Skip to main content
Easton's stuff

Main navigation

  • Code
  • SnackLinux
  • Home

Breadcrumb

  1. Home

pandas

By Easton , 2 June, 2022

Merge big CSV files with Pandas and Python

I deal with large CSV at work, mostly database dumps. Google Sheets and Numbers/Excel just can't keep up with formula changes for 300k+ lines. What do we do? We use Python.

The use case here is replicating a the VLOOKUP function with a left join. We want to get the matching criteria from our referenced CSV file, but only the matching (otherwise that would be a full outer join).

Here's what our data looks like:

We want to look up the lat/long in our reference file and join on zip code. Here's what that data looks like:

Note the matching "zip" column names.

Here's the code:

Tags
python
pandas
pandas

Navigation

  • GitHub Profile
  • SnackLinux
  • Popular pages
    • Restoring a Macintosh Plus
    • Getting root access on a $10 Aliexpress Wifi repeater
    • Remote code execution on no-name wifi repeaters: Part 2
    • Building your own handheld GPS with an ESP32
    • Remote code execution with Hitron CGNM-2250
    • NES controller and a RaspberryPi
    • 4000 series CMOS 24 hour clock
  • Toyota Overland
RSS feed



 

Powered by Drupal