steffenstein.com

PSQL: Insert millions of rows fast!

Insert lots of data in your Postgres databases as fast as you can.

2020-11-11

Postgres SQL - handle data the SMART way

There are many ways to import rows into your DB. But if you have a lot of them, you don't want to insert it row by row.

Mass of CSV files

Manipulate CSV files with sed and pandas

Create a python script:

import csv
import datetime
import os
import glob
import pandas as pd
import subprocess

scids = glob.glob("/data/in/*.csv")
path = "/data/out/"
headercols=[]
print("Start: "+str(datetime.datetime.now()))
for m in scids:
    print(m)
    filename = os.path.basename(m)
    print("Running symbol: "+filename)
    subprocess.call(["sed -i -e 's/,//' "+m], shell=True)
    df = pd.read_csv(m, parse_dates= ['Date Time'], usecols = ['Date Time',' Open',' High',' Low',' Last',' Volume',' # of Trades',' Bid Volume',' Ask Volume'])
    df.columns = ['time','open','high','low','close','vol','tcount','bidv','askv']
    df['symbol'] = str(os.path.splitext(filename)[0])
    df.to_csv(os.path.join(path,filename),index=False,columns=['time','symbol','open','high','low','close','vol','tcount','bidv','askv'])
    os.remove(m)
    print("End: "+str(datetime.datetime.now()))

Insert multiple CSV files with header

Fire up the terminal

for i in *; do psql -d DATABASE -U USER -c "\COPY public.TABLE FROM '$i' DELIMITER ',' CSV HEADER;"; done
Avatar of Steffen Stein

Hi, my name is Steffen 👋 I'm a webdeveloper from Cologne, Germany 🇩🇪 Coding abroad and traveling the world 🌎 ✈️