Hi all, I've got a reasonably large dataset to load via sql insert statements (3.3M inserts spread out over 4 or 5 tables) into an empty database. The Db is created using sqlalchemy. I then drop all indexes using sql statements. The inserts are performed by calling sqlite3 my.db < my_sql_text_file-xx-of-20 for each of the 20 text files (with each file containing ~120K statements). In each file I surround every 500 statements with BEGIN; COMMIT; so as to batch my statements. Even with dropped indexes and the BEGIN COMMIT blocking, things start fast but get progressively sllooooowww:
I0321 16:23:55.884863 3924 gordon_generate_database.py:120] Running sqlite3 /tmp/test.db < /tmp/gordon_generate_databaseu0g1_c/gordon_sql-00000-of-00020. I0321 16:24:01.305967 3924 gordon_generate_database.py:123] Ran in 5.42035007477 sec I0321 16:24:01.306075 3924 gordon_generate_database.py:120] Running sqlite3 /tmp/test.db < /tmp/gordon_generate_databaseu0g1_c/gordon_sql-00001-of-00020. I0321 16:24:09.609338 3924 gordon_generate_database.py:123] Ran in 8.3025021553 sec I0321 16:24:09.609455 3924 gordon_generate_database.py:120] Running sqlite3 /tmp/test.db < /tmp/gordon_generate_databaseu0g1_c/gordon_sql-00002-of-00020. I0321 16:24:43.088789 3924 gordon_generate_database.py:123] Ran in 33.480257988 sec I0321 16:24:43.088905 3924 gordon_generate_database.py:120] Running sqlite3 /tmp/test.db < /tmp/gordon_generate_databaseu0g1_c/gordon_sql-00003-of-00020. I0321 16:25:20.215434 3924 gordon_generate_database.py:123] Ran in 37.126128912 sec I0321 16:25:20.215564 3924 gordon_generate_database.py:120] Running sqlite3 /tmp/test.db < /tmp/gordon_generate_databaseu0g1_c/gordon_sql-00004-of-00020. I0321 16:30:39.857538 3924 gordon_generate_database.py:123] Ran in 319.643631935 sec I0321 16:30:39.857649 3924 gordon_generate_database.py:120] Running sqlite3 /tmp/test.db < /tmp/gordon_generate_databaseu0g1_c/gordon_sql-00005-of-00020. I0321 16:38:17.367182 3924 gordon_generate_database.py:123] Ran in 457.509384871 sec I0321 16:38:17.367294 3924 gordon_generate_database.py:120] Running sqlite3 /tmp/test.db < /tmp/gordon_generate_databaseu0g1_c/gordon_sql-00006-of-00020. I0321 16:48:04.492243 3924 gordon_generate_database.py:123] Ran in 587.122864962 sec I0321 16:48:04.492315 3924 gordon_generate_database.py:120] Running sqlite3 /tmp/test.db < /tmp/gordon_generate_databaseu0g1_c/gordon_sql-00007-of-00020. I0321 16:59:39.222325 3924 gordon_generate_database.py:123] Ran in 694.731889009 sec I0321 16:59:39.222442 3924 gordon_generate_database.py:120] Running sqlite3 /tmp/test.db < /tmp/gordon_generate_databaseu0g1_c/gordon_sql-00008-of-00020. I0321 17:12:18.610039 3924 gordon_generate_database.py:123] Ran in 759.385544062 sec I0321 17:12:18.610147 3924 gordon_generate_database.py:120] Running sqlite3 /tmp/test.db < /tmp/gordon_generate_databaseu0g1_c/gordon_sql-00009-of-00020. I0321 17:26:25.676740 3924 gordon_generate_database.py:123] Ran in 847.068361998 sec I0321 17:26:25.676851 3924 gordon_generate_database.py:120] Running sqlite3 /tmp/test.db < /tmp/gordon_generate_databaseu0g1_c/gordon_sql-00010-of-00020. I0321 17:40:35.345407 3924 gordon_generate_database.py:123] Ran in 849.668410063 I0321 17:40:35.345524 3924 gordon_generate_database.py:120] Running sqlite3 /tmp/test.db < /tmp/gordon_generate_databaseu0g1_c/gordon_sql-00011-of-00020. I believe I could do this faster with .csv but I'd like to avoid the extra code to go from SQL to CSV. Any hints? Doug Eck _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users