I fixed the slowness by sorting the sql statements in ascending key order and dumping to a new text file, then running sqlite on the new text file. That *really* helped :-) -Doug
On Mon, Mar 21, 2011 at 5:47 PM, Douglas Eck <d...@google.com> wrote: > Hi all, > I've got a reasonably large dataset to load via sql insert statements > (3.3M inserts spread out over 4 or 5 tables) > into an empty database. The Db is created using sqlalchemy. I then > drop all indexes using sql statements. > The inserts are performed by calling sqlite3 my.db < my_sql_text_file-xx-of-20 > for each of the 20 text files (with each file containing ~120K statements). > In each file I surround every 500 statements with > BEGIN; > COMMIT; > so as to batch my statements. > Even with dropped indexes and the BEGIN COMMIT blocking, things start > fast but get progressively sllooooowww: > > I0321 16:23:55.884863 3924 gordon_generate_database.py:120] Running > sqlite3 /tmp/test.db < > /tmp/gordon_generate_databaseu0g1_c/gordon_sql-00000-of-00020. > I0321 16:24:01.305967 3924 gordon_generate_database.py:123] Ran in > 5.42035007477 sec > I0321 16:24:01.306075 3924 gordon_generate_database.py:120] Running > sqlite3 /tmp/test.db < > /tmp/gordon_generate_databaseu0g1_c/gordon_sql-00001-of-00020. > I0321 16:24:09.609338 3924 gordon_generate_database.py:123] Ran in > 8.3025021553 sec > I0321 16:24:09.609455 3924 gordon_generate_database.py:120] Running > sqlite3 /tmp/test.db < > /tmp/gordon_generate_databaseu0g1_c/gordon_sql-00002-of-00020. > I0321 16:24:43.088789 3924 gordon_generate_database.py:123] Ran in > 33.480257988 sec > I0321 16:24:43.088905 3924 gordon_generate_database.py:120] Running > sqlite3 /tmp/test.db < > /tmp/gordon_generate_databaseu0g1_c/gordon_sql-00003-of-00020. > I0321 16:25:20.215434 3924 gordon_generate_database.py:123] Ran in > 37.126128912 sec > I0321 16:25:20.215564 3924 gordon_generate_database.py:120] Running > sqlite3 /tmp/test.db < > /tmp/gordon_generate_databaseu0g1_c/gordon_sql-00004-of-00020. > I0321 16:30:39.857538 3924 gordon_generate_database.py:123] Ran in > 319.643631935 sec > I0321 16:30:39.857649 3924 gordon_generate_database.py:120] Running > sqlite3 /tmp/test.db < > /tmp/gordon_generate_databaseu0g1_c/gordon_sql-00005-of-00020. > I0321 16:38:17.367182 3924 gordon_generate_database.py:123] Ran in > 457.509384871 sec > I0321 16:38:17.367294 3924 gordon_generate_database.py:120] Running > sqlite3 /tmp/test.db < > /tmp/gordon_generate_databaseu0g1_c/gordon_sql-00006-of-00020. > I0321 16:48:04.492243 3924 gordon_generate_database.py:123] Ran in > 587.122864962 sec > I0321 16:48:04.492315 3924 gordon_generate_database.py:120] Running > sqlite3 /tmp/test.db < > /tmp/gordon_generate_databaseu0g1_c/gordon_sql-00007-of-00020. > I0321 16:59:39.222325 3924 gordon_generate_database.py:123] Ran in > 694.731889009 sec > I0321 16:59:39.222442 3924 gordon_generate_database.py:120] Running > sqlite3 /tmp/test.db < > /tmp/gordon_generate_databaseu0g1_c/gordon_sql-00008-of-00020. > I0321 17:12:18.610039 3924 gordon_generate_database.py:123] Ran in > 759.385544062 sec > I0321 17:12:18.610147 3924 gordon_generate_database.py:120] Running > sqlite3 /tmp/test.db < > /tmp/gordon_generate_databaseu0g1_c/gordon_sql-00009-of-00020. > I0321 17:26:25.676740 3924 gordon_generate_database.py:123] Ran in > 847.068361998 sec > I0321 17:26:25.676851 3924 gordon_generate_database.py:120] Running > sqlite3 /tmp/test.db < > /tmp/gordon_generate_databaseu0g1_c/gordon_sql-00010-of-00020. > I0321 17:40:35.345407 3924 gordon_generate_database.py:123] Ran in > 849.668410063 > I0321 17:40:35.345524 3924 gordon_generate_database.py:120] Running > sqlite3 /tmp/test.db < > /tmp/gordon_generate_databaseu0g1_c/gordon_sql-00011-of-00020. > > I believe I could do this faster with .csv but I'd like to avoid the > extra code to go from SQL to CSV. > > Any hints? > > Doug Eck > -- Douglas Eck Research Scientist, Google Areas: Music and Machine Learning _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users