I fixed the slowness by sorting the sql statements in ascending key
order and dumping to a new text file, then running sqlite on the new
text file.  That *really* helped :-)
-Doug

On Mon, Mar 21, 2011 at 5:47 PM, Douglas Eck <d...@google.com> wrote:
> Hi all,
> I've got a reasonably large dataset to load via sql insert statements
> (3.3M inserts spread out over 4 or 5 tables)
> into an empty  database. The Db is created using sqlalchemy. I then
> drop all indexes using sql statements.
> The inserts are performed by calling sqlite3 my.db < my_sql_text_file-xx-of-20
> for each of the 20 text files (with each file containing ~120K statements).
> In each file I surround every 500 statements with
> BEGIN;
> COMMIT;
> so as to batch my statements.
> Even with dropped indexes and the BEGIN COMMIT blocking, things start
> fast but get progressively sllooooowww:
>
> I0321 16:23:55.884863  3924 gordon_generate_database.py:120] Running
> sqlite3 /tmp/test.db <
> /tmp/gordon_generate_databaseu0g1_c/gordon_sql-00000-of-00020.
> I0321 16:24:01.305967  3924 gordon_generate_database.py:123] Ran in
> 5.42035007477 sec
> I0321 16:24:01.306075  3924 gordon_generate_database.py:120] Running
> sqlite3 /tmp/test.db <
> /tmp/gordon_generate_databaseu0g1_c/gordon_sql-00001-of-00020.
> I0321 16:24:09.609338  3924 gordon_generate_database.py:123] Ran in
> 8.3025021553 sec
> I0321 16:24:09.609455  3924 gordon_generate_database.py:120] Running
> sqlite3 /tmp/test.db <
> /tmp/gordon_generate_databaseu0g1_c/gordon_sql-00002-of-00020.
> I0321 16:24:43.088789  3924 gordon_generate_database.py:123] Ran in
> 33.480257988 sec
> I0321 16:24:43.088905  3924 gordon_generate_database.py:120] Running
> sqlite3 /tmp/test.db <
> /tmp/gordon_generate_databaseu0g1_c/gordon_sql-00003-of-00020.
> I0321 16:25:20.215434  3924 gordon_generate_database.py:123] Ran in
> 37.126128912 sec
> I0321 16:25:20.215564  3924 gordon_generate_database.py:120] Running
> sqlite3 /tmp/test.db <
> /tmp/gordon_generate_databaseu0g1_c/gordon_sql-00004-of-00020.
> I0321 16:30:39.857538  3924 gordon_generate_database.py:123] Ran in
> 319.643631935 sec
> I0321 16:30:39.857649  3924 gordon_generate_database.py:120] Running
> sqlite3 /tmp/test.db <
> /tmp/gordon_generate_databaseu0g1_c/gordon_sql-00005-of-00020.
> I0321 16:38:17.367182  3924 gordon_generate_database.py:123] Ran in
> 457.509384871 sec
> I0321 16:38:17.367294  3924 gordon_generate_database.py:120] Running
> sqlite3 /tmp/test.db <
> /tmp/gordon_generate_databaseu0g1_c/gordon_sql-00006-of-00020.
> I0321 16:48:04.492243  3924 gordon_generate_database.py:123] Ran in
> 587.122864962 sec
> I0321 16:48:04.492315  3924 gordon_generate_database.py:120] Running
> sqlite3 /tmp/test.db <
> /tmp/gordon_generate_databaseu0g1_c/gordon_sql-00007-of-00020.
> I0321 16:59:39.222325  3924 gordon_generate_database.py:123] Ran in
> 694.731889009 sec
> I0321 16:59:39.222442  3924 gordon_generate_database.py:120] Running
> sqlite3 /tmp/test.db <
> /tmp/gordon_generate_databaseu0g1_c/gordon_sql-00008-of-00020.
> I0321 17:12:18.610039  3924 gordon_generate_database.py:123] Ran in
> 759.385544062 sec
> I0321 17:12:18.610147  3924 gordon_generate_database.py:120] Running
> sqlite3 /tmp/test.db <
> /tmp/gordon_generate_databaseu0g1_c/gordon_sql-00009-of-00020.
> I0321 17:26:25.676740  3924 gordon_generate_database.py:123] Ran in
> 847.068361998 sec
> I0321 17:26:25.676851  3924 gordon_generate_database.py:120] Running
> sqlite3 /tmp/test.db <
> /tmp/gordon_generate_databaseu0g1_c/gordon_sql-00010-of-00020.
> I0321 17:40:35.345407  3924 gordon_generate_database.py:123] Ran in
> 849.668410063
> I0321 17:40:35.345524  3924 gordon_generate_database.py:120] Running
> sqlite3 /tmp/test.db <
> /tmp/gordon_generate_databaseu0g1_c/gordon_sql-00011-of-00020.
>
> I believe I could do this faster with .csv but I'd like to avoid the
> extra code to go from SQL to CSV.
>
> Any hints?
>
> Doug Eck
>



-- 
Douglas Eck
Research Scientist, Google
Areas: Music and Machine Learning
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to