I am creating an SQLite database via Python, and trying to understand some performance issues.
This application does 3.8M inserts. Most inserts are to a main database that ends up being around 293MB. Around 740K of the inserts with larger data records are to 25 related databases of around 600MB each. The main database and 1 subdatabase are active together, then the subdatabase is closed when it gets to around 600MB and a new one is opened. This is an archiving application. As a performance baseline, I changed the application to just write out plain sequential text files, and it takes around 62 minutes to complete. This includes all the Python overhead and the raw hard drive overhead to write out the same amount of data in a "best case" scenario: Time: 3754.16 seconds Files: 708120 Bytes: 31565490710 real 62m34.335s user 53m55.492s sys 2m58.305s If I use SQLite and commit every 5 seconds, I get this performance: Time: 11383.95 seconds Files: 708120 Bytes: 31565490710 real 189m45.061s user 55m58.638s sys 4m46.528s If I commit every 30 seconds, I get this: Time: 13021.34 seconds Files: 708120 Bytes: 31565490710 real 217m2.078s user 56m9.647s sys 4m59.850s I believe fsync/fdatasync are significant performance issues, so I thought that it should improve performance if I start a transaction, do ALL of the inserts, then commit. Since I'm starting with an empty database, the journal should stay mostly empty, the database will be built without any syncs, and when I commit, the small journal will be deleted. I also added pragma synchronous=off. (All of these test are run with pragma lockmode exclusive.) But when I tried this, the performance was slowest of all: Time: 15356.42 seconds Files: 708120 Bytes: 31565490710 real 255m57.523s user 55m51.215s sys 4m22.173s Can anyone shed light on why building a database inside a single transaction would be slower than periodically commiting? Thanks, Jim _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users