I’m importing a large data set with a lot of rows — an entire Wikipedia dump, about 60GB, one article per row — into a brand new SQLite database in WAL mode. What’s the fastest way to import it?
I started with one big transaction, but noted that (of course) the WAL file was growing rapidly while the main database file stayed tiny. I figured this would become inefficient, so I stopped the run and adjusted my code to commit and re-open a transaction every 10,000 rows. With that, the import started quickly, but as time went on the commits were taking longer and longer, so the process was spending most of its time committing. (I wasn’t able to finish the job, as it ran into an unrelated fatal error in my code about ⅔ of the way through.) Would it have been faster to use a single transaction? Even if the commit at the end is epic-length, it wouldn’t be rewriting the b-tree nodes over and over again. If so, would periodic WAL checkpoints help? —Jens _______________________________________________ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users