I have no user-defined indices in my db, and want to do a largish number
of inserts (a few billion). I COMMIT every 10 million INSERTs or so --
so if my app dies (or I want to kill it) I don't have to start over.
Row sizes are small, a couple hundred bytes across 15ish columns.
The primary key on the table is a pair of integers.
After a few BEGIN/INSERT/COMMIT cycles the journal file grows
mid-transaction to a pretty big size, e.g. around 1Gb against a 14Gb db
file, meaning (right?) that sqlite wrote to ~1Gb of the existing
pages during that round of INSERTs.
This means the time spent doing a batch of INSERTs goes up as the number
of existing rows, which is a big frowny-face.*
I'd really love to avoid writing a big journal file. And I'd love to
avoid doing a billion-row insert in one transaction.
So, if I'm only doing INSERTs and they're always on brand new rows and
there are no indices, why does SQLite need to update existing pages in
the DB? (Is it updating an auto-index based on the primary key?) Is
there a way to avoid it?
Thanks!
Eric
PS I'm using 3.6.23.1 with defaults, except PRAGMA synchronous=OFF and
foreign_keys=1 (my only foreign key refs are to tiny tables). I'm using
the Tcl API, which probably doesn't matter for this question.
% db eval {pragma compile_options}
ENABLE_FTS3 ENABLE_LOCKING_STYLE=0 TEMP_STORE=1 THREADSAFE=1
* I found out today my users are using a raid5 array on the deployment
box, so it's an even bigger frowny face than it would have been by
default.
--
Eric A. Smith
We don't like their sound, and guitar music is on the way out.
-- Decca Recording Co. rejecting the Beatles, 1962.
_______________________________________________
sqlite-users mailing list
[email protected]
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users