I appreciate everyone's thoughts about this.

Knowing larger batch sizes help is interesting. Unfortunately, we don't
always control the batch size. We're using 1000 as an optimistic estimate,
but we receive things and may just have to commit after awhile.

Knowing that more OS file cache or a faster disk helps is also interesting.
Unfortunately, it is non-trivial to switch to SSDs. We will have a whole
fleet of machines, each storing several hundred terabytes. The sqlite
databases are meta-data about that. We might be able to use one SSD just
for the meta-data. We haven't explored that yet. We also can't use lots of
OS disk cache, as it will probably be taken by writing things other than
this meta-data.

Still, all of your observations are useful.

We are comparing to leveldb, which seems to have much better write
performance even in a limited-memory situation. Of course it offers much
less than sqlite. It is a partially-ordered key/value store, rather than a
relational database.

Michael Black writes:

Referencing the C program I sent earlier....I've found a COMMIT every 1M
records does best.  I had an extra zero on my 100,000 which gives the EKG
appearance.
I averaged 25,000 inserts/sec over 50M records with no big knees in the
performance (there is a noticeable knee on the commit though around 12M
records).  But the average performance curve is pretty smooth.
Less than that and you're flushing out the index too often which causes an
awful lot of disk thrashing it would seem.
During the 1M commit the CPU drops to a couple % and the disk I/O is pretty
constant...albeit slow....

P.S. I'm using 3.7.15.1
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to