Re: [sqlite] Sqlite insertion performance

John Machin Mon, 17 Sep 2007 04:25:35 -0700

On 17/09/2007 1:07 PM, Joe Wilson wrote:

--- "D. Richard Hipp" <[EMAIL PROTECTED]> wrote:
I have been struggling with the performance of insertion in sqlite.
Here we have a very simple case :
A table with an integer autoincrement primary key and a textfield that is
unique.

CREATE TABLE my (id PRIMARY KEY, url);

CREATE UNIQUE INDEX myurl ON my(url);
My application requires inserting up to 10 million records inbatches of
20 thousand records.
For each group of 20000 records, first insert them into a TEMP table.
Call the temp table t1.  Then transfer the records to the main table
as follows:

      INSERT OR IGNORE INTO my(url) SELECT url FROM t1 ORDER BY url;
I had no performance improvement with that temp store staging table
technique in my testing - actually it was slower.

  http://www.mail-archive.com/sqlite-users@sqlite.org/msg22143.html
Mind you, the table I was testing against had 4 indexes, whereas the abovetable has 2. I also wasn't using "OR IGNORE". There might be a difference.
Just setting pragma cache_size to a huge value and inserting into
the table normally in large batches resulted in better performance inmy case. It may have already been mentioned, but having a bigdatabase page_size value helps minimize the disk writes as well.


A couple of thoughts:

OTTOMH, time to search index is approx O(D * log(K)) where D = depth oftree and K = number of keys per block, and K ** D is O(N)


So:

(1) Big block means big K and thus small D

(2) Long keys (like URLs!!) means small K and thus big D

Further on point (2), the OP seems unsure about whether his URLs areunique or not. Perhaps storing another column containing a 64-bit hashwith an index on that column might be the way to go -- shorter key mightmight well outweigh the extra cost of checking for duplicates.


-----------------------------------------------------------------------------
To unsubscribe, send email to [EMAIL PROTECTED]
-----------------------------------------------------------------------------

Re: [sqlite] Sqlite insertion performance

Reply via email to