On Wed, 15 Dec 2004 08:47:34 -0500, D. Richard Hipp <[EMAIL PROTECTED]> wrote:
> Christopher Petrilli wrote:
> > Has anyone had any experience in storing a million or more rows in a
> > SQLite3 database?  I've got a database that I've been building, which
> > gets 250 inserts/second, roughly, and which has about 3M rows in it.
> > At that point, the CPU load is huge.
> >
> 
> The other thing to remember is that when a table has 5 separate
> indices (4 explicit indices + 1 primary key) then each INSERT or
> DELETE operation is really doing 6 inserts or deletes.  There
> is one insert/delete for the main table and one for each of the
> indices.  So you would expect insert performance to be at least
> six times slower on a table with 5 indices versus a table with
> no indices.
> 
> The other thing to remember is that when you insert on a table,
> the new row goes at the very end of the table, which is typically
> very fast.  (The BTree backend for SQLite is optimized for the
> common case of inserting a row at the end of a table.)  But an
> insert into an index will usually occur somewhere in the middle
> of the indice, and thus will likely involve some rebalancing
> operations to make space for the new entry and to keep the tree
> level.  Inserting into an index is thus typically a little
> slower than inserting into a table.  Hence we expect inserting
> into a table with 5 indices to be more than 6 times slower than
> inserting into a table with no indices.
> 
> So the "base insert rate" of SQLite is about 25000 rows/second.
> Divide by 6 because you have 5 indices.  Divide by 4 because
> you are using synchronous=OFF instead of BEGIN...COMMIT.  This
> leaves us with an expected insert rate in your application of
> about 1000 rows/second.  We are still missing a factor of 4.
> 
> Could there be a hardware difference?  What kind of equipment
> are you running on?

Actually, I do wrap inside BEGIN/COMMIT, as that was the first thing I
tried.  That created a HUGE increase in performance.  There's some
numbers up on my blog, BTW:

http://www.amber.org/~petrilli/archive/2004/11/28/sqlite_insertion_performance.html

Notice the odd behavior?  Anyway...

Past thatt, I've decided to do some of my own database manipulations,
and only add the additional indices after the database has been
"closed" for insertions.  That seems to help a lot, and doesn't
require but a couple seconds (I throw it into a separate thread to
do).

Hardware is a AMD64/3000, 2Gb RAM, SATA drives, Fedora Core 3

One thing I've noticed is that if I turn of synchronous, the
filesystem slowly slows down, which is fun, but it doesn't do so
enough that it's a major issue.

I'm using the APSW wrapper for Python, which is basically a very thin
wrapper over the basic API, and does nothing special, so I'm 99% sure
it's not that.  You'll notice in the web page that performance seems
to change radically at several points.

Chris

-- 
| Christopher Petrilli
| [EMAIL PROTECTED]

Reply via email to