RE: [sqlite] Performance problem

Allan Edwards Tue, 04 Nov 2003 07:32:53 -0800

Hmmmmmmmmmmmm, the name of the database IS sql "LITE".  My guess is that if
it was mean to be the monstrous scalable solution, it would have been
architected this way in the first place.

One thing I find interesting about Sqlite is that I tested in this past week
up to 360,000 records.  I had no performance degradations inserting, or
querying, updating, or deleting. I am very pleased.  In fact, for the
solution I have used it for, I have tossed SQL server on the enterprise
distributed  version.  It scales plenty well on just everyday hardware now.

The funny thing is that for its simplicity it is probably the best database
there is for it's current area of use.  I think if you want to scale it, go
buy a mainframe! : - )

If you want to go in and modify it to be scalable, you ought to put in the
time to do this requested stuff, then sell it.  You have an excellent base
of code to start from.  

Just some thoughts.

Allan 

-----Original Message-----
From: Avner Levy [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, November 04, 2003 3:56 AM
To: D. Richard Hipp
Cc: [EMAIL PROTECTED]
Subject: Re: [sqlite] Performance problem

Hi,
We have just finished testing the same scenario with MySql at amazingly they
continued to insert 1500-3000 rows per second even when the database had
60,000,000 records.
I don't know how this magic is done, but that what the tests showed (with
the same machine / OS).
There is one thing I don't understand. Sqlite works in optimistic mode,
which means the changes are written to the database and the original data is
backup.
In this case I would expect that the "commit" will be fast as deleting a
file. But the test shows that the commit can take up to 5 minutes when the
database is large.
Does any body has an explanation ?

I saw the comments about using OS feature to solve the problem. 
Personally if we will choose Sqlite, we need it for lots of platforms and
for product installation. Demanding us as users of Sqlite to start
configuring  journaling on the customer's computers is not practical.

Personally I'm a big fan of Sqlite (great tool), and I really hope that this
issue can be solved (so we can choose Sqlite).
I understand that solving this issue demands big changes, but I believe this
will take Sqlite to the "big players league".

Thanks
    Avner

D. Richard Hipp wrote:

> Avner Levy wrote:
>
>> Hi,
>> I've written the following program to test the sqlite performance for 
>> a specific scenario I have.
>> I've used most tricks I've found but still when the database gets big 
>> the performance gets unacceptable.
>> The performance problem happens of course only if the indexes are 
>> defined.
>> Since I need the indexes, is there any other option that can make it 
>> faster ?
>> When the database gets big the insert rate gets to 50 rows  per second.
>> Thanks in advance.
>>
>> * It seems that the larger the DB gets, the journal copies more and 
>> more pages out of the database into the file, which kills performance.
>>
>
> Each transaction in your test involves 30000 INSERTs into a single 
> table defined (roughly) as follows:
>
>     CREATE TABLE test(a,b);
>     CREATE INDEX idx1 ON test(a);
>     CREATE INDEX idx2 ON test(b);
>
> In these INSERTs, the values for test.a are non-decreasing.  That 
> means that new entries into the "test" table and into the "idx1" index 
> always go into the same place in their B*Trees - specifically at the end.
> But the values for test.b are random, which means that entries into
> idx2 are distributed throughout its B*Tree.  Because changes to test 
> and idx1 are localized, only a handful of disk blocks are changed 
> (many others are added, but few existing ones are changed) but the 
> changes to idx2 are not localized, which means that nearly every disk 
> block associated with idx2 must be changed.  Modifying the (thousands) 
> of disk blocks associated with idx2 is what is taking so long.  I do 
> not know anything I can do inside SQLite to make it go any faster.  I 
> do not know of any other way of creating an index that could work 
> around this issue.
>
> If you disable idx2, you will see that the inserts go much faster.  I 
> also observe that the journal file is about 1/3rd the size of the 
> database, which is consistent with the observation that every disk 
> block associated with idx2 needs to be changed.  My TCL test script 
> for this situation is appended to this message.
>
> If you really are doing millions of INSERTs prior to doing any 
> SELECTs, you might consider deferring the creation of idx2 until after 
> all the data has been put into the database.  Generally speaking, it 
> is a little faster to create an index for existing data than it is to 
> create the index at the same time that the data is being inserted.  
> But the difference is not that great.  And, of course, this won't help 
> if in reality you need to do some SELECTs along the way...
>
> I'm curious to know how other database engines deal with this problem.
> Have you tried a similar experiment on MySQL, or PostgreSQL, or even 
> Oracle?  How did they perform?
>
> Can any readers suggest ways that I have not thought of for making 
> large numbers of non-localized INSERTs go faster?

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: [sqlite] Performance problem

Reply via email to