Re: [sqlite] SQLite performance with mid-size databases

2004-06-17 Thread ben . carlyle
Raymond,

So far on this list I have only seen a reports of scalability problems 
with sqlite when the real problems were specific queries being given to 
sqlite that were constructed in a way that sqlite does not process well. 
In every such case I can recall an alternative form of the query was able 
to be produced (usually by DRH personally) that did not exhibit the 
problem. Knowing the code, there is no reason for sqlite to be scaling 
linearly in queries unless it is being forced to do table scans.

It is true that sqlite doesn't optimise queries as well as major 
databases. It's not designed to. Users must take some care in constructing 
their queries and ensuring the queries are suited to sqlite's design if 
performance might be a problem. At the same time, such queries usually 
perform much better than those of major databases because of the vastly 
reduced optimisation and connection overhead that sqlite affords.

As with every database technology. If you care about performance you have 
to understand some things about the design of your underlying technology. 
For sqlite the design is simple and the experts are extremely responsive. 
If you're having problems and can provide a clear, specifc description of 
your problem you will get help. While queries such as "I think sqlite 
scales linearly, but I can't tell you want queries I'm issuing to make 
that happen" and "My scroll list seems slow, but I don't know what is 
happening between the GUI and the database or what queries are going on" 
are unlikely to solicit helpful response, "I am issuing this query on that 
database schema with about 100 thousand rows each carrying 2k of data" is 
likely to be something members of this list can help you solve.

I'd like to take this soapbox opportunity to again thank drh and his 
associates for the wonderful work they put into sqlite and their genine 
personal commitment to this project. Sqlite is a great product, and a 
well-targeted one.

Benjamin





Raymond Irving <[EMAIL PROTECTED]>
18/06/2004 12:09 AM

 
To: [EMAIL PROTECTED]
        cc: 
    Subject:    Re: [sqlite] SQLite performance with mid-size databases



Well this does not sound good at all. I would think
that SQLite would at least do a better job at queries.

The issues with performance and scalability is of
great concern. I was planning on create some new apps
that use SQLite, but now I'm wondering is this little
database is capable of handle over 2 Gigabytes (GB) of
data even though the docs says it can handle up to 2
Terabytes (TB).

Does it really make sense to cut back on performance
in order to keep the library size small?

Will SQLite 3.0 fix these problems?

__
Raymond Irving

--- "D. Richard Hipp" <[EMAIL PROTECTED]> wrote:
> Richard Kuo wrote:
> > 
> > I suspect some unnecessary disk access has to be
> the problem...despite
> > the small amount of new guide information being
> queried out, disk bytes
> > read is several times higher than with MS access
> and scrolling back over
> > previously accessed areas of data is visibly
> faster...indicating that
> > the disk cache is very favorably impacting the
> speed of the queries.
> > 
> 
> If each of your rows contains 2K of data, that means
> each database entry
> is using about 2 overflow pages.  You can change
> this by increasing the
> page size.  Try recompiling SQLite after changing
> the SQLITE_PAGE_SIZE
> macro to 16384.
> 
> You might also trying switching to SQLite version
> 3.0.0 which will be
> released tomorrow.
> 
> Also tomorrow, I will be making available a database
> analysis tool
> for version 2.8 databases that will help us to
> better understand
> how information is stored on disk for your database,
> and possibly
> provide some clues about why you are having
> problems.





-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [sqlite] SQLite performance with mid-size databases

2004-06-17 Thread D. Richard Hipp
Ulrik Petersen wrote:
this is also just a stab in the dark, and I subscribe to the digest 
version of the mailinglist, so I may not have the latest.

Perhaps you are doing something like
SELECT A.x, A.y
FROM A
WHERE   A.rowid = xxx
OR  A.rowid = yyy
OR  A.rowid = zzz
OR  A.rowid = ...
etc.etc. with may OR-conditions.
I have noticed that SQLite (and PostgreSQL, for that matter) slows down 
quite a bit when the number of WHERE-conditions reaches beyond a 
somewhat low number, say around 10.

In SQLite, any use of the OR operator in a WHERE clause more or
less shuts down the query optimizer, disables all indices, and
forces a full table scan.  If you want to write a query like the
one shown above, do it this way:
   SELECT * FROM A
   WHERE A.rowid IN (xxx,yyy,zzz,...);
The optimizer understands the IN operator just fine.
SQLite's query optimizer can handle up to 32 AND terms in the
WHERE expression before it begins to have problems.
--
D. Richard Hipp -- [EMAIL PROTECTED] -- 704.948.4565
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: [sqlite] SQLite performance with mid-size databases

2004-06-17 Thread Raymond Irving
See below:

--- Jacob Engstrand <[EMAIL PROTECTED]> wrote:
>  To disable the
> I1A index,
> rewrite the WHERE clause like this:
> 
>   SELECT * FROM t1 WHERE a+0=5 AND b='xyzzy';
> 
> To disable the I1B index you could write
> 
>   SELECT * FROM t1 WHERE a=5 AND b LIKE 'xyzzy';
> 

Why not add a feature to SQLite to allow the user to
choose the index to use? Maybe something like this:

SELECT * FROM t1 WHERE a=5 abd b='xyzzy' USE INDEX(a)

Wouldn't this be much more straiht forward and easier
to read?


__
Raymond Irving


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [sqlite] SQLite performance with mid-size databases

2004-06-17 Thread Raymond Irving

Well this does not sound good at all. I would think
that SQLite would at least do a better job at queries.

The issues with performance and scalability is of
great concern. I was planning on create some new apps
that use SQLite, but now I'm wondering is this little
database is capable of handle over 2 Gigabytes (GB) of
data even though the docs says it can handle up to 2
Terabytes (TB).

Does it really make sense to cut back on performance
in order to keep the library size small?

Will SQLite 3.0 fix these problems?

__
Raymond Irving

--- "D. Richard Hipp" <[EMAIL PROTECTED]> wrote:
> Richard Kuo wrote:
> > 
> > I suspect some unnecessary disk access has to be
> the problem...despite
> > the small amount of new guide information being
> queried out, disk bytes
> > read is several times higher than with MS access
> and scrolling back over
> > previously accessed areas of data is visibly
> faster...indicating that
> > the disk cache is very favorably impacting the
> speed of the queries.
> > 
> 
> If each of your rows contains 2K of data, that means
> each database entry
> is using about 2 overflow pages.  You can change
> this by increasing the
> page size.  Try recompiling SQLite after changing
> the SQLITE_PAGE_SIZE
> macro to 16384.
> 
> You might also trying switching to SQLite version
> 3.0.0 which will be
> released tomorrow.
> 
> Also tomorrow, I will be making available a database
> analysis tool
> for version 2.8 databases that will help us to
> better understand
> how information is stored on disk for your database,
> and possibly
> provide some clues about why you are having
> problems.
> 
> -- 
> D. Richard Hipp -- [EMAIL PROTECTED] -- 704.948.4565
> 
> 
>
-
> To unsubscribe, e-mail:
> [EMAIL PROTECTED]
> For additional commands, e-mail:
> [EMAIL PROTECTED]
> 
> 


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: [sqlite] SQLite performance with mid-size databases

2004-06-16 Thread Richard Kuo
Tim,

We have been running into many issues with corrupt MDAC installations.
Switching to SQLite fixed this admirably.  However, database access is
now visibly slower for all our users.

I have been paying careful attention to use only one index in queries
and to make very simple queries whenever possible.  I cannot imagine our
query could possibly be the source, as we are actually querying rows out
strictly by rowid only in this particular case, having moved the actual
lookup of needed rowid's outside SQLite in order to try and isolate the
performance hit that SQLite was introducing.

I suspect some unnecessary disk access has to be the problem...despite
the small amount of new guide information being queried out, disk bytes
read is several times higher than with MS access and scrolling back over
previously accessed areas of data is visibly faster...indicating that
the disk cache is very favorably impacting the speed of the queries.

Richard

-Original Message-
From: Tim Anderson [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, June 16, 2004 10:33 AM
To: Richard Kuo
Subject: RE: [sqlite] SQLite performance with mid-size databases


> -Original Message-
> From: Richard Kuo [mailto:[EMAIL PROTECTED]
> Sent: 16 June 2004 05:04
> To: [EMAIL PROTECTED]
> Subject: [sqlite] SQLite performance with mid-size databases
> 
> Hi.  We are using SQLite to store and retrieve data rows
> where each row is roughly 2K total in size and in a table of 
> 15 columns.  The total size of the database ranges from 100-300 MB.
>  
> The problem we are seeing is that query and insert
> performance is unusually bad and scales up linearly with 
> database size.  Compared to MS Access, the query times are 
> several times slower. 

Richard,

I've done extensive comparision of SQLite vs Access and in general I'd
say SQLite is faster. But you can't get meaningful results without being
much more specific, and also testing where exactly the slowdown occurs.
I'm sure there are some queries where Access can use indexes, but SQLite
will scan the entire table or tables; obviously that's a huge perf. hit.
With careful SQL tuning you can usually find a way around it. Access
isn't a bad db engine either, as long as it's not used over a network
:-) The main advantages of SQLite are that it is:

- smaller
- more customisable since you have the source
- no dependencies to speak of, whereas Access needs MDAC etc.
- cross-platform

Tim


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [sqlite] SQLite performance with mid-size databases

2004-06-16 Thread Derrell . Lipman
On 2003-01-27, at 00.00, D. Richard Hipp wrote:

> Enterprise scale database engines do a more sophisticated job
> of selecting indices (when there is a choice like this) by
> collecting lots of statistics on the indices and using complex
> algorithms to make the choice.  SQLite takes the easy way out
> and makes an arbitrary choice.  With SQLite, it is up to you,
> the query author, to select an appropriate index when the choice
> of indices might make a difference.  You can disable the other
> index by modifying the WHERE clause.  To disable the I1A index,
> rewrite the WHERE clause like this:
>
>   SELECT * FROM t1 WHERE a+0=5 AND b='xyzzy';
>
> To disable the I1B index you could write
>
>   SELECT * FROM t1 WHERE a=5 AND b LIKE 'xyzzy';
>
> I should probably write some documentation talking about this
> and put it on the website

This statement to disable I1B, however, is not likely exactly correct since
(according to the docs) LIKE is case- *insensitive*, so it would match 'XYZZY'
and 'XyZzY' as well (but not 'PLUGH' or 'PLOVER' :-).  Using GLOB instead of
LIKE should work as intended:

SELECT * FROM t1 WHERE a=5 AND b GLOB 'xyzzy';

Derrell

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [sqlite] SQLite performance with mid-size databases

2004-06-16 Thread Jacob Engstrand
On 2004-06-16, at 06.04, Richard Kuo wrote:
Empirically speaking, we display our data in a scrolling 2
dimensional grid format.  With MS access, this grid responds
instantaneously when moving through the grid.  With SQLite, there is
very noticable stalling and lag and the disk i/o is higher than MS
Access by roughly a factor of 10.

Hello Richard,
I too hit a rather puzzling SELECT performance problem last year. 
Apparently, if you have a table with many records, and you perform a 
SELECT on an indexed column where most of the values are the same, 
performance goes way down.

The mail correspondence from January 2003 does not seem to be in the 
archive, so below I have pasted a very enlightening response from DRH 
to my question.

Hope this helps.
/jak


On 2003-01-27, at 00.00, D. Richard Hipp wrote:
Example:
 CREATE TABLE t1(a,b,c);
 CREATE INDEX i1a ON t1(a);
 CREATE INDEX i1b ON t1(b);
Next you insert lots of data where the value for T1.B is
usually the same, say 'xyzzy'.  Then you do a query:
 SELECT * FROM t1 WHERE a=5 AND b='xyzzy';
When it is compiling this query, the optimizer can choose to
use either index I1A and the "a=5" expression or it can choose
to use index I1B and the "b='xyzzy'" expression.  The choice
it makes is arbitrary.
But I1B would be the wrong choice because almost every entry
in T1 is a match for "b='xyzzy'".  So the query must read in
every one of these entries and check each one to see if "a=5".
This can take even longer than doing a full table scan.
I1A is the right index to use here because only a few entries
of T1 will match "a=5".  So only a few entries have to be
read in and checked for "b='xyzzy'" and the query goes MUCH
faster.
Enterprise scale database engines do a more sophisticated job
of selecting indices (when there is a choice like this) by
collecting lots of statistics on the indices and using complex
algorithms to make the choice.  SQLite takes the easy way out
and makes an arbitrary choice.  With SQLite, it is up to you,
the query author, to select an appropriate index when the choice
of indices might make a difference.  You can disable the other
index by modifying the WHERE clause.  To disable the I1A index,
rewrite the WHERE clause like this:
 SELECT * FROM t1 WHERE a+0=5 AND b='xyzzy';
To disable the I1B index you could write
 SELECT * FROM t1 WHERE a=5 AND b LIKE 'xyzzy';
I should probably write some documentation talking about this
and put it on the website
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: [sqlite] SQLite performance with mid-size databases

2004-06-16 Thread Nuno Lucas
This could be related with the grid handling code.
In another post I already talked about the "SELECT COUNT(*) ..." performance problem, 
that can only be solved by tuning the code in the grid control itself.
If the code is generic, it's a strong possibility it isn't optimized for this, and 
assumes a count of records taking not time at all.
Another strong possibility is the grid control using a SQLite wraper already slow by 
nature.

I am wondering if there is a possibility of being notified of new/deleted rows without 
the use of triggers in simple way. What I'm thinking about is in generic database 
browsers, that want to update their grids if another process changes the database (and 
a viewer shouldn't create triggers in the database he is watching, only if the user 
creates one). I know one can register hooks for auth access, but that seems a bit too 
much and only related to the local process/thread.
Another use is for "server" applications, to be notified of changes in the database 
from local clients (so he could invalidate his cache, etc.).


Regards,
~Nuno Lucas


P.S. - I'm very weak on SQL, but I think it isn't possible to create temp triggers. Is 
this right?


=== On 2004-06-16, Randall Fox wrote ===
>On Tue, 15 Jun 2004 23:04:04 -0500, you wrote:
>
>>Hi.  We are using SQLite to store and retrieve data rows where each
>>row is roughly 2K total in size and in a table of 15 columns.  The total
>>size of the database ranges from 100-300 MB.
>> 
>>The problem we are seeing is that query and insert performance is
>>unusually bad and scales up linearly with database size.  Compared to MS
>>Access, the query times are several times slower.  Frankly I was a bit
>>shocked at this considering that most people seem to think the
>>performance is good. However, I don't see anything that we are doing
>>wrong...we query the rows we want only by rowid.  I'm very puzzled that
>>this hasn't come up a lot in my searches of the mailing list, but
>>perhaps the slower query times aren't a concern for many of the
>>applications using SQLite.
>> 
>>Empirically speaking, we display our data in a scrolling 2
>>dimensional grid format.  With MS access, this grid responds
>>instantaneously when moving through the grid.  With SQLite, there is
>>very noticable stalling and lag and the disk i/o is higher than MS
>>Access by roughly a factor of 10.
>> 
>>I suppose I am looking to see if anyone is seeing the same results
>>that I am seeing, and wondering if this is known and expected to be the
>>case.  The speed results on the website seem way off to me or must be so
>>skewed towards a small dataset that they do not apply in a real world
>>scenario.  I would also like to state that I am very impressed with the
>>simplicity of SQLite, which is rare to find these days.  It was very
>>easy to get up and running.  I'm just having trouble getting past the
>>performance issues.  Any explanation would be helpful.
>> 
>>Richard Kuo
>
>
>How do you fill in the grid control?  Is it storing the data, or do
>you provide the data when requested? (owner data..)  You may need to
>implement some caching if it isn't implemented already, I know some
>controls have this set up in them, and it could be that either access
>is taking advantage of this, or is caching it from w/in the DB
>itself..
>
>Also, did you implement a integer primary key, and use indexing? 
>
>Randall Fox


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [sqlite] SQLite performance with mid-size databases

2004-06-16 Thread Randall Fox
On Tue, 15 Jun 2004 23:04:04 -0500, you wrote:

>Hi.  We are using SQLite to store and retrieve data rows where each
>row is roughly 2K total in size and in a table of 15 columns.  The total
>size of the database ranges from 100-300 MB.
> 
>The problem we are seeing is that query and insert performance is
>unusually bad and scales up linearly with database size.  Compared to MS
>Access, the query times are several times slower.  Frankly I was a bit
>shocked at this considering that most people seem to think the
>performance is good. However, I don't see anything that we are doing
>wrong...we query the rows we want only by rowid.  I'm very puzzled that
>this hasn't come up a lot in my searches of the mailing list, but
>perhaps the slower query times aren't a concern for many of the
>applications using SQLite.
> 
>Empirically speaking, we display our data in a scrolling 2
>dimensional grid format.  With MS access, this grid responds
>instantaneously when moving through the grid.  With SQLite, there is
>very noticable stalling and lag and the disk i/o is higher than MS
>Access by roughly a factor of 10.
> 
>I suppose I am looking to see if anyone is seeing the same results
>that I am seeing, and wondering if this is known and expected to be the
>case.  The speed results on the website seem way off to me or must be so
>skewed towards a small dataset that they do not apply in a real world
>scenario.  I would also like to state that I am very impressed with the
>simplicity of SQLite, which is rare to find these days.  It was very
>easy to get up and running.  I'm just having trouble getting past the
>performance issues.  Any explanation would be helpful.
> 
>Richard Kuo


How do you fill in the grid control?  Is it storing the data, or do
you provide the data when requested? (owner data..)  You may need to
implement some caching if it isn't implemented already, I know some
controls have this set up in them, and it could be that either access
is taking advantage of this, or is caching it from w/in the DB
itself..

Also, did you implement a integer primary key, and use indexing? 

Randall Fox

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: [sqlite] SQLite performance with mid-size databases

2004-06-15 Thread Ionut Filip
Hi Richard,

Here is a tip to speed up the insert operations: use prepared statements
instead of plain INSERTs. If you have to insert more than one record with
the same format the performance increase is significant.

Transactions also speeds up db operations, there are more detailed topics
about this on the list.

> The speed results on the website [...]
Here I think you need to make sure the concurrency access to the DB is not
a bottleneck.

Ionut Filip

PS: You can also try to post a sample (partial) schema of your database.


-Original Message-
From: Richard Kuo [mailto:[EMAIL PROTECTED]
Sent: Wednesday, June 16, 2004 7:04 AM
To: [EMAIL PROTECTED]
Subject: [sqlite] SQLite performance with mid-size databases


Hi.  We are using SQLite to store and retrieve data rows where each
row is roughly 2K total in size and in a table of 15 columns.  The total
size of the database ranges from 100-300 MB.
 
The problem we are seeing is that query and insert performance is
unusually bad and scales up linearly with database size.  Compared to MS
Access, the query times are several times slower.  Frankly I was a bit
shocked at this considering that most people seem to think the
performance is good. However, I don't see anything that we are doing
wrong...we query the rows we want only by rowid.  I'm very puzzled that
this hasn't come up a lot in my searches of the mailing list, but
perhaps the slower query times aren't a concern for many of the
applications using SQLite.
 
Empirically speaking, we display our data in a scrolling 2
dimensional grid format.  With MS access, this grid responds
instantaneously when moving through the grid.  With SQLite, there is
very noticable stalling and lag and the disk i/o is higher than MS
Access by roughly a factor of 10.
 
I suppose I am looking to see if anyone is seeing the same results
that I am seeing, and wondering if this is known and expected to be the
case.  The speed results on the website seem way off to me or must be so
skewed towards a small dataset that they do not apply in a real world
scenario.  I would also like to state that I am very impressed with the
simplicity of SQLite, which is rare to find these days.  It was very
easy to get up and running.  I'm just having trouble getting past the
performance issues.  Any explanation would be helpful.
 
Richard Kuo


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]