Re: [sqlite] speedtest result is obsolete

2005-02-09 Thread Dan Keeley
I think you people are missing the point here, the performance increase 
you're seeing is all down to OS caching and will vary across different 
ports.  It's nothing to do with sqlite, and will affect every package.

Therefore the only way to fairly compare mysql/postgress/sqlite is to make 
sure the machine is cleanly booted, before running any tests.

( well and then maybe run the tests twice in succession, so the caching 
effect can be taken into account )

From: Christian Smith [EMAIL PROTECTED]
Reply-To: sqlite-users@sqlite.org
To: sqlite-users@sqlite.org, [EMAIL PROTECTED]
Subject: Re: [sqlite] speedtest result is obsolete
Date: Tue, 8 Feb 2005 18:27:21 + (GMT)
On Tue, 8 Feb 2005, Chris Schirlinger wrote:
 I did a small test to see if performance was linear with time.
 I wanted to make sure it was suitable for my application.
 It seems with both indexed and unindexed tables it doesn't take
 significantly longer to do the 1,000,000th insert than it did the
 first.

I've discovered there are oodles of factors with this sort of thing,
all with depend on what you actually want to do. With 2 million rows,
a simple select statement (select * from wibble where key = 1)
returning 2000 records out of a dataset of 2 million+, takes between
3 and 10 seconds
Doing a keyed search is no guarantee that you won't touch *every* single
page in the table, if the rows are inserted in random order. Try this:
sqlite create table wibble2 as select * from wibble;
sqlite delete from wibble;
sqlite insert into wibble select * from wibble2 order by key;
sqlite drop table wibble2;
Assuming key is the key field you want, the records will be inserted into
wibble in key order. Selecting by key will then touch the least number of
pages, speeding up the select.

The *SECOND* time you call this, it's instant due mostly to SQLites
caching and HDD caching, however in our case, the 10 second wait at
the start was a major issue
What do you expect? SQLite can't second guess what might be needed and
load pages in the background ready for use.

The only way we could get correct test results for our purposes was
to clean boot between every test, and then the results are
depressing. Still trying to get past this

Why? Does you program require the machine to be rebooted before use?
I'm not trying to be facetious, but your test seem very invalid without
further explanation.
Christian

--
/\
\ /ASCII RIBBON CAMPAIGN - AGAINST HTML MAIL
 X   - AGAINST MS ATTACHMENTS
/ \



Re: [sqlite] speedtest result is obsolete

2005-02-09 Thread Chris Schirlinger
 I think you people are missing the point here, the performance increase 
 you're seeing is all down to OS caching and will vary across different 
 ports.  It's nothing to do with sqlite, and will affect every package.
 
 Therefore the only way to fairly compare mysql/postgress/sqlite is to make 
 sure the machine is cleanly booted, before running any tests.
 
 ( well and then maybe run the tests twice in succession, so the caching 
 effect can be taken into account )

This was what we found, and we tested dozens of databases and home 
grown systems

Some tools did better with initial cacheing and some better once 
cached

All our speed tests started on a clean boot, then we tested several 
iterations. 

Even delphi wrappers for the same back end (in one case SQLite) 
differed in the speed they worked at. Has taken months to get to this 
point where we are finally happy with the results we are getting from 
a DB



Re: [sqlite] speedtest result is obsolete

2005-02-08 Thread Christian Smith
On Tue, 8 Feb 2005, Chris Schirlinger wrote:

 I did a small test to see if performance was linear with time.
 I wanted to make sure it was suitable for my application.
 It seems with both indexed and unindexed tables it doesn't take
 significantly longer to do the 1,000,000th insert than it did the
 first.

I've discovered there are oodles of factors with this sort of thing,
all with depend on what you actually want to do. With 2 million rows,
a simple select statement (select * from wibble where key = 1)
returning 2000 records out of a dataset of 2 million+, takes between
3 and 10 seconds


Doing a keyed search is no guarantee that you won't touch *every* single
page in the table, if the rows are inserted in random order. Try this:

sqlite create table wibble2 as select * from wibble;
sqlite delete from wibble;
sqlite insert into wibble select * from wibble2 order by key;
sqlite drop table wibble2;

Assuming key is the key field you want, the records will be inserted into
wibble in key order. Selecting by key will then touch the least number of
pages, speeding up the select.


The *SECOND* time you call this, it's instant due mostly to SQLites
caching and HDD caching, however in our case, the 10 second wait at
the start was a major issue


What do you expect? SQLite can't second guess what might be needed and
load pages in the background ready for use.



The only way we could get correct test results for our purposes was
to clean boot between every test, and then the results are
depressing. Still trying to get past this


Why? Does you program require the machine to be rebooted before use?

I'm not trying to be facetious, but your test seem very invalid without
further explanation.

Christian



-- 
/\
\ /ASCII RIBBON CAMPAIGN - AGAINST HTML MAIL
 X   - AGAINST MS ATTACHMENTS
/ \


Re: [sqlite] speedtest result is obsolete

2005-02-08 Thread Chris Schirlinger
 Doing a keyed search is no guarantee that you won't touch *every* single
 page in the table, if the rows are inserted in random order. Try this:
...cut...
 Assuming key is the key field you want, the records will be inserted into
 wibble in key order. Selecting by key will then touch the least number of
 pages, speeding up the select.

Ahhh excellent idea, this seems similar to a clustered index,  where 
the data is actually stored on disk in order. That is actually what 
we wanted, but SQLite didn't seem to support those sorts of index

 Why? Does you program require the machine to be rebooted before use?
 
 I'm not trying to be facetious, but your test seem very invalid without
 further explanation.

No perfectly understandable. This is not a test for SQLite in 
particular, but was an issue I discovered a long time ago when doing 
speed tests

Speed tests on any system. SQLite, basic files whatever, are *SLOWER* 
on the first time you run them due mainly to HDD caching, and 
whatever other caching the program does with the data. We would get 
lovely 20k per second record update on stuff and then on fresh reboot 
discover we dropped to 20 records per second for first 2000 records

Unfortunatly, our program has the following requirement, user turns 
on machine, user immedeatly does some NASTY damn data retrieval or 
update, closes program and turns off machine

This first retrival is our issue, and is slow. What I have working 
now seems to be good however.

We did several things:

1) Set the page size of the database to 4096 which matches most 
peoples NTFS partions and has sped up the initial access nicely. Wont 
work for everyone but most I think

2) spawn several threads as the program is opened, atrifically 
caching the data before the user has a chance to push any buttons and 
wonder why they take 2 seconds to respond.

Gonna try your ordering idea, that will help IMMENSLY for the initial 
DB though I can see how it would slowly fall out of order as the user 
updates. 

Frankly, SQLite has been the closest I've seen to resolve this issue 
(well it DOES actually solve the issue). Love this DB :)

Personally I think it asking a bit much (I mean 20 million rows on 
some guys Windows 95box? and you want it HOW fast?)

Thanks



Re: [sqlite] speedtest result is obsolete

2005-02-08 Thread D. Richard Hipp
On Wed, 2005-02-09 at 09:30 +1100, Chris Schirlinger wrote:
  Doing a keyed search is no guarantee that you won't touch *every* single
  page in the table, if the rows are inserted in random order. Try this:
 ...cut...
  Assuming key is the key field you want, the records will be inserted into
  wibble in key order. Selecting by key will then touch the least number of
  pages, speeding up the select.
 
 Ahhh excellent idea, this seems similar to a clustered index,  where 
 the data is actually stored on disk in order. That is actually what 
 we wanted, but SQLite didn't seem to support those sorts of index
 

Assuming you are not using an INTEGER PRIMARY KEY, you could
do something like this:

   CREATE TABLE wibble2 AS 
 SELECT * FROM wibble ORDER BY cluster-index;
   DROP TABLE wibble;
   ALTER TABLE wibble2 RENAME TO wibble;

The above is perhaps a simplification, depending on your
schema.  But you should get the idea.  By default, SQLite
puts records into a table in the order in which they are
inserted.  So if you insert them in cluster order, that's
the way they will appear in the table and you can take
advantage of locality of reference.

Of course, as you insert and delete items from the table,
it gradually becomes less and less clustered.  But all
clustered tables work that way, do they not?  So you
periodically rebuild the table to restore locality of
reference when things deteriorate too much.

Another trick you can pull is to create an index that
contains every column in the table with the cluster index
columns occuring first.  That will double the size of your
database.  But when SQLite can get all of the information it
needs out of the index it does not bother to consult the 
table itself.  So the index will always stay clustered.
-- 
D. Richard Hipp [EMAIL PROTECTED]



Re: [sqlite] speedtest result is obsolete

2005-02-08 Thread Chris Schirlinger
 Another trick you can pull is to create an index that
 contains every column in the table with the cluster index
 columns occuring first.  That will double the size of your
 database.  But when SQLite can get all of the information it
 needs out of the index it does not bother to consult the 
 table itself.  So the index will always stay clustered.

Interesting idea. Will check that out

I did the clustering trick (insert into wibble select * from wibble2 
order by key1, key2;) which basically sorted the table in the order 
the query wants it, and there was a noticable speed increase

Sure, the table will slowly, as data is added, become less and less 
clustered but frankly it's something I am willing to live with 
considering the user would have to use the program for 10+ years 
before the data they added would compare with the data that's already 
there

Now the slowest points of the operation aren't DB access and 
retrieval any more, but application+parseing of the data once 
gathered

Way to go SQLite :)



[sqlite] speedtest result is obsolete

2005-02-07 Thread Yasuo Ohgaki
Hi,

The speed test result is obsolete

http://sqlite.org/speed.html

Here is my results.

http://www.ohgaki.net/download/speedtest.html
http://www.ohgaki.net/download/speedtest-pgsql-nosync.html

The later one is without fsync for PostgreSQL. All dbmses are
tested with default rpm package settings.


CPUAthlonXP 2500+
HDDATA133 7200rpm
Memory 1GB Memory
OS MomongaLinux1/kernel-2.4.27-4mk7/glibc-2.3.2-13m

PostgreSQL-7.4.5-1m
MySQL-4.0.20-4m
SQLite-2.8.13-1m


To be fair to PostgreSQL, 'vacuumdb -a -f' is executed after
each test.

These dbms packages are little old... Latest releases are

PostgreSQL 8.0.1
MySQL 4.1.9
SQLite 3.0.8

Result may change with latest versions, but my result should be
more acculate.

Regards,

--
Yasuo Ohgaki


Re: [sqlite] speedtest result is obsolete

2005-02-07 Thread Clay Dowling

Yasuo Ohgaki said:

 http://www.ohgaki.net/download/speedtest.html
 http://www.ohgaki.net/download/speedtest-pgsql-nosync.html

The tests were very interesting.  Based on what I see in those reports,
any one of the three should be suitable for most tasks, with the engine
chosen based on the features needed.  Dr. Hipp's advice to build a good
database abstraction layer into your software is even more important given
this result.

Clay

-- 
Lazarus Notes from Lazarus Internet Development
http://www.lazarusid.com/notes/
Articles, Reviews and Commentary on web development


Re: [sqlite] speedtest result is obsolete

2005-02-07 Thread Hugh Gibson
I would be interested to know the results for very large data sets. 
Indications on the list have been that performance suffers when the number 
of records gets very big ( 1 million), possibly due to using an internal 
sort.

Hugh


Re: [sqlite] speedtest result is obsolete

2005-02-07 Thread Jay

I did a small test to see if performance was linear with time.
I wanted to make sure it was suitable for my application.
It seems with both indexed and unindexed tables it doesn't take
significantly longer to do the 1,000,000th insert than it did the
first.


=

-

Lord Tarlington gazed upon the crazed Egyptian hieroglyphics on the walls of 
the ancient tomb of the petrified pharaoh, he vowed there would be no curse on 
him like on that other Lord, unless you count his marriage to Lady Tarlington 
who, when the lost treasure was found, will be dumped faster than that basket 
in the bulrushes.
  Melissa Rhodes
-

The Castles of Dereth Calendar: a tour of the art and architecture of Asheron's 
Call
http://www.lulu.com/content/77264



__ 
Do you Yahoo!? 
Yahoo! Mail - now with 250MB free storage. Learn more.
http://info.mail.yahoo.com/mail_250


Re: [sqlite] speedtest result is obsolete

2005-02-07 Thread Christopher Petrilli
On Mon, 7 Feb 2005 10:09:58 -0800 (PST), Jay [EMAIL PROTECTED] wrote:
 
 I did a small test to see if performance was linear with time.
 I wanted to make sure it was suitable for my application.
 It seems with both indexed and unindexed tables it doesn't take
 significantly longer to do the 1,000,000th insert than it did the
 first.

My experience is that it depends heavily on the number of indexes.
With just a primary key index, it's pretty linear (though it does
degrade), however with 4-5 indexes, it degrades very quickly after a
certain point.  I could try and genericize my test rigs and publish
some numbers if it would be interesting?

Chris
-- 
| Christopher Petrilli
| [EMAIL PROTECTED]


Re: [sqlite] speedtest result is obsolete

2005-02-07 Thread Chris Schirlinger
 I would be interested to know the results for very large data sets. 
 Indications on the list have been that performance suffers when the number 
 of records gets very big ( 1 million), possibly due to using an internal 
 sort.

I must say, with a 2+ million row data set, we aren't getting 
anywhere near these sort of speed results. 

Probably in the order of 10 times slower if not much, much more. 
However, I havn't written any generic tools to reconfirm this, rather 
this is based on the results our current SQLite3 project is spitting 
back