Mark Drago <[EMAIL PROTECTED]> wrote:
> 
> The query I'm running is the following:
> select count(host), host from log group by host;
> 
> The results include the first time the query is run (when the file is
> not cached) and then the times of a few runs after that (when the file
> is cached).
> 
> SQLite: 221.9s, 1.6s, 1.6s, 1.6s
>  MySQL:   2.2s, 1.8s, 1.8s, 1.8s
> 
> It is apparent that SQLite is reading the entire database off of the
> disk and MySQL somehow is not.  The MySQL query cache is not in use on
> this machine and MySQL does not claim very much memory for itself before
> the test is conducted (maybe 30M).
> 
> I've tried looking in to the output from 'explain' to see if SQLite was
> using the index that I have on the 'host' column, but I don't think it
> is.  The output from 'explain' is included below.  Note that the
> 'explain' output is from a different machine which is running SQLite
> 3.3.5 compiled from source as the SQLite on FC5 kept Segfaulting when I
> tried to use 'explain'.
> 

From the EXPLAIN output, I see that your query is using
the HOST index only.  It is never even looking at the LOG
table.  And it is doing a single straight pass through the
HOST table - no binary searchs or anything like that to
slow it down.  This is all working correctly.

I'm guessing that the HOST index must be really badly
fragmented within the database file and that this is
preventing the read-ahead logic in your filesystem from
working well (or at all, probably).  Hence, each page
read from the disk is requiring one full revolution of
the disk platter and maybe some head movement too.  That
is what is slowing things down so when you have a cold
cache.

MySQL is trouncing SQLite in this case probably for three
separate reasons:  (1) MySQL keeps each index in a separate
disk file so that all the information is closer together.
Hence the filesystem read-ahead logic is able to do a better
job.  (2) MySQL probably implements prefix compression on
its indices, which will make the indices much smaller.  This
is something that we need to add to SQLite (prefix compression)
but is not currently available.  And (3) MySQL may well do
a better job of keeping pages of the index in logical order.
But it does so at the expense of disk space.  MySQL and
SQLite have different objectives here.

If you wait to do the

   CREATE INDEX host ON log(host);

until right before you run your query (instead of creating
the index at the very beginning) that will tend to increase
the locality of the index and might make a big difference
in cold-cache performance.  Other than that, I cannot think
of anything to help, other than for me to get busy and
implement prefix compression on indices - and that is not
likely to happen this week  :-)  The other thing you might
try is using a large page size for the database: 32K instead
of 1K.  That will slow down insert performance somewhat but
should increase locality of reference and make cold-cache
queries run faster.  Maybe experiement with intermediate
page size values too:  4k or 8K.

If I think of anything else I will let you know.  Thanks for
bringing this case to my attention.  I will be working on it
over the coming months.

--
D. Richard Hipp   <[EMAIL PROTECTED]>


Reply via email to