This is a common issue on the mailing list.  The first time you do
count(*), SQLite (actually your OS) has to load data into memory.
Most OS's will keep the file in a buffer cache, so the 2nd count(*)
doesn't have to read from disk.

Here's a timing from my own system, after a purge command to clear the
buffer cache:

$ time sqlite3 hb.db 'select count(*) from logs'
-- Loading resources from /Users/jim/.sqliterc
count(*)
----------
734909

real    0m0.580s
user    0m0.190s
sys     0m0.034s

Same command again, with the file cached:

$ time sqlite3 hb.db 'select count(*) from logs'
-- Loading resources from /Users/jim/.sqliterc
count(*)
----------
734909

real    0m0.189s
user    0m0.165s
sys     0m0.019s

This time is consistent no matter how many times I run it, because the
file is still cached.  Doing a purge command to clear the cache and
re-running the query, we get:

$ purge
$ time sqlite3 hb.db 'select count(*) from logs'
-- Loading resources from /Users/jim/.sqliterc
count(*)
----------
734909

real    0m0.427s
user    0m0.175s
sys     0m0.024s

On my system, there is not a huge difference, but it is consistent.
Now, if you have a fragmented file system, you will see a much larger
difference.  There are many posts on the mailing list about both file
system fragmentation and logical fragmentation within the SQLite file
itself.  Your first count(*) is subject to these fragmentation
effects, while your 2nd usually is not, because the file is in memory.

Some people on the list believe fragmentation is an unimportant detail
you shouldn't worry about, because you have little control over it.
That may be true, but it's useful to understand how it can affect
performance.  I think you are seeing this first hand.

Jim
--
HashBackup: easy onsite and offsite Unix backup
http://www.hashbackup.com



On Mon, Feb 21, 2011 at 9:37 AM, Sven L <larvpo...@hotmail.se> wrote:
>
> Same result :(
> Note that I have compiled SQLite with the following switches:
> SQLITE_ENABLE_STAT2
> SQLITE_THREADSAFE=2
>
> I've learnt that COUNT(*) is slower than COUNT(ID), since * means the engine 
> has to traverse all columns and it might even return another value if there 
> are NULL-values...
>
> Also, this is quite interesting:
>
> sqlite> EXPLAIN QUERY PLAN SELECT COUNT(ItemID) FROM Items;
> 0|0|0|SCAN TABLE Items (~1000000 rows)
> sqlite> EXPLAIN QUERY PLAN SELECT COUNT(*) FROM Items;
> sqlite>
>
> I would expect an index scan on the first statement. The second statement 
> tells me nada?!
>
> Thanks for your help!
>
>
>> From: slav...@bigfraud.org
>> Date: Mon, 21 Feb 2011 14:24:50 +0000
>> To: sqlite-users@sqlite.org
>> Subject: Re: [sqlite] COUNT() extremely slow first time!
>>
>>
>> On 21 Feb 2011, at 2:23pm, Sven L wrote:
>>
>> > SELECT COUNT(ItemID) FROM Items;
>> >
>> > This takes around 40 seconds the first time! WHY?!
>>
>> Try again, doing everything identically except that instead of the above 
>> line use
>>
>> SELECT COUNT(*) FROM Items;
>>
>> Simon.
>> _______________________________________________
>> sqlite-users mailing list
>> sqlite-users@sqlite.org
>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
> _______________________________________________
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to