> boun...@sqlite.org] On Behalf Of Bart Smissaert
> 
> That seems to be the answer and after some quick testing it looks it makes
it
> more efficient as well!
> 
> On Tue, Oct 18, 2011 at 1:36 PM, Kit <kit.sa...@gmail.com> wrote:
> > 2011/10/16 Fabian <fabianpi...@gmail.com>:
> >> How can you limit a count-query? I tried:
> >> SELECT COUNT(*) FROM table LIMIT 5000
> >
> > SELECT min(COUNT(*),5000) FROM table;
> > --

My first impression was that the count(*) inside the min() would access all
the records anyway (perhaps not all the columns though) and thus in fact
still access more than 5000 records, even if it was hidden by the min
function.

To test this I took a database containing tables with millions of records
and executed the following statements:

1. select count(*) from table;

2. select min(count(*), 5000) from table;

3. select count(*) from (select null from table limit 5000);

Now to really test properly the computer would have to be restarted before
each test in order to avoid buffering of the disks, etc.
However, I could not be bothered in this case and just used three separate
tables with identical structure but a different number of records (between 4
and 5 million). As expected with the difference in execution times between
the three statements it did not matter that much.

1. Around 150 seconds

2. Around 14 seconds

3. Around 50 milliseconds !!!

Thus it seems that although variation two perhaps avoids fully reading the
records, it still have to touch them (or perhaps the primary key) which is
better than a normal full select count(*) on the table. Perhaps somebody
know the inner workings better and can explain the difference of factor 10
between variation 1 and 2.

To actually limit the count statement only variation 3 actually works (which
has been suggested earlier in the thread) and is of course several orders of
magnitude faster.

Somebody had also suggested: 
select count(*) from (select 1 from table limit 5000);

Notice the inner select of 1 instead of null. 
As predicted this is slightly slower at around 60 milliseconds. 

So selecting null is always better in sub queries when only the count or
existence of records are needed and not actual values from any of the
columns.


/Frank Missel

_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to