Re: [sqlite] Count(1)

2008-04-04 Thread Scott Hess
I think the main hit to be avoided is in reading all of the interior
and leaf pages into the page cache.  Once you've done that the
additional cost of actually processing the contents of those pages is
going to be really small.

-scott


On Fri, Apr 4, 2008 at 9:48 AM, Noah Hart <[EMAIL PROTECTED]> wrote:
> Questions to the SQLite maintainers...
>
>  The docs tell us that ...
> ** The page headers looks like this:
> **
> **   OFFSET   SIZE DESCRIPTION
> **  0   1  Flags. 1: intkey, 2: zerodata, 4: leafdata,
>  8: leaf
> **  1   2  byte offset to the first freeblock
> **  3   2  number of cells on this page
>
>  Since the count of cells in use stored in for each btree page?
>  Wouldn't it be pretty easy to optimize count(*) by
>
>  
>  count = 0
>  Btree_MOVE_TO_FIRST_ENTRY
>  while not Btree_END_OF_TREE
>   count += NUMBER_OF_ENTRIES_ON_THIS_CHILD_PAGE
>   Btree_MOVE_TO_NEXT_CHILD_PAGE
>  return count;
>  
>
>  With large rows contents lengths, the savings would be minimal
>  However even with rows contents lengths around 100, the savings would be
>  10x
>
>  Regards -- Noah
>
>
>
>
>  -Original Message-
>  From: [EMAIL PROTECTED]
>  [mailto:[EMAIL PROTECTED] On Behalf Of Scott Hess
>  Sent: Friday, April 04, 2008 9:15 AM
>  To: General Discussion of SQLite Database
>  Subject: Re: [sqlite] Count(1)
>
>  What I meant when I said "full table scan" is that it has to read at
>  least something for every single row in the table.  So the following
>  are going to be the same:
>
>   SELECT COUNT(*) FROM t;
>   SELECT COUNT(rowid) FROM t;
>
>  It won't have to scan any overflow pages, but it will have to hit all
>  the leaf nodes.
>
>  You could certainly do a full scan on an index other than the rowid.
>  It might involve much less reading if the indexed items are small
>  relative to the overall row.  Not sure if SQLite does this
>  optimization for you or not (I don't think it much matters - it's
>  still going to bel O(N), just with a lower constant).
>
>  -scott
>
>
>  On Fri, Apr 4, 2008 at 8:19 AM, Samuel Neff <[EMAIL PROTECTED]>
>  wrote:
>  > Scott,
>  >
>  >  Is it really a full table scan or just an index scan (at least in the
>  case
>  >  where no data is needed from the table as in the original sample that
>  had no
>  >  join or where clause).
>  >
>  >  Thanks,
>  >
>  >  Sam
>  >
>  >
>  >
>  >  On Thu, Apr 3, 2008 at 4:12 PM, Scott Hess <[EMAIL PROTECTED]> wrote:
>  >
>  >  > A little bit more info:  SELECT COUNT(*) is implemented as a full
>  >  > table scan, so SQLite is visiting every row in the table, which
>  will
>  >  > get slower and slower as the table gets bigger and the database
>  >  > fragments.  This differs from many database engines (which
>  implement
>  >  > an optimization for this)  Doing the trigger thing means that it
>  only
>  >  > visits the specific row that contains the count.
>  >  >
>  >  > -scott
>  >  >
>  >  >
>  >  --
>  >  -
>  >  We're Hiring! Seeking passionate Flex, C#, or C++ (RTSP, H264)
>  developer.
>  >  Position is in the Washington D.C. metro area. Contact
>  >  [EMAIL PROTECTED]
>  >
>  >
>  > ___
>  >  sqlite-users mailing list
>  >  sqlite-users@sqlite.org
>  >  http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>  >
>  ___
>  sqlite-users mailing list
>  sqlite-users@sqlite.org
>  http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
>
>
>  CONFIDENTIALITY NOTICE:
>  This message may contain confidential and/or privileged information. If you 
> are not the addressee or authorized to receive this for the addressee, you 
> must not use, copy, disclose, or take any action based on this message or any 
> information herein. If you have received this message in error, please advise 
> the sender immediately by reply e-mail and delete this message. Thank you for 
> your cooperation.
>
>
>
>
>  ___
>  sqlite-users mailing list
>  sqlite-users@sqlite.org
>  http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Count(1)

2008-04-04 Thread Jay A. Kreibich
On Fri, Apr 04, 2008 at 09:14:52AM -0700, Scott Hess scratched on the wall:
> What I meant when I said "full table scan" is that it has to read at
> least something for every single row in the table.  So the following
> are going to be the same:
> 
>   SELECT COUNT(*) FROM t;
>   SELECT COUNT(rowid) FROM t;

  These are extremely similar, but not the same.

  SELECT COUNT(*) FROM t;   uses a version of "count" that expects zero
  arguments.  The core of the operation looks like this:

3 OpenRead   0 2 000   
4 SetNumColumns  0 0 000   
5 Rewind 0 8 000   

6 AggStep0 0 1 count(0)   00   
7 Next   0 6 000   

8 Close  0 0 000   
9 AggFinal   1 0 0 count(0)   00   
10SCopy  1 2 000   
11ResultRow  2 1 000   
12Halt   0 0 000   

  In specific, the "main loop" is just AggStep and Next over and over.
  It just walks the B-Tree and never deals with row records.


  SELECT COUNT(rowid) FROM t;  uses a version of "count" that expects
  one argument.  In this case, the value of "rowid".  The core of that
  operation looks like this:

4 OpenRead   0 2 000   
5 SetNumColumns  0 0 000   
6 Rewind 0 10000   

7 Rowid  0 3 000   
8 AggStep0 3 1 count(1)   01   
9 Next   0 7 000   

10Close  0 0 000   
11AggFinal   1 1 0 count(1)   00   
12SCopy  1 3 000   
13ResultRow  3 1 000   
14Halt   0 0 000   

 In this case there is an extra step in the loop, as the rowid is
 fetched and passed to count.  As I understand it, that rowid value is
 taken directly out of the B-Tree, however, so you still don't need to
 read the actual row-record data.

   -j

-- 
Jay A. Kreibich < J A Y  @  K R E I B I.C H >

"'People who live in bamboo houses should not throw pandas.' Jesus said that."
   - "The Ninja", www.AskANinja.com, "Special Delivery 10: Pop!Tech 2006"
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Count(1)

2008-04-04 Thread Noah Hart
Questions to the SQLite maintainers...

The docs tell us that ...
** The page headers looks like this:
**
**   OFFSET   SIZE DESCRIPTION
**  0   1  Flags. 1: intkey, 2: zerodata, 4: leafdata,
8: leaf
**  1   2  byte offset to the first freeblock
**  3   2  number of cells on this page

Since the count of cells in use stored in for each btree page?
Wouldn't it be pretty easy to optimize count(*) by


count = 0
Btree_MOVE_TO_FIRST_ENTRY
while not Btree_END_OF_TREE
 count += NUMBER_OF_ENTRIES_ON_THIS_CHILD_PAGE
 Btree_MOVE_TO_NEXT_CHILD_PAGE
return count;


With large rows contents lengths, the savings would be minimal
However even with rows contents lengths around 100, the savings would be
10x

Regards -- Noah


-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Scott Hess
Sent: Friday, April 04, 2008 9:15 AM
To: General Discussion of SQLite Database
Subject: Re: [sqlite] Count(1)

What I meant when I said "full table scan" is that it has to read at
least something for every single row in the table.  So the following
are going to be the same:

  SELECT COUNT(*) FROM t;
  SELECT COUNT(rowid) FROM t;

It won't have to scan any overflow pages, but it will have to hit all
the leaf nodes.

You could certainly do a full scan on an index other than the rowid.
It might involve much less reading if the indexed items are small
relative to the overall row.  Not sure if SQLite does this
optimization for you or not (I don't think it much matters - it's
still going to bel O(N), just with a lower constant).

-scott


On Fri, Apr 4, 2008 at 8:19 AM, Samuel Neff <[EMAIL PROTECTED]>
wrote:
> Scott,
>
>  Is it really a full table scan or just an index scan (at least in the
case
>  where no data is needed from the table as in the original sample that
had no
>  join or where clause).
>
>  Thanks,
>
>  Sam
>
>
>
>  On Thu, Apr 3, 2008 at 4:12 PM, Scott Hess <[EMAIL PROTECTED]> wrote:
>
>  > A little bit more info:  SELECT COUNT(*) is implemented as a full
>  > table scan, so SQLite is visiting every row in the table, which
will
>  > get slower and slower as the table gets bigger and the database
>  > fragments.  This differs from many database engines (which
implement
>  > an optimization for this)  Doing the trigger thing means that it
only
>  > visits the specific row that contains the count.
>  >
>  > -scott
>  >
>  >
>  --
>  -
>  We're Hiring! Seeking passionate Flex, C#, or C++ (RTSP, H264)
developer.
>  Position is in the Washington D.C. metro area. Contact
>  [EMAIL PROTECTED]
>
>
> ___
>  sqlite-users mailing list
>  sqlite-users@sqlite.org
>  http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users



CONFIDENTIALITY NOTICE: 
This message may contain confidential and/or privileged information. If you are 
not the addressee or authorized to receive this for the addressee, you must not 
use, copy, disclose, or take any action based on this message or any 
information herein. If you have received this message in error, please advise 
the sender immediately by reply e-mail and delete this message. Thank you for 
your cooperation.


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Count(1)

2008-04-04 Thread Scott Hess
What I meant when I said "full table scan" is that it has to read at
least something for every single row in the table.  So the following
are going to be the same:

  SELECT COUNT(*) FROM t;
  SELECT COUNT(rowid) FROM t;

It won't have to scan any overflow pages, but it will have to hit all
the leaf nodes.

You could certainly do a full scan on an index other than the rowid.
It might involve much less reading if the indexed items are small
relative to the overall row.  Not sure if SQLite does this
optimization for you or not (I don't think it much matters - it's
still going to bel O(N), just with a lower constant).

-scott


On Fri, Apr 4, 2008 at 8:19 AM, Samuel Neff <[EMAIL PROTECTED]> wrote:
> Scott,
>
>  Is it really a full table scan or just an index scan (at least in the case
>  where no data is needed from the table as in the original sample that had no
>  join or where clause).
>
>  Thanks,
>
>  Sam
>
>
>
>  On Thu, Apr 3, 2008 at 4:12 PM, Scott Hess <[EMAIL PROTECTED]> wrote:
>
>  > A little bit more info:  SELECT COUNT(*) is implemented as a full
>  > table scan, so SQLite is visiting every row in the table, which will
>  > get slower and slower as the table gets bigger and the database
>  > fragments.  This differs from many database engines (which implement
>  > an optimization for this)  Doing the trigger thing means that it only
>  > visits the specific row that contains the count.
>  >
>  > -scott
>  >
>  >
>  --
>  -
>  We're Hiring! Seeking passionate Flex, C#, or C++ (RTSP, H264) developer.
>  Position is in the Washington D.C. metro area. Contact
>  [EMAIL PROTECTED]
>
>
> ___
>  sqlite-users mailing list
>  sqlite-users@sqlite.org
>  http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Count(1)

2008-04-04 Thread Nicolas Williams
On Fri, Apr 04, 2008 at 11:19:53AM -0400, Samuel Neff wrote:
> Is it really a full table scan or just an index scan (at least in the case
> where no data is needed from the table as in the original sample that had no
> join or where clause).

Either way it's O(N) instead of O(1), which is what the original poster
expected.  Not fetching the column data will hardly help the poster.

I think the trigger workaround is just fine.

It might be nice if SQLite just optimized this anyways (e.g., keeping
the row count in sqlite_master or some other sqlite_* table), mostly to
avoid the FAQ, and to help porting to SQLite.

Nico
-- 
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Count(1)

2008-04-04 Thread Jay A. Kreibich
On Fri, Apr 04, 2008 at 11:19:53AM -0400, Samuel Neff scratched on the wall:
> Scott,
> 
> Is it really a full table scan or just an index scan (at least in the case
> where no data is needed from the table as in the original sample that had no
> join or where clause).

  I wondered about this myself, and did some digging last night.

  "count(*)" is translated into a zero-argument version of "count", so
  no actual column data is required if it is applied directly to a
  whole, real table.

  From the VDBE codes, it looks like it is just walks the B-tree of the
  table (which is the same as an index scan), but never actually reads
  the rows.  'Next' is called, but 'Column' is not.

   -j

-- 
Jay A. Kreibich < J A Y  @  K R E I B I.C H >

"'People who live in bamboo houses should not throw pandas.' Jesus said that."
   - "The Ninja", www.AskANinja.com, "Special Delivery 10: Pop!Tech 2006"
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Count(1)

2008-04-04 Thread Samuel Neff
Scott,

Is it really a full table scan or just an index scan (at least in the case
where no data is needed from the table as in the original sample that had no
join or where clause).

Thanks,

Sam


On Thu, Apr 3, 2008 at 4:12 PM, Scott Hess <[EMAIL PROTECTED]> wrote:

> A little bit more info:  SELECT COUNT(*) is implemented as a full
> table scan, so SQLite is visiting every row in the table, which will
> get slower and slower as the table gets bigger and the database
> fragments.  This differs from many database engines (which implement
> an optimization for this)  Doing the trigger thing means that it only
> visits the specific row that contains the count.
>
> -scott
>
>
-- 
-
We're Hiring! Seeking passionate Flex, C#, or C++ (RTSP, H264) developer.
Position is in the Washington D.C. metro area. Contact
[EMAIL PROTECTED]
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Count(1)

2008-04-03 Thread Scott Hess
A little bit more info:  SELECT COUNT(*) is implemented as a full
table scan, so SQLite is visiting every row in the table, which will
get slower and slower as the table gets bigger and the database
fragments.  This differs from many database engines (which implement
an optimization for this)  Doing the trigger thing means that it only
visits the specific row that contains the count.

-scott


On Thu, Apr 3, 2008 at 5:03 AM, P Kishor <[EMAIL PROTECTED]> wrote:
> On 4/3/08, Mahalakshmi.m <[EMAIL PROTECTED]> wrote:
>  > Hi,
>  >
>  >  I am having 4 records in my Harddisk.
>  >  My Processor speed is 400 Mhz.
>  >
>  >  For "SELECT COUNT(1) FROM MUSIC ;" its getting more time to display the
>  >  count.
>  >  I have also tried with "SELECT COUNT(*) FROM MUSIC ;This also take more
>  >  time.
>
>  SELECT Count(whatever) takes time. There is no way around that.
>
>
>  >  Is there any other way we can get the Total number of records.
>
>  If speed of getting total number of records is important to you, keep
>  a running total in a separate "count_rows" table, and update that
>  total via TRIGGERs as you insert into, delete from  or update your
>  tables. Then retrieve the total from the count_rows table instead of
>  using COUNT. Here is the idea...
>
>  CREATE TABLE count_rows (tablename, count_of_rows);
>  CREATE TRIGGERs AFTER (DELETE|INSERT|UPDATE) ON table
>   ... UPDATE count_rows SET count_of_rows = ? WHERE tablename = 'table' ...
>
>
>  Then, later on, instead of
>
>  SELECT COUNT(*) FROM table
>
>  You do
>
>  SELECT count_of_rows FROM count_rows WHERE tablename = ?
>
>
>  Search the archives. Many of your questions will be answered.
>
>
>
>
>  >  Please help to solve this.
>  >
>  >  Thanks & Regards,
>  >  Mahalakshmi
>  >
>  >
>  >  ___
>  >  sqlite-users mailing list
>  >  sqlite-users@sqlite.org
>  >  http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>  >
>
>
>  --
>  Puneet Kishor http://punkish.eidesis.org/
>  Nelson Institute for Environmental Studies http://www.nelson.wisc.edu/
>  Open Source Geospatial Foundation (OSGeo) http://www.osgeo.org/
>
>
> ___
>  sqlite-users mailing list
>  sqlite-users@sqlite.org
>  http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Count(1)

2008-04-03 Thread Ken
You could use a trigger to keep the running total in a seperate table.


"Mahalakshmi.m" <[EMAIL PROTECTED]> wrote: Hi,

I am having 4 records in my Harddisk.
My Processor speed is 400 Mhz.

For "SELECT COUNT(1) FROM MUSIC ;" its getting more time to display the
count.
I have also tried with "SELECT COUNT(*) FROM MUSIC ;This also take more
time.
Is there any other way we can get the Total number of records.
Please help to solve this.

Thanks & Regards,
Mahalakshmi


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Count(1)

2008-04-03 Thread P Kishor
On 4/3/08, Mahalakshmi.m <[EMAIL PROTECTED]> wrote:
> Hi,
>
>  I am having 4 records in my Harddisk.
>  My Processor speed is 400 Mhz.
>
>  For "SELECT COUNT(1) FROM MUSIC ;" its getting more time to display the
>  count.
>  I have also tried with "SELECT COUNT(*) FROM MUSIC ;This also take more
>  time.

SELECT Count(whatever) takes time. There is no way around that.

>  Is there any other way we can get the Total number of records.

If speed of getting total number of records is important to you, keep
a running total in a separate "count_rows" table, and update that
total via TRIGGERs as you insert into, delete from  or update your
tables. Then retrieve the total from the count_rows table instead of
using COUNT. Here is the idea...

CREATE TABLE count_rows (tablename, count_of_rows);
CREATE TRIGGERs AFTER (DELETE|INSERT|UPDATE) ON table
 ... UPDATE count_rows SET count_of_rows = ? WHERE tablename = 'table' ...


Then, later on, instead of

SELECT COUNT(*) FROM table

You do

SELECT count_of_rows FROM count_rows WHERE tablename = ?


Search the archives. Many of your questions will be answered.



>  Please help to solve this.
>
>  Thanks & Regards,
>  Mahalakshmi
>
>
>  ___
>  sqlite-users mailing list
>  sqlite-users@sqlite.org
>  http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>


-- 
Puneet Kishor http://punkish.eidesis.org/
Nelson Institute for Environmental Studies http://www.nelson.wisc.edu/
Open Source Geospatial Foundation (OSGeo) http://www.osgeo.org/
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Count(1)

2008-04-03 Thread Mahalakshmi.m
Hi,

I am having 4 records in my Harddisk.
My Processor speed is 400 Mhz.

For "SELECT COUNT(1) FROM MUSIC ;" its getting more time to display the
count.
I have also tried with "SELECT COUNT(*) FROM MUSIC ;This also take more
time.
Is there any other way we can get the Total number of records.
Please help to solve this.

Thanks & Regards,
Mahalakshmi


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users