Re: [sqlite] Count(1)
I think the main hit to be avoided is in reading all of the interior and leaf pages into the page cache. Once you've done that the additional cost of actually processing the contents of those pages is going to be really small. -scott On Fri, Apr 4, 2008 at 9:48 AM, Noah Hart <[EMAIL PROTECTED]> wrote: > Questions to the SQLite maintainers... > > The docs tell us that ... > ** The page headers looks like this: > ** > ** OFFSET SIZE DESCRIPTION > ** 0 1 Flags. 1: intkey, 2: zerodata, 4: leafdata, > 8: leaf > ** 1 2 byte offset to the first freeblock > ** 3 2 number of cells on this page > > Since the count of cells in use stored in for each btree page? > Wouldn't it be pretty easy to optimize count(*) by > > > count = 0 > Btree_MOVE_TO_FIRST_ENTRY > while not Btree_END_OF_TREE > count += NUMBER_OF_ENTRIES_ON_THIS_CHILD_PAGE > Btree_MOVE_TO_NEXT_CHILD_PAGE > return count; > > > With large rows contents lengths, the savings would be minimal > However even with rows contents lengths around 100, the savings would be > 10x > > Regards -- Noah > > > > > -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of Scott Hess > Sent: Friday, April 04, 2008 9:15 AM > To: General Discussion of SQLite Database > Subject: Re: [sqlite] Count(1) > > What I meant when I said "full table scan" is that it has to read at > least something for every single row in the table. So the following > are going to be the same: > > SELECT COUNT(*) FROM t; > SELECT COUNT(rowid) FROM t; > > It won't have to scan any overflow pages, but it will have to hit all > the leaf nodes. > > You could certainly do a full scan on an index other than the rowid. > It might involve much less reading if the indexed items are small > relative to the overall row. Not sure if SQLite does this > optimization for you or not (I don't think it much matters - it's > still going to bel O(N), just with a lower constant). > > -scott > > > On Fri, Apr 4, 2008 at 8:19 AM, Samuel Neff <[EMAIL PROTECTED]> > wrote: > > Scott, > > > > Is it really a full table scan or just an index scan (at least in the > case > > where no data is needed from the table as in the original sample that > had no > > join or where clause). > > > > Thanks, > > > > Sam > > > > > > > > On Thu, Apr 3, 2008 at 4:12 PM, Scott Hess <[EMAIL PROTECTED]> wrote: > > > > > A little bit more info: SELECT COUNT(*) is implemented as a full > > > table scan, so SQLite is visiting every row in the table, which > will > > > get slower and slower as the table gets bigger and the database > > > fragments. This differs from many database engines (which > implement > > > an optimization for this) Doing the trigger thing means that it > only > > > visits the specific row that contains the count. > > > > > > -scott > > > > > > > > -- > > - > > We're Hiring! Seeking passionate Flex, C#, or C++ (RTSP, H264) > developer. > > Position is in the Washington D.C. metro area. Contact > > [EMAIL PROTECTED] > > > > > > ___ > > sqlite-users mailing list > > sqlite-users@sqlite.org > > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > > > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > > > > CONFIDENTIALITY NOTICE: > This message may contain confidential and/or privileged information. If you > are not the addressee or authorized to receive this for the addressee, you > must not use, copy, disclose, or take any action based on this message or any > information herein. If you have received this message in error, please advise > the sender immediately by reply e-mail and delete this message. Thank you for > your cooperation. > > > > > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Count(1)
On Fri, Apr 04, 2008 at 09:14:52AM -0700, Scott Hess scratched on the wall: > What I meant when I said "full table scan" is that it has to read at > least something for every single row in the table. So the following > are going to be the same: > > SELECT COUNT(*) FROM t; > SELECT COUNT(rowid) FROM t; These are extremely similar, but not the same. SELECT COUNT(*) FROM t; uses a version of "count" that expects zero arguments. The core of the operation looks like this: 3 OpenRead 0 2 000 4 SetNumColumns 0 0 000 5 Rewind 0 8 000 6 AggStep0 0 1 count(0) 00 7 Next 0 6 000 8 Close 0 0 000 9 AggFinal 1 0 0 count(0) 00 10SCopy 1 2 000 11ResultRow 2 1 000 12Halt 0 0 000 In specific, the "main loop" is just AggStep and Next over and over. It just walks the B-Tree and never deals with row records. SELECT COUNT(rowid) FROM t; uses a version of "count" that expects one argument. In this case, the value of "rowid". The core of that operation looks like this: 4 OpenRead 0 2 000 5 SetNumColumns 0 0 000 6 Rewind 0 10000 7 Rowid 0 3 000 8 AggStep0 3 1 count(1) 01 9 Next 0 7 000 10Close 0 0 000 11AggFinal 1 1 0 count(1) 00 12SCopy 1 3 000 13ResultRow 3 1 000 14Halt 0 0 000 In this case there is an extra step in the loop, as the rowid is fetched and passed to count. As I understand it, that rowid value is taken directly out of the B-Tree, however, so you still don't need to read the actual row-record data. -j -- Jay A. Kreibich < J A Y @ K R E I B I.C H > "'People who live in bamboo houses should not throw pandas.' Jesus said that." - "The Ninja", www.AskANinja.com, "Special Delivery 10: Pop!Tech 2006" ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Count(1)
Questions to the SQLite maintainers... The docs tell us that ... ** The page headers looks like this: ** ** OFFSET SIZE DESCRIPTION ** 0 1 Flags. 1: intkey, 2: zerodata, 4: leafdata, 8: leaf ** 1 2 byte offset to the first freeblock ** 3 2 number of cells on this page Since the count of cells in use stored in for each btree page? Wouldn't it be pretty easy to optimize count(*) by count = 0 Btree_MOVE_TO_FIRST_ENTRY while not Btree_END_OF_TREE count += NUMBER_OF_ENTRIES_ON_THIS_CHILD_PAGE Btree_MOVE_TO_NEXT_CHILD_PAGE return count; With large rows contents lengths, the savings would be minimal However even with rows contents lengths around 100, the savings would be 10x Regards -- Noah -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Scott Hess Sent: Friday, April 04, 2008 9:15 AM To: General Discussion of SQLite Database Subject: Re: [sqlite] Count(1) What I meant when I said "full table scan" is that it has to read at least something for every single row in the table. So the following are going to be the same: SELECT COUNT(*) FROM t; SELECT COUNT(rowid) FROM t; It won't have to scan any overflow pages, but it will have to hit all the leaf nodes. You could certainly do a full scan on an index other than the rowid. It might involve much less reading if the indexed items are small relative to the overall row. Not sure if SQLite does this optimization for you or not (I don't think it much matters - it's still going to bel O(N), just with a lower constant). -scott On Fri, Apr 4, 2008 at 8:19 AM, Samuel Neff <[EMAIL PROTECTED]> wrote: > Scott, > > Is it really a full table scan or just an index scan (at least in the case > where no data is needed from the table as in the original sample that had no > join or where clause). > > Thanks, > > Sam > > > > On Thu, Apr 3, 2008 at 4:12 PM, Scott Hess <[EMAIL PROTECTED]> wrote: > > > A little bit more info: SELECT COUNT(*) is implemented as a full > > table scan, so SQLite is visiting every row in the table, which will > > get slower and slower as the table gets bigger and the database > > fragments. This differs from many database engines (which implement > > an optimization for this) Doing the trigger thing means that it only > > visits the specific row that contains the count. > > > > -scott > > > > > -- > - > We're Hiring! Seeking passionate Flex, C#, or C++ (RTSP, H264) developer. > Position is in the Washington D.C. metro area. Contact > [EMAIL PROTECTED] > > > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users CONFIDENTIALITY NOTICE: This message may contain confidential and/or privileged information. If you are not the addressee or authorized to receive this for the addressee, you must not use, copy, disclose, or take any action based on this message or any information herein. If you have received this message in error, please advise the sender immediately by reply e-mail and delete this message. Thank you for your cooperation. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Count(1)
What I meant when I said "full table scan" is that it has to read at least something for every single row in the table. So the following are going to be the same: SELECT COUNT(*) FROM t; SELECT COUNT(rowid) FROM t; It won't have to scan any overflow pages, but it will have to hit all the leaf nodes. You could certainly do a full scan on an index other than the rowid. It might involve much less reading if the indexed items are small relative to the overall row. Not sure if SQLite does this optimization for you or not (I don't think it much matters - it's still going to bel O(N), just with a lower constant). -scott On Fri, Apr 4, 2008 at 8:19 AM, Samuel Neff <[EMAIL PROTECTED]> wrote: > Scott, > > Is it really a full table scan or just an index scan (at least in the case > where no data is needed from the table as in the original sample that had no > join or where clause). > > Thanks, > > Sam > > > > On Thu, Apr 3, 2008 at 4:12 PM, Scott Hess <[EMAIL PROTECTED]> wrote: > > > A little bit more info: SELECT COUNT(*) is implemented as a full > > table scan, so SQLite is visiting every row in the table, which will > > get slower and slower as the table gets bigger and the database > > fragments. This differs from many database engines (which implement > > an optimization for this) Doing the trigger thing means that it only > > visits the specific row that contains the count. > > > > -scott > > > > > -- > - > We're Hiring! Seeking passionate Flex, C#, or C++ (RTSP, H264) developer. > Position is in the Washington D.C. metro area. Contact > [EMAIL PROTECTED] > > > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Count(1)
On Fri, Apr 04, 2008 at 11:19:53AM -0400, Samuel Neff wrote: > Is it really a full table scan or just an index scan (at least in the case > where no data is needed from the table as in the original sample that had no > join or where clause). Either way it's O(N) instead of O(1), which is what the original poster expected. Not fetching the column data will hardly help the poster. I think the trigger workaround is just fine. It might be nice if SQLite just optimized this anyways (e.g., keeping the row count in sqlite_master or some other sqlite_* table), mostly to avoid the FAQ, and to help porting to SQLite. Nico -- ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Count(1)
On Fri, Apr 04, 2008 at 11:19:53AM -0400, Samuel Neff scratched on the wall: > Scott, > > Is it really a full table scan or just an index scan (at least in the case > where no data is needed from the table as in the original sample that had no > join or where clause). I wondered about this myself, and did some digging last night. "count(*)" is translated into a zero-argument version of "count", so no actual column data is required if it is applied directly to a whole, real table. From the VDBE codes, it looks like it is just walks the B-tree of the table (which is the same as an index scan), but never actually reads the rows. 'Next' is called, but 'Column' is not. -j -- Jay A. Kreibich < J A Y @ K R E I B I.C H > "'People who live in bamboo houses should not throw pandas.' Jesus said that." - "The Ninja", www.AskANinja.com, "Special Delivery 10: Pop!Tech 2006" ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Count(1)
Scott, Is it really a full table scan or just an index scan (at least in the case where no data is needed from the table as in the original sample that had no join or where clause). Thanks, Sam On Thu, Apr 3, 2008 at 4:12 PM, Scott Hess <[EMAIL PROTECTED]> wrote: > A little bit more info: SELECT COUNT(*) is implemented as a full > table scan, so SQLite is visiting every row in the table, which will > get slower and slower as the table gets bigger and the database > fragments. This differs from many database engines (which implement > an optimization for this) Doing the trigger thing means that it only > visits the specific row that contains the count. > > -scott > > -- - We're Hiring! Seeking passionate Flex, C#, or C++ (RTSP, H264) developer. Position is in the Washington D.C. metro area. Contact [EMAIL PROTECTED] ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Count(1)
A little bit more info: SELECT COUNT(*) is implemented as a full table scan, so SQLite is visiting every row in the table, which will get slower and slower as the table gets bigger and the database fragments. This differs from many database engines (which implement an optimization for this) Doing the trigger thing means that it only visits the specific row that contains the count. -scott On Thu, Apr 3, 2008 at 5:03 AM, P Kishor <[EMAIL PROTECTED]> wrote: > On 4/3/08, Mahalakshmi.m <[EMAIL PROTECTED]> wrote: > > Hi, > > > > I am having 4 records in my Harddisk. > > My Processor speed is 400 Mhz. > > > > For "SELECT COUNT(1) FROM MUSIC ;" its getting more time to display the > > count. > > I have also tried with "SELECT COUNT(*) FROM MUSIC ;This also take more > > time. > > SELECT Count(whatever) takes time. There is no way around that. > > > > Is there any other way we can get the Total number of records. > > If speed of getting total number of records is important to you, keep > a running total in a separate "count_rows" table, and update that > total via TRIGGERs as you insert into, delete from or update your > tables. Then retrieve the total from the count_rows table instead of > using COUNT. Here is the idea... > > CREATE TABLE count_rows (tablename, count_of_rows); > CREATE TRIGGERs AFTER (DELETE|INSERT|UPDATE) ON table > ... UPDATE count_rows SET count_of_rows = ? WHERE tablename = 'table' ... > > > Then, later on, instead of > > SELECT COUNT(*) FROM table > > You do > > SELECT count_of_rows FROM count_rows WHERE tablename = ? > > > Search the archives. Many of your questions will be answered. > > > > > > Please help to solve this. > > > > Thanks & Regards, > > Mahalakshmi > > > > > > ___ > > sqlite-users mailing list > > sqlite-users@sqlite.org > > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > > > > > -- > Puneet Kishor http://punkish.eidesis.org/ > Nelson Institute for Environmental Studies http://www.nelson.wisc.edu/ > Open Source Geospatial Foundation (OSGeo) http://www.osgeo.org/ > > > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Count(1)
You could use a trigger to keep the running total in a seperate table. "Mahalakshmi.m" <[EMAIL PROTECTED]> wrote: Hi, I am having 4 records in my Harddisk. My Processor speed is 400 Mhz. For "SELECT COUNT(1) FROM MUSIC ;" its getting more time to display the count. I have also tried with "SELECT COUNT(*) FROM MUSIC ;This also take more time. Is there any other way we can get the Total number of records. Please help to solve this. Thanks & Regards, Mahalakshmi ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Count(1)
On 4/3/08, Mahalakshmi.m <[EMAIL PROTECTED]> wrote: > Hi, > > I am having 4 records in my Harddisk. > My Processor speed is 400 Mhz. > > For "SELECT COUNT(1) FROM MUSIC ;" its getting more time to display the > count. > I have also tried with "SELECT COUNT(*) FROM MUSIC ;This also take more > time. SELECT Count(whatever) takes time. There is no way around that. > Is there any other way we can get the Total number of records. If speed of getting total number of records is important to you, keep a running total in a separate "count_rows" table, and update that total via TRIGGERs as you insert into, delete from or update your tables. Then retrieve the total from the count_rows table instead of using COUNT. Here is the idea... CREATE TABLE count_rows (tablename, count_of_rows); CREATE TRIGGERs AFTER (DELETE|INSERT|UPDATE) ON table ... UPDATE count_rows SET count_of_rows = ? WHERE tablename = 'table' ... Then, later on, instead of SELECT COUNT(*) FROM table You do SELECT count_of_rows FROM count_rows WHERE tablename = ? Search the archives. Many of your questions will be answered. > Please help to solve this. > > Thanks & Regards, > Mahalakshmi > > > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > -- Puneet Kishor http://punkish.eidesis.org/ Nelson Institute for Environmental Studies http://www.nelson.wisc.edu/ Open Source Geospatial Foundation (OSGeo) http://www.osgeo.org/ ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Count(1)
Hi, I am having 4 records in my Harddisk. My Processor speed is 400 Mhz. For "SELECT COUNT(1) FROM MUSIC ;" its getting more time to display the count. I have also tried with "SELECT COUNT(*) FROM MUSIC ;This also take more time. Is there any other way we can get the Total number of records. Please help to solve this. Thanks & Regards, Mahalakshmi ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users