Re: [sqlite] COUNT() extremely slow first time!

Greg Barker Sat, 26 Feb 2011 04:09:10 -0800

Thanks for the responses guys.

> I would never have any table with 150 columns.  It should be possible to
keep the schema for your table in your head.


Unfortunately those are the cards I have been dealt. The columns are just
buckets of data (bucket1, bucket2, bucket3, etc). Each bucket could be 1
month or 1 weeks worth of data, it can vary from table to table. They can
have up to two years worth of data loaded, so that could be 24 monthly
buckets or 104 weekly buckets.

We don't have any control over the machines this code is running on, so
hardware changes are not an option. These machines are definitely less than
ideal for what we are trying to get them to do, but what can you do.

> Giving this, if the fastest is 3.6 seconds, you have a very fast
harddrive. Can you confirm this?

I can confirm this, my tests were run on a machine that uses a solid state
drive.

Side note - how do I become a member of this list? Do my posts always need
to be approved?

On Fri, Feb 25, 2011 at 4:57 AM, Max Vlasov <[email protected]> wrote:

> On Fri, Feb 25, 2011 at 1:51 AM, Greg Barker <[email protected]>
> wrote:
>
> > I ran my tests again, this time setting the cache_size based on the
> > page_size. For some reason I wasn't able to replicate the timing I got
> > before for the 1024 page size (40.2 seconds) which is really confusing
> me,
> > since I was careful to make sure nothing was running in the background
> > while
> > I was testing.
> >
> > page_size/cache_size: SELECT COUNT time
> > 1024/2000: 20.83s
> > 4096/500: 14.4s
> > 8192/250: 8.8s
> > 16384/125: 5.0s
> > 32768/62: 3.6s
> >
> > I'm assuming reducing the cache size like that will definitely have an
> > impact on the performance of an actual application? Optimizing
> performance
> > for an application where both the queries and data can take many
> different
> > shapes and sizes is beginning to seem like quite a daunting task. What do
> > you do if there could be anywhere between 30-150 columns?
> >
> >
> Greg, first, I suggest to forget about sqlite cache for scanning
> operations.
> This is because the os cache is also used and it's a known fact that they
> sometimes store the same data twice, so disabling or decreasing one still
> leaves another in effect.
>
> I saw that the db you have is about 400 MB in size. Giving this, if the
> fastest is 3.6 seconds, you have a very fast harddrive. Can you confirm
> this? The scanning of such big base with at least 5-years old hd definitely
> should be 10 seconds or even slower. Did you reboot your comp this time? By
> the way, a faster way to clear the system cache for a particular file is to
> "touch" the file with CreateFile(...FILE_FLAG_NO_BUFFERING).
>
> I tried to emulate your base with a table
> CREATE TABLE [TestTable] ([Id] Integer Primary key autoincrement, [a] Text,
> [b] Text, [c] Text, [d] Text);
> and filling it with 1,000,000 records having a, b,c,d about 100 bytes each.
> I saw no big difference between 1024 and 8192 page sizes. When copy the db
> to Nul took about 20 seconds, 1024-select count took 25 seconds and 8192 -
> 20 seconds.
>
> Anyway, select count usually is a very slow operation regardless of the
> tweaks we can use, almost in every scenario it's equivalent to reading all
> the data of the table. I suppose the only exception is if your records is
> very big to take advantage of bypassing overflow data. So, probably in a
> table with 100 records each containing a 1MB text, 'select count' will be
> very fast
>
> Max Vlasov
> _______________________________________________
> sqlite-users mailing list
> [email protected]
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
_______________________________________________
sqlite-users mailing list
[email protected]
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] COUNT() extremely slow first time!

Reply via email to