Re: [sqlite] Strange cache behavior

Pavel Ivanov Wed, 22 Sep 2010 08:13:21 -0700

> Is it ok for cache to behave like this or some optimization is possible to
> fix this?

For this particular case I believe you can do some optimization by
making your own implementation of cache.
Also I believe such "strange" behavior of cache is pretty much
explainable. Remember that standard implementation of cache replaces
pages on LRU basis, i.e. if cache is too large then new page replaces
the oldest page, the one which access time is smallest.
So let's say for execution of this query
SELECT Count(*) FROM WikiData WHERE Id < 9812
it needs to read pages 1 through 2000 of the database in exactly this
order. For execution of this query
SELECT Count(*) FROM WikiData WHERE Id < 10000
it needs to read pages 1 through 2001 in exactly this order. But when
it wants to load 2001st page it sees that cache is too large, so it
replaces 1st page with 2001st. Then when you want to execute this
query once more it need to start reading from 1st page. It's not in
cache and cache is too large so it replaces the oldest page - frankly
it's the 2nd one. After 1st page it needs to read 2nd which is not in
cache and it replaces the oldest page - 3rd one, and so on.

As you see simple LRU algorithm in this particular case forces SQLite
to read all pages again although they were in memory a tiny moment
ago. And the problem can be solved only by temporary (e.g. for the
time of statement execution) allowing the cache to grow above set
limit.

Pavel

On Wed, Sep 22, 2010 at 10:36 AM, Max Vlasov <max.vla...@gmail.com> wrote:
> Hi,
>
> playing with my admin about cache size (there was a question today related
> to the cache size), noticed a strange thing with cache. It's Windows,
> initially it was v3.6.10, but the same is for 3.7.2
>
> I sometimes mentioned that I can track vfs requests for every select in my
> admin. In this case it was interesting to look at the numbers related to the
> cache size.
>
> The default PRAGA page_size=2000; so the threshold level is about 2000*1024
> = 2,048,000 bytes.
> A select for a test table filled with wikipedia abstract article was
>
> SELECT Count(*) FROM WikiData WHERE Id < ?
>
> This query doesn't produce tiny result in the front-end, but at the same
> time uses cache and tried to keep the pages accessed in it.
>
> And the number of ? that corresponds to the cache_size threshold is about
> 9813.
>
> The things I noticed:
> - If the select accesses the pages total size of whose is lower than the
> cache size, everything is ok. I mean, trying to select any subset of the
> data leads to only 16 bytes reading.
> SELECT Count(*) FROM WikiData WHERE Id < 9812   =>   VFS   2,045,968
> SELECT Count(*) FROM WikiData WHERE Id < 1000   =>   VFS   16
> SELECT Count(*) FROM WikiData WHERE Id > 5000 and id < 7000   =>   VFS   16
> these are expected results
>
> - But when there was a request that filled all the cache and the following
> request is a subset of the first one, the latter unfortunately doesn't take
> advantage of the pages already in cache
> SELECT Count(*) FROM WikiData WHERE Id < 10000   => VFS 2,082,838
> SELECT Count(*) FROM WikiData WHERE Id < 9800  => VFS 2,023,440
>
> - Also when there are two consecutive requests that probably larger than the
> cache size, they will do the full reading (or almost full reading)
> SELECT Count(*) FROM WikiData WHERE Id < 10000   => VFS 2,082,838
> SELECT Count(*) FROM WikiData WHERE Id < 10001  => VFS 2,062,352
> SELECT Count(*) FROM WikiData WHERE Id < 9999  => VFS 2,061,328
>
> Is it ok for cache to behave like this or some optimization is possible to
> fix this?
>
> I don't know if it's possible to do similar tests with other front-ends. If
> someone really interested, I can upload my tool, but it's just not
> completely ready for mainstream.
>
> Thanks,
>
> Max
> _______________________________________________
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Strange cache behavior

Reply via email to