Re: [sqlite] experimental (better?) usage based sqlite cache

2013-02-25 Thread Gabriel Corneanu

That's why I asked about feedback.
My implementation is ~100 lines longer, so I think it's still "lite".
There is nothing complex in it; apart from heap implementation, there 
are quite a few simplifications in the original code.


Gabriel

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] experimental (better?) usage based sqlite cache

2013-02-25 Thread Gabriel Corneanu
I thought more about a "minus" (subtract minimum), but this might be a 
better option.


Regards,
Gabriel

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] experimental (better?) usage based sqlite cache

2013-02-25 Thread Ryan Johnson

On 25/02/2013 7:24 AM, Simon Slavin wrote:

On 25 Feb 2013, at 11:33am, Howard Chu  wrote:


Gabriel Corneanu wrote:

Following a few other discussions, I had the feeling that sqlite should
benefit from a cache which discards cached pages in a least frequently
used order.

Just offhand, classical LRU is quite poor in terms of lock overhead.

Gabriel writes "least frequently used".  Howard writes "least recently used".  
You're not writing about the same thing.

And the speed advantages of any algorithm used must be assessed before anything 
new is implemented.  SQLite is meant to be 'lite' and have almost nothing in.  
Complicated algorithms and use-counting should be added only if they improve 
things a lot.
(This is also partly a response to Howard, who suggests to just rely on 
the OS fs cache)


There seem to be quite a few sqlite3 users on platforms where I would 
not trust the OS to provide effective caching (embedded, smartphone, 
etc.) and we have seen complaints on the list from them about slow I/O. 
Granted, that's often due to logging overheads, but if we're going to 
see any real benefit from some new caching strategy in sqlite3, it's 
likely to show up there.


As for `lite', this could a pluggable extension, just like the fts 
stuff. Lots of people don't need it, but it can be acquired if needed.


Ryan

P.S. locking overheads are a bogey-man on a belligerently 
single-threaded system...


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] experimental (better?) usage based sqlite cache

2013-02-25 Thread Hick Gunter
Just before the first fetch counter overflows, right shift all counters by 1. 
This does not change the purge order, keeps the counters in limits and happens 
infrequently (depending on the size of the counter).

-Ursprüngliche Nachricht-
Von: Gabriel Corneanu [mailto:gabrielcorne...@gmail.com]
Gesendet: Montag, 25. Februar 2013 12:25
An: sqlite-users@sqlite.org
Betreff: [sqlite] experimental (better?) usage based sqlite cache

Following a few other discussions, I had the feeling that sqlite should benefit 
from a cache which discards cached pages in a least frequently used order.
It generally means, index pages, often used data pages, etc, should be 
preferred (meaning kept in memory) compared to some infrequent used pages.
This helps me where I have big files which are mostly written once, but I also 
have some small tables with summaries; these should be better cached, the same 
for indices.

I first implemented a custom cache in Delphi (Pascal) using some high level 
(generic) containers (hash for keys,  heap for usage data); there is a 
significant overhead due to the classes I used and maybe also compiler 
differences.
My own usage shows some visible improvements, therefore I took some time to 
implement it directly in core (pcache1).

I would like to ask anyone who sees this interesting to try and give some 
feedback about benefits (if at all :)).
Feedback / results / benchmarks are welcome.

If it is useful, I would be happy to contribute it.
The diff is done against 3.7.15.2 ; I'm not sure if it makes it to the list, so 
here is the diff text:
http://pastebin.com/RrzqWjWv

Technical details:
- each page has a fetch counter
- the LRU list is changed to a heap, arranged according to this counter
- when discarding pages, the page with the minimum fetch counter is selected 
Apart from the heap operations, the other changes are quite straightforward.

I run some tests to check for errors, maybe someone can check if the 
initialization is done in proper place (especially for shared cache group).
There is an important catch; the fetch counter overflow. I don't have yet a 
definitive idea how/when to limit or to correct it.
So this problem is currently postponed until the tests show actual benefit / 
interest.

Regards,
Gabriel





--
 Gunter Hick
Software Engineer
Scientific Games International GmbH
Klitschgasse 2 – 4, A - 1130 Vienna, Austria
FN 157284 a, HG Wien
Tel: +43 1 80100 0
E-Mail: h...@scigames.at

This e-mail is confidential and may well also be legally privileged. If you 
have received it in error, you are on notice as to its status and accordingly 
please notify us immediately by reply e-mail and then delete this message from 
your system. Please do not copy it or use it for any purposes, or disclose its 
contents to any person as to do so could be a breach of confidence. Thank you 
for your cooperation.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] experimental (better?) usage based sqlite cache

2013-02-25 Thread Howard Chu

Simon Slavin wrote:


On 25 Feb 2013, at 11:33am, Howard Chu  wrote:


Gabriel Corneanu wrote:

Following a few other discussions, I had the feeling that sqlite should
benefit from a cache which discards cached pages in a least frequently
used order.


Just offhand, classical LRU is quite poor in terms of lock overhead.


Gabriel writes "least frequently used".  Howard writes "least recently used".  
You're not writing about the same thing.


Doh, you're right. Sorry for the noise, going back to get some caffeine now.

--
  -- Howard Chu
  CTO, Symas Corp.   http://www.symas.com
  Director, Highland Sun http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] experimental (better?) usage based sqlite cache

2013-02-25 Thread Simon Slavin

On 25 Feb 2013, at 11:33am, Howard Chu  wrote:

> Gabriel Corneanu wrote:
>> Following a few other discussions, I had the feeling that sqlite should
>> benefit from a cache which discards cached pages in a least frequently
>> used order.
> 
> Just offhand, classical LRU is quite poor in terms of lock overhead.

Gabriel writes "least frequently used".  Howard writes "least recently used".  
You're not writing about the same thing.

And the speed advantages of any algorithm used must be assessed before anything 
new is implemented.  SQLite is meant to be 'lite' and have almost nothing in.  
Complicated algorithms and use-counting should be added only if they improve 
things a lot.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] experimental (better?) usage based sqlite cache

2013-02-25 Thread Gabriel Corneanu

I'm not sure we're talking about the same thing.
For me caching here means avoiding IO, not memory and/or locks.
The heap itself also needs some work, but logarithmic.
The default value of 2000 pages cache should me enough for most useful 
pages (indices, roots...), having an overhead of max ~11 entries to update.


Regards,
Gabriel
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] experimental (better?) usage based sqlite cache

2013-02-25 Thread Howard Chu

Gabriel Corneanu wrote:

Following a few other discussions, I had the feeling that sqlite should
benefit from a cache which discards cached pages in a least frequently
used order.


Just offhand, classical LRU is quite poor in terms of lock overhead. The CLOCK 
refinement scales much better, because no reorganizing of LRU lists is needed 
during page references.


And of course, having gone thru all of these exercises of fancy 
application-level cache algorithms already, it's still obvious that the best 
approach is to leave it to the kernel.


--
  -- Howard Chu
  CTO, Symas Corp.   http://www.symas.com
  Director, Highland Sun http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] experimental (better?) usage based sqlite cache

2013-02-25 Thread Gabriel Corneanu
Following a few other discussions, I had the feeling that sqlite should 
benefit from a cache which discards cached pages in a least frequently 
used order.
It generally means, index pages, often used data pages, etc, should be 
preferred (meaning kept in memory) compared to some infrequent used pages.
This helps me where I have big files which are mostly written once, but 
I also have some small tables with summaries; these should be better 
cached, the same for indices.


I first implemented a custom cache in Delphi (Pascal) using some high 
level (generic) containers (hash for keys,  heap for usage data); there 
is a significant overhead due to the classes I used and maybe also 
compiler differences.
My own usage shows some visible improvements, therefore I took some time 
to implement it directly in core (pcache1).


I would like to ask anyone who sees this interesting to try and give 
some feedback about benefits (if at all :)).

Feedback / results / benchmarks are welcome.

If it is useful, I would be happy to contribute it.
The diff is done against 3.7.15.2 ; I'm not sure if it makes it to the 
list, so here is the diff text:

http://pastebin.com/RrzqWjWv

Technical details:
- each page has a fetch counter
- the LRU list is changed to a heap, arranged according to this counter
- when discarding pages, the page with the minimum fetch counter is selected
Apart from the heap operations, the other changes are quite straightforward.

I run some tests to check for errors, maybe someone can check if the 
initialization is done in proper place (especially for shared cache group).
There is an important catch; the fetch counter overflow. I don't have 
yet a definitive idea how/when to limit or to correct it.
So this problem is currently postponed until the tests show actual 
benefit / interest.


Regards,
Gabriel



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users