All,
I was reading details on a recent update to Apache Hadoop engine and one of the
changes dealt with a change to their cache algorithm.
Hadoop had a cache which used a simple LRU algorithm. As we know most LRU
algorithms have the problem that "hot" pages can be flushed when a query which
requires reading millions of pages is submitted - the Firebird equivalent of a
full table scan - The reads for old pages pushes "hot" pages from the cache.
Their solution was to define 2 cache levels:
- Level 1: for pages which had only been accessed a single time - this
used a simple index for locating the page in the list. Once a page had been
accessed more than once it was promoted to the second cache.
- Level 2: for pages which had been accessed/references more than once -
this uses an LRU.
Within the configuration the ratio of memory allocated between the caches
(default = 25%/75% IIRC)
I realize that 2 caches will make single cache requests slower, but it has the
benefit of more likely keeping the "hot" pages in memory for longer, thus
improving overall performance.
I also know that there has been historical discussion of changing the engine to
recognize table scan/NATURAL and backup to modify the cache operation.
But I wonder if this would be an approach that Firebird should consider, since
it seems to address the known issues while not requiring significant
modifications.
Sean
------------------------------------------------------------------------------
Is your legacy SCM system holding you back? Join Perforce May 7 to find out:
• 3 signs your SCM is hindering your productivity
• Requirements for releasing software faster
• Expert tips and advice for migrating your SCM now
http://p.sf.net/sfu/perforce
Firebird-Devel mailing list, web interface at
https://lists.sourceforge.net/lists/listinfo/firebird-devel