All,

I was reading details on a recent update to Apache Hadoop engine and one of the 
changes dealt with a change to their cache algorithm.

Hadoop had a cache which used a simple LRU algorithm.  As we know most LRU 
algorithms have the problem that "hot" pages can be flushed when a query which 
requires reading millions of pages is submitted - the Firebird equivalent of a 
full table scan - The reads for old pages pushes "hot" pages from the cache.

Their solution was to define 2 cache levels:

-       Level 1: for pages which had only been accessed a single time - this 
used a simple index for locating the page in the list.  Once a page had been 
accessed more than once it was promoted to the second cache.

-       Level 2: for pages which had been accessed/references more than once - 
this uses an LRU.

Within the configuration the ratio of memory allocated between the caches 
(default = 25%/75% IIRC)

I realize that 2 caches will make single cache requests slower, but it has the 
benefit of more likely keeping the "hot" pages in memory for longer, thus 
improving overall performance.

I also know that there has been historical discussion of changing the engine to 
recognize table scan/NATURAL and backup to modify the cache operation.

But I wonder if this would be an approach that Firebird should consider, since 
it seems to address the known issues while not requiring significant 
modifications.


Sean


------------------------------------------------------------------------------
Is your legacy SCM system holding you back? Join Perforce May 7 to find out:
• 3 signs your SCM is hindering your productivity
• Requirements for releasing software faster
• Expert tips and advice for migrating your SCM now
http://p.sf.net/sfu/perforce
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel

Reply via email to