[ https://issues.apache.org/jira/browse/ACCUMULO-4626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15976955#comment-15976955 ]
Josh Elser commented on ACCUMULO-4626: -------------------------------------- Gotcha, the monitor was just the first place I would've looked for cache info off the bat. TabletServer logs make a bit more sense. > improve cache hit rate via weak reference map > --------------------------------------------- > > Key: ACCUMULO-4626 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4626 > Project: Accumulo > Issue Type: Improvement > Components: tserver > Reporter: Adam Fuchs > Labels: performance, stability > Time Spent: 10m > Remaining Estimate: 0h > > When a single iterator tree references the same RFile blocks in different > branches we sometimes get cache misses for one iterator even though the > requested block is held in memory by another iterator. This is particularly > important when using something like the IntersectingIterator to intersect > many deep copies. Instead of evicting completely, keeping evicted blocks into > a WeakReference value map can avoid re-reading blocks that are currently > referenced by another deep copied source iterator. > We've seen this in the field for some of Sqrrl's queries against very large > tablets. The total memory usage for these queries can be equal to the size of > all the iterator block reads times the number of readahead threads times the > number of files times the number of IntersectingIterator children when cache > miss rates are high. This might work out to something like: > {code} > 16 readahead threads * 200 deep copied children * 99% cache miss rate * 20 > files * 252KB per reader = ~16GB of memory > {code} > In most cases, evicting to a weak reference value map changes the cache miss > rate from very high to very low and has a dramatic effect on total memory > usage. -- This message was sent by Atlassian JIRA (v6.3.15#6346)