Re: Large static structures in M/R heap

2013-02-27 Thread David Rosenstrauch
On 02/27/2013 01:42 PM, Adam Phelps wrote: We have a job that uses a large lookup structure that gets created as a static class during the map setup phase (and we have the JVM reused so this only takes place once). However of late this structure has grown drastically (due to items beyond our con

Re: Large static structures in M/R heap

2013-02-27 Thread Robert Evans
Have you looked at things like CDB http://cr.yp.to/cdb.html that would allow you to keep most of the file on disk and cache hot parts in memory. That really depends on your access pattern. Alternatively you could give yourself more heap and take up two slots for your map task. Also if it is big e

Re: Large static structures in M/R heap

2013-02-27 Thread Adam Phelps
We actually use CDBs a good bit outside of M/R. This is something worth looking into, but the big structure we're currently using is a giant tree-based lookup table whose access pattern is pretty random, so I don't think caching would be of much use. There is a lesser (but still large) structure