+1 for compressed pointers. Sent from my mobile. Please excuse the typos.
On 2010-06-22, at 4:18 AM, Steve Loughran <ste...@apache.org> wrote: > Bobby Dennett wrote: >> Thanks all for your suggestions (please note that Tan is my co-worker; >> we are both working to try and resolve this issue)... we experienced >> another hang this weekend and increased the HADOOP_HEAPSIZE setting to >> 6000 (MB) as we do periodically see "java.lang.OutOfMemoryError: Java >> heap space" errors in the jobtracker log. We are now looking into the >> resource allocation of the master node/server to ensure we aren't >> experiencing any issues due to the heap size increase. In parallel, we >> are also working on building "beefier" servers -- stronger CPUs, 3x more >> memory -- for the node running the primary namenode and jobtracker >> processes as well as for the secondary namenode. >> >> Any additional suggestions you might have for troubleshooting/resolving >> this hanging jobtracker issue would be greatly appreciated. > > Have you tried > * using compressed object pointers on java 6 server? They reduce space > > * bolder: JRockit JVM. Not officially supported in Hadoop, but I liked > using right up until oracle stopped giving away the updates with > security patches. It has a way better heap as well as compressed > pointers for a long time (==more stable code) > > I'm surprised its the JT that is OOM-ing, anecdotally its the NN and > 2ary NN that use more, especially if the files are many and the > blocksize small. the JT should not be tracking that much data over time