Ok thanks Avery, I'll try it. I'm not sure I know how I would do that on a running AWS EMR instance, but I can do it on a local stand-alone Hadoop running a smaller version of the job and see if anything jumps out.
On Wed, Aug 28, 2013 at 4:57 PM, Avery Ching <ach...@apache.org> wrote: > Try dumping a histogram of memory usage from a running JVM and see where > the memory is going. I can't think of anything in particular that > changed... > > > On 8/28/13 4:39 PM, Jeff Peters wrote: > >> >> I am tasked with updating our ancient (circa 7/10/2012) Giraph to >> giraph-release-1.0.0-RC3. Most jobs run fine but our largest job now runs >> out of memory using the same AWS elastic-mapreduce configuration we have >> always used. I have never tried to configure either Giraph or the AWS >> Hadoop. We build for Hadoop 1.0.2 because that's closest to the 1.0.3 AWS >> provides us. The 8 X m2.4xlarge cluster we use seems to provide 8*14=112 >> map tasks fitted out with 2GB heap each. Our code is completely unchanged >> except as required to adapt to the new Giraph APIs. Our vertex, edge, and >> message data are completely unchanged. On smaller jobs, that work, the >> aggregate heap usage high-water mark seems about the same as before, but >> the "committed heap" seems to run higher. I can't even make it work on a >> cluster of 12. In that case I get one map task that seems to end up with >> nearly twice as many messages as most of the others so it runs out of >> memory anyway. It only takes one to fail the job. Am I missing something >> here? Should I be configuring my new Giraph in some way I didn't used to >> need to with the old one? >> >> >