Try dumping a histogram of memory usage from a running JVM and see where the memory is going. I can't think of anything in particular that changed...

On 8/28/13 4:39 PM, Jeff Peters wrote:

I am tasked with updating our ancient (circa 7/10/2012) Giraph to giraph-release-1.0.0-RC3. Most jobs run fine but our largest job now runs out of memory using the same AWS elastic-mapreduce configuration we have always used. I have never tried to configure either Giraph or the AWS Hadoop. We build for Hadoop 1.0.2 because that's closest to the 1.0.3 AWS provides us. The 8 X m2.4xlarge cluster we use seems to provide 8*14=112 map tasks fitted out with 2GB heap each. Our code is completely unchanged except as required to adapt to the new Giraph APIs. Our vertex, edge, and message data are completely unchanged. On smaller jobs, that work, the aggregate heap usage high-water mark seems about the same as before, but the "committed heap" seems to run higher. I can't even make it work on a cluster of 12. In that case I get one map task that seems to end up with nearly twice as many messages as most of the others so it runs out of memory anyway. It only takes one to fail the job. Am I missing something here? Should I be configuring my new Giraph in some way I didn't used to need to with the old one?


Reply via email to