Dear all, I'm facing a problem when I run large graph job (currently 1.6T, will be 16T then), it always shows java.lang.OutOfMemoryError: Java heap space error when loaded specific numbers of vertex(near 59000000). I tried to add like: -Dgiraph.useOutOfCoreGraph=true -Dmapred.child.java.opts="-XX:-UseGCOverheadLimit" OR -Dmapred.child.java.opts="-Xmx16384" -Dgiraph.yarn.task.heap.mb=36570
but the problem remain though I can see those value are shown in Metadata. I'm not sure the max value of memory in this VertexInputSplitsCallable info is related to java heap size. INFO [load-0] org.apache.giraph.worker.VertexInputSplitsCallable: readVertexInputSplit: Loaded 46975802 vertices at 68977.49310291892 vertices/sec 0 edges at 0.0 edges/sec Memory (free/total/max) = 475.08M / 2759.00M / 2759.00M But I am noticed in main log, it *always* shows: INFO [AsyncDispatcher event handler] org.apache.hadoop.mapred.JobConf: Task java-opts do not specify heap size. Setting task attempt jvm max heap size to -Xmx2868m *no matter what arguments I added*. Even when I run normal Hadoop jobs. Any ideas about this? Following is the log. 2016-10-20 21:25:49,008 ERROR [netty-client-worker-2] org.apache.giraph.comm.netty.NettyClient: Request failed java.lang.OutOfMemoryError: Java heap space at io.netty.buffer.UnpooledHeapByteBuf.<init>(UnpooledHeapByteBuf.java:45) at io.netty.buffer.UnpooledByteBufAllocator.newHeapBuffer(UnpooledByteBufAllocator.java:43) at io.netty.buffer.AbstractByteBufAllocator.heapBuffer(AbstractByteBufAllocator.java:136) at io.netty.buffer.AbstractByteBufAllocator.heapBuffer(AbstractByteBufAllocator.java:127) at io.netty.buffer.AbstractByteBufAllocator.buffer(AbstractByteBufAllocator.java:85) at org.apache.giraph.comm.netty.handler.RequestEncoder.write(RequestEncoder.java:81) at io.netty.channel.DefaultChannelHandlerContext.invokeWrite(DefaultChannelHandlerContext.java:645) at io.netty.channel.DefaultChannelHandlerContext.access$2000(DefaultChannelHandlerContext.java:29) at io.netty.channel.DefaultChannelHandlerContext$WriteTask.run(DefaultChannelHandlerContext.java:906) at io.netty.util.concurrent.DefaultEventExecutor.run(DefaultEventExecutor.java:36) at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:101) at java.lang.Thread.run(Thread.java:745) 2016-10-20 21:25:55,299 ERROR [netty-client-worker-1] org.apache.giraph.comm.netty.NettyClient: Request failed java.lang.OutOfMemoryError: Java heap space at io.netty.buffer.UnpooledHeapByteBuf.<init>(UnpooledHeapByteBuf.java:45) at io.netty.buffer.UnpooledByteBufAllocator.newHeapBuffer(UnpooledByteBufAllocator.java:43) at io.netty.buffer.AbstractByteBufAllocator.heapBuffer(AbstractByteBufAllocator.java:136) at io.netty.buffer.AbstractByteBufAllocator.heapBuffer(AbstractByteBufAllocator.java:127) at io.netty.buffer.AbstractByteBufAllocator.buffer(AbstractByteBufAllocator.java:85) at org.apache.giraph.comm.netty.handler.RequestEncoder.write(RequestEncoder.java:81) at io.netty.channel.DefaultChannelHandlerContext.invokeWrite(DefaultChannelHandlerContext.java:645) at io.netty.channel.DefaultChannelHandlerContext.access$2000(DefaultChannelHandlerContext.java:29) at io.netty.channel.DefaultChannelHandlerContext$WriteTask.run(DefaultChannelHandlerContext.java:906) at io.netty.util.concurrent.DefaultEventExecutor.run(DefaultEventExecutor.java:36) at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:101) at java.lang.Thread.run(Thread.java:745) 2016-10-20 21:26:06,731 ERROR [main] org.apache.giraph.graph.GraphMapper: Caught an unrecoverable exception waitFor: ExecutionException occurred while waiting for org.apache.giraph.utils.ProgressableUtils$FutureWaitable@6737a445 java.lang.IllegalStateException: waitFor: ExecutionException occurred while waiting for org.apache.giraph.utils.ProgressableUtils$FutureWaitable@6737a445 at org.apache.giraph.utils.ProgressableUtils.waitFor(ProgressableUtils.java:193) at org.apache.giraph.utils.ProgressableUtils.waitForever(ProgressableUtils.java:151) at org.apache.giraph.utils.ProgressableUtils.waitForever(ProgressableUtils.java:136) at org.apache.giraph.utils.ProgressableUtils.getFutureResult(ProgressableUtils.java:99) at org.apache.giraph.utils.ProgressableUtils.getResultsWithNCallables(ProgressableUtils.java:233) at org.apache.giraph.worker.BspServiceWorker.loadInputSplits(BspServiceWorker.java:316) at org.apache.giraph.worker.BspServiceWorker.loadVertices(BspServiceWorker.java:409) at org.apache.giraph.worker.BspServiceWorker.setup(BspServiceWorker.java:629) at org.apache.giraph.graph.GraphTaskManager.execute(GraphTaskManager.java:284) at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:93) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) Caused by: java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: Java heap space at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:202) at org.apache.giraph.utils.ProgressableUtils$FutureWaitable.waitFor(ProgressableUtils.java:312) at org.apache.giraph.utils.ProgressableUtils.waitFor(ProgressableUtils.java:185) ... 16 more Caused by: java.lang.OutOfMemoryError: Java heap space at org.apache.giraph.utils.UnsafeByteArrayOutputStream.<init>(UnsafeByteArrayOutputStream.java:81) at org.apache.giraph.conf.ImmutableClassesGiraphConfiguration.createExtendedDataOutput(ImmutableClassesGiraphConfiguration.java:1161) at org.apache.giraph.comm.SendPartitionCache.addVertex(SendPartitionCache.java:77) at org.apache.giraph.comm.netty.NettyWorkerClientRequestProcessor.sendVertexRequest(NettyWorkerClientRequestProcessor.java:248) at org.apache.giraph.worker.VertexInputSplitsCallable.readInputSplit(VertexInputSplitsCallable.java:231) at org.apache.giraph.worker.InputSplitsCallable.loadInputSplit(InputSplitsCallable.java:267) at org.apache.giraph.worker.InputSplitsCallable.call(InputSplitsCallable.java:211) at org.apache.giraph.worker.InputSplitsCallable.call(InputSplitsCallable.java:60) at org.apache.giraph.utils.LogStacktraceCallable.call(LogStacktraceCallable.java:51) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) 2016-10-20 21:26:06,737 ERROR [main] org.apache.giraph.worker.BspServiceWorker: unregisterHealth: Got failure, unregistering health on /_hadoopBsp/job_1476386340018_0175/_applicationAttemptsDir/0/_superstepDir/-1/_workerHealthyDir/hadoop18.umd.com_23 on superstep -1 2016-10-20 21:26:06,746 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : java.lang.IllegalStateException: run: Caught an unrecoverable exception waitFor: ExecutionException occurred while waiting for org.apache.giraph.utils.ProgressableUtils$FutureWaitable@6737a445 at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:104) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) Caused by: java.lang.IllegalStateException: waitFor: ExecutionException occurred while waiting for org.apache.giraph.utils.ProgressableUtils$FutureWaitable@6737a445 at org.apache.giraph.utils.ProgressableUtils.waitFor(ProgressableUtils.java:193) at org.apache.giraph.utils.ProgressableUtils.waitForever(ProgressableUtils.java:151) at org.apache.giraph.utils.ProgressableUtils.waitForever(ProgressableUtils.java:136) at org.apache.giraph.utils.ProgressableUtils.getFutureResult(ProgressableUtils.java:99) at org.apache.giraph.utils.ProgressableUtils.getResultsWithNCallables(ProgressableUtils.java:233) at org.apache.giraph.worker.BspServiceWorker.loadInputSplits(BspServiceWorker.java:316) at org.apache.giraph.worker.BspServiceWorker.loadVertices(BspServiceWorker.java:409) at org.apache.giraph.worker.BspServiceWorker.setup(BspServiceWorker.java:629) at org.apache.giraph.graph.GraphTaskManager.execute(GraphTaskManager.java:284) at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:93) ... 7 more Caused by: java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: Java heap space at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:202) at org.apache.giraph.utils.ProgressableUtils$FutureWaitable.waitFor(ProgressableUtils.java:312) at org.apache.giraph.utils.ProgressableUtils.waitFor(ProgressableUtils.java:185) ... 16 more Caused by: java.lang.OutOfMemoryError: Java heap space at org.apache.giraph.utils.UnsafeByteArrayOutputStream.<init>(UnsafeByteArrayOutputStream.java:81) at org.apache.giraph.conf.ImmutableClassesGiraphConfiguration.createExtendedDataOutput(ImmutableClassesGiraphConfiguration.java:1161) at org.apache.giraph.comm.SendPartitionCache.addVertex(SendPartitionCache.java:77) at org.apache.giraph.comm.netty.NettyWorkerClientRequestProcessor.sendVertexRequest(NettyWorkerClientRequestProcessor.java:248) at org.apache.giraph.worker.VertexInputSplitsCallable.readInputSplit(VertexInputSplitsCallable.java:231) at org.apache.giraph.worker.InputSplitsCallable.loadInputSplit(InputSplitsCallable.java:267) at org.apache.giraph.worker.InputSplitsCallable.call(InputSplitsCallable.java:211) at org.apache.giraph.worker.InputSplitsCallable.call(InputSplitsCallable.java:60) at org.apache.giraph.utils.LogStacktraceCallable.call(LogStacktraceCallable.java:51) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Thank you so much! Best, Hai