Thanks Agrta Thanks for your response. How exact I can do to increase min and max RAM?(in which conf file or by using any command/arguments? my giraph-site.xml is empty as default).
As I saw online how to increase the heap size(not sure it is the same thing like you mentioned min max RAM size), many people suggest to increase: mapred.child.java.opts OR HADOOP_DATANODE_OPTS But they are not help. My problem happen during "VertexInputSplitsCallable: readVertexInputSplit:", so I tried to increase mapreduce.map.memory.mb and decrease # of container/workers. Currently I'm using 248 workers and mapreduce.map.memory.mb=12000, ratio=0.7. This can help but I face new problem: 1. The superstep -1 is extremely slow, like take 7-8 hours to load a 150G graph: e.g. org.apache.giraph.master.BspServiceMaster: barrierOnWorkerList: 106 out of 248 workers finished on superstep -1 on path /_hadoopBsp/job_1477020594559_ 0012/_vertexInputSplitDoneDir I saw in log like: INFO [main] org.apache.giraph.comm.netty.NettyClient: logInfoAboutOpenRequests: Waiting interval of 15000 msecs, 2499 open requests, waiting for it to be <= 0, MBytes/sec received = 0.0001, MBytesReceived = 0.0058, ave received req MBytes = 0, secs waited = 92.12 MBytes/sec sent = 10.4373, MBytesSent = 961.4983, ave sent req MBytes = 0.3244, secs waited = 92.12 To finish those 2499 open requests will take a very long time. *I'm not sure is this normal?* 2. I tried out-of-core graph option but I'm not sure I'm using it correct. I did add -Dgiraph.useOutOfCoreGraph=true -ca isStaticGraph=true,giraph.maxPartitionsInMemory=10. But how I know if it is work? I doubt when I tried 15T graph, the problem will be worse. What should I do? Thanks for your help. Best, Hai On Sun, Oct 23, 2016 at 7:11 AM, Agrta Rawat <agrta.ra...@gmail.com> wrote: > Hi Hai, > > Please check your giraph configurations. Try increasing min and max RAM > size in your configurations. > This should help. > > Regards, > Agrta Rawat > > > On Sat, Oct 22, 2016 at 7:46 PM, Hai Lan <lanhai1...@gmail.com> wrote: > >> Can anyone help with this? >> >> Thanks a lot! >> >> >> On Thu, Oct 20, 2016 at 9:48 PM, Hai Lan <lanhai1...@gmail.com> wrote: >> >>> Dear all, >>> >>> I'm facing a problem when I run large graph job (currently 1.6T, will be >>> 16T then), it always shows java.lang.OutOfMemoryError: Java heap >>> space error when loaded specific numbers of vertex(near 59000000). I tried >>> to add like: >>> -Dgiraph.useOutOfCoreGraph=true >>> -Dmapred.child.java.opts="-XX:-UseGCOverheadLimit" OR >>> -Dmapred.child.java.opts="-Xmx16384" >>> -Dgiraph.yarn.task.heap.mb=36570 >>> >>> but the problem remain though I can see those value are shown in >>> Metadata. >>> >>> I'm not sure the max value of memory in this VertexInputSplitsCallable >>> info is related to java heap size. >>> INFO [load-0] org.apache.giraph.worker.VertexInputSplitsCallable: >>> readVertexInputSplit: Loaded 46975802 vertices at 68977.49310291892 >>> vertices/sec 0 edges at 0.0 edges/sec Memory (free/total/max) = 475.08M / >>> 2759.00M / 2759.00M >>> >>> But I am noticed in main log, it *always* shows: >>> INFO [AsyncDispatcher event handler] org.apache.hadoop.mapred.JobConf: >>> Task java-opts do not specify heap size. Setting task attempt jvm max heap >>> size to -Xmx2868m >>> *no matter what arguments I added*. Even when I run normal Hadoop jobs. >>> >>> Any ideas about this? Following is the log. >>> >>> 2016-10-20 21:25:49,008 ERROR [netty-client-worker-2] >>> org.apache.giraph.comm.netty.NettyClient: Request failed >>> java.lang.OutOfMemoryError: Java heap space >>> at io.netty.buffer.UnpooledHeapByteBuf.<init>(UnpooledHeapByteB >>> uf.java:45) >>> at io.netty.buffer.UnpooledByteBufAllocator.newHeapBuffer(Unpoo >>> ledByteBufAllocator.java:43) >>> at io.netty.buffer.AbstractByteBufAllocator.heapBuffer(Abstract >>> ByteBufAllocator.java:136) >>> at io.netty.buffer.AbstractByteBufAllocator.heapBuffer(Abstract >>> ByteBufAllocator.java:127) >>> at io.netty.buffer.AbstractByteBufAllocator.buffer(AbstractByte >>> BufAllocator.java:85) >>> at org.apache.giraph.comm.netty.handler.RequestEncoder.write(Re >>> questEncoder.java:81) >>> at io.netty.channel.DefaultChannelHandlerContext.invokeWrite(De >>> faultChannelHandlerContext.java:645) >>> at io.netty.channel.DefaultChannelHandlerContext.access$2000(De >>> faultChannelHandlerContext.java:29) >>> at io.netty.channel.DefaultChannelHandlerContext$WriteTask.run( >>> DefaultChannelHandlerContext.java:906) >>> at io.netty.util.concurrent.DefaultEventExecutor.run(DefaultEve >>> ntExecutor.java:36) >>> at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(Sin >>> gleThreadEventExecutor.java:101) >>> at java.lang.Thread.run(Thread.java:745) >>> 2016-10-20 21:25:55,299 ERROR [netty-client-worker-1] >>> org.apache.giraph.comm.netty.NettyClient: Request failed >>> java.lang.OutOfMemoryError: Java heap space >>> at io.netty.buffer.UnpooledHeapByteBuf.<init>(UnpooledHeapByteB >>> uf.java:45) >>> at io.netty.buffer.UnpooledByteBufAllocator.newHeapBuffer(Unpoo >>> ledByteBufAllocator.java:43) >>> at io.netty.buffer.AbstractByteBufAllocator.heapBuffer(Abstract >>> ByteBufAllocator.java:136) >>> at io.netty.buffer.AbstractByteBufAllocator.heapBuffer(Abstract >>> ByteBufAllocator.java:127) >>> at io.netty.buffer.AbstractByteBufAllocator.buffer(AbstractByte >>> BufAllocator.java:85) >>> at org.apache.giraph.comm.netty.handler.RequestEncoder.write(Re >>> questEncoder.java:81) >>> at io.netty.channel.DefaultChannelHandlerContext.invokeWrite(De >>> faultChannelHandlerContext.java:645) >>> at io.netty.channel.DefaultChannelHandlerContext.access$2000(De >>> faultChannelHandlerContext.java:29) >>> at io.netty.channel.DefaultChannelHandlerContext$WriteTask.run( >>> DefaultChannelHandlerContext.java:906) >>> at io.netty.util.concurrent.DefaultEventExecutor.run(DefaultEve >>> ntExecutor.java:36) >>> at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(Sin >>> gleThreadEventExecutor.java:101) >>> at java.lang.Thread.run(Thread.java:745) >>> 2016-10-20 21:26:06,731 ERROR [main] org.apache.giraph.graph.GraphMapper: >>> Caught an unrecoverable exception waitFor: ExecutionException occurred >>> while waiting for org.apache.giraph.utils.Progre >>> ssableUtils$FutureWaitable@6737a445 >>> java.lang.IllegalStateException: waitFor: ExecutionException occurred >>> while waiting for org.apache.giraph.utils.Progre >>> ssableUtils$FutureWaitable@6737a445 >>> at org.apache.giraph.utils.ProgressableUtils.waitFor(Progressab >>> leUtils.java:193) >>> at org.apache.giraph.utils.ProgressableUtils.waitForever(Progre >>> ssableUtils.java:151) >>> at org.apache.giraph.utils.ProgressableUtils.waitForever(Progre >>> ssableUtils.java:136) >>> at org.apache.giraph.utils.ProgressableUtils.getFutureResult(Pr >>> ogressableUtils.java:99) >>> at org.apache.giraph.utils.ProgressableUtils.getResultsWithNCal >>> lables(ProgressableUtils.java:233) >>> at org.apache.giraph.worker.BspServiceWorker.loadInputSplits(Bs >>> pServiceWorker.java:316) >>> at org.apache.giraph.worker.BspServiceWorker.loadVertices(BspSe >>> rviceWorker.java:409) >>> at org.apache.giraph.worker.BspServiceWorker.setup(BspServiceWo >>> rker.java:629) >>> at org.apache.giraph.graph.GraphTaskManager.execute(GraphTaskMa >>> nager.java:284) >>> at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:93) >>> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787) >>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) >>> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) >>> at java.security.AccessController.doPrivileged(Native Method) >>> at javax.security.auth.Subject.doAs(Subject.java:415) >>> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGro >>> upInformation.java:1693) >>> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) >>> Caused by: java.util.concurrent.ExecutionException: >>> java.lang.OutOfMemoryError: Java heap space >>> at java.util.concurrent.FutureTask.report(FutureTask.java:122) >>> at java.util.concurrent.FutureTask.get(FutureTask.java:202) >>> at org.apache.giraph.utils.ProgressableUtils$FutureWaitable.wai >>> tFor(ProgressableUtils.java:312) >>> at org.apache.giraph.utils.ProgressableUtils.waitFor(Progressab >>> leUtils.java:185) >>> ... 16 more >>> Caused by: java.lang.OutOfMemoryError: Java heap space >>> at org.apache.giraph.utils.UnsafeByteArrayOutputStream.<init>(U >>> nsafeByteArrayOutputStream.java:81) >>> at org.apache.giraph.conf.ImmutableClassesGiraphConfiguration.c >>> reateExtendedDataOutput(ImmutableClassesGiraphConfiguration.java:1161) >>> at org.apache.giraph.comm.SendPartitionCache.addVertex(SendPart >>> itionCache.java:77) >>> at org.apache.giraph.comm.netty.NettyWorkerClientRequestProcess >>> or.sendVertexRequest(NettyWorkerClientRequestProcessor.java:248) >>> at org.apache.giraph.worker.VertexInputSplitsCallable.readInput >>> Split(VertexInputSplitsCallable.java:231) >>> at org.apache.giraph.worker.InputSplitsCallable.loadInputSplit( >>> InputSplitsCallable.java:267) >>> at org.apache.giraph.worker.InputSplitsCallable.call(InputSplit >>> sCallable.java:211) >>> at org.apache.giraph.worker.InputSplitsCallable.call(InputSplit >>> sCallable.java:60) >>> at org.apache.giraph.utils.LogStacktraceCallable.call(LogStackt >>> raceCallable.java:51) >>> at java.util.concurrent.FutureTask.run(FutureTask.java:262) >>> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool >>> Executor.java:1145) >>> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo >>> lExecutor.java:615) >>> at java.lang.Thread.run(Thread.java:745) >>> 2016-10-20 21:26:06,737 ERROR [main] >>> org.apache.giraph.worker.BspServiceWorker: >>> unregisterHealth: Got failure, unregistering health on >>> /_hadoopBsp/job_1476386340018_0175/_applicationAttemptsDir/0 >>> /_superstepDir/-1/_workerHealthyDir/hadoop18.umd.com_23 on superstep -1 >>> 2016-10-20 21:26:06,746 WARN [main] org.apache.hadoop.mapred.YarnChild: >>> Exception running child : java.lang.IllegalStateException: run: Caught >>> an unrecoverable exception waitFor: ExecutionException occurred while >>> waiting for org.apache.giraph.utils.ProgressableUtils$FutureWaitable@673 >>> 7a445 >>> at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:104) >>> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787) >>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) >>> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) >>> at java.security.AccessController.doPrivileged(Native Method) >>> at javax.security.auth.Subject.doAs(Subject.java:415) >>> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGro >>> upInformation.java:1693) >>> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) >>> Caused by: java.lang.IllegalStateException: waitFor: ExecutionException >>> occurred while waiting for org.apache.giraph.utils.Progre >>> ssableUtils$FutureWaitable@6737a445 >>> at org.apache.giraph.utils.ProgressableUtils.waitFor(Progressab >>> leUtils.java:193) >>> at org.apache.giraph.utils.ProgressableUtils.waitForever(Progre >>> ssableUtils.java:151) >>> at org.apache.giraph.utils.ProgressableUtils.waitForever(Progre >>> ssableUtils.java:136) >>> at org.apache.giraph.utils.ProgressableUtils.getFutureResult(Pr >>> ogressableUtils.java:99) >>> at org.apache.giraph.utils.ProgressableUtils.getResultsWithNCal >>> lables(ProgressableUtils.java:233) >>> at org.apache.giraph.worker.BspServiceWorker.loadInputSplits(Bs >>> pServiceWorker.java:316) >>> at org.apache.giraph.worker.BspServiceWorker.loadVertices(BspSe >>> rviceWorker.java:409) >>> at org.apache.giraph.worker.BspServiceWorker.setup(BspServiceWo >>> rker.java:629) >>> at org.apache.giraph.graph.GraphTaskManager.execute(GraphTaskMa >>> nager.java:284) >>> at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:93) >>> ... 7 more >>> Caused by: java.util.concurrent.ExecutionException: >>> java.lang.OutOfMemoryError: Java heap space >>> at java.util.concurrent.FutureTask.report(FutureTask.java:122) >>> at java.util.concurrent.FutureTask.get(FutureTask.java:202) >>> at org.apache.giraph.utils.ProgressableUtils$FutureWaitable.wai >>> tFor(ProgressableUtils.java:312) >>> at org.apache.giraph.utils.ProgressableUtils.waitFor(Progressab >>> leUtils.java:185) >>> ... 16 more >>> Caused by: java.lang.OutOfMemoryError: Java heap space >>> at org.apache.giraph.utils.UnsafeByteArrayOutputStream.<init>(U >>> nsafeByteArrayOutputStream.java:81) >>> at org.apache.giraph.conf.ImmutableClassesGiraphConfiguration.c >>> reateExtendedDataOutput(ImmutableClassesGiraphConfiguration.java:1161) >>> at org.apache.giraph.comm.SendPartitionCache.addVertex(SendPart >>> itionCache.java:77) >>> at org.apache.giraph.comm.netty.NettyWorkerClientRequestProcess >>> or.sendVertexRequest(NettyWorkerClientRequestProcessor.java:248) >>> at org.apache.giraph.worker.VertexInputSplitsCallable.readInput >>> Split(VertexInputSplitsCallable.java:231) >>> at org.apache.giraph.worker.InputSplitsCallable.loadInputSplit( >>> InputSplitsCallable.java:267) >>> at org.apache.giraph.worker.InputSplitsCallable.call(InputSplit >>> sCallable.java:211) >>> at org.apache.giraph.worker.InputSplitsCallable.call(InputSplit >>> sCallable.java:60) >>> at org.apache.giraph.utils.LogStacktraceCallable.call(LogStackt >>> raceCallable.java:51) >>> at java.util.concurrent.FutureTask.run(FutureTask.java:262) >>> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool >>> Executor.java:1145) >>> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo >>> lExecutor.java:615) >>> at java.lang.Thread.run(Thread.java:745) >>> >>> >>> Thank you so much! >>> >>> Best, >>> >>> Hai >>> >> >> >