Hi, I have the same problem and I add the following in mapred-site.xml and hadoop-env.sh but I still have the same problem. I try various values below but nothhing increase the memory.
mapred-site.xml: <property> <name>mapred.child.java.opts</name> <value>-Xms256m </value> <value>-Xmx4096m</value> </property> hadoop-env.sh: export HADOOP_HEAPSIZE=3072 export HADOOP_OPTS="-Xmx4096m" 2016-11-04 17:57 GMT+02:00 Agrta Rawat <agrta.ra...@gmail.com>: > Hi, > Didi you tried running your code on a low size data set? Did it work? And > you have to increase xms and Xmx options in hasoop configuration file. I > exactly do mot remember the file name but probably in mapred-site.xml you > will be able to find such entry. > > :) > > Thanks, > Agrta Rawat > > On Sun, Oct 23, 2016 at 11:17 PM, Hai Lan <lanhai1...@gmail.com> wrote: > >> More info: >> >> If I add -Dgiraph.useOutOfCoreGraph=true it can run successfully but >> superstep -1 is extremely slow. If I do not add Dgiraph.useOutOfCoreGraph >> =true, it loads much faster but will show error at waiting about last 10 >> workers to finished superstep -1. The error is: >> >> org.apache.giraph.master.BspServiceMaster: *barrierOnWorkerList: Missing >> chosen workers* [Worker(hostname=trantor17.umiacs.umd.edu, MRtaskID=124, >> port=30124), Worker(hostname=trantor17.umiacs.umd.edu, MRtaskID=126, >> port=30126), Worker(hostname=trantor17.umiacs.umd.edu, MRtaskID=128, >> port=30128), Worker(hostname=trantor17.umiacs.umd.edu, MRtaskID=130, >> port=30130)] on superstep -1 >> 2016-10-23 10:40:16,358 ERROR [org.apache.giraph.master.MasterThread] >> org.apache.giraph.master.MasterThread: masterThread: Master algorithm >> failed with IllegalStateException >> java.lang.IllegalStateException: coordinateVertexInputSplits: Worker >> failed during input split (currently not supported) >> >> Seems this error is just like https://issues.apache.org >> /jira/browse/GIRAPH-904 but there is no upper case in my hostnames >> >> Any ideas about this? >> >> Many Thanks, >> >> Hai >> >> >> >> >> On Sun, Oct 23, 2016 at 8:36 AM, Hai Lan <lanhai1...@gmail.com> wrote: >> >>> Thanks Agrta >>> >>> Thanks for your response. How exact I can do to increase min and max >>> RAM?(in which conf file or by using any command/arguments? my >>> giraph-site.xml is empty as default). >>> >>> As I saw online how to increase the heap size(not sure it is the same >>> thing like you mentioned min max RAM size), many people suggest to increase: >>> mapred.child.java.opts OR HADOOP_DATANODE_OPTS >>> >>> But they are not help. My problem happen during "VertexInputSplitsCallable: >>> readVertexInputSplit:", so I tried to increase mapreduce.map.memory.mb >>> and decrease # of container/workers. Currently I'm using 248 workers and >>> mapreduce.map.memory.mb=12000, ratio=0.7. This can help but I face new >>> problem: >>> >>> 1. The superstep -1 is extremely slow, like take 7-8 hours to load a >>> 150G graph: >>> e.g. >>> org.apache.giraph.master.BspServiceMaster: barrierOnWorkerList: 106 out >>> of 248 workers finished on superstep -1 on path >>> /_hadoopBsp/job_1477020594559_0012/_vertexInputSplitDoneDir >>> >>> I saw in log like: >>> INFO [main] org.apache.giraph.comm.netty.NettyClient: >>> logInfoAboutOpenRequests: Waiting interval of 15000 msecs, 2499 open >>> requests, waiting for it to be <= 0, MBytes/sec received = 0.0001, >>> MBytesReceived = 0.0058, ave received req MBytes = 0, secs waited = 92.12 >>> MBytes/sec sent = 10.4373, MBytesSent = 961.4983, ave sent req MBytes = >>> 0.3244, secs waited = 92.12 >>> >>> To finish those 2499 open requests will take a very long time. *I'm not >>> sure is this normal?* >>> >>> 2. I tried out-of-core graph option but I'm not sure I'm using it >>> correct. I did add -Dgiraph.useOutOfCoreGraph=true >>> -ca isStaticGraph=true,giraph.maxPartitionsInMemory=10. But how I know >>> if it is work? >>> >>> I doubt when I tried 15T graph, the problem will be worse. What should I >>> do? >>> >>> Thanks for your help. >>> >>> Best, >>> Hai >>> >>> >>> On Sun, Oct 23, 2016 at 7:11 AM, Agrta Rawat <agrta.ra...@gmail.com> >>> wrote: >>> >>>> Hi Hai, >>>> >>>> Please check your giraph configurations. Try increasing min and max RAM >>>> size in your configurations. >>>> This should help. >>>> >>>> Regards, >>>> Agrta Rawat >>>> >>>> >>>> On Sat, Oct 22, 2016 at 7:46 PM, Hai Lan <lanhai1...@gmail.com> wrote: >>>> >>>>> Can anyone help with this? >>>>> >>>>> Thanks a lot! >>>>> >>>>> >>>>> On Thu, Oct 20, 2016 at 9:48 PM, Hai Lan <lanhai1...@gmail.com> wrote: >>>>> >>>>>> Dear all, >>>>>> >>>>>> I'm facing a problem when I run large graph job (currently 1.6T, will >>>>>> be 16T then), it always shows java.lang.OutOfMemoryError: Java heap >>>>>> space error when loaded specific numbers of vertex(near 59000000). I >>>>>> tried >>>>>> to add like: >>>>>> -Dgiraph.useOutOfCoreGraph=true >>>>>> -Dmapred.child.java.opts="-XX:-UseGCOverheadLimit" OR >>>>>> -Dmapred.child.java.opts="-Xmx16384" >>>>>> -Dgiraph.yarn.task.heap.mb=36570 >>>>>> >>>>>> but the problem remain though I can see those value are shown in >>>>>> Metadata. >>>>>> >>>>>> I'm not sure the max value of memory in this >>>>>> VertexInputSplitsCallable info is related to java heap size. >>>>>> INFO [load-0] org.apache.giraph.worker.VertexInputSplitsCallable: >>>>>> readVertexInputSplit: Loaded 46975802 vertices at 68977.49310291892 >>>>>> vertices/sec 0 edges at 0.0 edges/sec Memory (free/total/max) = 475.08M / >>>>>> 2759.00M / 2759.00M >>>>>> >>>>>> But I am noticed in main log, it *always* shows: >>>>>> INFO [AsyncDispatcher event handler] org.apache.hadoop.mapred.JobConf: >>>>>> Task java-opts do not specify heap size. Setting task attempt jvm max >>>>>> heap >>>>>> size to -Xmx2868m >>>>>> *no matter what arguments I added*. Even when I run normal Hadoop >>>>>> jobs. >>>>>> >>>>>> Any ideas about this? Following is the log. >>>>>> >>>>>> 2016-10-20 21:25:49,008 ERROR [netty-client-worker-2] >>>>>> org.apache.giraph.comm.netty.NettyClient: Request failed >>>>>> java.lang.OutOfMemoryError: Java heap space >>>>>> at io.netty.buffer.UnpooledHeapByteBuf.<init>(UnpooledHeapByteB >>>>>> uf.java:45) >>>>>> at io.netty.buffer.UnpooledByteBufAllocator.newHeapBuffer(Unpoo >>>>>> ledByteBufAllocator.java:43) >>>>>> at io.netty.buffer.AbstractByteBufAllocator.heapBuffer(Abstract >>>>>> ByteBufAllocator.java:136) >>>>>> at io.netty.buffer.AbstractByteBufAllocator.heapBuffer(Abstract >>>>>> ByteBufAllocator.java:127) >>>>>> at io.netty.buffer.AbstractByteBufAllocator.buffer(AbstractByte >>>>>> BufAllocator.java:85) >>>>>> at org.apache.giraph.comm.netty.handler.RequestEncoder.write(Re >>>>>> questEncoder.java:81) >>>>>> at io.netty.channel.DefaultChannelHandlerContext.invokeWrite(De >>>>>> faultChannelHandlerContext.java:645) >>>>>> at io.netty.channel.DefaultChannelHandlerContext.access$2000(De >>>>>> faultChannelHandlerContext.java:29) >>>>>> at io.netty.channel.DefaultChannelHandlerContext$WriteTask.run( >>>>>> DefaultChannelHandlerContext.java:906) >>>>>> at io.netty.util.concurrent.DefaultEventExecutor.run(DefaultEve >>>>>> ntExecutor.java:36) >>>>>> at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(Sin >>>>>> gleThreadEventExecutor.java:101) >>>>>> at java.lang.Thread.run(Thread.java:745) >>>>>> 2016-10-20 21:25:55,299 ERROR [netty-client-worker-1] >>>>>> org.apache.giraph.comm.netty.NettyClient: Request failed >>>>>> java.lang.OutOfMemoryError: Java heap space >>>>>> at io.netty.buffer.UnpooledHeapByteBuf.<init>(UnpooledHeapByteB >>>>>> uf.java:45) >>>>>> at io.netty.buffer.UnpooledByteBufAllocator.newHeapBuffer(Unpoo >>>>>> ledByteBufAllocator.java:43) >>>>>> at io.netty.buffer.AbstractByteBufAllocator.heapBuffer(Abstract >>>>>> ByteBufAllocator.java:136) >>>>>> at io.netty.buffer.AbstractByteBufAllocator.heapBuffer(Abstract >>>>>> ByteBufAllocator.java:127) >>>>>> at io.netty.buffer.AbstractByteBufAllocator.buffer(AbstractByte >>>>>> BufAllocator.java:85) >>>>>> at org.apache.giraph.comm.netty.handler.RequestEncoder.write(Re >>>>>> questEncoder.java:81) >>>>>> at io.netty.channel.DefaultChannelHandlerContext.invokeWrite(De >>>>>> faultChannelHandlerContext.java:645) >>>>>> at io.netty.channel.DefaultChannelHandlerContext.access$2000(De >>>>>> faultChannelHandlerContext.java:29) >>>>>> at io.netty.channel.DefaultChannelHandlerContext$WriteTask.run( >>>>>> DefaultChannelHandlerContext.java:906) >>>>>> at io.netty.util.concurrent.DefaultEventExecutor.run(DefaultEve >>>>>> ntExecutor.java:36) >>>>>> at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(Sin >>>>>> gleThreadEventExecutor.java:101) >>>>>> at java.lang.Thread.run(Thread.java:745) >>>>>> 2016-10-20 21:26:06,731 ERROR [main] org.apache.giraph.graph.GraphMapper: >>>>>> Caught an unrecoverable exception waitFor: ExecutionException occurred >>>>>> while waiting for org.apache.giraph.utils.Progre >>>>>> ssableUtils$FutureWaitable@6737a445 >>>>>> java.lang.IllegalStateException: waitFor: ExecutionException >>>>>> occurred while waiting for org.apache.giraph.utils.Progre >>>>>> ssableUtils$FutureWaitable@6737a445 >>>>>> at org.apache.giraph.utils.ProgressableUtils.waitFor(Progressab >>>>>> leUtils.java:193) >>>>>> at org.apache.giraph.utils.ProgressableUtils.waitForever(Progre >>>>>> ssableUtils.java:151) >>>>>> at org.apache.giraph.utils.ProgressableUtils.waitForever(Progre >>>>>> ssableUtils.java:136) >>>>>> at org.apache.giraph.utils.ProgressableUtils.getFutureResult(Pr >>>>>> ogressableUtils.java:99) >>>>>> at org.apache.giraph.utils.ProgressableUtils.getResultsWithNCal >>>>>> lables(ProgressableUtils.java:233) >>>>>> at org.apache.giraph.worker.BspServiceWorker.loadInputSplits(Bs >>>>>> pServiceWorker.java:316) >>>>>> at org.apache.giraph.worker.BspServiceWorker.loadVertices(BspSe >>>>>> rviceWorker.java:409) >>>>>> at org.apache.giraph.worker.BspServiceWorker.setup(BspServiceWo >>>>>> rker.java:629) >>>>>> at org.apache.giraph.graph.GraphTaskManager.execute(GraphTaskMa >>>>>> nager.java:284) >>>>>> at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:93) >>>>>> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787) >>>>>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) >>>>>> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) >>>>>> at java.security.AccessController.doPrivileged(Native Method) >>>>>> at javax.security.auth.Subject.doAs(Subject.java:415) >>>>>> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGro >>>>>> upInformation.java:1693) >>>>>> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) >>>>>> Caused by: java.util.concurrent.ExecutionException: >>>>>> java.lang.OutOfMemoryError: Java heap space >>>>>> at java.util.concurrent.FutureTask.report(FutureTask.java:122) >>>>>> at java.util.concurrent.FutureTask.get(FutureTask.java:202) >>>>>> at org.apache.giraph.utils.ProgressableUtils$FutureWaitable.wai >>>>>> tFor(ProgressableUtils.java:312) >>>>>> at org.apache.giraph.utils.ProgressableUtils.waitFor(Progressab >>>>>> leUtils.java:185) >>>>>> ... 16 more >>>>>> Caused by: java.lang.OutOfMemoryError: Java heap space >>>>>> at org.apache.giraph.utils.UnsafeByteArrayOutputStream.<init>(U >>>>>> nsafeByteArrayOutputStream.java:81) >>>>>> at org.apache.giraph.conf.ImmutableClassesGiraphConfiguration.c >>>>>> reateExtendedDataOutput(ImmutableClassesGiraphConfiguration. >>>>>> java:1161) >>>>>> at org.apache.giraph.comm.SendPartitionCache.addVertex(SendPart >>>>>> itionCache.java:77) >>>>>> at org.apache.giraph.comm.netty.NettyWorkerClientRequestProcess >>>>>> or.sendVertexRequest(NettyWorkerClientRequestProcessor.java:248) >>>>>> at org.apache.giraph.worker.VertexInputSplitsCallable.readInput >>>>>> Split(VertexInputSplitsCallable.java:231) >>>>>> at org.apache.giraph.worker.InputSplitsCallable.loadInputSplit( >>>>>> InputSplitsCallable.java:267) >>>>>> at org.apache.giraph.worker.InputSplitsCallable.call(InputSplit >>>>>> sCallable.java:211) >>>>>> at org.apache.giraph.worker.InputSplitsCallable.call(InputSplit >>>>>> sCallable.java:60) >>>>>> at org.apache.giraph.utils.LogStacktraceCallable.call(LogStackt >>>>>> raceCallable.java:51) >>>>>> at java.util.concurrent.FutureTask.run(FutureTask.java:262) >>>>>> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool >>>>>> Executor.java:1145) >>>>>> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo >>>>>> lExecutor.java:615) >>>>>> at java.lang.Thread.run(Thread.java:745) >>>>>> 2016-10-20 21:26:06,737 ERROR [main] >>>>>> org.apache.giraph.worker.BspServiceWorker: >>>>>> unregisterHealth: Got failure, unregistering health on >>>>>> /_hadoopBsp/job_1476386340018_0175/_applicationAttemptsDir/0 >>>>>> /_superstepDir/-1/_workerHealthyDir/hadoop18.umd.com_23 on superstep >>>>>> -1 >>>>>> 2016-10-20 21:26:06,746 WARN [main] org.apache.hadoop.mapred.YarnChild: >>>>>> Exception running child : java.lang.IllegalStateException: run: >>>>>> Caught an unrecoverable exception waitFor: ExecutionException occurred >>>>>> while waiting for org.apache.giraph.utils.Progre >>>>>> ssableUtils$FutureWaitable@6737a445 >>>>>> at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:104) >>>>>> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787) >>>>>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) >>>>>> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) >>>>>> at java.security.AccessController.doPrivileged(Native Method) >>>>>> at javax.security.auth.Subject.doAs(Subject.java:415) >>>>>> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGro >>>>>> upInformation.java:1693) >>>>>> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) >>>>>> Caused by: java.lang.IllegalStateException: waitFor: >>>>>> ExecutionException occurred while waiting for >>>>>> org.apache.giraph.utils.Progre >>>>>> ssableUtils$FutureWaitable@6737a445 >>>>>> at org.apache.giraph.utils.ProgressableUtils.waitFor(Progressab >>>>>> leUtils.java:193) >>>>>> at org.apache.giraph.utils.ProgressableUtils.waitForever(Progre >>>>>> ssableUtils.java:151) >>>>>> at org.apache.giraph.utils.ProgressableUtils.waitForever(Progre >>>>>> ssableUtils.java:136) >>>>>> at org.apache.giraph.utils.ProgressableUtils.getFutureResult(Pr >>>>>> ogressableUtils.java:99) >>>>>> at org.apache.giraph.utils.ProgressableUtils.getResultsWithNCal >>>>>> lables(ProgressableUtils.java:233) >>>>>> at org.apache.giraph.worker.BspServiceWorker.loadInputSplits(Bs >>>>>> pServiceWorker.java:316) >>>>>> at org.apache.giraph.worker.BspServiceWorker.loadVertices(BspSe >>>>>> rviceWorker.java:409) >>>>>> at org.apache.giraph.worker.BspServiceWorker.setup(BspServiceWo >>>>>> rker.java:629) >>>>>> at org.apache.giraph.graph.GraphTaskManager.execute(GraphTaskMa >>>>>> nager.java:284) >>>>>> at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:93) >>>>>> ... 7 more >>>>>> Caused by: java.util.concurrent.ExecutionException: >>>>>> java.lang.OutOfMemoryError: Java heap space >>>>>> at java.util.concurrent.FutureTask.report(FutureTask.java:122) >>>>>> at java.util.concurrent.FutureTask.get(FutureTask.java:202) >>>>>> at org.apache.giraph.utils.ProgressableUtils$FutureWaitable.wai >>>>>> tFor(ProgressableUtils.java:312) >>>>>> at org.apache.giraph.utils.ProgressableUtils.waitFor(Progressab >>>>>> leUtils.java:185) >>>>>> ... 16 more >>>>>> Caused by: java.lang.OutOfMemoryError: Java heap space >>>>>> at org.apache.giraph.utils.UnsafeByteArrayOutputStream.<init>(U >>>>>> nsafeByteArrayOutputStream.java:81) >>>>>> at org.apache.giraph.conf.ImmutableClassesGiraphConfiguration.c >>>>>> reateExtendedDataOutput(ImmutableClassesGiraphConfiguration. >>>>>> java:1161) >>>>>> at org.apache.giraph.comm.SendPartitionCache.addVertex(SendPart >>>>>> itionCache.java:77) >>>>>> at org.apache.giraph.comm.netty.NettyWorkerClientRequestProcess >>>>>> or.sendVertexRequest(NettyWorkerClientRequestProcessor.java:248) >>>>>> at org.apache.giraph.worker.VertexInputSplitsCallable.readInput >>>>>> Split(VertexInputSplitsCallable.java:231) >>>>>> at org.apache.giraph.worker.InputSplitsCallable.loadInputSplit( >>>>>> InputSplitsCallable.java:267) >>>>>> at org.apache.giraph.worker.InputSplitsCallable.call(InputSplit >>>>>> sCallable.java:211) >>>>>> at org.apache.giraph.worker.InputSplitsCallable.call(InputSplit >>>>>> sCallable.java:60) >>>>>> at org.apache.giraph.utils.LogStacktraceCallable.call(LogStackt >>>>>> raceCallable.java:51) >>>>>> at java.util.concurrent.FutureTask.run(FutureTask.java:262) >>>>>> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool >>>>>> Executor.java:1145) >>>>>> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo >>>>>> lExecutor.java:615) >>>>>> at java.lang.Thread.run(Thread.java:745) >>>>>> >>>>>> >>>>>> Thank you so much! >>>>>> >>>>>> Best, >>>>>> >>>>>> Hai >>>>>> >>>>> >>>>> >>>> >>> >> >