Hi, I added -Dgiraph.waitForPerWorkerRequests=true parameter. And I got this error.
Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded at java.util.ArrayList$SubList.listIterator(ArrayList.java:1095) at java.util.AbstractList.listIterator(AbstractList.java:299) at java.util.ArrayList$SubList.iterator(ArrayList.java:1087) at java.util.AbstractCollection.toArray(AbstractCollection.java:180) at java.util.regex.Pattern.split(Pattern.java:1241) at java.util.regex.Pattern.split(Pattern.java:1273) at org.apache.giraph.examples.LongDoubleNullTextInputFormat$LongDoubleNullDoubleVertexReader.getCurrentVertex(LongDoubleNullTextInputFormat.java:86) at org.apache.giraph.io.internal.WrappedVertexReader.getCurrentVertex(WrappedVertexReader.java:90) at org.apache.giraph.worker.VertexInputSplitsCallable.readInputSplit(VertexInputSplitsCallable.java:182) at org.apache.giraph.worker.InputSplitsCallable.loadInputSplit(InputSplitsCallable.java:275) at org.apache.giraph.worker.InputSplitsCallable.call(InputSplitsCallable.java:227) at org.apache.giraph.worker.InputSplitsCallable.call(InputSplitsCallable.java:60) at org.apache.giraph.utils.LogStacktraceCallable.call(LogStacktraceCallable.java:67) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) So, does this mean that there is no other solution but to increase the physical memory? Cheers, Darshan On 16 Jul 2017, at 23:04, Hassan Eslami <hsn.esl...@gmail.com<mailto:hsn.esl...@gmail.com>> wrote: Hi, giraph.useOutOfCoreMessages is no longer in use. The main problem here is that you are using default flow control mechanism (NoOpFlowControl), that causes a lot of outstanding/received messages. As a consequence, you fill up the memory so fast, and the job would fail for various reasons. Please use the following options instead: -Dgiraph.isStaticGraph=false -Dgiraph.useOutOfCoreGraph=true -Dgiraph.waitForPerWorkerRequests=true Note: the static graph has a known bug with the out-of-core mechanism. Hope it helps, Hassan On Sun, Jul 16, 2017 at 1:54 PM, Darshan Mallenahalli Shankaralingappa < dshankaralinga...@ntent.com<mailto:dshankaralinga...@ntent.com>> wrote: Hi, I am trying to run the page rank algorithm using giraph on a 3.5 billion node web graph on a relatively smaller Hadoop cluster (6 nodes with 225GB RAM total). I set the giraph.useOutOfCoreGraph and giraph.useOutOfCoreMessages to true and the application killed after some time. I am running the giraph job using this command: yarn jar giraph-examples-1.2.0-for-hadoop-2.6.0-jar-with-dependencies.jar org.apache.giraph.GiraphRunner -Dgiraph.yarn.task.heap.mb=58880 -Dgiraph.isStaticGraph=true -Dgiraph.useOutOfCoreGraph=true -Dgiraph.useOutOfCoreMessages=true org.apache.giraph.examples.PageRankComputation -vif org.apache.giraph.examples.LongDoubleNullTextInputFormat -vip /user/darshan/AdjList/ -vof org.apache.giraph.io.formats.IdWithValueTextOutputFormat -op /user/darshan/giraph_3.5B_ooc/ -w 8 -mc org.apache.giraph.examples.RandomWalkVertexMasterCompute -wc org.apache.giraph.examples.RandomWalkWorkerContext -ca org.apache.giraph.examples.RandomWalkVertex.teleportationProbability=0.15f -ca org.apache.giraph.examples.RandomWalkVertex.maxSupersteps=21 Here is a log from the zookeeper: 2017-07-12 08:08:35,026 WARN [netty-client-worker-1] org.apache.giraph.comm.netty.handler.ResponseClientHandler: exceptionCaught: Channel failed with remote address <url>/<ip>:30006< http://hdpbcn-01.lv.ntent.com/10.100.21.118:30006> java.lang.ArrayIndexOutOfBoundsException: 1075052547 at org.apache.giraph.comm.flow_control.NoOpFlowControl. getAckSignalFlag(NoOpFlowControl.java:52) at org.apache.giraph.comm.netty.NettyClient.messageReceived( NettyClient.java:796) at org.apache.giraph.comm.netty.handler.ResponseClientHandler. channelRead(ResponseClientHandler.java:87) at io.netty.channel.DefaultChannelHandlerContext. invokeChannelRead(DefaultChannelHandlerContext.java:338) at io.netty.channel.DefaultChannelHandlerContext.fireChannelRead( DefaultChannelHandlerContext.java:324) at io.netty.handler.codec.ByteToMessageDecoder.channelRead( ByteToMessageDecoder.java:153) at io.netty.channel.DefaultChannelHandlerContext. invokeChannelRead(DefaultChannelHandlerContext.java:338) at io.netty.channel.DefaultChannelHandlerContext.fireChannelRead( DefaultChannelHandlerContext.java:324) at org.apache.giraph.comm.netty.InboundByteCounter.channelRead( InboundByteCounter.java:74) at io.netty.channel.DefaultChannelHandlerContext. invokeChannelRead(DefaultChannelHandlerContext.java:338) at io.netty.channel.DefaultChannelHandlerContext.fireChannelRead( DefaultChannelHandlerContext.java:324) at io.netty.channel.DefaultChannelPipeline.fireChannelRead( DefaultChannelPipeline.java:785) at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read( AbstractNioByteChannel.java:126) at io.netty.channel.nio.NioEventLoop.processSelectedKey( NioEventLoop.java:485) at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized( NioEventLoop.java:452) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:346) at io.netty.util.concurrent.SingleThreadEventExecutor$2. run(SingleThreadEventExecutor.java:101) at java.lang.Thread.run(Thread.java:745) I think this issue is related to the messaging stack rather than the algorithm. If not, can someone please help me with this or at least point me in the right direction? Cheers, Darshan