Hi,
I added -Dgiraph.waitForPerWorkerRequests=true parameter. And I got this error.
Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded
at java.util.ArrayList$SubList.listIterator(ArrayList.java:1095)
at java.util.AbstractList.listIterator(AbstractList.java:299)
at java.util.ArrayList$SubList.iterator(ArrayList.java:1087)
at java.util.AbstractCollection.toArray(AbstractCollection.java:180)
at java.util.regex.Pattern.split(Pattern.java:1241)
at java.util.regex.Pattern.split(Pattern.java:1273)
at
org.apache.giraph.examples.LongDoubleNullTextInputFormat$LongDoubleNullDoubleVertexReader.getCurrentVertex(LongDoubleNullTextInputFormat.java:86)
at
org.apache.giraph.io.internal.WrappedVertexReader.getCurrentVertex(WrappedVertexReader.java:90)
at
org.apache.giraph.worker.VertexInputSplitsCallable.readInputSplit(VertexInputSplitsCallable.java:182)
at
org.apache.giraph.worker.InputSplitsCallable.loadInputSplit(InputSplitsCallable.java:275)
at
org.apache.giraph.worker.InputSplitsCallable.call(InputSplitsCallable.java:227)
at
org.apache.giraph.worker.InputSplitsCallable.call(InputSplitsCallable.java:60)
at
org.apache.giraph.utils.LogStacktraceCallable.call(LogStacktraceCallable.java:67)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
So, does this mean that there is no other solution but to increase the physical
memory?
Cheers,
Darshan
On 16 Jul 2017, at 23:04, Hassan Eslami
<[email protected]<mailto:[email protected]>> wrote:
Hi,
giraph.useOutOfCoreMessages is no longer in use.
The main problem here is that you are using default flow control mechanism
(NoOpFlowControl), that causes a lot of outstanding/received messages. As a
consequence, you fill up the memory so fast, and the job would fail for
various reasons. Please use the following options instead:
-Dgiraph.isStaticGraph=false -Dgiraph.useOutOfCoreGraph=true
-Dgiraph.waitForPerWorkerRequests=true
Note: the static graph has a known bug with the out-of-core mechanism.
Hope it helps,
Hassan
On Sun, Jul 16, 2017 at 1:54 PM, Darshan Mallenahalli Shankaralingappa <
[email protected]<mailto:[email protected]>> wrote:
Hi,
I am trying to run the page rank algorithm using giraph on a 3.5 billion
node web graph on a relatively smaller Hadoop cluster (6 nodes with 225GB
RAM total).
I set the giraph.useOutOfCoreGraph and giraph.useOutOfCoreMessages to true
and the application killed after some time.
I am running the giraph job using this command:
yarn jar giraph-examples-1.2.0-for-hadoop-2.6.0-jar-with-dependencies.jar
org.apache.giraph.GiraphRunner -Dgiraph.yarn.task.heap.mb=58880
-Dgiraph.isStaticGraph=true -Dgiraph.useOutOfCoreGraph=true
-Dgiraph.useOutOfCoreMessages=true
org.apache.giraph.examples.PageRankComputation
-vif org.apache.giraph.examples.LongDoubleNullTextInputFormat -vip
/user/darshan/AdjList/ -vof
org.apache.giraph.io.formats.IdWithValueTextOutputFormat
-op /user/darshan/giraph_3.5B_ooc/ -w 8 -mc
org.apache.giraph.examples.RandomWalkVertexMasterCompute
-wc org.apache.giraph.examples.RandomWalkWorkerContext -ca
org.apache.giraph.examples.RandomWalkVertex.teleportationProbability=0.15f
-ca org.apache.giraph.examples.RandomWalkVertex.maxSupersteps=21
Here is a log from the zookeeper:
2017-07-12 08:08:35,026 WARN [netty-client-worker-1]
org.apache.giraph.comm.netty.handler.ResponseClientHandler:
exceptionCaught: Channel failed with remote address <url>/<ip>:30006<
http://hdpbcn-01.lv.ntent.com/10.100.21.118:30006>
java.lang.ArrayIndexOutOfBoundsException: 1075052547
at org.apache.giraph.comm.flow_control.NoOpFlowControl.
getAckSignalFlag(NoOpFlowControl.java:52)
at org.apache.giraph.comm.netty.NettyClient.messageReceived(
NettyClient.java:796)
at org.apache.giraph.comm.netty.handler.ResponseClientHandler.
channelRead(ResponseClientHandler.java:87)
at io.netty.channel.DefaultChannelHandlerContext.
invokeChannelRead(DefaultChannelHandlerContext.java:338)
at io.netty.channel.DefaultChannelHandlerContext.fireChannelRead(
DefaultChannelHandlerContext.java:324)
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(
ByteToMessageDecoder.java:153)
at io.netty.channel.DefaultChannelHandlerContext.
invokeChannelRead(DefaultChannelHandlerContext.java:338)
at io.netty.channel.DefaultChannelHandlerContext.fireChannelRead(
DefaultChannelHandlerContext.java:324)
at org.apache.giraph.comm.netty.InboundByteCounter.channelRead(
InboundByteCounter.java:74)
at io.netty.channel.DefaultChannelHandlerContext.
invokeChannelRead(DefaultChannelHandlerContext.java:338)
at io.netty.channel.DefaultChannelHandlerContext.fireChannelRead(
DefaultChannelHandlerContext.java:324)
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(
DefaultChannelPipeline.java:785)
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(
AbstractNioByteChannel.java:126)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(
NioEventLoop.java:485)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(
NioEventLoop.java:452)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:346)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.
run(SingleThreadEventExecutor.java:101)
at java.lang.Thread.run(Thread.java:745)
I think this issue is related to the messaging stack rather than the
algorithm.
If not, can someone please help me with this or at least point me in the
right direction?
Cheers,
Darshan