Hi Armando & Claudio,

I tested your latest patch, unfortunately it still doesn't work for me.

I applied the patch to the current trunk, copied my algorithm code and ran it on the cluster. If it runs in memory, everything works well. If I enable out-of-core I run into the deadlock as before.

Here's the command I used:

hadoop jar giraph-examples-1.1.0-SNAPSHOT-for-hadoop-1.2.1-jar-with-dependencies.jar org.apache.giraph.GiraphRunner org.apache.giraph.examples.hyperball.HyperBall --vertexInputFormat org.apache.giraph.examples.hyperball.HyperBallTextInputFormat --vertexInputPath hdfs:///ssc/grades/data/twitter-negative/ --vertexOutputFormat org.apache.giraph.io.formats.IdWithValueTextOutputFormat --outputPath hdfs:///ssc/tmp-123/ --combiner org.apache.giraph.examples.hyperball.HyperLogLogCombiner --outEdges org.apache.giraph.edge.LongNullArrayEdges --workers 24 --customArguments giraph.oneToAllMsgSending=true,giraph.isStaticGraph=true,giraph.numComputeThreads=15,giraph.numInputThreads=15,giraph.numOutputThreads=15,giraph.maxNumberOfSupersteps=30,giraph.useOutOfCoreGraph=true,giraph.maxPartitionsInMemory=20

Best,
Sebastian

Here's a stracktrace of one workers:

2014-02-20 11:05:17
Full thread dump Java HotSpot(TM) 64-Bit Server VM (23.6-b04 mixed mode):

"Attach Listener" daemon prio=10 tid=0x00007f98f0096000 nid=0x78cb waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"netty-server-worker-15" prio=10 tid=0x00007f9910025000 nid=0x77b3 runnable [0x00007f9914aa1000]
   java.lang.Thread.State: RUNNABLE
        at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
        at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:228)
        at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:81)
        at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
- locked <0x00000000adfc3140> (a io.netty.channel.nio.SelectedSelectionKeySet)
        - locked <0x00000000adfc3218> (a java.util.Collections$UnmodifiableSet)
        - locked <0x00000000adfc3040> (a sun.nio.ch.EPollSelectorImpl)
        at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
        at io.netty.channel.nio.NioEventLoop.select(NioEventLoop.java:596)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:306)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:101)
        at java.lang.Thread.run(Thread.java:722)

"netty-server-worker-14" prio=10 tid=0x00007f9910023000 nid=0x77b2 runnable [0x00007f9914ba2000]
   java.lang.Thread.State: RUNNABLE
        at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
        at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:228)
        at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:81)
        at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
- locked <0x00000000ae3a0680> (a io.netty.channel.nio.SelectedSelectionKeySet)
        - locked <0x00000000ae22fae8> (a java.util.Collections$UnmodifiableSet)
        - locked <0x00000000ae3a0590> (a sun.nio.ch.EPollSelectorImpl)
        at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
        at io.netty.channel.nio.NioEventLoop.select(NioEventLoop.java:596)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:306)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:101)
        at java.lang.Thread.run(Thread.java:722)

"netty-server-worker-13" prio=10 tid=0x00007f9910021000 nid=0x77b1 runnable [0x00007f9914ca3000]
   java.lang.Thread.State: RUNNABLE
        at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
        at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:228)
        at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:81)
        at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
- locked <0x00000000ae474e30> (a io.netty.channel.nio.SelectedSelectionKeySet)
        - locked <0x00000000ae4780c8> (a java.util.Collections$UnmodifiableSet)
        - locked <0x00000000ae474d40> (a sun.nio.ch.EPollSelectorImpl)
        at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
        at io.netty.channel.nio.NioEventLoop.select(NioEventLoop.java:596)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:306)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:101)
        at java.lang.Thread.run(Thread.java:722)

"netty-server-worker-12" prio=10 tid=0x00007f991001f800 nid=0x77b0 runnable [0x00007f9914da4000]
   java.lang.Thread.State: RUNNABLE
        at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
        at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:228)
        at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:81)
        at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
- locked <0x00000000adfee950> (a io.netty.channel.nio.SelectedSelectionKeySet)
        - locked <0x00000000adfeec20> (a java.util.Collections$UnmodifiableSet)
        - locked <0x00000000adfee860> (a sun.nio.ch.EPollSelectorImpl)
        at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
        at io.netty.channel.nio.NioEventLoop.select(NioEventLoop.java:596)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:306)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:101)
        at java.lang.Thread.run(Thread.java:722)

"netty-server-worker-11" prio=10 tid=0x00007f991001d800 nid=0x77af runnable [0x00007f9914ea5000]
   java.lang.Thread.State: RUNNABLE
        at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
        at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:228)
        at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:81)
        at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
- locked <0x00000000ae0fc9b0> (a io.netty.channel.nio.SelectedSelectionKeySet)
        - locked <0x00000000ae0fcc80> (a java.util.Collections$UnmodifiableSet)
        - locked <0x00000000ae0fc8c0> (a sun.nio.ch.EPollSelectorImpl)
        at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
        at io.netty.channel.nio.NioEventLoop.select(NioEventLoop.java:596)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:306)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:101)
        at java.lang.Thread.run(Thread.java:722)

"netty-server-worker-10" prio=10 tid=0x00007f991001c000 nid=0x77ae waiting for monitor entry [0x00007f9914fa6000]
   java.lang.Thread.State: BLOCKED (on object monitor)
at org.apache.giraph.partition.DiskBackedPartitionStore.getOrCreatePartition(DiskBackedPartitionStore.java:234) - waiting to lock <0x00000000ae3c8458> (a org.apache.giraph.partition.DiskBackedPartitionStore$MetaPartition) at org.apache.giraph.comm.requests.SendWorkerVerticesRequest.doRequest(SendWorkerVerticesRequest.java:114) at org.apache.giraph.comm.netty.handler.WorkerRequestServerHandler.processRequest(WorkerRequestServerHandler.java:60) at org.apache.giraph.comm.netty.handler.WorkerRequestServerHandler.processRequest(WorkerRequestServerHandler.java:36) at org.apache.giraph.comm.netty.handler.RequestServerHandler.channelRead(RequestServerHandler.java:103) at io.netty.channel.DefaultChannelHandlerContext.invokeChannelRead(DefaultChannelHandlerContext.java:338) at io.netty.channel.DefaultChannelHandlerContext.fireChannelRead(DefaultChannelHandlerContext.java:324) at org.apache.giraph.comm.netty.handler.RequestDecoder.channelRead(RequestDecoder.java:103) at io.netty.channel.DefaultChannelHandlerContext.invokeChannelRead(DefaultChannelHandlerContext.java:338) at io.netty.channel.DefaultChannelHandlerContext.access$700(DefaultChannelHandlerContext.java:29) at io.netty.channel.DefaultChannelHandlerContext$8.run(DefaultChannelHandlerContext.java:329) at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:354)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:353)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:101)
        at java.lang.Thread.run(Thread.java:722)

"netty-server-worker-9" prio=10 tid=0x00007f991001a800 nid=0x77ad runnable [0x00007f99150a7000]
   java.lang.Thread.State: RUNNABLE
        at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
        at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:228)
        at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:81)
        at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
- locked <0x00000000adf80e78> (a io.netty.channel.nio.SelectedSelectionKeySet)
        - locked <0x00000000adf81148> (a java.util.Collections$UnmodifiableSet)
        - locked <0x00000000adf80d88> (a sun.nio.ch.EPollSelectorImpl)
        at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
        at io.netty.channel.nio.NioEventLoop.select(NioEventLoop.java:596)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:306)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:101)
        at java.lang.Thread.run(Thread.java:722)

"netty-server-exec-7" prio=10 tid=0x00007f98f8015800 nid=0x77ac waiting on condition [0x00007f99151a8000]
   java.lang.Thread.State: TIMED_WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000000ae444d38> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at 
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082) at java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467) at io.netty.util.concurrent.SingleThreadEventExecutor.takeTask(SingleThreadEventExecutor.java:219) at io.netty.util.concurrent.DefaultEventExecutor.run(DefaultEventExecutor.java:34) at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:101)
        at java.lang.Thread.run(Thread.java:722)

"netty-server-worker-8" prio=10 tid=0x00007f9910018800 nid=0x77ab runnable [0x00007f99152a9000]
   java.lang.Thread.State: RUNNABLE
        at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
        at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:228)
        at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:81)
        at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
- locked <0x00000000ae0fd200> (a io.netty.channel.nio.SelectedSelectionKeySet)
        - locked <0x00000000ae0fd2d8> (a java.util.Collections$UnmodifiableSet)
        - locked <0x00000000ae0fd100> (a sun.nio.ch.EPollSelectorImpl)
        at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
        at io.netty.channel.nio.NioEventLoop.select(NioEventLoop.java:596)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:306)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:101)
        at java.lang.Thread.run(Thread.java:722)

"netty-server-worker-7" prio=10 tid=0x00007f9910016800 nid=0x77aa waiting for monitor entry [0x00007f99153aa000]
   java.lang.Thread.State: BLOCKED (on object monitor)
at org.apache.giraph.partition.DiskBackedPartitionStore.getOrCreatePartition(DiskBackedPartitionStore.java:234) - waiting to lock <0x00000000ae3c8458> (a org.apache.giraph.partition.DiskBackedPartitionStore$MetaPartition) at org.apache.giraph.comm.requests.SendWorkerVerticesRequest.doRequest(SendWorkerVerticesRequest.java:114) at org.apache.giraph.comm.netty.handler.WorkerRequestServerHandler.processRequest(WorkerRequestServerHandler.java:60) at org.apache.giraph.comm.netty.handler.WorkerRequestServerHandler.processRequest(WorkerRequestServerHandler.java:36) at org.apache.giraph.comm.netty.handler.RequestServerHandler.channelRead(RequestServerHandler.java:103) at io.netty.channel.DefaultChannelHandlerContext.invokeChannelRead(DefaultChannelHandlerContext.java:338) at io.netty.channel.DefaultChannelHandlerContext.fireChannelRead(DefaultChannelHandlerContext.java:324) at org.apache.giraph.comm.netty.handler.RequestDecoder.channelRead(RequestDecoder.java:103) at io.netty.channel.DefaultChannelHandlerContext.invokeChannelRead(DefaultChannelHandlerContext.java:338) at io.netty.channel.DefaultChannelHandlerContext.access$700(DefaultChannelHandlerContext.java:29) at io.netty.channel.DefaultChannelHandlerContext$8.run(DefaultChannelHandlerContext.java:329) at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:354)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:353)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:101)
        at java.lang.Thread.run(Thread.java:722)

"netty-server-exec-6" prio=10 tid=0x00007f9928452800 nid=0x77a9 waiting on condition [0x00007f99154ab000]
   java.lang.Thread.State: TIMED_WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000000ae4466a0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at 
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082) at java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467) at io.netty.util.concurrent.SingleThreadEventExecutor.takeTask(SingleThreadEventExecutor.java:219) at io.netty.util.concurrent.DefaultEventExecutor.run(DefaultEventExecutor.java:34) at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:101)
        at java.lang.Thread.run(Thread.java:722)

"netty-server-exec-5" prio=10 tid=0x00007f99006d6000 nid=0x77a8 waiting on condition [0x00007f99155ac000]
   java.lang.Thread.State: TIMED_WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000000ae44df48> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at 
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082) at java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467) at io.netty.util.concurrent.SingleThreadEventExecutor.takeTask(SingleThreadEventExecutor.java:219) at io.netty.util.concurrent.DefaultEventExecutor.run(DefaultEventExecutor.java:34) at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:101)
        at java.lang.Thread.run(Thread.java:722)

"netty-server-worker-6" prio=10 tid=0x00007f9910014800 nid=0x77a7 waiting for monitor entry [0x00007f99156ad000]
   java.lang.Thread.State: BLOCKED (on object monitor)
at org.apache.giraph.partition.DiskBackedPartitionStore.getOrCreatePartition(DiskBackedPartitionStore.java:234) - waiting to lock <0x00000000ae3c8458> (a org.apache.giraph.partition.DiskBackedPartitionStore$MetaPartition) at org.apache.giraph.comm.requests.SendWorkerVerticesRequest.doRequest(SendWorkerVerticesRequest.java:114) at org.apache.giraph.comm.netty.handler.WorkerRequestServerHandler.processRequest(WorkerRequestServerHandler.java:60) at org.apache.giraph.comm.netty.handler.WorkerRequestServerHandler.processRequest(WorkerRequestServerHandler.java:36) at org.apache.giraph.comm.netty.handler.RequestServerHandler.channelRead(RequestServerHandler.java:103) at io.netty.channel.DefaultChannelHandlerContext.invokeChannelRead(DefaultChannelHandlerContext.java:338) at io.netty.channel.DefaultChannelHandlerContext.fireChannelRead(DefaultChannelHandlerContext.java:324) at org.apache.giraph.comm.netty.handler.RequestDecoder.channelRead(RequestDecoder.java:103) at io.netty.channel.DefaultChannelHandlerContext.invokeChannelRead(DefaultChannelHandlerContext.java:338) at io.netty.channel.DefaultChannelHandlerContext.access$700(DefaultChannelHandlerContext.java:29) at io.netty.channel.DefaultChannelHandlerContext$8.run(DefaultChannelHandlerContext.java:329) at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:354)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:353)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:101)
        at java.lang.Thread.run(Thread.java:722)

"netty-server-exec-4" prio=10 tid=0x00007f9908661800 nid=0x77a6 waiting on condition [0x00007f99157ae000]
   java.lang.Thread.State: TIMED_WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000000ae4287a0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at 
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082) at java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467) at io.netty.util.concurrent.SingleThreadEventExecutor.takeTask(SingleThreadEventExecutor.java:219) at io.netty.util.concurrent.DefaultEventExecutor.run(DefaultEventExecutor.java:34) at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:101)
        at java.lang.Thread.run(Thread.java:722)

"netty-server-exec-3" prio=10 tid=0x00007f98f0005800 nid=0x77a5 waiting on condition [0x00007f99158af000]
   java.lang.Thread.State: TIMED_WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000000ae452518> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at 
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082) at java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467) at io.netty.util.concurrent.SingleThreadEventExecutor.takeTask(SingleThreadEventExecutor.java:219) at io.netty.util.concurrent.DefaultEventExecutor.run(DefaultEventExecutor.java:34) at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:101)
        at java.lang.Thread.run(Thread.java:722)

"netty-server-worker-5" prio=10 tid=0x00007f9910012800 nid=0x77a4 waiting for monitor entry [0x00007f99159b0000]
   java.lang.Thread.State: BLOCKED (on object monitor)
at org.apache.giraph.partition.DiskBackedPartitionStore.getOrCreatePartition(DiskBackedPartitionStore.java:234) - waiting to lock <0x00000000ae3c8458> (a org.apache.giraph.partition.DiskBackedPartitionStore$MetaPartition) at org.apache.giraph.comm.requests.SendWorkerVerticesRequest.doRequest(SendWorkerVerticesRequest.java:114) at org.apache.giraph.comm.netty.handler.WorkerRequestServerHandler.processRequest(WorkerRequestServerHandler.java:60) at org.apache.giraph.comm.netty.handler.WorkerRequestServerHandler.processRequest(WorkerRequestServerHandler.java:36) at org.apache.giraph.comm.netty.handler.RequestServerHandler.channelRead(RequestServerHandler.java:103) at io.netty.channel.DefaultChannelHandlerContext.invokeChannelRead(DefaultChannelHandlerContext.java:338) at io.netty.channel.DefaultChannelHandlerContext.fireChannelRead(DefaultChannelHandlerContext.java:324) at org.apache.giraph.comm.netty.handler.RequestDecoder.channelRead(RequestDecoder.java:103) at io.netty.channel.DefaultChannelHandlerContext.invokeChannelRead(DefaultChannelHandlerContext.java:338) at io.netty.channel.DefaultChannelHandlerContext.access$700(DefaultChannelHandlerContext.java:29) at io.netty.channel.DefaultChannelHandlerContext$8.run(DefaultChannelHandlerContext.java:329) at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:354)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:353)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:101)
        at java.lang.Thread.run(Thread.java:722)

"netty-server-worker-4" prio=10 tid=0x00007f9910010800 nid=0x77a3 runnable [0x00007f9915ab1000]
   java.lang.Thread.State: RUNNABLE
        at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
        at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:228)
        at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:81)
        at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
- locked <0x00000000adfadba0> (a io.netty.channel.nio.SelectedSelectionKeySet)
        - locked <0x00000000adfb86d8> (a java.util.Collections$UnmodifiableSet)
        - locked <0x00000000adfadab0> (a sun.nio.ch.EPollSelectorImpl)
        at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
        at io.netty.channel.nio.NioEventLoop.select(NioEventLoop.java:596)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:306)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:101)
        at java.lang.Thread.run(Thread.java:722)

"netty-server-worker-3" prio=10 tid=0x00007f991000e800 nid=0x77a2 runnable [0x00007f9915bb1000]
   java.lang.Thread.State: RUNNABLE
        at java.io.FileInputStream.readBytes(Native Method)
        at java.io.FileInputStream.read(FileInputStream.java:242)
        at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
        at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)
        at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
        - locked <0x00000005be8e5778> (a java.io.BufferedInputStream)
        at java.io.DataInputStream.readFully(DataInputStream.java:195)
        at java.io.DataInputStream.readLong(DataInputStream.java:416)
at org.apache.giraph.edge.LongNullArrayEdges.readFields(LongNullArrayEdges.java:161) at org.apache.giraph.partition.DiskBackedPartitionStore.readOutEdges(DiskBackedPartitionStore.java:387) at org.apache.giraph.partition.DiskBackedPartitionStore.loadPartition(DiskBackedPartitionStore.java:441) at org.apache.giraph.partition.DiskBackedPartitionStore.getPartition(DiskBackedPartitionStore.java:771) - locked <0x00000000ae3c8458> (a org.apache.giraph.partition.DiskBackedPartitionStore$MetaPartition) at org.apache.giraph.partition.DiskBackedPartitionStore.getOrCreatePartition(DiskBackedPartitionStore.java:238) - locked <0x00000000ae3c8458> (a org.apache.giraph.partition.DiskBackedPartitionStore$MetaPartition) at org.apache.giraph.comm.requests.SendWorkerVerticesRequest.doRequest(SendWorkerVerticesRequest.java:114) at org.apache.giraph.comm.netty.handler.WorkerRequestServerHandler.processRequest(WorkerRequestServerHandler.java:60) at org.apache.giraph.comm.netty.handler.WorkerRequestServerHandler.processRequest(WorkerRequestServerHandler.java:36) at org.apache.giraph.comm.netty.handler.RequestServerHandler.channelRead(RequestServerHandler.java:103) at io.netty.channel.DefaultChannelHandlerContext.invokeChannelRead(DefaultChannelHandlerContext.java:338) at io.netty.channel.DefaultChannelHandlerContext.fireChannelRead(DefaultChannelHandlerContext.java:324) at org.apache.giraph.comm.netty.handler.RequestDecoder.channelRead(RequestDecoder.java:103) at io.netty.channel.DefaultChannelHandlerContext.invokeChannelRead(DefaultChannelHandlerContext.java:338) at io.netty.channel.DefaultChannelHandlerContext.access$700(DefaultChannelHandlerContext.java:29) at io.netty.channel.DefaultChannelHandlerContext$8.run(DefaultChannelHandlerContext.java:329) at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:354)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:353)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:101)
        at java.lang.Thread.run(Thread.java:722)

"netty-server-exec-2" prio=10 tid=0x0000000000eb2800 nid=0x77a1 waiting on condition [0x00007f9915cb3000]
   java.lang.Thread.State: TIMED_WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000000ae453e70> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at 
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082) at java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467) at io.netty.util.concurrent.SingleThreadEventExecutor.takeTask(SingleThreadEventExecutor.java:219) at io.netty.util.concurrent.DefaultEventExecutor.run(DefaultEventExecutor.java:34) at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:101)
        at java.lang.Thread.run(Thread.java:722)

"netty-server-exec-1" prio=10 tid=0x00007f992810c000 nid=0x77a0 waiting on condition [0x00007f9915db4000]
   java.lang.Thread.State: TIMED_WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000000ae45c728> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at 
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082) at java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467) at io.netty.util.concurrent.SingleThreadEventExecutor.takeTask(SingleThreadEventExecutor.java:219) at io.netty.util.concurrent.DefaultEventExecutor.run(DefaultEventExecutor.java:34) at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:101)
        at java.lang.Thread.run(Thread.java:722)

"netty-server-worker-2" prio=10 tid=0x00007f991000c000 nid=0x779f waiting for monitor entry [0x00007f9915eb5000]
   java.lang.Thread.State: BLOCKED (on object monitor)
at org.apache.giraph.partition.DiskBackedPartitionStore.getOrCreatePartition(DiskBackedPartitionStore.java:234) - waiting to lock <0x00000000ae3c8458> (a org.apache.giraph.partition.DiskBackedPartitionStore$MetaPartition) at org.apache.giraph.comm.requests.SendWorkerVerticesRequest.doRequest(SendWorkerVerticesRequest.java:114) at org.apache.giraph.comm.netty.handler.WorkerRequestServerHandler.processRequest(WorkerRequestServerHandler.java:60) at org.apache.giraph.comm.netty.handler.WorkerRequestServerHandler.processRequest(WorkerRequestServerHandler.java:36) at org.apache.giraph.comm.netty.handler.RequestServerHandler.channelRead(RequestServerHandler.java:103) at io.netty.channel.DefaultChannelHandlerContext.invokeChannelRead(DefaultChannelHandlerContext.java:338) at io.netty.channel.DefaultChannelHandlerContext.fireChannelRead(DefaultChannelHandlerContext.java:324) at org.apache.giraph.comm.netty.handler.RequestDecoder.channelRead(RequestDecoder.java:103) at io.netty.channel.DefaultChannelHandlerContext.invokeChannelRead(DefaultChannelHandlerContext.java:338) at io.netty.channel.DefaultChannelHandlerContext.access$700(DefaultChannelHandlerContext.java:29) at io.netty.channel.DefaultChannelHandlerContext$8.run(DefaultChannelHandlerContext.java:329) at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:354)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:353)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:101)
        at java.lang.Thread.run(Thread.java:722)

"netty-server-worker-1" prio=10 tid=0x00007f991000a000 nid=0x779e waiting for monitor entry [0x00007f9915fb6000]
   java.lang.Thread.State: BLOCKED (on object monitor)
at org.apache.giraph.partition.DiskBackedPartitionStore.getOrCreatePartition(DiskBackedPartitionStore.java:234) - waiting to lock <0x00000000ae3c8458> (a org.apache.giraph.partition.DiskBackedPartitionStore$MetaPartition) at org.apache.giraph.comm.requests.SendWorkerVerticesRequest.doRequest(SendWorkerVerticesRequest.java:114) at org.apache.giraph.comm.netty.handler.WorkerRequestServerHandler.processRequest(WorkerRequestServerHandler.java:60) at org.apache.giraph.comm.netty.handler.WorkerRequestServerHandler.processRequest(WorkerRequestServerHandler.java:36) at org.apache.giraph.comm.netty.handler.RequestServerHandler.channelRead(RequestServerHandler.java:103) at io.netty.channel.DefaultChannelHandlerContext.invokeChannelRead(DefaultChannelHandlerContext.java:338) at io.netty.channel.DefaultChannelHandlerContext.fireChannelRead(DefaultChannelHandlerContext.java:324) at org.apache.giraph.comm.netty.handler.RequestDecoder.channelRead(RequestDecoder.java:103) at io.netty.channel.DefaultChannelHandlerContext.invokeChannelRead(DefaultChannelHandlerContext.java:338) at io.netty.channel.DefaultChannelHandlerContext.access$700(DefaultChannelHandlerContext.java:29) at io.netty.channel.DefaultChannelHandlerContext$8.run(DefaultChannelHandlerContext.java:329) at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:354)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:353)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:101)
        at java.lang.Thread.run(Thread.java:722)

"netty-client-exec-7" prio=10 tid=0x00007f9928109800 nid=0x779d waiting on condition [0x00007f99160b7000]
   java.lang.Thread.State: TIMED_WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000000ae19d018> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at 
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082) at java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467) at io.netty.util.concurrent.SingleThreadEventExecutor.takeTask(SingleThreadEventExecutor.java:219) at io.netty.util.concurrent.DefaultEventExecutor.run(DefaultEventExecutor.java:34) at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:101)
        at java.lang.Thread.run(Thread.java:722)

"netty-client-exec-6" prio=10 tid=0x00007f98e8003800 nid=0x779c waiting on condition [0x00007f99161b8000]
   java.lang.Thread.State: TIMED_WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000000ae1ac6e0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at 
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082) at java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467) at io.netty.util.concurrent.SingleThreadEventExecutor.takeTask(SingleThreadEventExecutor.java:219) at io.netty.util.concurrent.DefaultEventExecutor.run(DefaultEventExecutor.java:34) at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:101)
        at java.lang.Thread.run(Thread.java:722)

"netty-client-exec-5" prio=10 tid=0x00007f991800b800 nid=0x779b waiting on condition [0x00007f99162b9000]
   java.lang.Thread.State: TIMED_WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000000ae2367a8> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at 
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082) at java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467) at io.netty.util.concurrent.SingleThreadEventExecutor.takeTask(SingleThreadEventExecutor.java:219) at io.netty.util.concurrent.DefaultEventExecutor.run(DefaultEventExecutor.java:34) at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:101)
        at java.lang.Thread.run(Thread.java:722)

"netty-client-exec-4" prio=10 tid=0x00007f9900017000 nid=0x779a waiting on condition [0x00007f99163ba000]
   java.lang.Thread.State: TIMED_WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000000ae1adfe8> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at 
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082) at java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467) at io.netty.util.concurrent.SingleThreadEventExecutor.takeTask(SingleThreadEventExecutor.java:219) at io.netty.util.concurrent.DefaultEventExecutor.run(DefaultEventExecutor.java:34) at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:101)
        at java.lang.Thread.run(Thread.java:722)

"netty-client-exec-3" prio=10 tid=0x00007f9928107800 nid=0x7799 waiting on condition [0x00007f99164bb000]
   java.lang.Thread.State: TIMED_WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000000ae1af8f0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at 
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082) at java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467) at io.netty.util.concurrent.SingleThreadEventExecutor.takeTask(SingleThreadEventExecutor.java:219) at io.netty.util.concurrent.DefaultEventExecutor.run(DefaultEventExecutor.java:34) at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:101)
        at java.lang.Thread.run(Thread.java:722)

"netty-client-exec-2" prio=10 tid=0x00007f98e8002000 nid=0x7798 waiting on condition [0x00007f99165bc000]
   java.lang.Thread.State: TIMED_WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000000ae231bf8> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at 
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082) at java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467) at io.netty.util.concurrent.SingleThreadEventExecutor.takeTask(SingleThreadEventExecutor.java:219) at io.netty.util.concurrent.DefaultEventExecutor.run(DefaultEventExecutor.java:34) at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:101)
        at java.lang.Thread.run(Thread.java:722)

"netty-client-exec-1" prio=10 tid=0x00007f9918002800 nid=0x7797 waiting on condition [0x00007f99166bd000]
   java.lang.Thread.State: TIMED_WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000000ae233500> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at 
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082) at java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467) at io.netty.util.concurrent.SingleThreadEventExecutor.takeTask(SingleThreadEventExecutor.java:219) at io.netty.util.concurrent.DefaultEventExecutor.run(DefaultEventExecutor.java:34) at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:101)
        at java.lang.Thread.run(Thread.java:722)

"netty-client-exec-0" prio=10 tid=0x00007f9900015000 nid=0x7796 waiting on condition [0x00007f99167be000]
   java.lang.Thread.State: TIMED_WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000000ae234e08> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at 
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082) at java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467) at io.netty.util.concurrent.SingleThreadEventExecutor.takeTask(SingleThreadEventExecutor.java:219) at io.netty.util.concurrent.DefaultEventExecutor.run(DefaultEventExecutor.java:34) at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:101)
        at java.lang.Thread.run(Thread.java:722)

"netty-client-worker-3" prio=10 tid=0x00007f9928036800 nid=0x7795 runnable [0x00007f99168bf000]
   java.lang.Thread.State: RUNNABLE
        at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
        at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:228)
        at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:81)
        at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
- locked <0x00000000ae3b2e10> (a io.netty.channel.nio.SelectedSelectionKeySet)
        - locked <0x00000000ae22fc08> (a java.util.Collections$UnmodifiableSet)
        - locked <0x00000000ae3b2d20> (a sun.nio.ch.EPollSelectorImpl)
        at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
        at io.netty.channel.nio.NioEventLoop.select(NioEventLoop.java:596)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:306)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:101)
        at java.lang.Thread.run(Thread.java:722)

"netty-client-worker-2" prio=10 tid=0x00007f9928035000 nid=0x7794 runnable [0x00007f99169c0000]
   java.lang.Thread.State: RUNNABLE
        at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
        at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:228)
        at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:81)
        at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
- locked <0x00000000ae24cd00> (a io.netty.channel.nio.SelectedSelectionKeySet)
        - locked <0x00000000ae24ff20> (a java.util.Collections$UnmodifiableSet)
        - locked <0x00000000ae24cc00> (a sun.nio.ch.EPollSelectorImpl)
        at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
        at io.netty.channel.nio.NioEventLoop.select(NioEventLoop.java:596)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:306)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:101)
        at java.lang.Thread.run(Thread.java:722)

"netty-client-worker-1" prio=10 tid=0x00007f9928034000 nid=0x7793 runnable [0x00007f9916ac1000]
   java.lang.Thread.State: RUNNABLE
        at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
        at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:228)
        at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:81)
        at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
- locked <0x00000000ae19c400> (a io.netty.channel.nio.SelectedSelectionKeySet)
        - locked <0x00000000ae460d48> (a java.util.Collections$UnmodifiableSet)
        - locked <0x00000000ae19c310> (a sun.nio.ch.EPollSelectorImpl)
        at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
        at io.netty.channel.nio.NioEventLoop.select(NioEventLoop.java:596)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:306)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:101)
        at java.lang.Thread.run(Thread.java:722)

"netty-client-worker-0" prio=10 tid=0x00007f9928039800 nid=0x7792 runnable [0x00007f9916bc2000]
   java.lang.Thread.State: RUNNABLE
        at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
        at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:228)
        at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:81)
        at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
- locked <0x00000000ae2d0238> (a io.netty.channel.nio.SelectedSelectionKeySet)
        - locked <0x00000000ae2d0258> (a java.util.Collections$UnmodifiableSet)
        - locked <0x00000000ae2d01f0> (a sun.nio.ch.EPollSelectorImpl)
        at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
        at io.netty.channel.nio.NioEventLoop.select(NioEventLoop.java:596)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:306)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:101)
        at java.lang.Thread.run(Thread.java:722)

"netty-server-exec-0" prio=10 tid=0x00007f98f800a000 nid=0x7791 waiting on condition [0x00007f9916cc3000]
   java.lang.Thread.State: TIMED_WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000000ae4577a8> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at 
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082) at java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467) at io.netty.util.concurrent.SingleThreadEventExecutor.takeTask(SingleThreadEventExecutor.java:219) at io.netty.util.concurrent.DefaultEventExecutor.run(DefaultEventExecutor.java:34) at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:101)
        at java.lang.Thread.run(Thread.java:722)

"netty-server-worker-0" prio=10 tid=0x00007f9910009800 nid=0x7790 waiting for monitor entry [0x00007f9916ec5000]
   java.lang.Thread.State: BLOCKED (on object monitor)
at org.apache.giraph.partition.DiskBackedPartitionStore.getOrCreatePartition(DiskBackedPartitionStore.java:234) - waiting to lock <0x00000000ae3c8458> (a org.apache.giraph.partition.DiskBackedPartitionStore$MetaPartition) at org.apache.giraph.comm.requests.SendWorkerVerticesRequest.doRequest(SendWorkerVerticesRequest.java:114) at org.apache.giraph.comm.netty.handler.WorkerRequestServerHandler.processRequest(WorkerRequestServerHandler.java:60) at org.apache.giraph.comm.netty.handler.WorkerRequestServerHandler.processRequest(WorkerRequestServerHandler.java:36) at org.apache.giraph.comm.netty.handler.RequestServerHandler.channelRead(RequestServerHandler.java:103) at io.netty.channel.DefaultChannelHandlerContext.invokeChannelRead(DefaultChannelHandlerContext.java:338) at io.netty.channel.DefaultChannelHandlerContext.fireChannelRead(DefaultChannelHandlerContext.java:324) at org.apache.giraph.comm.netty.handler.RequestDecoder.channelRead(RequestDecoder.java:103) at io.netty.channel.DefaultChannelHandlerContext.invokeChannelRead(DefaultChannelHandlerContext.java:338) at io.netty.channel.DefaultChannelHandlerContext.access$700(DefaultChannelHandlerContext.java:29) at io.netty.channel.DefaultChannelHandlerContext$8.run(DefaultChannelHandlerContext.java:329) at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:354)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:353)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:101)
        at java.lang.Thread.run(Thread.java:722)

"Thread-12" prio=10 tid=0x00007f9928a74800 nid=0x778c waiting on condition [0x00007f9916dc4000]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
        at java.lang.Thread.sleep(Native Method)
at org.apache.giraph.worker.WorkerProgressWriter$1.run(WorkerProgressWriter.java:55)
        at java.lang.Thread.run(Thread.java:722)

"netty-server-boss-0" prio=10 tid=0x00007f9928a73000 nid=0x778a runnable [0x00007f9916fc6000]
   java.lang.Thread.State: RUNNABLE
        at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
        at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:228)
        at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:81)
        at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
- locked <0x00000000ae3e94c8> (a io.netty.channel.nio.SelectedSelectionKeySet)
        - locked <0x00000000ae3ea7d0> (a java.util.Collections$UnmodifiableSet)
        - locked <0x00000000ae3e93c8> (a sun.nio.ch.EPollSelectorImpl)
        at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
        at io.netty.channel.nio.NioEventLoop.select(NioEventLoop.java:596)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:306)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:101)
        at java.lang.Thread.run(Thread.java:722)

"main-EventThread" daemon prio=10 tid=0x00007f99287f4000 nid=0x777e waiting on condition [0x00007f99170c7000]
   java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000000ae3d6f90> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043) at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
        at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:491)

"main-SendThread(cloud-18.dima.tu-berlin.de:22181)" daemon prio=10 tid=0x00007f99287d5800 nid=0x777d runnable [0x00007f99172c9000]
   java.lang.Thread.State: RUNNABLE
        at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
        at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:228)
        at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:81)
        at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
        - locked <0x00000000ae3d6aa0> (a sun.nio.ch.Util$2)
        - locked <0x00000000ae3d6a90> (a java.util.Collections$UnmodifiableSet)
        - locked <0x00000000ae3d6830> (a sun.nio.ch.EPollSelectorImpl)
        at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:338)
        at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068)

"LeaseChecker" daemon prio=10 tid=0x00007f99287cb800 nid=0x777c waiting on condition [0x00007f99171c8000]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
        at java.lang.Thread.sleep(Native Method)
        at 
org.apache.hadoop.hdfs.DFSClient$LeaseChecker.run(DFSClient.java:1379)
        at java.lang.Thread.run(Thread.java:722)

"metrics-meter-tick-thread-2" daemon prio=10 tid=0x00007f99287c0000 nid=0x777a waiting on condition [0x00007f99173ca000]
   java.lang.Thread.State: TIMED_WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000000ae3a2fb0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at 
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082) at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1090) at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:807) at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1043) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1103) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
        at java.lang.Thread.run(Thread.java:722)

"metrics-meter-tick-thread-1" daemon prio=10 tid=0x00007f99287be000 nid=0x7779 waiting on condition [0x00007f99174cb000]
   java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000000ae3a2fb0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043) at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1085) at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:807) at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1043) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1103) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
        at java.lang.Thread.run(Thread.java:722)

"communication thread" daemon prio=10 tid=0x00007f9928699800 nid=0x7771 waiting on condition [0x00007f99175cc000]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
        at java.lang.Thread.sleep(Native Method)
        at org.apache.hadoop.mapred.Task$TaskReporter.run(Task.java:654)
        at java.lang.Thread.run(Thread.java:722)

"Timer for 'MapTask' metrics system" daemon prio=10 tid=0x00007f9928673000 nid=0x776f in Object.wait() [0x00007f99179df000]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        - waiting on <0x00000000ae393a08> (a java.util.TaskQueue)
        at java.util.TimerThread.mainLoop(Timer.java:552)
        - locked <0x00000000ae393a08> (a java.util.TaskQueue)
        at java.util.TimerThread.run(Timer.java:505)

"Thread for syncLogs" daemon prio=10 tid=0x00007f99284e6800 nid=0x776e waiting on condition [0x00007f9917ae0000]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
        at java.lang.Thread.sleep(Native Method)
        at org.apache.hadoop.mapred.Child$3.run(Child.java:139)

"IPC Client (47) connection to /127.0.0.1:36307 from job_201402201052_0002" daemon prio=10 tid=0x00007f99284e0800 nid=0x776d in Object.wait() [0x00007f9917be1000]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
- waiting on <0x00000000ae325778> (a org.apache.hadoop.ipc.Client$Connection)
        at org.apache.hadoop.ipc.Client$Connection.waitForWork(Client.java:706)
        - locked <0x00000000ae325778> (a 
org.apache.hadoop.ipc.Client$Connection)
        at org.apache.hadoop.ipc.Client$Connection.run(Client.java:748)

"Service Thread" daemon prio=10 tid=0x00007f9928105000 nid=0x776b runnable [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C2 CompilerThread1" daemon prio=10 tid=0x00007f9928102800 nid=0x776a waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C2 CompilerThread0" daemon prio=10 tid=0x00007f99280ff800 nid=0x7769 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Signal Dispatcher" daemon prio=10 tid=0x00007f99280fd800 nid=0x7768 runnable [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Finalizer" daemon prio=10 tid=0x00007f99280b0800 nid=0x7767 in Object.wait() [0x00007f991d50c000]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        - waiting on <0x00000000ae4620d8> (a java.lang.ref.ReferenceQueue$Lock)
        at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:135)
        - locked <0x00000000ae4620d8> (a java.lang.ref.ReferenceQueue$Lock)
        at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:151)
        at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:177)

"Reference Handler" daemon prio=10 tid=0x00007f99280ae800 nid=0x7766 in Object.wait() [0x00007f991d60d000]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        - waiting on <0x00000000ae461d20> (a java.lang.ref.Reference$Lock)
        at java.lang.Object.wait(Object.java:503)
        at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:133)
        - locked <0x00000000ae461d20> (a java.lang.ref.Reference$Lock)

"main" prio=10 tid=0x00007f992800b000 nid=0x7757 in Object.wait() [0x00007f993111b000]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
- waiting on <0x00000000ae3b3740> (a java.util.concurrent.ConcurrentHashMap) at org.apache.giraph.comm.netty.NettyClient.waitSomeRequests(NettyClient.java:710)
        - locked <0x00000000ae3b3740> (a java.util.concurrent.ConcurrentHashMap)
at org.apache.giraph.comm.netty.NettyClient.waitAllRequests(NettyClient.java:685) at org.apache.giraph.comm.netty.NettyWorkerClient.waitAllRequests(NettyWorkerClient.java:148) at org.apache.giraph.worker.BspServiceWorker.loadInputSplits(BspServiceWorker.java:291) at org.apache.giraph.worker.BspServiceWorker.loadVertices(BspServiceWorker.java:328) at org.apache.giraph.worker.BspServiceWorker.setup(BspServiceWorker.java:509) at org.apache.giraph.graph.GraphTaskManager.execute(GraphTaskManager.java:262)
        at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:91)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
        at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
        at org.apache.hadoop.mapred.Child.main(Child.java:249)

"VM Thread" prio=10 tid=0x00007f99280a7000 nid=0x7765 runnable

"GC task thread#0 (ParallelGC)" prio=10 tid=0x00007f9928018800 nid=0x7758 runnable

"GC task thread#1 (ParallelGC)" prio=10 tid=0x00007f992801a800 nid=0x7759 runnable

"GC task thread#2 (ParallelGC)" prio=10 tid=0x00007f992801c800 nid=0x775a runnable

"GC task thread#3 (ParallelGC)" prio=10 tid=0x00007f992801e800 nid=0x775b runnable

"GC task thread#4 (ParallelGC)" prio=10 tid=0x00007f9928020000 nid=0x775c runnable

"GC task thread#5 (ParallelGC)" prio=10 tid=0x00007f9928022000 nid=0x775d runnable

"GC task thread#6 (ParallelGC)" prio=10 tid=0x00007f9928024000 nid=0x775e runnable

"GC task thread#7 (ParallelGC)" prio=10 tid=0x00007f9928025800 nid=0x775f runnable

"GC task thread#8 (ParallelGC)" prio=10 tid=0x00007f9928027800 nid=0x7760 runnable

"GC task thread#9 (ParallelGC)" prio=10 tid=0x00007f9928029800 nid=0x7761 runnable

"GC task thread#10 (ParallelGC)" prio=10 tid=0x00007f992802b000 nid=0x7762 runnable

"GC task thread#11 (ParallelGC)" prio=10 tid=0x00007f992802d000 nid=0x7763 runnable

"GC task thread#12 (ParallelGC)" prio=10 tid=0x00007f992802f000 nid=0x7764 runnable

"VM Periodic Task Thread" prio=10 tid=0x00007f9928117800 nid=0x776c waiting on condition

JNI global references: 414


On 02/16/2014 12:01 AM, Armando Miraglia wrote:
Hi Sebastian.

I gave a quick look to the source code of DiskPartition and I realised
I forgot to correctly manage the concurrent management of one variable.
The patch I am sending you now should fix this problem. However, I am
not sure whether this was the reason for your issue, so it is still
possible that your issue is there.

If you could test it, it would be great.

Cheers,
A.


Reply via email to