[ 
https://issues.apache.org/jira/browse/GIRAPH-462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15283660#comment-15283660
 ] 

ramesh krishnan m commented on GIRAPH-462:
------------------------------------------

is this issue fixed. I am still getting this erron in the latest release .

Exception logs:

2016-05-14 19:10:55,733 ERROR [ooc-io-0] 
org.apache.giraph.utils.LogStacktraceCallable: Execution of callable failed
java.lang.RuntimeException: java.io.EOFException
        at 
org.apache.giraph.ooc.OutOfCoreIOCallable.call(OutOfCoreIOCallable.java:76)
        at 
org.apache.giraph.ooc.OutOfCoreIOCallable.call(OutOfCoreIOCallable.java:30)
        at 
org.apache.giraph.utils.LogStacktraceCallable.call(LogStacktraceCallable.java:51)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.EOFException
        at java.io.DataInputStream.readInt(DataInputStream.java:392)
        at org.apache.hadoop.io.IntWritable.readFields(IntWritable.java:47)
        at 
org.apache.giraph.ooc.data.DiskBackedPartitionStore.readOutEdges(DiskBackedPartitionStore.java:286)
        at 
org.apache.giraph.ooc.data.DiskBackedPartitionStore.loadInMemoryPartitionData(DiskBackedPartitionStore.java:329)
        at 
org.apache.giraph.ooc.data.OutOfCoreDataManager.loadPartitionData(OutOfCoreDataManager.java:195)
        at 
org.apache.giraph.ooc.data.DiskBackedPartitionStore.loadPartitionData(DiskBackedPartitionStore.java:360)
        at 
org.apache.giraph.ooc.io.LoadPartitionIOCommand.execute(LoadPartitionIOCommand.java:64)
        at 
org.apache.giraph.ooc.OutOfCoreIOCallable.call(OutOfCoreIOCallable.java:72)
        ... 6 more
2016-05-14 19:10:55,737 INFO [ooc-io-0] 
org.apache.giraph.ooc.OutOfCoreIOCallableFactory: afterExecute: an out-of-core 
thread terminated unexpectedly with java.util.concurrent.ExecutionException: 
java.lang.RuntimeException: java.io.EOFException
2016-05-14 19:10:55,739 INFO [checkpoint-vertices-7] 
org.apache.giraph.ooc.FixedOutOfCoreEngine: getNextPartition: waiting until a 
partition becomes available!
2016-05-14 19:10:56,426 ERROR [checkpoint-vertices-6] 
org.apache.giraph.utils.LogStacktraceCallable: Execution of callable failed
java.lang.RuntimeException: Job Failed due to a failure in an out-of-core IO 
thread
        at 
org.apache.giraph.ooc.FixedOutOfCoreEngine.getNextPartition(FixedOutOfCoreEngine.java:81)
        at 
org.apache.giraph.ooc.data.DiskBackedPartitionStore.getNextPartition(DiskBackedPartitionStore.java:187)
        at 
org.apache.giraph.worker.BspServiceWorker$3$1.call(BspServiceWorker.java:1398)
        at 
org.apache.giraph.worker.BspServiceWorker$3$1.call(BspServiceWorker.java:1392)
        at 
org.apache.giraph.utils.LogStacktraceCallable.call(LogStacktraceCallable.java:51)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)

> Multithreading breaks out-of-core graph
> ---------------------------------------
>
>                 Key: GIRAPH-462
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-462
>             Project: Giraph
>          Issue Type: Bug
>            Reporter: Alessandro Presta
>            Priority: Critical
>         Attachments: GIRAPH-461.patch
>
>
> [~cmartella] pointed out this issue: when using multithreaded computation in 
> conjunction with out-of-core graph, we incur in a race condition. The compute 
> threads share the same DiskBackedPartitionStore, whose getPartition() method 
> is not meant to be thread-safe. When two threads request two out-of-core 
> partitions concurrently, they both try to load it to the same slot.
> The result is that we can lose the reference to one of the two partitions 
> (which will not be written back to disk) and we can incur in a 
> NullPointerException when both threads are trying to offload the currently 
> loaded partition to disk.
> I ran this test to confirm the issue:
> https://gist.github.com/4429628
> All tests pass except the one that uses both out-of-core graph and multiple 
> compute threads.
> The error is the following:
> https://gist.github.com/4429650



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to