[
https://issues.apache.org/jira/browse/APEXCORE-415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15224689#comment-15224689
]
ASF GitHub Bot commented on APEXCORE-415:
-----------------------------------------
Github user ilooner commented on the pull request:
https://github.com/apache/incubator-apex-core/pull/292#issuecomment-205426336
@davidyan74 There was a similar bug in Generic Node which I fixed and got
merged before. The problem happens when The Operator CheckpointWindowCount is a
multiple of the Dag CheckpointWindowCount. In that case the operator is
checkpointed on receiving the endWindow tuple and the checkpoint tuple for the
same window. When that happens AsyncFSStorageAgent throws an exception because
there are two threads trying to move the same file.
> Input Operator Can Double Checkpoint if Operator CheckpointWindowCount is
> greater than DAG CheckpointWindowCount
> ----------------------------------------------------------------------------------------------------------------
>
> Key: APEXCORE-415
> URL: https://issues.apache.org/jira/browse/APEXCORE-415
> Project: Apache Apex Core
> Issue Type: Bug
> Reporter: Timothy Farkas
> Assignee: Timothy Farkas
>
> Application that reproduces the issue is here
> https://github.com/ilooner/streamcodec-bug/tree/asyncCheckpointBug
> java.lang.RuntimeException: java.util.concurrent.ExecutionException:
> java.lang.RuntimeException: java.io.FileNotFoundException:
> /disk6/ndevyarn/nm/usercache/tim/appcache/application_1456485348783_3429/container_1456485348783_3429_01_000019/tmp/chkp3241218411712328004/1/6268662011559673861
> (No such file or directory)
> at
> com.datatorrent.netlet.util.DTThrowable.wrapIfChecked(DTThrowable.java:59)
> at com.datatorrent.stram.engine.Node.reportStats(Node.java:465)
> at com.datatorrent.stram.engine.InputNode.run(InputNode.java:156)
> at
> com.datatorrent.stram.engine.StreamingContainer$2.run(StreamingContainer.java:1388)
> Caused by: java.util.concurrent.ExecutionException:
> java.lang.RuntimeException: java.io.FileNotFoundException:
> /disk6/ndevyarn/nm/usercache/tim/appcache/application_1456485348783_3429/container_1456485348783_3429_01_000019/tmp/chkp3241218411712328004/1/6268662011559673861
> (No such file or directory)
> at java.util.concurrent.FutureTask.report(FutureTask.java:122)
> at java.util.concurrent.FutureTask.get(FutureTask.java:188)
> at com.datatorrent.stram.engine.Node.reportStats(Node.java:458)
> ... 2 more
> Caused by: java.lang.RuntimeException: java.io.FileNotFoundException:
> /disk6/ndevyarn/nm/usercache/tim/appcache/application_1456485348783_3429/container_1456485348783_3429_01_000019/tmp/chkp3241218411712328004/1/6268662011559673861
> (No such file or directory)
> at
> com.datatorrent.netlet.util.DTThrowable.wrapIfChecked(DTThrowable.java:50)
> at com.datatorrent.netlet.util.DTThrowable.rethrow(DTThrowable.java:31)
> at
> com.datatorrent.common.util.AsyncFSStorageAgent.copyToHDFS(AsyncFSStorageAgent.java:126)
> at
> com.datatorrent.stram.engine.Node$CheckpointHandler.call(Node.java:684)
> at
> com.datatorrent.stram.engine.Node$CheckpointHandler.call(Node.java:673)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.FileNotFoundException:
> /disk6/ndevyarn/nm/usercache/tim/appcache/application_1456485348783_3429/container_1456485348783_3429_01_000019/tmp/chkp3241218411712328004/1/6268662011559673861
> (No such file or directory)
> at java.io.FileInputStream.open(Native Method)
> at java.io.FileInputStream.<init>(FileInputStream.java:146)
> at
> com.datatorrent.common.util.AsyncFSStorageAgent.copyToHDFS(AsyncFSStorageAgent.java:117)
> ... 8 more
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)