[ 
https://issues.apache.org/jira/browse/TEZ-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14129680#comment-14129680
 ] 

Jeff Zhang commented on TEZ-1345:
---------------------------------



[~hitesh] Attach the new patch
* Remove vertexName in VertexDataMovementEventsGeneratedEvent, using vertexId 
for unit test
* bq. any reason for using synchronized as compared to using something like a 
LinkedBlockingQueue for the cached events? Does not need to be changed but just 
curious as to whether other options were considered?
Using LinkedBlockingQueue may still cause onRootVertexInitialized return 
init_events from 2 inputs. After a second thought, I think using 
ConcurrentHashMap would be much better. Use ConcurrentHashMap in the new patch.

* bq. Regd. the test in TestDAGRecovery, the test should likely pass even if 
the caching fix is not applied. The issue only shows up in cases where there is 
a vertex which has an additional input as well as an inbound edge to it from 
another vertex. This can be addressed as part of the overall recovery 
end-to-end regression tests jira.
The test won't pass when there's only one addition input in the root vertex if 
the issue is not fixed. The init event will written after VertexInitedEvent 
which would cause the recovery issue.


> Add checks to guarantee all init events are written to recovery to consider 
> vertex initialized
> ----------------------------------------------------------------------------------------------
>
>                 Key: TEZ-1345
>                 URL: https://issues.apache.org/jira/browse/TEZ-1345
>             Project: Apache Tez
>          Issue Type: Sub-task
>            Reporter: Hitesh Shah
>            Assignee: Jeff Zhang
>         Attachments: Tez-1345-10.patch, Tez-1345-2.patch, Tez-1345-3.patch, 
> Tez-1345-4.patch, Tez-1345-5.patch, Tez-1345-6.patch, Tez-1345-7.patch, 
> Tez-1345-8.patch, Tez-1345-9.patch, Tez-1345.patch
>
>
> Related to issue discovered in TEZ-1033



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to