[ 
https://issues.apache.org/jira/browse/TEZ-1019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14252782#comment-14252782
 ] 

Jeff Zhang commented on TEZ-1019:
---------------------------------

bq. There is no guarantee that vertex running event was written in time ( given 
that it is not critical ) hence both the vertex start could have occurred as 
well tasks starting/finishing.
Yes, I know it is not written in time. But if the recoveredState is in INITED, 
that means the VertexStartedEvent and Task related event is not logged too. 
That means we have no Task to recover in this case.

bq. That should be the case in most scenarios. However, with allowing of -1 on 
1:1 edges and waiting for an upstream parallelism to be set to define the 
downstream vertex parallelism, we may need to verify all such cases. Also, in 
case of a parallelism update ( after running ), numTasks need not be set to 0 
but this could just be a sanity check to verify the tasks array matches 
numTasks.
Why we allow vertex go to RUNNING state with taskNum setting as -1 ? It makes 
no beneficial for that, since we still can not start any tasks when taskNum is 
-1.  

bq. numTasks 0 means vertex should go to a succeeded state. this might also 
happen if the vertex manager sets parallelism to 0
Is there any real case in Pig/Hive that VM would set parallelism to 0 ?




> Re-factor routing of events to use common code path for normal and recovery 
> flow.
> ---------------------------------------------------------------------------------
>
>                 Key: TEZ-1019
>                 URL: https://issues.apache.org/jira/browse/TEZ-1019
>             Project: Apache Tez
>          Issue Type: Sub-task
>            Reporter: Hitesh Shah
>            Assignee: Jeff Zhang
>         Attachments: TEZ-1019-2.patch, Tez-1019.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to