[ 
https://issues.apache.org/jira/browse/TEZ-1170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14041169#comment-14041169
 ] 

Siddharth Seth commented on TEZ-1170:
-------------------------------------

bq. Correct. But before any of that happens the source would have sent an INIT 
event to this vertex to make it start its initialization. So the vertex would 
not be in a NEW state.
I'm not sure that's always the case, especially for more complicated DAGs where 
the source has multiple Inputs - and one of them determines the parallelism.

bq. Not sure I understand. Until all the initializers are done, the shutdown is 
not invoked until this transition code is called.
The shutdown is invoked as soon as a KILL comes in (which could be via a user 
event). The Vertex processes the KILL and tries to shutdown the initializers - 
however, some initialization_DONE events may be in queue or generated if the 
initializer does not respect interrupts or wins a race between the interrupt 
and completing.

bq. The other approach would be for setParallelism() to not check for 
canInitVertex() and always send a SET_PARALLELISM_INVOKED event which would be 
ignored in the state machine if canInitVertex() is not yet true. But that would 
mean sending extra events since those events would not trigger a state change. 
Right?
Right, avoiding the event is an optimization in set parallelism.

The same optimization should exist for the ONE_TO_ONE case as well - where we 
can skip events. It should, in fact, exist for all transitions to the START 
state - instead of explicitly generating an event to start.

That optimizations / re-working transitions can be a follow up jira, but I 
think we need to fix at least the ROOT_INPUT_INITIALIZED state transitions 
which were changed by this patch.


> Simplify Vertex Initializing transition
> ---------------------------------------
>
>                 Key: TEZ-1170
>                 URL: https://issues.apache.org/jira/browse/TEZ-1170
>             Project: Apache Tez
>          Issue Type: Improvement
>            Reporter: Siddharth Seth
>            Assignee: Bikas Saha
>             Fix For: 0.5.0
>
>         Attachments: TEZ-1170.1.patch, TEZ-1170.2.patch
>
>
> After TEZ-1145 and 1151, a vertex should only need to stay in INITIALZING 
> state when it has an uninitialized edge, or when the parallelism is at -1 
> (not set yet). Waiting for all RootInputInitializers to complete should not 
> be required - as long as one of them sets parallelism (via a VertexManager).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to