[
https://issues.apache.org/jira/browse/TEZ-1170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14041169#comment-14041169
]
Siddharth Seth commented on TEZ-1170:
-------------------------------------
bq. Correct. But before any of that happens the source would have sent an INIT
event to this vertex to make it start its initialization. So the vertex would
not be in a NEW state.
I'm not sure that's always the case, especially for more complicated DAGs where
the source has multiple Inputs - and one of them determines the parallelism.
bq. Not sure I understand. Until all the initializers are done, the shutdown is
not invoked until this transition code is called.
The shutdown is invoked as soon as a KILL comes in (which could be via a user
event). The Vertex processes the KILL and tries to shutdown the initializers -
however, some initialization_DONE events may be in queue or generated if the
initializer does not respect interrupts or wins a race between the interrupt
and completing.
bq. The other approach would be for setParallelism() to not check for
canInitVertex() and always send a SET_PARALLELISM_INVOKED event which would be
ignored in the state machine if canInitVertex() is not yet true. But that would
mean sending extra events since those events would not trigger a state change.
Right?
Right, avoiding the event is an optimization in set parallelism.
The same optimization should exist for the ONE_TO_ONE case as well - where we
can skip events. It should, in fact, exist for all transitions to the START
state - instead of explicitly generating an event to start.
That optimizations / re-working transitions can be a follow up jira, but I
think we need to fix at least the ROOT_INPUT_INITIALIZED state transitions
which were changed by this patch.
> Simplify Vertex Initializing transition
> ---------------------------------------
>
> Key: TEZ-1170
> URL: https://issues.apache.org/jira/browse/TEZ-1170
> Project: Apache Tez
> Issue Type: Improvement
> Reporter: Siddharth Seth
> Assignee: Bikas Saha
> Fix For: 0.5.0
>
> Attachments: TEZ-1170.1.patch, TEZ-1170.2.patch
>
>
> After TEZ-1145 and 1151, a vertex should only need to stay in INITIALZING
> state when it has an uninitialized edge, or when the parallelism is at -1
> (not set yet). Waiting for all RootInputInitializers to complete should not
> be required - as long as one of them sets parallelism (via a VertexManager).
--
This message was sent by Atlassian JIRA
(v6.2#6252)