[ 
https://issues.apache.org/jira/browse/TEZ-2568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14598788#comment-14598788
 ] 

Jeff Zhang edited comment on TEZ-2568 at 6/24/15 3:20 AM:
----------------------------------------------------------

We can't control when and how many times user will call 
VertexManagerPluginContextImpl#addRootInputEvent. But we have control on when 
to send the RootInputDataInformationEvents. Currently we do it in callback of 
onRootVertexInitialized. So that means on each onRootVertexInitialized call, we 
will send RootInputDataInformationEvents back to Vertex in the callback. But 
the RootInputDataInformationEvents maybe empty if user didn't call 
VertexManagerPluginContextImpl#addRootInputEvent.  We only expose the api of 
VertexManagerPluginContextImpl#addRootInputEvent to user but don't expose any 
api of sending RootInputDataInformationEvents back to Vertex.  Anyway, we do 
have risk if in the future we send RootInputDataInformationEvents in other 
places 


was (Author: zjffdu):
We can't control when and how many times user will call 
VertexManagerPluginContextImpl#addRootInputEvent. But we have control on when 
to send the RootInputDataInformationEvents. Currently we do it in callback of 
onRootVertexInitialized. So that means on each onRootVertexInitialized call, we 
will send RootInputDataInformationEvents back to Vertex in the callback. But 
the RootInputDataInformationEvents maybe empty if user didn't call 
VertexManagerPluginContextImpl#addRootInputEvent.  We only expose the api of 
VertexManagerPluginContextImpl#addRootInputEvent to user but don't expose any 
api of sending RootInputDataInformationEvents back to Vertex

> V_INPUT_DATA_INFORMATION may happen after vertex is initialized
> ---------------------------------------------------------------
>
>                 Key: TEZ-2568
>                 URL: https://issues.apache.org/jira/browse/TEZ-2568
>             Project: Apache Tez
>          Issue Type: Bug
>    Affects Versions: 0.7.0
>            Reporter: Jeff Zhang
>            Assignee: Jeff Zhang
>            Priority: Blocker
>             Fix For: 0.8.0, 0.7.1
>
>         Attachments: TEZ-2568-1.patch, TEZ-2568-2.patch, TEZ-2568-3.patch, 
> TEZ-2568-4.patch, TEZ-2568-5.patch, a.log
>
>
> {code}
> 2015-06-19 15:57:28,462 ERROR [Dispatcher thread: Central] impl.VertexImpl: 
> Can't handle Invalid event V_INPUT_DATA_INFORMATION on vertex Map 2 with 
> vertexId vertex_1434754502979_0002_2_00 at current state INITED
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
> V_INPUT_DATA_INFORMATION at INITED
>         at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
>         at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
>         at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
>         at 
> org.apache.tez.state.StateMachineTez.doTransition(StateMachineTez.java:57)
>         at 
> org.apache.tez.dag.app.dag.impl.VertexImpl.handle(VertexImpl.java:1799)
>         at 
> org.apache.tez.dag.app.dag.impl.VertexImpl.handle(VertexImpl.java:198)
>         at 
> org.apache.tez.dag.app.DAGAppMaster$VertexEventDispatcher.handle(DAGAppMaster.java:1963)
>         at 
> org.apache.tez.dag.app.DAGAppMaster$VertexEventDispatcher.handle(DAGAppMaster.java:1949)
>         at 
> org.apache.tez.common.AsyncDispatcher.dispatch(AsyncDispatcher.java:183)
>         at 
> org.apache.tez.common.AsyncDispatcher$1.run(AsyncDispatcher.java:114)
>         at java.lang.Thread.run(Thread.java:722)
> {code}
> Vertex move to INITED as long as its parallelism is determined, no null edges 
> and root inputs are initialized. RootInputDataInformation handling is not a 
> precondition of vertex move to INITED.   We can't wait for all the 
> V_INPUT_DATA_INFORMATION events available in INITIALIZING state, because it 
> is not know how many V_INPUT_DATA_INFORMATION we may receive, it is 
> determined by VM.  So will allow V_INPUT_DATA_INFORMATION happens when vertex 
> is initialized. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to