[ 
https://issues.apache.org/jira/browse/TEZ-3356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15383490#comment-15383490
 ] 

Hitesh Shah commented on TEZ-3356:
----------------------------------

\cc [~bikassaha] [~rajesh.balamohan]

> Fix initializing of stats when custom ShuffleVertexManager is used
> ------------------------------------------------------------------
>
>                 Key: TEZ-3356
>                 URL: https://issues.apache.org/jira/browse/TEZ-3356
>             Project: Apache Tez
>          Issue Type: Bug
>    Affects Versions: 0.8.4
>            Reporter: Peter Slawski
>            Assignee: Peter Slawski
>         Attachments: TEZ-3356.1.patch
>
>
> When using a custom ShuffleVertexManager to set a vertex’s parallelism, the 
> partition stats field will be left uninitialized even after the manager 
> itself gets initialized. This results in a IllegalStateException to be thrown 
> as the stats field will not yet be initialized when VertexManagerEvents are 
> processed upon the start of the vertex. Note that these events contain 
> partition sizes which are aggregated and stored in this stats field.
>  
> Apache Pig’s grace auto-parallelism feature uses a custom 
> ShuffleVertexManager which sets a vertex’s parallelism upon the completion of 
> one of its parent’s parents. Thus, this corner case is hit and pig scripts 
> with grace parallelism enabled would fail if the DAG consists of at least one 
> vertex having grandparents.
>  
> The fix should be straight forward. Before rather than after 
> VertexManagerEvents are processed, simply update pending tasks to ensure the 
> partition stats field will be initialized.
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to