[ https://issues.apache.org/jira/browse/TEZ-3356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15383490#comment-15383490 ]
Hitesh Shah commented on TEZ-3356: ---------------------------------- \cc [~bikassaha] [~rajesh.balamohan] > Fix initializing of stats when custom ShuffleVertexManager is used > ------------------------------------------------------------------ > > Key: TEZ-3356 > URL: https://issues.apache.org/jira/browse/TEZ-3356 > Project: Apache Tez > Issue Type: Bug > Affects Versions: 0.8.4 > Reporter: Peter Slawski > Assignee: Peter Slawski > Attachments: TEZ-3356.1.patch > > > When using a custom ShuffleVertexManager to set a vertex’s parallelism, the > partition stats field will be left uninitialized even after the manager > itself gets initialized. This results in a IllegalStateException to be thrown > as the stats field will not yet be initialized when VertexManagerEvents are > processed upon the start of the vertex. Note that these events contain > partition sizes which are aggregated and stored in this stats field. > > Apache Pig’s grace auto-parallelism feature uses a custom > ShuffleVertexManager which sets a vertex’s parallelism upon the completion of > one of its parent’s parents. Thus, this corner case is hit and pig scripts > with grace parallelism enabled would fail if the DAG consists of at least one > vertex having grandparents. > > The fix should be straight forward. Before rather than after > VertexManagerEvents are processed, simply update pending tasks to ensure the > partition stats field will be initialized. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)