[ 
https://issues.apache.org/jira/browse/TEZ-2412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang updated TEZ-2412:
----------------------------
    Description: 
Actually 2 issues
* When vertex rerun, it move to RUNNING state, so should kill it in 
DAGImpl#VertexRerunWhileCommitting
* Vertex only send out DAGEventVertexReRunning when vertex.commitVertexOutputs 
is false, that means it is not necessary to check whether its commits has been 
committed as shared output of vertex group in DAGImpl#vertexReRunning
{code}
private boolean vertexReRunning(Vertex vertex) {
    reRunningVertices.add(vertex.getVertexId());
    numSuccessfulVertices--;
    addDiagnostic("Vertex re-running"
      + ", vertexName=" + vertex.getName()
      + ", vertexId=" + vertex.getVertexId());

    if (!commitAllOutputsOnSuccess) {
      // partial output may already have been committed. fail if so
      List<VertexGroupInfo> groupList = vertexGroupInfo.get(vertex.getName());
      if (groupList != null) {
        for (VertexGroupInfo groupInfo : groupList) {
          if (groupInfo.committed) {
            String msg = "Aborting job as committed vertex: "
                + vertex.getLogIdentifier() + " is re-running";
            LOG.info(msg);
            addDiagnostic(msg);
            enactKill(DAGTerminationCause.VERTEX_RERUN_AFTER_COMMIT,
                VertexTerminationCause.VERTEX_RERUN_AFTER_COMMIT);
            return true;
          } else {
            groupInfo.successfulMembers--;
          }
        }
      }
    }
    return false;
  }
{code}

  was:When vertex rerun, it move to RUNNING state, so should kill it in 


> Should kill vertex if vertex rerun is failed in 
> DAGImpl#VertexRerunWhileCommitting
> ----------------------------------------------------------------------------------
>
>                 Key: TEZ-2412
>                 URL: https://issues.apache.org/jira/browse/TEZ-2412
>             Project: Apache Tez
>          Issue Type: Bug
>    Affects Versions: 0.7.0
>            Reporter: Jeff Zhang
>            Assignee: Jeff Zhang
>             Fix For: 0.7.0
>
>
> Actually 2 issues
> * When vertex rerun, it move to RUNNING state, so should kill it in 
> DAGImpl#VertexRerunWhileCommitting
> * Vertex only send out DAGEventVertexReRunning when 
> vertex.commitVertexOutputs is false, that means it is not necessary to check 
> whether its commits has been committed as shared output of vertex group in 
> DAGImpl#vertexReRunning
> {code}
> private boolean vertexReRunning(Vertex vertex) {
>     reRunningVertices.add(vertex.getVertexId());
>     numSuccessfulVertices--;
>     addDiagnostic("Vertex re-running"
>       + ", vertexName=" + vertex.getName()
>       + ", vertexId=" + vertex.getVertexId());
>     if (!commitAllOutputsOnSuccess) {
>       // partial output may already have been committed. fail if so
>       List<VertexGroupInfo> groupList = vertexGroupInfo.get(vertex.getName());
>       if (groupList != null) {
>         for (VertexGroupInfo groupInfo : groupList) {
>           if (groupInfo.committed) {
>             String msg = "Aborting job as committed vertex: "
>                 + vertex.getLogIdentifier() + " is re-running";
>             LOG.info(msg);
>             addDiagnostic(msg);
>             enactKill(DAGTerminationCause.VERTEX_RERUN_AFTER_COMMIT,
>                 VertexTerminationCause.VERTEX_RERUN_AFTER_COMMIT);
>             return true;
>           } else {
>             groupInfo.successfulMembers--;
>           }
>         }
>       }
>     }
>     return false;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to