[ 
https://issues.apache.org/jira/browse/TEZ-714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14386113#comment-14386113
 ] 

Jeff Zhang edited comment on TEZ-714 at 3/30/15 2:45 AM:
---------------------------------------------------------

Regarding the partial output, After second thought, I think we should only 
consider it as vertex basis rather than output basis. That means either one 
vertex's all outputs commits successfully or abort all.  I think the purpose of 
TEZ_AM_COMMIT_ALL_OUTPUTS_ON_DAG_SUCCESS to allow external system is to check 
the intermediate vertex's output at vertex level. Say if one vertex has 2 
outputs and TEZ_AM_COMMIT_ALL_OUTPUTS_ON_DAG_SUCCESS is false, and the first 
commit succeeded but the second commit fails, then we should abort both of them 
and mark this vertex to failed state. 


was (Author: zjffdu):
Regarding the partial output, After second thought, I think we should only 
consider it as vertex basis rather than output basis. That means either one 
vertex's all outputs commits successfully or abort all.  I think the purpose of 
TEZ_AM_COMMIT_ALL_OUTPUTS_ON_DAG_SUCCESS to allow external system to check the 
intermediate vertex's output. Say if one vertex has 2 outputs and 
TEZ_AM_COMMIT_ALL_OUTPUTS_ON_DAG_SUCCESS is false, and the first commit 
succeeded but the second commit fails, then we should abort both of them and 
mark this vertex to failed state. 

> OutputCommitters should not run in the main AM dispatcher thread
> ----------------------------------------------------------------
>
>                 Key: TEZ-714
>                 URL: https://issues.apache.org/jira/browse/TEZ-714
>             Project: Apache Tez
>          Issue Type: Improvement
>            Reporter: Siddharth Seth
>            Assignee: Jeff Zhang
>            Priority: Critical
>         Attachments: DAG_2.pdf, TEZ-714-1.patch, TEZ-714-2.patch, 
> TEZ-714-3.patch, TEZ-714-4.patch, TEZ-714-5.patch, Vertex_2.pdf
>
>
> Follow up jira from TEZ-41.
> 1) If there's multiple OutputCommitters on a Vertex, they can be run in 
> parallel.
> 2) Running an OutputCommitter in the main thread blocks all other event 
> handling, w.r.t the DAG, and causes the event queue to back up.
> 3) This should also cover shared commits that happen in the DAG.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to