[ https://issues.apache.org/jira/browse/TEZ-714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14386113#comment-14386113 ]
Jeff Zhang edited comment on TEZ-714 at 3/30/15 2:46 AM: --------------------------------------------------------- Regarding the partial output, After second thought, I think we should only consider it as vertex basis rather than output basis. That means either one vertex's all outputs commits successfully or abort all. I think the purpose of TEZ_AM_COMMIT_ALL_OUTPUTS_ON_DAG_SUCCESS is to allow external system to check the intermediate vertex's output at vertex level. Say if one vertex has 2 outputs and TEZ_AM_COMMIT_ALL_OUTPUTS_ON_DAG_SUCCESS is false, and the first commit succeeded but the second commit fails, then we should abort both of them and mark this vertex to failed state. was (Author: zjffdu): Regarding the partial output, After second thought, I think we should only consider it as vertex basis rather than output basis. That means either one vertex's all outputs commits successfully or abort all. I think the purpose of TEZ_AM_COMMIT_ALL_OUTPUTS_ON_DAG_SUCCESS to allow external system is to check the intermediate vertex's output at vertex level. Say if one vertex has 2 outputs and TEZ_AM_COMMIT_ALL_OUTPUTS_ON_DAG_SUCCESS is false, and the first commit succeeded but the second commit fails, then we should abort both of them and mark this vertex to failed state. > OutputCommitters should not run in the main AM dispatcher thread > ---------------------------------------------------------------- > > Key: TEZ-714 > URL: https://issues.apache.org/jira/browse/TEZ-714 > Project: Apache Tez > Issue Type: Improvement > Reporter: Siddharth Seth > Assignee: Jeff Zhang > Priority: Critical > Attachments: DAG_2.pdf, TEZ-714-1.patch, TEZ-714-2.patch, > TEZ-714-3.patch, TEZ-714-4.patch, TEZ-714-5.patch, Vertex_2.pdf > > > Follow up jira from TEZ-41. > 1) If there's multiple OutputCommitters on a Vertex, they can be run in > parallel. > 2) Running an OutputCommitter in the main thread blocks all other event > handling, w.r.t the DAG, and causes the event queue to back up. > 3) This should also cover shared commits that happen in the DAG. -- This message was sent by Atlassian JIRA (v6.3.4#6332)