[ https://issues.apache.org/jira/browse/HIVE-16295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16540730#comment-16540730 ]
Hive QA commented on HIVE-16295: -------------------------------- Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12931082/HIVE-16295.8.patch {color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified. {color:green}SUCCESS:{color} +1 due to 14642 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/12537/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12537/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12537/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12931082 - PreCommit-HIVE-Build > Add support for using Hadoop's S3A OutputCommitter > -------------------------------------------------- > > Key: HIVE-16295 > URL: https://issues.apache.org/jira/browse/HIVE-16295 > Project: Hive > Issue Type: Sub-task > Reporter: Sahil Takiar > Assignee: Sahil Takiar > Priority: Major > Attachments: HIVE-16295.1.WIP.patch, HIVE-16295.2.WIP.patch, > HIVE-16295.3.WIP.patch, HIVE-16295.4.patch, HIVE-16295.5.patch, > HIVE-16295.6.patch, HIVE-16295.7.patch, HIVE-16295.8.patch > > > Hive doesn't have integration with Hadoop's {{OutputCommitter}}, it uses a > {{NullOutputCommitter}} and uses its own commit logic spread across > {{FileSinkOperator}}, {{MoveTask}}, and {{Hive}}. > The Hadoop community is building an {{OutputCommitter}} that integrates with > S3Guard and does a safe, coordinate commit of data on S3 inside individual > tasks (HADOOP-13786). If Hive can integrate with this new {{OutputCommitter}} > there would be a lot of benefits to Hive-on-S3: > * Data is only written once; directly committing data at a task level means > no renames are necessary > * The commit is done safely, in a coordinated manner; duplicate tasks (from > task retries or speculative execution) should not step on each other -- This message was sent by Atlassian JIRA (v7.6.3#76005)