[ 
https://issues.apache.org/jira/browse/HIVE-10213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14529452#comment-14529452
 ] 

Sushanth Sowmyan commented on HIVE-10213:
-----------------------------------------

This patch set off some warning flags for me with regards to the traditional 
M-R usecase, but it's because it's been a while since I looked at this piece of 
code. The traditional M-R usecase is still fine, because the 
DynamicPartitionFileRecordWriterContainer.close() will register an appropriate 
TaskCommitterProxy, and a commit on the OutputCommitter will be called in the 
same process scope, thus making it okay. For pig-based optimizations also, it'd 
continue to be okay as the singleton retains it in memory.

+1, and I'm okay with committing this patch as-is, tests have already run on 
this, and this section of code has not changed since then.

> MapReduce jobs using dynamic-partitioning fail on commit.
> ---------------------------------------------------------
>
>                 Key: HIVE-10213
>                 URL: https://issues.apache.org/jira/browse/HIVE-10213
>             Project: Hive
>          Issue Type: Bug
>          Components: HCatalog
>            Reporter: Mithun Radhakrishnan
>            Assignee: Mithun Radhakrishnan
>         Attachments: HIVE-10213.1.patch
>
>
> I recently ran into a problem in {{TaskCommitContextRegistry}}, when using 
> dynamic-partitions.
> Consider a MapReduce program that reads HCatRecords from a table (using 
> HCatInputFormat), and then writes to another table (with identical schema), 
> using HCatOutputFormat. The Map-task fails with the following exception:
> {code}
> Error: java.io.IOException: No callback registered for 
> TaskAttemptID:attempt_1426589008676_509707_m_000000_0@hdfs://crystalmyth.myth.net:8020/user/mithunr/mythdb/target/_DYN0.6784154320609959/grid=__HIVE_DEFAULT_PARTITION__/dt=__HIVE_DEFAULT_PARTITION__
>         at 
> org.apache.hive.hcatalog.mapreduce.TaskCommitContextRegistry.commitTask(TaskCommitContextRegistry.java:56)
>         at 
> org.apache.hive.hcatalog.mapreduce.FileOutputCommitterContainer.commitTask(FileOutputCommitterContainer.java:139)
>         at org.apache.hadoop.mapred.Task.commit(Task.java:1163)
>         at org.apache.hadoop.mapred.Task.done(Task.java:1025)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:345)
>         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1694)
>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> {code}
> {{TaskCommitContextRegistry::commitTask()}} uses call-backs registered from 
> {{DynamicPartitionFileRecordWriter}}. But in case {{HCatInputFormat}} and 
> {{HCatOutputFormat}} are both used in the same job, the 
> {{DynamicPartitionFileRecordWriter}} might only be exercised in the Reducer.
> I'm relaxing the IOException, and log a warning message instead of just 
> failing.
> (I'll post the fix shortly.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to