[jira] [Commented] (HIVE-17113) Duplicate bucket files can get written to table by runaway task

Jason Dere (JIRA) Mon, 17 Jul 2017 15:14:01 -0700

    [ 
https://issues.apache.org/jira/browse/HIVE-17113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16090666#comment-16090666
 ]


Jason Dere commented on HIVE-17113:
-----------------------------------

Talked to [~ashutoshc] and [~sseth] about this. According to Sid this is 
normally handled in MR using the OutputCommitter. However Ashutosh mentioned 
that Hive does not use the Hadoop OutputCommitter functionality and instead 
tries to handle duplicate task attempts by itself - thus the call to 
Utilities.removeTempOrDuplicateFiles().

A couple of solutions to this on the Hive side:
1) Changing Hive to properly use the OutputCommitter
2) Utiltiies.mvFileToFinalPath() should call 
Utilities.removeTempOrDuplicateFiles() after renaming the temp directory rather 
than before renaming. This is basically swapping the order of steps 6 and 8 in 
the Jira description, within Utilities.mvFileToFinalPath().

Gonna try to do option 2 as it looks like a simpler fix.

> Duplicate bucket files can get written to table by runaway task
> ---------------------------------------------------------------
>
>                 Key: HIVE-17113
>                 URL: https://issues.apache.org/jira/browse/HIVE-17113
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Processor
>            Reporter: Jason Dere
>            Assignee: Jason Dere
>
> Saw a table get a duplicate bucket file from a Hive query. It looks like the 
> following happened:
> 1. Task attempt A_0 starts,but then stops making progress
> 2. The job was running with speculative execution on, and task attempt A_1 is 
> started
> 3. Task attempt A_1 finishes execution and saves its output to the temp 
> directory.
> 5. A task kill is sent to A_0, though this does appear to actually kill A_0
> 6. The job for the query finishes and Utilities.mvFileToFinalPath() calls 
> Utilities.removeTempOrDuplicateFiles() to check for duplicate bucket files
> 7. A_0 (still running) finally finishes and saves its file to the temp 
> directory. At this point we now have duplicate bucket files - oops!
> 8. Utilities.removeTempOrDuplicateFiles() moves the temp directory to the 
> final location, where it is later moved to the partition directory.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17113) Duplicate bucket files can get written to table by runaway task

Reply via email to