[jira] Updated: (HIVE-131) insert overwrite directory leaves behind uncommitted/tmp files from failed tasks
[ https://issues.apache.org/jira/browse/HIVE-131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zheng Shao updated HIVE-131: Resolution: Fixed Fix Version/s: 0.3.0 0.2.0 Release Note: HIVE-131. Remove uncommitted files from failed tasks. (Joydeep Sen Sarma via zshao) Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) trunk: Committed revision 745709. branch-0.2: Committed revision 745710. insert overwrite directory leaves behind uncommitted/tmp files from failed tasks Key: HIVE-131 URL: https://issues.apache.org/jira/browse/HIVE-131 Project: Hadoop Hive Issue Type: Bug Components: Query Processor Reporter: Joydeep Sen Sarma Assignee: Joydeep Sen Sarma Priority: Critical Fix For: 0.2.0, 0.3.0 Attachments: HIVE-131.patch.1, hive-131.patch.2 _tmp files are getting left behind on insert overwrite directory: /user/jssarma/ctst1/40422_m_000195_0.deflate r 3 13285 2008-12-07 01:47 rw-r--r-- jssarma supergroup /user/jssarma/ctst1/40422_m_000196_0.deflate r 3 3055 2008-12-07 01:46 rw-r--r-- jssarma supergroup /user/jssarma/ctst1/_tmp.40422_m_33_0 r 3 0 2008-12-07 01:53 rw-r--r-- jssarma supergroup /user/jssarma/ctst1/_tmp.40422_m_37_1 r 3 0 2008-12-07 01:53 rw-r--r-- jssarma supergroup this happened with speculative execution. the code looks good (in fact in this case many speculative tasks were launched - and only a couple caused problems). Almost seems like these files did not appear in the namespace until after the map-reduce job finished and the movetask did a listing of the output dir .. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-131) insert overwrite directory leaves behind uncommitted/tmp files from failed tasks
[ https://issues.apache.org/jira/browse/HIVE-131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joydeep Sen Sarma updated HIVE-131: --- Attachment: hive-131.patch.2 Dhruba said: 1. I see that execute returns values 1, 2, and 3. It will be good to document what these values mean. 2. Staring hadoop 0.19, it might make sense to set FileSystem.deleteOnExit() for files that are temporary. 3. It is interesting to note that now there is an extra step jobClose() that gets triggered on the client-side after the job is complete. Prior to this patch, a job would be successful even if the client-side has disappeared before the job is completed. This patch requires that the client remains active and healthy till the entire job is complete. This probably is ok for Hive, especially because Hive anyway requires job-chaining and I do not see any other way to do it - incorporated suggestion to use deleteOnExit where available. - return codes are always accompanied by a corresponding message on the console/log. So don't see much point creating additional documentation around them. - hive has always depended on client side code-patch for query completion. insert overwrite directory leaves behind uncommitted/tmp files from failed tasks Key: HIVE-131 URL: https://issues.apache.org/jira/browse/HIVE-131 Project: Hadoop Hive Issue Type: Bug Components: Query Processor Reporter: Joydeep Sen Sarma Assignee: Joydeep Sen Sarma Priority: Critical Attachments: HIVE-131.patch.1, hive-131.patch.2 _tmp files are getting left behind on insert overwrite directory: /user/jssarma/ctst1/40422_m_000195_0.deflate r 3 13285 2008-12-07 01:47 rw-r--r-- jssarma supergroup /user/jssarma/ctst1/40422_m_000196_0.deflate r 3 3055 2008-12-07 01:46 rw-r--r-- jssarma supergroup /user/jssarma/ctst1/_tmp.40422_m_33_0 r 3 0 2008-12-07 01:53 rw-r--r-- jssarma supergroup /user/jssarma/ctst1/_tmp.40422_m_37_1 r 3 0 2008-12-07 01:53 rw-r--r-- jssarma supergroup this happened with speculative execution. the code looks good (in fact in this case many speculative tasks were launched - and only a couple caused problems). Almost seems like these files did not appear in the namespace until after the map-reduce job finished and the movetask did a listing of the output dir .. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-131) insert overwrite directory leaves behind uncommitted/tmp files from failed tasks
[ https://issues.apache.org/jira/browse/HIVE-131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joydeep Sen Sarma updated HIVE-131: --- Status: Patch Available (was: Open) insert overwrite directory leaves behind uncommitted/tmp files from failed tasks Key: HIVE-131 URL: https://issues.apache.org/jira/browse/HIVE-131 Project: Hadoop Hive Issue Type: Bug Components: Query Processor Reporter: Joydeep Sen Sarma Assignee: Joydeep Sen Sarma Priority: Critical Attachments: HIVE-131.patch.1 _tmp files are getting left behind on insert overwrite directory: /user/jssarma/ctst1/40422_m_000195_0.deflate r 3 13285 2008-12-07 01:47 rw-r--r-- jssarma supergroup /user/jssarma/ctst1/40422_m_000196_0.deflate r 3 3055 2008-12-07 01:46 rw-r--r-- jssarma supergroup /user/jssarma/ctst1/_tmp.40422_m_33_0 r 3 0 2008-12-07 01:53 rw-r--r-- jssarma supergroup /user/jssarma/ctst1/_tmp.40422_m_37_1 r 3 0 2008-12-07 01:53 rw-r--r-- jssarma supergroup this happened with speculative execution. the code looks good (in fact in this case many speculative tasks were launched - and only a couple caused problems). Almost seems like these files did not appear in the namespace until after the map-reduce job finished and the movetask did a listing of the output dir .. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.