[jira] Updated: (HIVE-131) insert overwrite directory leaves behind uncommitted/tmp files from failed tasks

2009-02-18 Thread Zheng Shao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zheng Shao updated HIVE-131:


   Resolution: Fixed
Fix Version/s: 0.3.0
   0.2.0
 Release Note: HIVE-131. Remove uncommitted files from failed tasks. 
(Joydeep Sen Sarma via zshao)
 Hadoop Flags: [Reviewed]
   Status: Resolved  (was: Patch Available)

trunk: Committed revision 745709.
branch-0.2: Committed revision 745710.



 insert overwrite directory leaves behind uncommitted/tmp files from failed 
 tasks
 

 Key: HIVE-131
 URL: https://issues.apache.org/jira/browse/HIVE-131
 Project: Hadoop Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Joydeep Sen Sarma
Assignee: Joydeep Sen Sarma
Priority: Critical
 Fix For: 0.2.0, 0.3.0

 Attachments: HIVE-131.patch.1, hive-131.patch.2


 _tmp files are getting left behind on insert overwrite directory:
 /user/jssarma/ctst1/40422_m_000195_0.deflate  r 3 13285 2008-12-07 01:47  
 rw-r--r-- jssarma supergroup
 /user/jssarma/ctst1/40422_m_000196_0.deflate  r 3 3055  2008-12-07 01:46  
 rw-r--r-- jssarma supergroup
 /user/jssarma/ctst1/_tmp.40422_m_33_0 r 3 0 2008-12-07 01:53  rw-r--r-- 
 jssarma supergroup
 /user/jssarma/ctst1/_tmp.40422_m_37_1 r 3 0 2008-12-07 01:53  rw-r--r-- 
 jssarma supergroup
 this happened with speculative execution. the code looks good (in fact in 
 this case many speculative tasks were launched - and only a couple caused 
 problems). Almost seems like these files did not appear in the namespace 
 until after the map-reduce job finished and the movetask did a listing of the 
 output dir ..

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-131) insert overwrite directory leaves behind uncommitted/tmp files from failed tasks

2009-02-11 Thread Joydeep Sen Sarma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joydeep Sen Sarma updated HIVE-131:
---

Attachment: hive-131.patch.2

Dhruba said:

 1. I see that execute returns values 1, 2, and 3. It will be good to document 
 what these values mean.
 2. Staring hadoop 0.19, it might make sense to set FileSystem.deleteOnExit() 
 for files that are temporary.
 3. It is interesting to note that now there is an extra step jobClose() that 
 gets triggered on the client-side after the job is complete. Prior to this 
 patch, a job would be successful even if the client-side has disappeared 
 before the job is completed. This patch requires that the client remains 
 active and healthy till the entire job is complete. This probably is ok for 
 Hive, especially because Hive anyway requires job-chaining and I do not see 
 any other way to do it

- incorporated  suggestion to use deleteOnExit where available.
- return codes are always accompanied by a corresponding message on the 
console/log. So don't see much point creating additional documentation around 
them.
- hive has always depended on client side code-patch for query completion.

 insert overwrite directory leaves behind uncommitted/tmp files from failed 
 tasks
 

 Key: HIVE-131
 URL: https://issues.apache.org/jira/browse/HIVE-131
 Project: Hadoop Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Joydeep Sen Sarma
Assignee: Joydeep Sen Sarma
Priority: Critical
 Attachments: HIVE-131.patch.1, hive-131.patch.2


 _tmp files are getting left behind on insert overwrite directory:
 /user/jssarma/ctst1/40422_m_000195_0.deflate  r 3 13285 2008-12-07 01:47  
 rw-r--r-- jssarma supergroup
 /user/jssarma/ctst1/40422_m_000196_0.deflate  r 3 3055  2008-12-07 01:46  
 rw-r--r-- jssarma supergroup
 /user/jssarma/ctst1/_tmp.40422_m_33_0 r 3 0 2008-12-07 01:53  rw-r--r-- 
 jssarma supergroup
 /user/jssarma/ctst1/_tmp.40422_m_37_1 r 3 0 2008-12-07 01:53  rw-r--r-- 
 jssarma supergroup
 this happened with speculative execution. the code looks good (in fact in 
 this case many speculative tasks were launched - and only a couple caused 
 problems). Almost seems like these files did not appear in the namespace 
 until after the map-reduce job finished and the movetask did a listing of the 
 output dir ..

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-131) insert overwrite directory leaves behind uncommitted/tmp files from failed tasks

2009-02-08 Thread Joydeep Sen Sarma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joydeep Sen Sarma updated HIVE-131:
---

Status: Patch Available  (was: Open)

 insert overwrite directory leaves behind uncommitted/tmp files from failed 
 tasks
 

 Key: HIVE-131
 URL: https://issues.apache.org/jira/browse/HIVE-131
 Project: Hadoop Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Joydeep Sen Sarma
Assignee: Joydeep Sen Sarma
Priority: Critical
 Attachments: HIVE-131.patch.1


 _tmp files are getting left behind on insert overwrite directory:
 /user/jssarma/ctst1/40422_m_000195_0.deflate  r 3 13285 2008-12-07 01:47  
 rw-r--r-- jssarma supergroup
 /user/jssarma/ctst1/40422_m_000196_0.deflate  r 3 3055  2008-12-07 01:46  
 rw-r--r-- jssarma supergroup
 /user/jssarma/ctst1/_tmp.40422_m_33_0 r 3 0 2008-12-07 01:53  rw-r--r-- 
 jssarma supergroup
 /user/jssarma/ctst1/_tmp.40422_m_37_1 r 3 0 2008-12-07 01:53  rw-r--r-- 
 jssarma supergroup
 this happened with speculative execution. the code looks good (in fact in 
 this case many speculative tasks were launched - and only a couple caused 
 problems). Almost seems like these files did not appear in the namespace 
 until after the map-reduce job finished and the movetask did a listing of the 
 output dir ..

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.