[ https://issues.apache.org/jira/browse/HIVE-1463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12889328#action_12889328 ]
Joydeep Sen Sarma commented on HIVE-1463: ----------------------------------------- thanks for the review. 1) I checked this out. hadoop from 17 onwards always uses <prefix>_<jtid>_[mr]_<taskid>_<attemptid>. in 17 - prefix was 'task'. in 18 and later prefix was changed to 'attempt'. jt = 'local' for local mode. otherwise there's no difference between local and regular jobs. i think 15 was different (where hive was initially started) - that's why there were comments to the effect that jobs have _map_ in local mode. one thing i can do is add tests under shim to make sure of this. if i am unable to add a test - i will at least confirm for sure the naming under 17. 2) good point! dropping the leading prefix is not necessary (since repeated strings are factored out by hdfs now - it uses String.intern()). i can take that part out. will upload modified diff. > hive output file names are unnecessarily large > ---------------------------------------------- > > Key: HIVE-1463 > URL: https://issues.apache.org/jira/browse/HIVE-1463 > Project: Hadoop Hive > Issue Type: Improvement > Reporter: Joydeep Sen Sarma > Attachments: hive-1463.1.patch > > > Hive's output files are named like this: > attempt_201006221843_431854_r_000000_0 > out of all of this goop - only one character '0' would have sufficed. we > should fix this. This would help environments with namenode memory > constraints. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.