[jira] Updated: (HADOOP-2596) add SequenceFile.createWriter() method that takes block size as parameter
[ https://issues.apache.org/jira/browse/HADOOP-2596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-2596: -- Status: Patch Available (was: Open) add SequenceFile.createWriter() method that takes block size as parameter - Key: HADOOP-2596 URL: https://issues.apache.org/jira/browse/HADOOP-2596 Project: Hadoop Issue Type: Improvement Components: io Environment: all Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Priority: Minor Fix For: 0.16.0 Attachments: patch2596.txt Currently it is not possible to create a SequenceFile.Writer using a block size other than the default. The createWriter() method should be overloaded with a signature receiving block size as parameter should be added to the the SequenceFile class. With all the current signatures for this method there is a significant code duplication, if possible the createWriter() methods should be refactored to avoid such duplication. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-2596) add SequenceFile.createWriter() method that takes block size as parameter
[ https://issues.apache.org/jira/browse/HADOOP-2596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-2596: -- Status: Open (was: Patch Available) Re-submitting to hudson since an unrelated test failed... add SequenceFile.createWriter() method that takes block size as parameter - Key: HADOOP-2596 URL: https://issues.apache.org/jira/browse/HADOOP-2596 Project: Hadoop Issue Type: Improvement Components: io Environment: all Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Priority: Minor Fix For: 0.16.0 Attachments: patch2596.txt Currently it is not possible to create a SequenceFile.Writer using a block size other than the default. The createWriter() method should be overloaded with a signature receiving block size as parameter should be added to the the SequenceFile class. With all the current signatures for this method there is a significant code duplication, if possible the createWriter() methods should be refactored to avoid such duplication. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-2402) Lzo compression compresses each write from TextOutputFormat
[ https://issues.apache.org/jira/browse/HADOOP-2402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-2402: -- Status: Open (was: Patch Available) Mostly looks ok, but there are too many unrelated white-space changes - hence, I'm cancelling this patch. Lzo compression compresses each write from TextOutputFormat --- Key: HADOOP-2402 URL: https://issues.apache.org/jira/browse/HADOOP-2402 Project: Hadoop Issue Type: Bug Components: io, mapred, native Reporter: Chris Douglas Assignee: Chris Douglas Fix For: 0.16.0 Attachments: 2402-0.patch, 2402-1.patch Outputting with TextOutputFormat and Lzo compression generates a file such that each key, tab delimiter, and value are compressed separately. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-2574) bugs in mapred tutorial
[ https://issues.apache.org/jira/browse/HADOOP-2574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-2574: -- Attachment: mapred_tutorial.html HADOOP-2574_1_20080114.patch Updated to incorporate Phu's original ask and Amar's feedback... again, I've attached the generate mapred_tutorial.html for folks to review it without having to figure forrest. bugs in mapred tutorial --- Key: HADOOP-2574 URL: https://issues.apache.org/jira/browse/HADOOP-2574 Project: Hadoop Issue Type: Bug Components: documentation Reporter: Doug Cutting Assignee: Arun C Murthy Fix For: 0.15.3 Attachments: HADOOP-2574_0_20080110.patch, HADOOP-2574_1_20080114.patch, mapred_tutorial.html, mapred_tutorial.html Sam Pullara sends me: {noformat} Phu was going through the WordCount example... lines 52 and 53 should have args[0] and args[1]: http://lucene.apache.org/hadoop/docs/current/mapred_tutorial.html The javac and jar command are also wrong, they don't include the directories for the packages, should be: $ javac -classpath ${HADOOP_HOME}/hadoop-${HADOOP_VERSION}-core.jar -d classes WordCount.java $ jar -cvf /usr/joe/wordcount.jar WordCount.class -C classes . {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-2574) bugs in mapred tutorial
[ https://issues.apache.org/jira/browse/HADOOP-2574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-2574: -- Fix Version/s: (was: 0.16.0) Status: Patch Available (was: Open) bugs in mapred tutorial --- Key: HADOOP-2574 URL: https://issues.apache.org/jira/browse/HADOOP-2574 Project: Hadoop Issue Type: Bug Components: documentation Reporter: Doug Cutting Assignee: Arun C Murthy Fix For: 0.15.3 Attachments: HADOOP-2574_0_20080110.patch, HADOOP-2574_1_20080114.patch, mapred_tutorial.html, mapred_tutorial.html Sam Pullara sends me: {noformat} Phu was going through the WordCount example... lines 52 and 53 should have args[0] and args[1]: http://lucene.apache.org/hadoop/docs/current/mapred_tutorial.html The javac and jar command are also wrong, they don't include the directories for the packages, should be: $ javac -classpath ${HADOOP_HOME}/hadoop-${HADOOP_VERSION}-core.jar -d classes WordCount.java $ jar -cvf /usr/joe/wordcount.jar WordCount.class -C classes . {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-2574) bugs in mapred tutorial
[ https://issues.apache.org/jira/browse/HADOOP-2574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12558569#action_12558569 ] Arun C Murthy commented on HADOOP-2574: --- Phu - does this patch (http://issues.apache.org/jira/secure/attachment/12373083/mapred_tutorial.html) address your concerns? bugs in mapred tutorial --- Key: HADOOP-2574 URL: https://issues.apache.org/jira/browse/HADOOP-2574 Project: Hadoop Issue Type: Bug Components: documentation Reporter: Doug Cutting Assignee: Arun C Murthy Fix For: 0.15.3 Attachments: HADOOP-2574_0_20080110.patch, HADOOP-2574_1_20080114.patch, mapred_tutorial.html, mapred_tutorial.html Sam Pullara sends me: {noformat} Phu was going through the WordCount example... lines 52 and 53 should have args[0] and args[1]: http://lucene.apache.org/hadoop/docs/current/mapred_tutorial.html The javac and jar command are also wrong, they don't include the directories for the packages, should be: $ javac -classpath ${HADOOP_HOME}/hadoop-${HADOOP_VERSION}-core.jar -d classes WordCount.java $ jar -cvf /usr/joe/wordcount.jar WordCount.class -C classes . {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-2516) HADOOP-1819 removed a public api JobTracker.getTracker in 0.15.0
[ https://issues.apache.org/jira/browse/HADOOP-2516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12558574#action_12558574 ] Arun C Murthy commented on HADOOP-2516: --- I'd still go with Owen's comment: mark it as won't fix and then move HADOOP-1819 to the *INCOMPATIBLE CHANGES* section... HADOOP-1819 removed a public api JobTracker.getTracker in 0.15.0 Key: HADOOP-2516 URL: https://issues.apache.org/jira/browse/HADOOP-2516 Project: Hadoop Issue Type: Bug Affects Versions: 0.15.1 Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.15.3 HADOOP-1819 removed a 0.14.0 public api {{JobTracker.getTracker}} in 0.15.0. http://svn.apache.org/viewvc?view=revrevision=575438 and http://svn.apache.org/viewvc/lucene/hadoop/branches/branch-0.15/src/java/org/apache/hadoop/mapred/JobTracker.java?r1=573708r2=575438diff_format=h There is a simple work-around i.e. use the return value of {{JobTracker.startTracker}} ... yet, is this a 0.15.2 blocker? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-2077) Logging version number (and compiled date) at STARTUP_MSG
[ https://issues.apache.org/jira/browse/HADOOP-2077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-2077: -- Attachment: HADOOP-2077_1_20080114.patch Slight modification to the log-msg: {noformat} 08/01/14 17:33:25 INFO dfs.NameNode: STARTUP_MSG: / STARTUP_MSG: Starting NameNode STARTUP_MSG: host = neo/127.0.0.1 STARTUP_MSG: args = [-format] STARTUP_MSG: version = 0.16.0-dev STARTUP_MSG: build = http://svn.apache.org/repos/asf/lucene/hadoop/trunk -r 611760; compiled by 'arun' on Mon Jan 14 17:33:13 IST 2008 / {noformat} Logging version number (and compiled date) at STARTUP_MSG --- Key: HADOOP-2077 URL: https://issues.apache.org/jira/browse/HADOOP-2077 Project: Hadoop Issue Type: Improvement Components: dfs, mapred Reporter: Koji Noguchi Assignee: Arun C Murthy Priority: Trivial Fix For: 0.16.0 Attachments: HADOOP-2077_0_20080110.patch, HADOOP-2077_0_20080110.patch, HADOOP-2077_1_20080114.patch This will help us figure out which version of hadoop we were running when looking back the logs. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-2077) Logging version number (and compiled date) at STARTUP_MSG
[ https://issues.apache.org/jira/browse/HADOOP-2077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-2077: -- Resolution: Fixed Status: Resolved (was: Patch Available) I just committed this. Logging version number (and compiled date) at STARTUP_MSG --- Key: HADOOP-2077 URL: https://issues.apache.org/jira/browse/HADOOP-2077 Project: Hadoop Issue Type: Improvement Components: dfs, mapred Reporter: Koji Noguchi Assignee: Arun C Murthy Priority: Trivial Fix For: 0.16.0 Attachments: HADOOP-2077_0_20080110.patch, HADOOP-2077_0_20080110.patch, HADOOP-2077_1_20080114.patch This will help us figure out which version of hadoop we were running when looking back the logs. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-2574) bugs in mapred tutorial
[ https://issues.apache.org/jira/browse/HADOOP-2574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-2574: -- Resolution: Fixed Status: Resolved (was: Patch Available) I just committed this. bugs in mapred tutorial --- Key: HADOOP-2574 URL: https://issues.apache.org/jira/browse/HADOOP-2574 Project: Hadoop Issue Type: Bug Components: documentation Reporter: Doug Cutting Assignee: Arun C Murthy Fix For: 0.15.3 Attachments: HADOOP-2574_0_20080110.patch, HADOOP-2574_1_20080114.patch, mapred_tutorial.html, mapred_tutorial.html Sam Pullara sends me: {noformat} Phu was going through the WordCount example... lines 52 and 53 should have args[0] and args[1]: http://lucene.apache.org/hadoop/docs/current/mapred_tutorial.html The javac and jar command are also wrong, they don't include the directories for the packages, should be: $ javac -classpath ${HADOOP_HOME}/hadoop-${HADOOP_VERSION}-core.jar -d classes WordCount.java $ jar -cvf /usr/joe/wordcount.jar WordCount.class -C classes . {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-2574) bugs in mapred tutorial
[ https://issues.apache.org/jira/browse/HADOOP-2574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12558917#action_12558917 ] Arun C Murthy commented on HADOOP-2574: --- I've clarified in the tutorial that WordCount v1 works with local, pseudo-distributed and fully-distributed modes while v2 needs HDFS to be up and running (pseudo-distributed or fully-distributed) - primarily due to the usage of the DistributedCache. Works? bugs in mapred tutorial --- Key: HADOOP-2574 URL: https://issues.apache.org/jira/browse/HADOOP-2574 Project: Hadoop Issue Type: Bug Components: documentation Reporter: Doug Cutting Assignee: Arun C Murthy Fix For: 0.15.3 Attachments: HADOOP-2574_0_20080110.patch, HADOOP-2574_1_20080114.patch, mapred_tutorial.html, mapred_tutorial.html Sam Pullara sends me: {noformat} Phu was going through the WordCount example... lines 52 and 53 should have args[0] and args[1]: http://lucene.apache.org/hadoop/docs/current/mapred_tutorial.html The javac and jar command are also wrong, they don't include the directories for the packages, should be: $ javac -classpath ${HADOOP_HOME}/hadoop-${HADOOP_VERSION}-core.jar -d classes WordCount.java $ jar -cvf /usr/joe/wordcount.jar WordCount.class -C classes . {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-2574) bugs in mapred tutorial
[ https://issues.apache.org/jira/browse/HADOOP-2574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-2574: -- Status: Open (was: Patch Available) Uh, I missed: bq. The quickstart tutorial does not make it clear which examples work under which scenarios (Stand alone, Pseudo-Distributed, or Fully-Distributed). bugs in mapred tutorial --- Key: HADOOP-2574 URL: https://issues.apache.org/jira/browse/HADOOP-2574 Project: Hadoop Issue Type: Bug Components: documentation Reporter: Doug Cutting Assignee: Arun C Murthy Fix For: 0.15.3, 0.16.0 Attachments: HADOOP-2574_0_20080110.patch, mapred_tutorial.html Sam Pullara sends me: {noformat} Phu was going through the WordCount example... lines 52 and 53 should have args[0] and args[1]: http://lucene.apache.org/hadoop/docs/current/mapred_tutorial.html The javac and jar command are also wrong, they don't include the directories for the packages, should be: $ javac -classpath ${HADOOP_HOME}/hadoop-${HADOOP_VERSION}-core.jar -d classes WordCount.java $ jar -cvf /usr/joe/wordcount.jar WordCount.class -C classes . {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-2570) streaming jobs fail after HADOOP-2227
[ https://issues.apache.org/jira/browse/HADOOP-2570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-2570: -- Status: Open (was: Patch Available) Re-trying hudson... streaming jobs fail after HADOOP-2227 - Key: HADOOP-2570 URL: https://issues.apache.org/jira/browse/HADOOP-2570 Project: Hadoop Issue Type: Bug Components: contrib/streaming Affects Versions: 0.15.2 Reporter: lohit vijayarenu Assignee: Amareshwari Sri Ramadasu Priority: Blocker Fix For: 0.15.3 Attachments: HADOOP-2570_1_20080112.patch, patch-2570.txt HADOOP-2227 changes jobCacheDir. In streaming, jobCacheDir was constructed like this {code} File jobCacheDir = new File(currentDir.getParentFile().getParent(), work); {code} We should change this to get it working. Referring to the changes made in HADOOP-2227, I see that the APIs used in there to construct the path are not public. And hard coding the path in streaming does not look good. thought? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-2570) streaming jobs fail after HADOOP-2227
[ https://issues.apache.org/jira/browse/HADOOP-2570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-2570: -- Status: Patch Available (was: Open) streaming jobs fail after HADOOP-2227 - Key: HADOOP-2570 URL: https://issues.apache.org/jira/browse/HADOOP-2570 Project: Hadoop Issue Type: Bug Components: contrib/streaming Affects Versions: 0.15.2 Reporter: lohit vijayarenu Assignee: Amareshwari Sri Ramadasu Priority: Blocker Fix For: 0.15.3 Attachments: HADOOP-2570_1_20080112.patch, patch-2570.txt HADOOP-2227 changes jobCacheDir. In streaming, jobCacheDir was constructed like this {code} File jobCacheDir = new File(currentDir.getParentFile().getParent(), work); {code} We should change this to get it working. Referring to the changes made in HADOOP-2227, I see that the APIs used in there to construct the path are not public. And hard coding the path in streaming does not look good. thought? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-1876) Persisting completed jobs status
[ https://issues.apache.org/jira/browse/HADOOP-1876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-1876: -- Resolution: Fixed Status: Resolved (was: Patch Available) I just committed this. Thanks, Alejandro! Persisting completed jobs status Key: HADOOP-1876 URL: https://issues.apache.org/jira/browse/HADOOP-1876 Project: Hadoop Issue Type: Improvement Components: mapred Environment: all Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Priority: Critical Fix For: 0.16.0 Attachments: patch1876.txt, patch1876.txt, patch1876.txt Currently the JobTracker keeps information about completed jobs in memory. This information is flushed from the cache when it has outlived (#RETIRE_JOB_INTERVAL) or because the limit of completed jobs in memory has been reach (#MAX_COMPLETE_USER_JOBS_IN_MEMORY). Also, if the JobTracker is restarted (due to being recycled or due to a crash) information about completed jobs is lost. If any of the above scenarios happens before the job information is queried by a hadoop client (normally the job submitter or a monitoring component) there is no way to obtain such information. A way to avoid this is the JobTracker to persist in DFS the completed jobs information upon job completion. This would be done at the time the job is moved to the completed jobs queue. Then when querying the JobTracker for information about a completed job, if it is not found in the memory queue, a lookup in DFS would be done to retrieve the completed job information. A directory in DFS (under mapred/system) would be used to persist completed job information, for each completed job there would be a directory with the job ID, within that directory all the information about the job: status, jobprofile, counters and completion events. A configuration property will indicate for how log persisted job information should be kept in DFS. After such period it will be cleaned up automatically. This improvement would not introduce API changes. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-2570) streaming jobs fail after HADOOP-2227
[ https://issues.apache.org/jira/browse/HADOOP-2570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12558080#action_12558080 ] Arun C Murthy commented on HADOOP-2570: --- All tests fail with: {noformat} 2008-01-11 17:35:53,433 INFO mapred.TaskTracker (TaskTracker.java:launchTaskForJob(703)) - org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find taskTracker/jobcache/job_20080735_0001/work in any of the configured local directories at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:359) at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:138) at org.apache.hadoop.mapred.TaskTracker$TaskInProgress.localizeTask(TaskTracker.java:1395) at org.apache.hadoop.mapred.TaskTracker$TaskInProgress.launchTask(TaskTracker.java:1469) at org.apache.hadoop.mapred.TaskTracker.launchTaskForJob(TaskTracker.java:693) at org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:686) at org.apache.hadoop.mapred.TaskTracker.startNewTask(TaskTracker.java:1279) at org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:920) at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:1315) at org.apache.hadoop.mapred.MiniMRCluster$TaskTrackerRunner.run(MiniMRCluster.java:144) at java.lang.Thread.run(Thread.java:595) {noformat} The problem is that the LocalDirAllocator.getLocalPathToRead throws and exception when the path is not found - this patch should handle that exception and go-ahead to create the symlink... streaming jobs fail after HADOOP-2227 - Key: HADOOP-2570 URL: https://issues.apache.org/jira/browse/HADOOP-2570 Project: Hadoop Issue Type: Bug Components: contrib/streaming Affects Versions: 0.15.2 Reporter: lohit vijayarenu Assignee: Amareshwari Sri Ramadasu Priority: Blocker Fix For: 0.15.3 Attachments: patch-2570.txt HADOOP-2227 changes jobCacheDir. In streaming, jobCacheDir was constructed like this {code} File jobCacheDir = new File(currentDir.getParentFile().getParent(), work); {code} We should change this to get it working. Referring to the changes made in HADOOP-2227, I see that the APIs used in there to construct the path are not public. And hard coding the path in streaming does not look good. thought? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-2574) bugs in mapred tutorial
[ https://issues.apache.org/jira/browse/HADOOP-2574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-2574: -- Attachment: HADOOP-2574_0_20080110.patch Here is patch which addresses most of Phu's concerns... bugs in mapred tutorial --- Key: HADOOP-2574 URL: https://issues.apache.org/jira/browse/HADOOP-2574 Project: Hadoop Issue Type: Bug Components: documentation Reporter: Doug Cutting Assignee: Arun C Murthy Fix For: 0.15.3, 0.16.0 Attachments: HADOOP-2574_0_20080110.patch Sam Pullara sends me: {noformat} Phu was going through the WordCount example... lines 52 and 53 should have args[0] and args[1]: http://lucene.apache.org/hadoop/docs/current/mapred_tutorial.html The javac and jar command are also wrong, they don't include the directories for the packages, should be: $ javac -classpath ${HADOOP_HOME}/hadoop-${HADOOP_VERSION}-core.jar -d classes WordCount.java $ jar -cvf /usr/joe/wordcount.jar WordCount.class -C classes . {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (HADOOP-2574) bugs in mapred tutorial
[ https://issues.apache.org/jira/browse/HADOOP-2574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy reassigned HADOOP-2574: - Assignee: Arun C Murthy bugs in mapred tutorial --- Key: HADOOP-2574 URL: https://issues.apache.org/jira/browse/HADOOP-2574 Project: Hadoop Issue Type: Bug Components: documentation Reporter: Doug Cutting Assignee: Arun C Murthy Fix For: 0.15.3, 0.16.0 Attachments: HADOOP-2574_0_20080110.patch Sam Pullara sends me: {noformat} Phu was going through the WordCount example... lines 52 and 53 should have args[0] and args[1]: http://lucene.apache.org/hadoop/docs/current/mapred_tutorial.html The javac and jar command are also wrong, they don't include the directories for the packages, should be: $ javac -classpath ${HADOOP_HOME}/hadoop-${HADOOP_VERSION}-core.jar -d classes WordCount.java $ jar -cvf /usr/joe/wordcount.jar WordCount.class -C classes . {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-2574) bugs in mapred tutorial
[ https://issues.apache.org/jira/browse/HADOOP-2574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-2574: -- Attachment: mapred_tutorial.html Here is how the tutorial looks with this patch... bugs in mapred tutorial --- Key: HADOOP-2574 URL: https://issues.apache.org/jira/browse/HADOOP-2574 Project: Hadoop Issue Type: Bug Components: documentation Reporter: Doug Cutting Assignee: Arun C Murthy Fix For: 0.15.3, 0.16.0 Attachments: HADOOP-2574_0_20080110.patch, mapred_tutorial.html Sam Pullara sends me: {noformat} Phu was going through the WordCount example... lines 52 and 53 should have args[0] and args[1]: http://lucene.apache.org/hadoop/docs/current/mapred_tutorial.html The javac and jar command are also wrong, they don't include the directories for the packages, should be: $ javac -classpath ${HADOOP_HOME}/hadoop-${HADOOP_VERSION}-core.jar -d classes WordCount.java $ jar -cvf /usr/joe/wordcount.jar WordCount.class -C classes . {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-2570) streaming jobs fail after HADOOP-2227
[ https://issues.apache.org/jira/browse/HADOOP-2570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12558087#action_12558087 ] Arun C Murthy commented on HADOOP-2570: --- Sigh, this exception seems to stem from the fact that the LocalDirAllocator is not used to create the *taskTracker/jobcache/jobid/work* directory at all. It is always created in the same partition as the *taskTracker/jobcache/jobid/* directory. This means LocalDirAllocator doesn't know about the *taskTracker/jobcache/jobid/work* directory at all and hence the DiskErrorException. streaming jobs fail after HADOOP-2227 - Key: HADOOP-2570 URL: https://issues.apache.org/jira/browse/HADOOP-2570 Project: Hadoop Issue Type: Bug Components: contrib/streaming Affects Versions: 0.15.2 Reporter: lohit vijayarenu Assignee: Amareshwari Sri Ramadasu Priority: Blocker Fix For: 0.15.3 Attachments: patch-2570.txt HADOOP-2227 changes jobCacheDir. In streaming, jobCacheDir was constructed like this {code} File jobCacheDir = new File(currentDir.getParentFile().getParent(), work); {code} We should change this to get it working. Referring to the changes made in HADOOP-2227, I see that the APIs used in there to construct the path are not public. And hard coding the path in streaming does not look good. thought? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-2574) bugs in mapred tutorial
[ https://issues.apache.org/jira/browse/HADOOP-2574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-2574: -- Status: Patch Available (was: Open) bugs in mapred tutorial --- Key: HADOOP-2574 URL: https://issues.apache.org/jira/browse/HADOOP-2574 Project: Hadoop Issue Type: Bug Components: documentation Reporter: Doug Cutting Assignee: Arun C Murthy Fix For: 0.15.3, 0.16.0 Attachments: HADOOP-2574_0_20080110.patch, mapred_tutorial.html Sam Pullara sends me: {noformat} Phu was going through the WordCount example... lines 52 and 53 should have args[0] and args[1]: http://lucene.apache.org/hadoop/docs/current/mapred_tutorial.html The javac and jar command are also wrong, they don't include the directories for the packages, should be: $ javac -classpath ${HADOOP_HOME}/hadoop-${HADOOP_VERSION}-core.jar -d classes WordCount.java $ jar -cvf /usr/joe/wordcount.jar WordCount.class -C classes . {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-2570) streaming jobs fail after HADOOP-2227
[ https://issues.apache.org/jira/browse/HADOOP-2570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12558133#action_12558133 ] Arun C Murthy commented on HADOOP-2570: --- Please ignore my previous comments... it's been a long day (maybe the following ones too! *smile*) It seems like the test cases don't have a jar and hence there is an 'if' check in TaskTracker.localizeJob which fails and hence the work directory isn't created. This explains the exception seen in the TaskTracker.launchTaskForJob function. I didn't make any headway after that... streaming jobs fail after HADOOP-2227 - Key: HADOOP-2570 URL: https://issues.apache.org/jira/browse/HADOOP-2570 Project: Hadoop Issue Type: Bug Components: contrib/streaming Affects Versions: 0.15.2 Reporter: lohit vijayarenu Assignee: Amareshwari Sri Ramadasu Priority: Blocker Fix For: 0.15.3 Attachments: patch-2570.txt HADOOP-2227 changes jobCacheDir. In streaming, jobCacheDir was constructed like this {code} File jobCacheDir = new File(currentDir.getParentFile().getParent(), work); {code} We should change this to get it working. Referring to the changes made in HADOOP-2227, I see that the APIs used in there to construct the path are not public. And hard coding the path in streaming does not look good. thought? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-1965) Handle map output buffers better
[ https://issues.apache.org/jira/browse/HADOOP-1965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-1965: -- Resolution: Fixed Status: Resolved (was: Patch Available) I just committed this. Thanks, Amar - this was a long-drawn affair! Handle map output buffers better Key: HADOOP-1965 URL: https://issues.apache.org/jira/browse/HADOOP-1965 Project: Hadoop Issue Type: Improvement Components: mapred Affects Versions: 0.16.0 Reporter: Devaraj Das Assignee: Amar Kamat Fix For: 0.16.0 Attachments: 1965_single_proc_150mb_gziped.jpeg, 1965_single_proc_150mb_gziped.pdf, 1965_single_proc_150mb_gziped_breakup.png, HADOOP-1965-1.patch, HADOOP-1965-Benchmark.patch, HADOOP-1965-Benchmark.patch, HADOOP-1965-Benchmark.patch, HADOOP-1965-Benchmark.patch, HADOOP-1965-Benchmark.patch, HADOOP-2419.patch, HADOOP-2419.patch, HADOOP-2419.patch, HADOOP-2419.patch Today, the map task stops calling the map method while sort/spill is using the (single instance of) map output buffer. One improvement that can be done to improve performance of the map task is to have another buffer for writing the map outputs to, while sort/spill is using the first buffer. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-2570) streaming jobs fail after HADOOP-2227
[ https://issues.apache.org/jira/browse/HADOOP-2570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-2570: -- Attachment: HADOOP-2570_1_20080112.patch bq. It seems like the test cases don't have a jar and hence there is an 'if' check in TaskTracker.localizeJob which fails and hence the work directory isn't created. This explains the exception seen in the TaskTracker.launchTaskForJob function. Here is patch which fixes TaskTracker.localizeJob to fix the problem described above, along with Amareshwari's original fix. streaming jobs fail after HADOOP-2227 - Key: HADOOP-2570 URL: https://issues.apache.org/jira/browse/HADOOP-2570 Project: Hadoop Issue Type: Bug Components: contrib/streaming Affects Versions: 0.15.2 Reporter: lohit vijayarenu Assignee: Amareshwari Sri Ramadasu Priority: Blocker Fix For: 0.15.3 Attachments: HADOOP-2570_1_20080112.patch, patch-2570.txt HADOOP-2227 changes jobCacheDir. In streaming, jobCacheDir was constructed like this {code} File jobCacheDir = new File(currentDir.getParentFile().getParent(), work); {code} We should change this to get it working. Referring to the changes made in HADOOP-2227, I see that the APIs used in there to construct the path are not public. And hard coding the path in streaming does not look good. thought? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-2570) streaming jobs fail after HADOOP-2227
[ https://issues.apache.org/jira/browse/HADOOP-2570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-2570: -- Status: Patch Available (was: Open) streaming jobs fail after HADOOP-2227 - Key: HADOOP-2570 URL: https://issues.apache.org/jira/browse/HADOOP-2570 Project: Hadoop Issue Type: Bug Components: contrib/streaming Affects Versions: 0.15.2 Reporter: lohit vijayarenu Assignee: Amareshwari Sri Ramadasu Priority: Blocker Fix For: 0.15.3 Attachments: HADOOP-2570_1_20080112.patch, patch-2570.txt HADOOP-2227 changes jobCacheDir. In streaming, jobCacheDir was constructed like this {code} File jobCacheDir = new File(currentDir.getParentFile().getParent(), work); {code} We should change this to get it working. Referring to the changes made in HADOOP-2227, I see that the APIs used in there to construct the path are not public. And hard coding the path in streaming does not look good. thought? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-2570) streaming jobs fail after HADOOP-2227
[ https://issues.apache.org/jira/browse/HADOOP-2570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12557636#action_12557636 ] Arun C Murthy commented on HADOOP-2570: --- Sigh, the only way is see is fix this post HADOOP-2227 is symlink the work directory from the partition on which the task's cwd is present; this is so because user scripts could just use ../work/ path and there is no way for us to pass them extra configuration parameters etc. Thoughts? streaming jobs fail after HADOOP-2227 - Key: HADOOP-2570 URL: https://issues.apache.org/jira/browse/HADOOP-2570 Project: Hadoop Issue Type: Bug Components: contrib/streaming Affects Versions: 0.15.2 Reporter: lohit vijayarenu Assignee: Amareshwari Sri Ramadasu Fix For: 0.15.3 HADOOP-2227 changes jobCacheDir. In streaming, jobCacheDir was constructed like this {code} File jobCacheDir = new File(currentDir.getParentFile().getParent(), work); {code} We should change this to get it working. Referring to the changes made in HADOOP-2227, I see that the APIs used in there to construct the path are not public. And hard coding the path in streaming does not look good. thought? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-2573) limit running tasks per job
[ https://issues.apache.org/jira/browse/HADOOP-2573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12557757#action_12557757 ] Arun C Murthy commented on HADOOP-2573: --- I'd like to throw *job priority* into this festering pool... At least changing the job-priority (done by the cluster-admin) should in a change in number of max_slots... thoughts? limit running tasks per job --- Key: HADOOP-2573 URL: https://issues.apache.org/jira/browse/HADOOP-2573 Project: Hadoop Issue Type: New Feature Components: mapred Reporter: Doug Cutting Fix For: 0.17.0 It should be possible to specify a limit to the number of tasks per job permitted to run simultaneously. If, for example, you have a cluster of 50 nodes, with 100 map task slots and 100 reduce task slots, and the configured limit is 25 simultaneous tasks/job, then four or more jobs will be able to run at a time. This will permit short jobs to pass longer-running jobs. This also avoids some problems we've seen with HOD, where nodes are underutilized in their tail, and it should permit improved input locality. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-2510) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/HADOOP-2510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12557754#action_12557754 ] Arun C Murthy commented on HADOOP-2510: --- bq. I would, however, argue that the JobScheduler should not be part of MapReduce itself and rather a separate component. Sure, that is _precisely_ the idea. I guess we are on the same page now. JobScheduler is the big-daddy of the cluster. As Eric alludes, the gravy is that by moving MR into the client-code (JobManager) we can support multiple parallel-computation paradigms, in addition to MR itself. Clearly, we are a long way ... Map-Reduce 2.0 -- Key: HADOOP-2510 URL: https://issues.apache.org/jira/browse/HADOOP-2510 Project: Hadoop Issue Type: Improvement Components: mapred Reporter: Arun C Murthy We, at Yahoo!, have been using Hadoop-On-Demand as the resource provisioning/scheduling mechanism. With HoD the user uses a self-service system to ask-for a set of nodes. HoD allocates these from a global pool and also provisions a private Map-Reduce cluster for the user. She then runs her jobs and shuts the cluster down via HoD when done. All user-private clusters use the same humongous, static HDFS (e.g. 2k node HDFS). More details about HoD are available here: HADOOP-1301. h3. Motivation The current deployment (Hadoop + HoD) has a couple of implications: * _Non-optimal Cluster Utilization_ 1. Job-private Map-Reduce clusters imply that the user-cluster potentially could be *idle* for atleast a while before being detected and shut-down. 2. Elastic Jobs: Map-Reduce jobs, typically, have lots of maps with much-smaller no. of reduces; with maps being light and quick and reduces being i/o heavy and longer-running. Users typically allocate clusters depending on the no. of maps (i.e. input size) which leads to the scenario where all the maps are done (idle nodes in the cluster) and the few reduces are chugging along. Right now, we do not have the ability to shrink the HoD'ed Map-Reduce clusters which would alleviate this issue. * _Impact on data-locality_ With the current setup of a static, large HDFS and much smaller (5/10/20/50 node) clusters there is a good chance of losing one of Map-Reduce's primary features: ability to execute tasks on the datanodes where the input splits are located. In fact, we have seen the data-local tasks go down to 20-25 percent in the GridMix benchmarks, from the 95-98 percent we see on the randomwriter+sort runs run as part of the hadoopqa benchmarks (admittedly a synthetic benchmark, but yet). Admittedly, HADOOP-1985 (rack-aware Map-Reduce) helps significantly here. Primarily, the notion of *job-level scheduling* leading to private clusers, as opposed to *task-level scheduling*, is a good peg to hang-on the majority of the blame. Keeping the above factors in mind, here are some thoughts on how to re-structure Hadoop Map-Reduce to solve some of these issues. h3. State of the Art As it exists today, a large, static, Hadoop Map-Reduce cluster (forget HoD for a bit) does provide task-level scheduling; however as it exists today, it's scalability to tens-of-thousands of user-jobs, per-week, is in question. Lets review it's current architecture and main components: * JobTracker: It does both *task-scheduling* and *task-monitoring* (tasktrackers send task-statuses via periodic heartbeats), which implies it is fairly loaded. It is also a _single-point of failure_ in the Map-Reduce framework i.e. its failure implies that all the jobs in the system fail. This means a static, large Map-Reduce cluster is fairly susceptible and a definite suspect. Clearly HoD solves this by having per-job clusters, albeit with the above drawbacks. * TaskTracker: The slave in the system which executes one task at-a-time under directions from the JobTracker. * JobClient: The per-job client which just submits the job and polls the JobTracker for status. h3. Proposal - Map-Reduce 2.0 The primary idea is to move to task-level scheduling and static Map-Reduce clusters (so as to maintain the same storage cluster and compute cluster paradigm) as a way to directly tackle the two main issues illustrated above. Clearly, we will have to get around the existing problems, especially w.r.t. scalability and reliability. The proposal is to re-work Hadoop Map-Reduce to make it suitable for a large, static cluster. Here is an overview of how its main components would look like: * JobTracker: Turn the JobTracker into a pure task-scheduler, a global one. Lets call this the *JobScheduler* henceforth. Clearly (data-locality aware) Maui/Moab are candidates for being the scheduler, in which case, the JobScheduler is just a thin wrapper
[jira] Updated: (HADOOP-2131) Speculative execution should be allowed for reducers only
[ https://issues.apache.org/jira/browse/HADOOP-2131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-2131: -- Resolution: Fixed Status: Resolved (was: Patch Available) I just committed this. Thanks, Amareshwari! Speculative execution should be allowed for reducers only - Key: HADOOP-2131 URL: https://issues.apache.org/jira/browse/HADOOP-2131 Project: Hadoop Issue Type: Improvement Components: mapred Environment: Hadoop job, map fetches data from external systems Reporter: Srikanth Kakani Assignee: Amareshwari Sri Ramadasu Priority: Critical Fix For: 0.16.0 Attachments: patch-2131.txt, patch-2131.txt, patch-2131.txt Consider hadoop jobs where maps fetch data from external systems, and emit the data. The reducers in this are identity reducers. The data processed by these jobs is huge. There could be slow nodes in this cluster and some of the reducers run twice as slow as their counterparts. This could result in a long tail. Speculative execution would help greatly in such cases. However given the current hadoop, we have to select speculative execution for both maps and reducers. In this case hurting the map performance as they are fetching data from external systems thereby overloading the external systems. Speculative execution only on reducers would be a great way to solve this problem. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-2510) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/HADOOP-2510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12557781#action_12557781 ] Arun C Murthy commented on HADOOP-2510: --- bq. What I meant was more of a SW organization point of view. The JobScheduler should not be part of the MapReduce sub-project. Ah, point taken. I misunderstood your previous comment... Map-Reduce 2.0 -- Key: HADOOP-2510 URL: https://issues.apache.org/jira/browse/HADOOP-2510 Project: Hadoop Issue Type: Improvement Components: mapred Reporter: Arun C Murthy We, at Yahoo!, have been using Hadoop-On-Demand as the resource provisioning/scheduling mechanism. With HoD the user uses a self-service system to ask-for a set of nodes. HoD allocates these from a global pool and also provisions a private Map-Reduce cluster for the user. She then runs her jobs and shuts the cluster down via HoD when done. All user-private clusters use the same humongous, static HDFS (e.g. 2k node HDFS). More details about HoD are available here: HADOOP-1301. h3. Motivation The current deployment (Hadoop + HoD) has a couple of implications: * _Non-optimal Cluster Utilization_ 1. Job-private Map-Reduce clusters imply that the user-cluster potentially could be *idle* for atleast a while before being detected and shut-down. 2. Elastic Jobs: Map-Reduce jobs, typically, have lots of maps with much-smaller no. of reduces; with maps being light and quick and reduces being i/o heavy and longer-running. Users typically allocate clusters depending on the no. of maps (i.e. input size) which leads to the scenario where all the maps are done (idle nodes in the cluster) and the few reduces are chugging along. Right now, we do not have the ability to shrink the HoD'ed Map-Reduce clusters which would alleviate this issue. * _Impact on data-locality_ With the current setup of a static, large HDFS and much smaller (5/10/20/50 node) clusters there is a good chance of losing one of Map-Reduce's primary features: ability to execute tasks on the datanodes where the input splits are located. In fact, we have seen the data-local tasks go down to 20-25 percent in the GridMix benchmarks, from the 95-98 percent we see on the randomwriter+sort runs run as part of the hadoopqa benchmarks (admittedly a synthetic benchmark, but yet). Admittedly, HADOOP-1985 (rack-aware Map-Reduce) helps significantly here. Primarily, the notion of *job-level scheduling* leading to private clusers, as opposed to *task-level scheduling*, is a good peg to hang-on the majority of the blame. Keeping the above factors in mind, here are some thoughts on how to re-structure Hadoop Map-Reduce to solve some of these issues. h3. State of the Art As it exists today, a large, static, Hadoop Map-Reduce cluster (forget HoD for a bit) does provide task-level scheduling; however as it exists today, it's scalability to tens-of-thousands of user-jobs, per-week, is in question. Lets review it's current architecture and main components: * JobTracker: It does both *task-scheduling* and *task-monitoring* (tasktrackers send task-statuses via periodic heartbeats), which implies it is fairly loaded. It is also a _single-point of failure_ in the Map-Reduce framework i.e. its failure implies that all the jobs in the system fail. This means a static, large Map-Reduce cluster is fairly susceptible and a definite suspect. Clearly HoD solves this by having per-job clusters, albeit with the above drawbacks. * TaskTracker: The slave in the system which executes one task at-a-time under directions from the JobTracker. * JobClient: The per-job client which just submits the job and polls the JobTracker for status. h3. Proposal - Map-Reduce 2.0 The primary idea is to move to task-level scheduling and static Map-Reduce clusters (so as to maintain the same storage cluster and compute cluster paradigm) as a way to directly tackle the two main issues illustrated above. Clearly, we will have to get around the existing problems, especially w.r.t. scalability and reliability. The proposal is to re-work Hadoop Map-Reduce to make it suitable for a large, static cluster. Here is an overview of how its main components would look like: * JobTracker: Turn the JobTracker into a pure task-scheduler, a global one. Lets call this the *JobScheduler* henceforth. Clearly (data-locality aware) Maui/Moab are candidates for being the scheduler, in which case, the JobScheduler is just a thin wrapper around them. * TaskTracker: These stay as before, without some minor changes as illustrated later in the piece. * JobClient: Fatten up the JobClient my putting a lot more intelligence into it. Enhance it to talk to the JobTracker to ask for available
[jira] Commented: (HADOOP-2570) streaming jobs fail after HADOOP-2227
[ https://issues.apache.org/jira/browse/HADOOP-2570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12557906#action_12557906 ] Arun C Murthy commented on HADOOP-2570: --- bq. the 2 places where jobcache dir was used in streaming was to 'chmod' the executable and to lookup this directory in PATH. Would be it OK to construct jobCacheDir as done in HADOOP-2227 ? Lohit, that still won't help scripts which use ../work/myscript - so this is the best approach for 0.15.3. In light of this bug HADOOP-2116 is a little more complicated than originally thought, I have a few thoughts about this which I'll put up there. bq. Submiting the patch with symlinks to ../work from taskdir +1 streaming jobs fail after HADOOP-2227 - Key: HADOOP-2570 URL: https://issues.apache.org/jira/browse/HADOOP-2570 Project: Hadoop Issue Type: Bug Components: contrib/streaming Affects Versions: 0.15.2 Reporter: lohit vijayarenu Assignee: Amareshwari Sri Ramadasu Priority: Blocker Fix For: 0.15.3 Attachments: patch-2570.txt HADOOP-2227 changes jobCacheDir. In streaming, jobCacheDir was constructed like this {code} File jobCacheDir = new File(currentDir.getParentFile().getParent(), work); {code} We should change this to get it working. Referring to the changes made in HADOOP-2227, I see that the APIs used in there to construct the path are not public. And hard coding the path in streaming does not look good. thought? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-2116) Job.local.dir to be exposed to tasks
[ https://issues.apache.org/jira/browse/HADOOP-2116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-2116: -- Status: Open (was: Patch Available) I light of HADOOP-2570, I'm cancelling this patch. Reasoning: The *-file* option works by putting the script into the job's jar file by unjar-ing, copying and then jar-ing it again. (yuck!) This means that on the TaskTracker the script has moved from jobCache/work to jobCache/job_jar_xml (I propose we rename that to *private*, heh). Clearly user-scripts which rely on ../work/script_name will break again... Having said that we need to debate whether this feature is an incompatible-change, what do folks think? If people say otherwise we need to ensure all files in jobCache/private are smylinked into jobCache/work... ugh! I'd like to take this opportunity to take a hard look at streaming's *-file* option too. The unjar/jar way is completely backwards! We _should_ rework the -file option to use the DistributedCache and the symlink option it provides. So, user-scripts can simply be ./script rather than ../work/script. Yes, the way to maintain compatibility (if we want) is to use the previous option of symlinking files into jobCache/work also. I'd strongly vote for this option. Thoughts? Job.local.dir to be exposed to tasks Key: HADOOP-2116 URL: https://issues.apache.org/jira/browse/HADOOP-2116 Project: Hadoop Issue Type: Improvement Components: mapred Affects Versions: 0.14.3 Environment: All Reporter: Milind Bhandarkar Assignee: Amareshwari Sri Ramadasu Fix For: 0.16.0 Attachments: patch-2116.txt, patch-2116.txt Currently, since all task cwds are created under a jobcache directory, users that need a job-specific shared directory for use as scratch space, create ../work. This is hacky, and will break when HADOOP-2115 is addressed. For such jobs, hadoop mapred should expose job.local.dir via localized configuration. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-1876) Persisting completed jobs status
[ https://issues.apache.org/jira/browse/HADOOP-1876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-1876: -- Assignee: Alejandro Abdelnur Status: Open (was: Patch Available) bq. It seems to me that it would be much easier to retrofit the JobHistory to use info out the files the patch is writing that the other way around. I guess we should consider the fact that we might be better off, in the long run, moving away from the custom, textual format used today by the {{JobHistory}} and go the {{Writable}} way - much lesser and more standard code. I don't believe the textual format buys us much, and is a pain to maintain. If folks agree, I'm okay with this patch going in as-is (oh, and yes, this is a very different use-case) and then fixing {{JobHistory}} to use {{Writable}} to serialize the necessary data-structures. Thoughts? That said, some comments about the patch: Alejandro, could you please ensure that the {{completedJobsStoreThread}} isn't _started at all_ if the feature is switched off? Maybe we could add a boolean {{mapred.job.tracker.persist.jobstatus}} flag to turn the feature on/off. Persisting completed jobs status Key: HADOOP-1876 URL: https://issues.apache.org/jira/browse/HADOOP-1876 Project: Hadoop Issue Type: Improvement Components: mapred Environment: all Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Priority: Critical Fix For: 0.16.0 Attachments: patch1876.txt, patch1876.txt Currently the JobTracker keeps information about completed jobs in memory. This information is flushed from the cache when it has outlived (#RETIRE_JOB_INTERVAL) or because the limit of completed jobs in memory has been reach (#MAX_COMPLETE_USER_JOBS_IN_MEMORY). Also, if the JobTracker is restarted (due to being recycled or due to a crash) information about completed jobs is lost. If any of the above scenarios happens before the job information is queried by a hadoop client (normally the job submitter or a monitoring component) there is no way to obtain such information. A way to avoid this is the JobTracker to persist in DFS the completed jobs information upon job completion. This would be done at the time the job is moved to the completed jobs queue. Then when querying the JobTracker for information about a completed job, if it is not found in the memory queue, a lookup in DFS would be done to retrieve the completed job information. A directory in DFS (under mapred/system) would be used to persist completed job information, for each completed job there would be a directory with the job ID, within that directory all the information about the job: status, jobprofile, counters and completion events. A configuration property will indicate for how log persisted job information should be kept in DFS. After such period it will be cleaned up automatically. This improvement would not introduce API changes. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-1281) Speculative map tasks aren't getting killed although the TIP completed
[ https://issues.apache.org/jira/browse/HADOOP-1281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-1281: -- Attachment: HADOOP-1281_2_20080109.patch Exact same patch as before, but added comments rationalizing the fix... Speculative map tasks aren't getting killed although the TIP completed -- Key: HADOOP-1281 URL: https://issues.apache.org/jira/browse/HADOOP-1281 Project: Hadoop Issue Type: Bug Components: mapred Affects Versions: 0.15.0 Reporter: Arun C Murthy Assignee: Arun C Murthy Priority: Critical Fix For: 0.16.0 Attachments: HADOOP-1281_1_20071117.patch, HADOOP-1281_2_20071123.patch, HADOOP-1281_2_20080109.patch The speculative map tasks run to completion although the TIP succeeded since the other task completed elsewhere. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-1281) Speculative map tasks aren't getting killed although the TIP completed
[ https://issues.apache.org/jira/browse/HADOOP-1281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-1281: -- Resolution: Fixed Status: Resolved (was: Patch Available) I just committed this. Speculative map tasks aren't getting killed although the TIP completed -- Key: HADOOP-1281 URL: https://issues.apache.org/jira/browse/HADOOP-1281 Project: Hadoop Issue Type: Bug Components: mapred Affects Versions: 0.15.0 Reporter: Arun C Murthy Assignee: Arun C Murthy Priority: Critical Fix For: 0.16.0 Attachments: HADOOP-1281_1_20071117.patch, HADOOP-1281_2_20071123.patch, HADOOP-1281_2_20080109.patch The speculative map tasks run to completion although the TIP succeeded since the other task completed elsewhere. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-1876) Persisting completed jobs status
[ https://issues.apache.org/jira/browse/HADOOP-1876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12557397#action_12557397 ] Arun C Murthy commented on HADOOP-1876: --- bq. Can this patch make JobHistory log obsolete? Or at least is that intended? I hate to see same information logged at different places in different forms using different code paths. This patch doesn't do that, but definitely that is the direction I'd go too... +1. Should we broaden the scope HADOOP-2178 to re-work JobHistory to use Writables rather than the custom format? Or is that a new jira? bq. Other than being in text format (which has its pros and cons), job history log is event based [...] Yes, moving to Writable wouldn't hurt the _job analysis_ part since, as you point out, it's event-based - we just need to use Writable.readFields rather than the custom text-parsing... anyone sees other issues? Persisting completed jobs status Key: HADOOP-1876 URL: https://issues.apache.org/jira/browse/HADOOP-1876 Project: Hadoop Issue Type: Improvement Components: mapred Environment: all Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Priority: Critical Fix For: 0.16.0 Attachments: patch1876.txt, patch1876.txt Currently the JobTracker keeps information about completed jobs in memory. This information is flushed from the cache when it has outlived (#RETIRE_JOB_INTERVAL) or because the limit of completed jobs in memory has been reach (#MAX_COMPLETE_USER_JOBS_IN_MEMORY). Also, if the JobTracker is restarted (due to being recycled or due to a crash) information about completed jobs is lost. If any of the above scenarios happens before the job information is queried by a hadoop client (normally the job submitter or a monitoring component) there is no way to obtain such information. A way to avoid this is the JobTracker to persist in DFS the completed jobs information upon job completion. This would be done at the time the job is moved to the completed jobs queue. Then when querying the JobTracker for information about a completed job, if it is not found in the memory queue, a lookup in DFS would be done to retrieve the completed job information. A directory in DFS (under mapred/system) would be used to persist completed job information, for each completed job there would be a directory with the job ID, within that directory all the information about the job: status, jobprofile, counters and completion events. A configuration property will indicate for how log persisted job information should be kept in DFS. After such period it will be cleaned up automatically. This improvement would not introduce API changes. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-2268) JobControl classes should use interfaces rather than implemenations
[ https://issues.apache.org/jira/browse/HADOOP-2268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-2268: -- Resolution: Fixed Status: Resolved (was: Patch Available) I just committed this. Thanks, Adrian! JobControl classes should use interfaces rather than implemenations --- Key: HADOOP-2268 URL: https://issues.apache.org/jira/browse/HADOOP-2268 Project: Hadoop Issue Type: Improvement Components: mapred Affects Versions: 0.15.0 Reporter: Adrian Woodhead Assignee: Adrian Woodhead Priority: Minor Fix For: 0.16.0 Attachments: HADOOP-2268-1.patch, HADOOP-2268-2.patch, HADOOP-2268-3.patch, HADOOP-2268-4.patch See HADOOP-2202 for background on this issue. Arun C. Murthy agrees that when possible it is preferable to program against the interface rather than a concrete implementation (more flexible, allows for changes of the implementation in future etc.) JobControl currently exposes running, waiting, ready, successful and dependent jobs as ArrayList rather than List. I propose to change this to List. I will code up a patch for this. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (HADOOP-2077) Logging version number (and compiled date) at STARTUP_MSG
[ https://issues.apache.org/jira/browse/HADOOP-2077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy reassigned HADOOP-2077: - Assignee: Arun C Murthy Logging version number (and compiled date) at STARTUP_MSG --- Key: HADOOP-2077 URL: https://issues.apache.org/jira/browse/HADOOP-2077 Project: Hadoop Issue Type: Improvement Components: dfs, mapred Reporter: Koji Noguchi Assignee: Arun C Murthy Priority: Trivial Fix For: 0.16.0 This will help us figure out which version of hadoop we were running when looking back the logs. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-2077) Logging version number (and compiled date) at STARTUP_MSG
[ https://issues.apache.org/jira/browse/HADOOP-2077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-2077: -- Attachment: HADOOP-2077_0_20080110.patch Simple fix. I haven't gotten around to testing it much since svn.apache.org is _super_ slow... Logging version number (and compiled date) at STARTUP_MSG --- Key: HADOOP-2077 URL: https://issues.apache.org/jira/browse/HADOOP-2077 Project: Hadoop Issue Type: Improvement Components: dfs, mapred Reporter: Koji Noguchi Assignee: Arun C Murthy Priority: Trivial Fix For: 0.16.0 Attachments: HADOOP-2077_0_20080110.patch This will help us figure out which version of hadoop we were running when looking back the logs. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-2077) Logging version number (and compiled date) at STARTUP_MSG
[ https://issues.apache.org/jira/browse/HADOOP-2077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-2077: -- Attachment: HADOOP-2077_0_20080110.patch Minor change in formatting of the output, which now looks like: {noformat} 2008-01-10 03:57:15,143 INFO org.apache.hadoop.dfs.NameNode: STARTUP_MSG: / STARTUP_MSG: Starting NameNode STARTUP_MSG: host = neo/127.0.0.1 STARTUP_MSG: args = [] STARTUP_MSG: version = 0.16.0-dev STARTUP_MSG: subversion = http://svn.apache.org/repos/asf/lucene/hadoop/trunk -r 610541 STARTUP_MSG: compiled-by = arun on Thu Jan 10 03:57:03 IST 2008 / {noformat} Logging version number (and compiled date) at STARTUP_MSG --- Key: HADOOP-2077 URL: https://issues.apache.org/jira/browse/HADOOP-2077 Project: Hadoop Issue Type: Improvement Components: dfs, mapred Reporter: Koji Noguchi Assignee: Arun C Murthy Priority: Trivial Fix For: 0.16.0 Attachments: HADOOP-2077_0_20080110.patch, HADOOP-2077_0_20080110.patch This will help us figure out which version of hadoop we were running when looking back the logs. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-2077) Logging version number (and compiled date) at STARTUP_MSG
[ https://issues.apache.org/jira/browse/HADOOP-2077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-2077: -- Status: Patch Available (was: Open) Logging version number (and compiled date) at STARTUP_MSG --- Key: HADOOP-2077 URL: https://issues.apache.org/jira/browse/HADOOP-2077 Project: Hadoop Issue Type: Improvement Components: dfs, mapred Reporter: Koji Noguchi Assignee: Arun C Murthy Priority: Trivial Fix For: 0.16.0 Attachments: HADOOP-2077_0_20080110.patch, HADOOP-2077_0_20080110.patch This will help us figure out which version of hadoop we were running when looking back the logs. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-2131) Speculative execution should be allowed for reducers only
[ https://issues.apache.org/jira/browse/HADOOP-2131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-2131: -- Status: Open (was: Patch Available) Please go ahead and deprecate the old {{mapred.speculative.execution}} config in favour of the new ones which should be set to *true* in hadoop-default.xml. For 0.16.0 we should let {{mapred.speculative.execution}} over-ride the new ones since it means folks actually went ahead and cared about the non-default value and hence set it in their hadoop-site.xml. Speculative execution should be allowed for reducers only - Key: HADOOP-2131 URL: https://issues.apache.org/jira/browse/HADOOP-2131 Project: Hadoop Issue Type: Improvement Components: mapred Environment: Hadoop job, map fetches data from external systems Reporter: Srikanth Kakani Assignee: Amareshwari Sri Ramadasu Priority: Critical Fix For: 0.16.0 Attachments: patch-2131.txt Consider hadoop jobs where maps fetch data from external systems, and emit the data. The reducers in this are identity reducers. The data processed by these jobs is huge. There could be slow nodes in this cluster and some of the reducers run twice as slow as their counterparts. This could result in a long tail. Speculative execution would help greatly in such cases. However given the current hadoop, we have to select speculative execution for both maps and reducers. In this case hurting the map performance as they are fetching data from external systems thereby overloading the external systems. Speculative execution only on reducers would be a great way to solve this problem. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-2285) TextInputFormat is slow compared to reading files.
[ https://issues.apache.org/jira/browse/HADOOP-2285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-2285: -- Status: Open (was: Patch Available) Minor nit: this patch removes a public constructor rather than add a new one: {noformat} - public LineRecordReader(InputStream in, long offset, long endOffset) + public LineRecordReader(InputStream in, long offset, long endOffset, + Configuration job) {noformat} TextInputFormat is slow compared to reading files. -- Key: HADOOP-2285 URL: https://issues.apache.org/jira/browse/HADOOP-2285 Project: Hadoop Issue Type: Bug Components: mapred Affects Versions: 0.15.0 Reporter: Owen O'Malley Assignee: Owen O'Malley Fix For: 0.16.0 Attachments: fast-line.patch The LineRecordReader reads from the source byte by byte, which seems to be half as fast as if the readLine method was defined on the memory buffer directly instead of as an InputStream. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-2285) TextInputFormat is slow compared to reading files.
[ https://issues.apache.org/jira/browse/HADOOP-2285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-2285: -- Attachment: fast-line2.patch Attaching a simple fix to my previous comment on Owen's behalf... TextInputFormat is slow compared to reading files. -- Key: HADOOP-2285 URL: https://issues.apache.org/jira/browse/HADOOP-2285 Project: Hadoop Issue Type: Bug Components: mapred Affects Versions: 0.15.0 Reporter: Owen O'Malley Assignee: Owen O'Malley Fix For: 0.16.0 Attachments: fast-line.patch, fast-line2.patch The LineRecordReader reads from the source byte by byte, which seems to be half as fast as if the readLine method was defined on the memory buffer directly instead of as an InputStream. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-2285) TextInputFormat is slow compared to reading files.
[ https://issues.apache.org/jira/browse/HADOOP-2285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-2285: -- Status: Patch Available (was: Open) TextInputFormat is slow compared to reading files. -- Key: HADOOP-2285 URL: https://issues.apache.org/jira/browse/HADOOP-2285 Project: Hadoop Issue Type: Bug Components: mapred Affects Versions: 0.15.0 Reporter: Owen O'Malley Assignee: Owen O'Malley Fix For: 0.16.0 Attachments: fast-line.patch, fast-line2.patch The LineRecordReader reads from the source byte by byte, which seems to be half as fast as if the readLine method was defined on the memory buffer directly instead of as an InputStream. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-2487) Provide an option to get job status for all jobs run by or submitted to a job tracker
[ https://issues.apache.org/jira/browse/HADOOP-2487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12556886#action_12556886 ] Arun C Murthy commented on HADOOP-2487: --- You are right, I withdraw my earlier comments... +1 Provide an option to get job status for all jobs run by or submitted to a job tracker - Key: HADOOP-2487 URL: https://issues.apache.org/jira/browse/HADOOP-2487 Project: Hadoop Issue Type: New Feature Components: mapred Reporter: Hemanth Yamijala Assignee: Amareshwari Sri Ramadasu Fix For: 0.16.0 Attachments: patch-2487.txt This is an RFE for providing an RPC in Hadoop that can expose status information for jobs submitted to a JobTracker. Such a feature can be used for developing tools that can be used to analyse jobs. It is possible that other information is also useful - such as running times of jobs, etc. Comments ? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-1281) Speculative map tasks aren't getting killed although the TIP completed
[ https://issues.apache.org/jira/browse/HADOOP-1281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-1281: -- Status: Patch Available (was: Reopened) I finally got around to testing this patch throughly, hence marking it PA. Speculative map tasks aren't getting killed although the TIP completed -- Key: HADOOP-1281 URL: https://issues.apache.org/jira/browse/HADOOP-1281 Project: Hadoop Issue Type: Bug Components: mapred Affects Versions: 0.15.0 Reporter: Arun C Murthy Assignee: Arun C Murthy Priority: Critical Fix For: 0.16.0 Attachments: HADOOP-1281_1_20071117.patch, HADOOP-1281_2_20071123.patch The speculative map tasks run to completion although the TIP succeeded since the other task completed elsewhere. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-1660) add support for native library toDistributedCache
[ https://issues.apache.org/jira/browse/HADOOP-1660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-1660: -- Resolution: Fixed Status: Resolved (was: Patch Available) I just committed this. add support for native library toDistributedCache -- Key: HADOOP-1660 URL: https://issues.apache.org/jira/browse/HADOOP-1660 Project: Hadoop Issue Type: Improvement Components: mapred Environment: unix (different handling would be required for windows) Reporter: Alejandro Abdelnur Assignee: Arun C Murthy Fix For: 0.16.0 Attachments: HADOOP-1660_0_20080108.patch Currently if a M/R job depends on JNI based component the dynamic library must be available in all the task nodes. This is not possible specially when you have not control on the cluster machines, just using it as a service. It should be possible to specify using the DistributedCache what are the native libraries a job needs. For example via a new method 'public void addLibrary(Path libraryPath, JobConf conf)'. The added libraries would make it to the local FS of the task nodes (same way as cached resources) but instead been part of the classpath they would be copied to a lib directory and that lib directory would be added t the LD_LIBRARY_PATH of the task JVM. An alternative would be to set the '-Djava.library.path=' task JVM parameter to the lib directory above. However, this would break for libraries that depend on other libraries as the dependent one would not be in the LD_LIBRARY_PATH and the OS would fail to find it as it is not the JVM the one doing the load of the dependent one. For uncached usage of native libraries, a special directory in the JAR could be used for native libraries. But I'd argue that the DistributedCache enhancement would be enough, and if somebody wants to use a native library s/he should use the DistributedCached. Or a JobConf addLibrary method that uses the DistributedCached under the hood at submission time. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (HADOOP-622) Users should be able to change the environment in which there maps/reduces run.
[ https://issues.apache.org/jira/browse/HADOOP-622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy resolved HADOOP-622. -- Resolution: Duplicate Fixed as a part of HADOOP-1660. Users should be able to change the environment in which there maps/reduces run. --- Key: HADOOP-622 URL: https://issues.apache.org/jira/browse/HADOOP-622 Project: Hadoop Issue Type: Improvement Components: mapred Reporter: Mahadev konar Assignee: Owen O'Malley Priority: Minor This would be useful with caching. So you would be avble to say, cache file X and then should be able to change the environment variable like PATH/LD_LIBRARY_PATH to include the local path hwere the file was cached. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-2131) Speculative execution should be allowed for reducers only
[ https://issues.apache.org/jira/browse/HADOOP-2131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-2131: -- Attachment: patch-2131.txt Attaching an updated patch( with a couple of minor javadoc fixes) on Amareshwari's behalf so that Hudson can pick this up right-away... Speculative execution should be allowed for reducers only - Key: HADOOP-2131 URL: https://issues.apache.org/jira/browse/HADOOP-2131 Project: Hadoop Issue Type: Improvement Components: mapred Environment: Hadoop job, map fetches data from external systems Reporter: Srikanth Kakani Assignee: Amareshwari Sri Ramadasu Priority: Critical Fix For: 0.16.0 Attachments: patch-2131.txt, patch-2131.txt, patch-2131.txt Consider hadoop jobs where maps fetch data from external systems, and emit the data. The reducers in this are identity reducers. The data processed by these jobs is huge. There could be slow nodes in this cluster and some of the reducers run twice as slow as their counterparts. This could result in a long tail. Speculative execution would help greatly in such cases. However given the current hadoop, we have to select speculative execution for both maps and reducers. In this case hurting the map performance as they are fetching data from external systems thereby overloading the external systems. Speculative execution only on reducers would be a great way to solve this problem. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-2131) Speculative execution should be allowed for reducers only
[ https://issues.apache.org/jira/browse/HADOOP-2131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-2131: -- Status: Open (was: Patch Available) Need to update this patch to reflect recent changes to trunk... Speculative execution should be allowed for reducers only - Key: HADOOP-2131 URL: https://issues.apache.org/jira/browse/HADOOP-2131 Project: Hadoop Issue Type: Improvement Components: mapred Environment: Hadoop job, map fetches data from external systems Reporter: Srikanth Kakani Assignee: Amareshwari Sri Ramadasu Priority: Critical Fix For: 0.16.0 Attachments: patch-2131.txt, patch-2131.txt Consider hadoop jobs where maps fetch data from external systems, and emit the data. The reducers in this are identity reducers. The data processed by these jobs is huge. There could be slow nodes in this cluster and some of the reducers run twice as slow as their counterparts. This could result in a long tail. Speculative execution would help greatly in such cases. However given the current hadoop, we have to select speculative execution for both maps and reducers. In this case hurting the map performance as they are fetching data from external systems thereby overloading the external systems. Speculative execution only on reducers would be a great way to solve this problem. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-2131) Speculative execution should be allowed for reducers only
[ https://issues.apache.org/jira/browse/HADOOP-2131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-2131: -- Status: Patch Available (was: Open) Speculative execution should be allowed for reducers only - Key: HADOOP-2131 URL: https://issues.apache.org/jira/browse/HADOOP-2131 Project: Hadoop Issue Type: Improvement Components: mapred Environment: Hadoop job, map fetches data from external systems Reporter: Srikanth Kakani Assignee: Amareshwari Sri Ramadasu Priority: Critical Fix For: 0.16.0 Attachments: patch-2131.txt, patch-2131.txt, patch-2131.txt Consider hadoop jobs where maps fetch data from external systems, and emit the data. The reducers in this are identity reducers. The data processed by these jobs is huge. There could be slow nodes in this cluster and some of the reducers run twice as slow as their counterparts. This could result in a long tail. Speculative execution would help greatly in such cases. However given the current hadoop, we have to select speculative execution for both maps and reducers. In this case hurting the map performance as they are fetching data from external systems thereby overloading the external systems. Speculative execution only on reducers would be a great way to solve this problem. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-2535) Remove support for deprecated mapred.child.heap.size and indentation fix in TaskRunner.java
[ https://issues.apache.org/jira/browse/HADOOP-2535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-2535: -- Resolution: Fixed Status: Resolved (was: Patch Available) I just committed this. Remove support for deprecated mapred.child.heap.size and indentation fix in TaskRunner.java --- Key: HADOOP-2535 URL: https://issues.apache.org/jira/browse/HADOOP-2535 Project: Hadoop Issue Type: Bug Components: mapred Affects Versions: 0.15.2 Reporter: Arun C Murthy Assignee: Arun C Murthy Priority: Minor Fix For: 0.16.0 Attachments: HADOOP-2535_0_20080107.patch, HADOOP-2535_1_20080107.patch, HADOOP-2535_2_20080107.patch TaskRunner.java (289-344) have wrong indentation - 4 spaces rather than the standard 2. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-1660) add support for native library toDistributedCache
[ https://issues.apache.org/jira/browse/HADOOP-1660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-1660: -- Fix Version/s: 0.16.0 Status: Patch Available (was: Open) add support for native library toDistributedCache -- Key: HADOOP-1660 URL: https://issues.apache.org/jira/browse/HADOOP-1660 Project: Hadoop Issue Type: Improvement Components: mapred Environment: unix (different handling would be required for windows) Reporter: Alejandro Abdelnur Assignee: Arun C Murthy Fix For: 0.16.0 Attachments: HADOOP-1660_0_20080108.patch Currently if a M/R job depends on JNI based component the dynamic library must be available in all the task nodes. This is not possible specially when you have not control on the cluster machines, just using it as a service. It should be possible to specify using the DistributedCache what are the native libraries a job needs. For example via a new method 'public void addLibrary(Path libraryPath, JobConf conf)'. The added libraries would make it to the local FS of the task nodes (same way as cached resources) but instead been part of the classpath they would be copied to a lib directory and that lib directory would be added t the LD_LIBRARY_PATH of the task JVM. An alternative would be to set the '-Djava.library.path=' task JVM parameter to the lib directory above. However, this would break for libraries that depend on other libraries as the dependent one would not be in the LD_LIBRARY_PATH and the OS would fail to find it as it is not the JVM the one doing the load of the dependent one. For uncached usage of native libraries, a special directory in the JAR could be used for native libraries. But I'd argue that the DistributedCache enhancement would be enough, and if somebody wants to use a native library s/he should use the DistributedCached. Or a JobConf addLibrary method that uses the DistributedCached under the hood at submission time. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-1660) add support for native library toDistributedCache
[ https://issues.apache.org/jira/browse/HADOOP-1660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-1660: -- Attachment: HADOOP-1660_0_20080108.patch Here is a candidate patch which adds the child task's cwd to it's {{java.library.path}}, I've also updated the forrest-docs to reflect this. add support for native library toDistributedCache -- Key: HADOOP-1660 URL: https://issues.apache.org/jira/browse/HADOOP-1660 Project: Hadoop Issue Type: Improvement Components: mapred Environment: unix (different handling would be required for windows) Reporter: Alejandro Abdelnur Assignee: Arun C Murthy Fix For: 0.16.0 Attachments: HADOOP-1660_0_20080108.patch Currently if a M/R job depends on JNI based component the dynamic library must be available in all the task nodes. This is not possible specially when you have not control on the cluster machines, just using it as a service. It should be possible to specify using the DistributedCache what are the native libraries a job needs. For example via a new method 'public void addLibrary(Path libraryPath, JobConf conf)'. The added libraries would make it to the local FS of the task nodes (same way as cached resources) but instead been part of the classpath they would be copied to a lib directory and that lib directory would be added t the LD_LIBRARY_PATH of the task JVM. An alternative would be to set the '-Djava.library.path=' task JVM parameter to the lib directory above. However, this would break for libraries that depend on other libraries as the dependent one would not be in the LD_LIBRARY_PATH and the OS would fail to find it as it is not the JVM the one doing the load of the dependent one. For uncached usage of native libraries, a special directory in the JAR could be used for native libraries. But I'd argue that the DistributedCache enhancement would be enough, and if somebody wants to use a native library s/he should use the DistributedCached. Or a JobConf addLibrary method that uses the DistributedCached under the hood at submission time. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-2487) Provide an option to get job status for all jobs run by or submitted to a job tracker
[ https://issues.apache.org/jira/browse/HADOOP-2487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12556677#action_12556677 ] Arun C Murthy commented on HADOOP-2487: --- This patch enhances the _bin/hadoop job_ command like so: {noformat} $ bin/hadoop job -list all {noformat} However, I think it's better to keep the _job_ command per-job specific and add a *listAllJobs* switch to the _bin/hadoop jobtracker_ command: {noformat} $ bin/hadoop jobtracker -listalljobs {noformat} Provide an option to get job status for all jobs run by or submitted to a job tracker - Key: HADOOP-2487 URL: https://issues.apache.org/jira/browse/HADOOP-2487 Project: Hadoop Issue Type: New Feature Components: mapred Reporter: Hemanth Yamijala Assignee: Amareshwari Sri Ramadasu Fix For: 0.16.0 Attachments: patch-2487.txt This is an RFE for providing an RPC in Hadoop that can expose status information for jobs submitted to a JobTracker. Such a feature can be used for developing tools that can be used to analyse jobs. It is possible that other information is also useful - such as running times of jobs, etc. Comments ? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (HADOOP-1660) add support for native library toDistributedCache
[ https://issues.apache.org/jira/browse/HADOOP-1660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy reassigned HADOOP-1660: - Assignee: Arun C Murthy add support for native library toDistributedCache -- Key: HADOOP-1660 URL: https://issues.apache.org/jira/browse/HADOOP-1660 Project: Hadoop Issue Type: Improvement Components: mapred Environment: unix (different handling would be required for windows) Reporter: Alejandro Abdelnur Assignee: Arun C Murthy Currently if a M/R job depends on JNI based component the dynamic library must be available in all the task nodes. This is not possible specially when you have not control on the cluster machines, just using it as a service. It should be possible to specify using the DistributedCache what are the native libraries a job needs. For example via a new method 'public void addLibrary(Path libraryPath, JobConf conf)'. The added libraries would make it to the local FS of the task nodes (same way as cached resources) but instead been part of the classpath they would be copied to a lib directory and that lib directory would be added t the LD_LIBRARY_PATH of the task JVM. An alternative would be to set the '-Djava.library.path=' task JVM parameter to the lib directory above. However, this would break for libraries that depend on other libraries as the dependent one would not be in the LD_LIBRARY_PATH and the OS would fail to find it as it is not the JVM the one doing the load of the dependent one. For uncached usage of native libraries, a special directory in the JAR could be used for native libraries. But I'd argue that the DistributedCache enhancement would be enough, and if somebody wants to use a native library s/he should use the DistributedCached. Or a JobConf addLibrary method that uses the DistributedCached under the hood at submission time. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HADOOP-2535) Indentation fix in TaskRunner.java
Indentation fix in TaskRunner.java -- Key: HADOOP-2535 URL: https://issues.apache.org/jira/browse/HADOOP-2535 Project: Hadoop Issue Type: Bug Components: mapred Affects Versions: 0.15.2 Reporter: Arun C Murthy Assignee: Arun C Murthy Priority: Trivial Fix For: 0.16.0 TaskRunner.java (289-344) have wrong indentation - 4 spaces rather than the standard 2. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-2535) Remove support for deprecated mapred.child.heap.size and indentation fix in TaskRunner.java
[ https://issues.apache.org/jira/browse/HADOOP-2535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-2535: -- Priority: Minor (was: Trivial) Summary: Remove support for deprecated mapred.child.heap.size and indentation fix in TaskRunner.java (was: Indentation fix in TaskRunner.java) Remove support for deprecated mapred.child.heap.size and indentation fix in TaskRunner.java --- Key: HADOOP-2535 URL: https://issues.apache.org/jira/browse/HADOOP-2535 Project: Hadoop Issue Type: Bug Components: mapred Affects Versions: 0.15.2 Reporter: Arun C Murthy Assignee: Arun C Murthy Priority: Minor Fix For: 0.16.0 TaskRunner.java (289-344) have wrong indentation - 4 spaces rather than the standard 2. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-2535) Remove support for deprecated mapred.child.heap.size and indentation fix in TaskRunner.java
[ https://issues.apache.org/jira/browse/HADOOP-2535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-2535: -- Attachment: HADOOP-2535_0_20080107.patch Candidate patch which removes support for deprecated {{mapred.child.heap.size}} and fixes the indentation irritants. I've also updated some comments... Remove support for deprecated mapred.child.heap.size and indentation fix in TaskRunner.java --- Key: HADOOP-2535 URL: https://issues.apache.org/jira/browse/HADOOP-2535 Project: Hadoop Issue Type: Bug Components: mapred Affects Versions: 0.15.2 Reporter: Arun C Murthy Assignee: Arun C Murthy Priority: Minor Fix For: 0.16.0 Attachments: HADOOP-2535_0_20080107.patch TaskRunner.java (289-344) have wrong indentation - 4 spaces rather than the standard 2. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-2535) Remove support for deprecated mapred.child.heap.size and indentation fix in TaskRunner.java
[ https://issues.apache.org/jira/browse/HADOOP-2535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-2535: -- Status: Patch Available (was: Open) Remove support for deprecated mapred.child.heap.size and indentation fix in TaskRunner.java --- Key: HADOOP-2535 URL: https://issues.apache.org/jira/browse/HADOOP-2535 Project: Hadoop Issue Type: Bug Components: mapred Affects Versions: 0.15.2 Reporter: Arun C Murthy Assignee: Arun C Murthy Priority: Minor Fix For: 0.16.0 Attachments: HADOOP-2535_0_20080107.patch TaskRunner.java (289-344) have wrong indentation - 4 spaces rather than the standard 2. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-2535) Remove support for deprecated mapred.child.heap.size and indentation fix in TaskRunner.java
[ https://issues.apache.org/jira/browse/HADOOP-2535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-2535: -- Status: Open (was: Patch Available) I noticed, a little too late, that another old comment is _wrong_ ... *smile* Remove support for deprecated mapred.child.heap.size and indentation fix in TaskRunner.java --- Key: HADOOP-2535 URL: https://issues.apache.org/jira/browse/HADOOP-2535 Project: Hadoop Issue Type: Bug Components: mapred Affects Versions: 0.15.2 Reporter: Arun C Murthy Assignee: Arun C Murthy Priority: Minor Fix For: 0.16.0 Attachments: HADOOP-2535_0_20080107.patch TaskRunner.java (289-344) have wrong indentation - 4 spaces rather than the standard 2. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-2535) Remove support for deprecated mapred.child.heap.size and indentation fix in TaskRunner.java
[ https://issues.apache.org/jira/browse/HADOOP-2535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-2535: -- Attachment: HADOOP-2535_1_20080107.patch Updated patch to fix: {noformat} - // namemapred.child.optional.jvm.args/name {noformat} as {noformat} - // namemapred.child.java.opts/name {noformat} Remove support for deprecated mapred.child.heap.size and indentation fix in TaskRunner.java --- Key: HADOOP-2535 URL: https://issues.apache.org/jira/browse/HADOOP-2535 Project: Hadoop Issue Type: Bug Components: mapred Affects Versions: 0.15.2 Reporter: Arun C Murthy Assignee: Arun C Murthy Priority: Minor Fix For: 0.16.0 Attachments: HADOOP-2535_0_20080107.patch, HADOOP-2535_1_20080107.patch TaskRunner.java (289-344) have wrong indentation - 4 spaces rather than the standard 2. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-2535) Remove support for deprecated mapred.child.heap.size and indentation fix in TaskRunner.java
[ https://issues.apache.org/jira/browse/HADOOP-2535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-2535: -- Status: Patch Available (was: Open) Remove support for deprecated mapred.child.heap.size and indentation fix in TaskRunner.java --- Key: HADOOP-2535 URL: https://issues.apache.org/jira/browse/HADOOP-2535 Project: Hadoop Issue Type: Bug Components: mapred Affects Versions: 0.15.2 Reporter: Arun C Murthy Assignee: Arun C Murthy Priority: Minor Fix For: 0.16.0 Attachments: HADOOP-2535_0_20080107.patch, HADOOP-2535_1_20080107.patch TaskRunner.java (289-344) have wrong indentation - 4 spaces rather than the standard 2. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-2535) Remove support for deprecated mapred.child.heap.size and indentation fix in TaskRunner.java
[ https://issues.apache.org/jira/browse/HADOOP-2535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-2535: -- Status: Open (was: Patch Available) I need to fix some more docs for {{mapred.child.heap.size}} ... Remove support for deprecated mapred.child.heap.size and indentation fix in TaskRunner.java --- Key: HADOOP-2535 URL: https://issues.apache.org/jira/browse/HADOOP-2535 Project: Hadoop Issue Type: Bug Components: mapred Affects Versions: 0.15.2 Reporter: Arun C Murthy Assignee: Arun C Murthy Priority: Minor Fix For: 0.16.0 Attachments: HADOOP-2535_0_20080107.patch, HADOOP-2535_1_20080107.patch TaskRunner.java (289-344) have wrong indentation - 4 spaces rather than the standard 2. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-2528) check permissions for job inputs and outputs
[ https://issues.apache.org/jira/browse/HADOOP-2528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12556271#action_12556271 ] Arun C Murthy commented on HADOOP-2528: --- I'm with Doug on the need for failing early if either of the input or output directories aren't readable/writable. I'm wondering if it makes sense to add utility apis in fs which given a directory name checks for existence, validates against given set of permissions etc. We could then use this to validate the job inputs via a single rpc, rather than one per file as today (without this patch). Thoughts? check permissions for job inputs and outputs Key: HADOOP-2528 URL: https://issues.apache.org/jira/browse/HADOOP-2528 Project: Hadoop Issue Type: Improvement Components: mapred Reporter: Doug Cutting Fix For: 0.16.0 Attachments: HADOOP-2528-0.patch, HADOOP-2528-1.patch On job submission, filesystem permissions should be checked to ensure that the input directory is readable and that the output directory is writable. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-2516) HADOOP-1819 removed a public api JobTracker.getTracker in 0.15.0
[ https://issues.apache.org/jira/browse/HADOOP-2516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12555850#action_12555850 ] Arun C Murthy commented on HADOOP-2516: --- bq. There isn't a way to get the old functionality without leaving the static variable. I think the static variable and the usage of it was causing trouble because the reference was visible before the object was finished being constructed. Fair point. Should we just mark HADOOP-1819 as an *incompatible change* for reference? bq. But whatever the outcome, it certainly shouldn't be marked for fixing in 0.16. +1 HADOOP-1819 removed a public api JobTracker.getTracker in 0.15.0 Key: HADOOP-2516 URL: https://issues.apache.org/jira/browse/HADOOP-2516 Project: Hadoop Issue Type: Bug Affects Versions: 0.15.1 Reporter: Arun C Murthy Assignee: Arun C Murthy HADOOP-1819 removed a 0.14.0 public api {{JobTracker.getTracker}} in 0.15.0. http://svn.apache.org/viewvc?view=revrevision=575438 and http://svn.apache.org/viewvc/lucene/hadoop/branches/branch-0.15/src/java/org/apache/hadoop/mapred/JobTracker.java?r1=573708r2=575438diff_format=h There is a simple work-around i.e. use the return value of {{JobTracker.startTracker}} ... yet, is this a 0.15.2 blocker? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-2106) Hadoop daemons should support generic command-line options by implementing the Tool interface
[ https://issues.apache.org/jira/browse/HADOOP-2106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-2106: -- Component/s: mapred dfs I'm marking this for 0.17.0 after a discussion with Hemanth, the original requestor of this feature. Hadoop daemons should support generic command-line options by implementing the Tool interface - Key: HADOOP-2106 URL: https://issues.apache.org/jira/browse/HADOOP-2106 Project: Hadoop Issue Type: Improvement Components: dfs, mapred Reporter: Arun C Murthy Assignee: Arun C Murthy Priority: Critical Fix For: 0.16.0 Hadoop daemons (NN/DN/JT/TT) should support generic command-line options (i.e. -nn / -jt/ -conf / -D) by implementing the Tool interface. This is particularly useful for cases where the masters(NN/JT) are to be configured dynamically e.g. via HoD. (I suspect we will need to possibly tweak some the the hadoop scripts too, possibly.) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-1622) Hadoop should provide a way to allow the user to specify jar file(s) the user job depends on
[ https://issues.apache.org/jira/browse/HADOOP-1622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-1622: -- Component/s: mapred Description: More likely than not, a user's job may depend on multiple jars. Right now, when submitting a job through bin/hadoop, there is no way for the user to specify that. A walk around for that is to re-package all the dependent jars into a new jar or put the dependent jar files in the lib dir of the new jar. This walk around causes unnecessary inconvenience to the user. Furthermore, if the user does not own the main function (like the case when the user uses Aggregate, or datajoin, streaming), the user has to re-package those system jar files too. It is much desired that hadoop provides a clean and simple way for the user to specify a list of dependent jar files at the time of job submission. Someting like: bin/hadoop --depending_jars j1.jar:j2.jar was: More likely than not, a user's job may depend on multiple jars. Right now, when submitting a job through bin/hadoop, there is no way for the user to specify that. A walk around for that is to re-package all the dependent jars into a new jar or put the dependent jar files in the lib dir of the new jar. This walk around causes unnecessary inconvenience to the user. Furthermore, if the user does not own the main function (like the case when the user uses Aggregate, or datajoin, streaming), the user has to re-package those system jar files too. It is much desired that hadoop provides a clean and simple way for the user to specify a list of dependent jar files at the time of job submission. Someting like: bin/hadoop --depending_jars j1.jar:j2.jar Hadoop should provide a way to allow the user to specify jar file(s) the user job depends on Key: HADOOP-1622 URL: https://issues.apache.org/jira/browse/HADOOP-1622 Project: Hadoop Issue Type: Improvement Components: mapred Reporter: Runping Qi Assignee: Dennis Kubes Fix For: 0.16.0 Attachments: hadoop-1622-4-20071008.patch, HADOOP-1622-5.patch, HADOOP-1622-6.patch, HADOOP-1622-7.patch, HADOOP-1622-8.patch, HADOOP-1622-9.patch, multipleJobJars.patch, multipleJobResources.patch, multipleJobResources2.patch More likely than not, a user's job may depend on multiple jars. Right now, when submitting a job through bin/hadoop, there is no way for the user to specify that. A walk around for that is to re-package all the dependent jars into a new jar or put the dependent jar files in the lib dir of the new jar. This walk around causes unnecessary inconvenience to the user. Furthermore, if the user does not own the main function (like the case when the user uses Aggregate, or datajoin, streaming), the user has to re-package those system jar files too. It is much desired that hadoop provides a clean and simple way for the user to specify a list of dependent jar files at the time of job submission. Someting like: bin/hadoop --depending_jars j1.jar:j2.jar -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-2099) Pending, running, completed tasks should also be shown as percentage
[ https://issues.apache.org/jira/browse/HADOOP-2099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-2099: -- Component/s: mapred I'm marking this for 0.17.0. Pending, running, completed tasks should also be shown as percentage Key: HADOOP-2099 URL: https://issues.apache.org/jira/browse/HADOOP-2099 Project: Hadoop Issue Type: Improvement Components: mapred Affects Versions: 0.14.0 Reporter: Amar Kamat Assignee: Amar Kamat Priority: Minor Fix For: 0.16.0 Attachments: HADOOP-2099.patch, percent.png -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-2106) Hadoop daemons should support generic command-line options by implementing the Tool interface
[ https://issues.apache.org/jira/browse/HADOOP-2106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-2106: -- Fix Version/s: (was: 0.16.0) 0.17.0 Hadoop daemons should support generic command-line options by implementing the Tool interface - Key: HADOOP-2106 URL: https://issues.apache.org/jira/browse/HADOOP-2106 Project: Hadoop Issue Type: Improvement Components: dfs, mapred Reporter: Arun C Murthy Assignee: Arun C Murthy Priority: Critical Fix For: 0.17.0 Hadoop daemons (NN/DN/JT/TT) should support generic command-line options (i.e. -nn / -jt/ -conf / -D) by implementing the Tool interface. This is particularly useful for cases where the masters(NN/JT) are to be configured dynamically e.g. via HoD. (I suspect we will need to possibly tweak some the the hadoop scripts too, possibly.) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-2148) Inefficient FSDataset.getBlockFile()
[ https://issues.apache.org/jira/browse/HADOOP-2148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-2148: -- Component/s: dfs Inefficient FSDataset.getBlockFile() Key: HADOOP-2148 URL: https://issues.apache.org/jira/browse/HADOOP-2148 Project: Hadoop Issue Type: Improvement Components: dfs Affects Versions: 0.14.0 Reporter: Konstantin Shvachko Fix For: 0.16.0 FSDataset.getBlockFile() first verifies that the block is valid and then returns the file name corresponding to the block. Doing that it performs the data-node blockMap lookup twice. Only one lookup is needed here. This is important since the data-node blockMap is big. Another observation is that data-nodes do not need the blockMap at all. File names can be derived from the block IDs, there is no need to hold Block to File mapping in memory. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-2206) Design/implement a general log-aggregation framework for Hadoop
[ https://issues.apache.org/jira/browse/HADOOP-2206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-2206: -- Component/s: mapred dfs Fix Version/s: (was: 0.16.0) 0.17.0 I'm marking for 0.17.0. Design/implement a general log-aggregation framework for Hadoop --- Key: HADOOP-2206 URL: https://issues.apache.org/jira/browse/HADOOP-2206 Project: Hadoop Issue Type: New Feature Components: dfs, mapred Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.17.0 I'd like to propose a log-aggregation framework which facilitates collection, aggregation and storage of the logs of the Hadoop Map-Reduce framework and user-jobs in HDFS. Clearly the design/implementation of this framework is heavily influenced and limited by Hadoop itself for e.g. lack of appends, not too many small files (think: stdout/stderr/syslog of each map/reduce task) and so on. This framework will be especially useful once HoD (HADOOP-1301) is used to provision dynamic, per-user, Map-Reduce clusters. h4. Requirements: * Store the various logs to a configurable location in the Hadoop Distributed FileSystem ** User task logs (stdout, stderr, syslog) ** Map-Reduce daemons' logs (JobTracker and TaskTracker) * Integrate well with Hadoop and ensure no adverse performance impact on the Map-Reduce framework. * It must not use a HDFS file (or more!) per a task, which would swamp the NameNode capabilities. * The aggregation system must be distributed and reliable. * Facilities/tools to read the aggregated logs. * The aggregated logs should be compressed. h4. Architecture: Here is a high-level overview of the log-aggregation framework: h5. Logging * Provision a cloud of log-aggregators in the cluster (outside of the Hadoop cluster, running on the subset of nodes in the cluster). Lets call each one in the cloud as a Log Aggregator i.e. LA. * Each LA writes out 2 files per Map-Reduce cluster: an index file and a data file. The LA maintains one directory per Map-Reduce cluster on HDFS. * The index file format is simple: ** streamid (_streamid_ is either daemon identifier e.g. tasktracker_foo.bar.com:57891 or $jobid-$taskid-(stdout|stderr|syslog) or individual task-logs) ** timestamp ** logs-data start offset ** no. of bytes * Each Hadoop daemon (JT/TT) is given the entire list of LAs in the cluster. * Each daemon picks one LA (at random) from the list, opens an exclusive stream with the LA after identifying itself (i.e. ${daemonid}) and sends it's logs. In case of error/failure to log it just connects to another LA as above and starts logging to it. * The logs are sent to the LA by a new log4j appender. The appender provides some amount of buffering on the client-side. * Implement a feature in the TaskTracker which lets it use the same appender to send out the userlogs (stdout/stderr/syslog) to the LA after task completion. This is important to ensure that logging to the LA at runtime doesn't hurt the task's performance (see HADOOP-1553). The TaskTracker picks an LA per task in a manner similar to the one it uses for it's own logs, identifies itself (${jobid}, ${taskid}, {stdout|stderr|syslog}) and streams the entire task-log at one go. In fact we can pick different LAs for each of the task's stdout, stderr and syslog logs - each an exclusive stream to a single LA. * The LA buffers some amount of data in memory (say 16K) and then flushes that data to the HDFS file (per LA per cluster) after writing out an entry to the index file. * The LA periodically purges old logs (monthly, fortnightly or weekly as today). h5. Getting the logged information The main requirement is to implement a simple set of tools to query the LA (i.e. the index/data files on HDFS) to glean the logged information. If we can think of each Map-Reduce cluster's logs as a set of archives (i.e. one file per cluster per LA used) we need the ability to query the log-archive to figure out the available streams and the ability to get one entire stream or a subset of time based on timestamp-ranges. Essentially these are simple tools which parse the index files of each LA (for a given Hadoop cluster) and return the required information. h6. Query for available streams The query just returns all the available streams in an cluster-log archive identified by the HDFS path. It looks something like this for a cluster with 3 nodes which ran 2 jobs, first of which had 2 maps, 1 reduce and the second had 1 map, 1 reduce: {noformat} $ la -query /log-aggregation/cluster-20071113 Available streams: jobtracker_foo.bar.com:57893 tasktracker_baz.bar.com:57841
[jira] Updated: (HADOOP-2447) HDFS should be capable of limiting the total number of inodes in the system
[ https://issues.apache.org/jira/browse/HADOOP-2447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-2447: -- Component/s: dfs HDFS should be capable of limiting the total number of inodes in the system --- Key: HADOOP-2447 URL: https://issues.apache.org/jira/browse/HADOOP-2447 Project: Hadoop Issue Type: New Feature Components: dfs Reporter: Sameer Paranjpye Assignee: dhruba borthakur Fix For: 0.16.0 Attachments: fileLimit.patch, fileLimit2.patch The HDFS Namenode should be capable of limiting the total number of Inodes (files + directories). The can be done through a config variable, settable in hadoop-site.xml. The default should be no limit. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-2054) Improve memory model for map-side sorts
[ https://issues.apache.org/jira/browse/HADOOP-2054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-2054: -- Fix Version/s: (was: 0.16.0) Pushing this to 0.17.0 and beyond... Improve memory model for map-side sorts --- Key: HADOOP-2054 URL: https://issues.apache.org/jira/browse/HADOOP-2054 Project: Hadoop Issue Type: Improvement Components: mapred Reporter: Arun C Murthy Assignee: Arun C Murthy {{MapTask#MapOutputBuffer}} uses a plain-jane {{DataOutputBuffer}} which defaults to a buffer of size 32-bytes, and the {{DataOutputBuffer#write}} call doubles the underlying byte-array when it needs more space. However for maps which output any decent amount of data (e.g. 128MB in examples/Sort.java) this means the buffer grows painfully slowly from 2^6 to 2^28, and each time this results in a new array being created, followed by an array-copy: {noformat} public void write(DataInput in, int len) throws IOException { int newcount = count + len; if (newcount buf.length) { byte newbuf[] = new byte[Math.max(buf.length 1, newcount)]; System.arraycopy(buf, 0, newbuf, 0, count); buf = newbuf; } in.readFully(buf, count, len); count = newcount; } {noformat} I reckon we could do much better in the {{MapTask}}, specifically... For e.g. we start with a buffer of size 1/4KB and quadruple, rather than double, upto, say 4/8/16MB. Then we resume doubling (or less). This means that it quickly ramps up to minimize no. of {{System.arrayCopy}} calls and small-sized buffers to GC; and later start doubling to ensure we don't ramp-up too quickly to minimize memory wastage due to fragmentation. Of course, this issue is about benchmarking and figuring if all this is worth it, and, if so, what are the right set of trade-offs to make. Thoughts? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-2099) Pending, running, completed tasks should also be shown as percentage
[ https://issues.apache.org/jira/browse/HADOOP-2099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-2099: -- Fix Version/s: (was: 0.16.0) 0.17.0 Pending, running, completed tasks should also be shown as percentage Key: HADOOP-2099 URL: https://issues.apache.org/jira/browse/HADOOP-2099 Project: Hadoop Issue Type: Improvement Components: mapred Affects Versions: 0.14.0 Reporter: Amar Kamat Assignee: Amar Kamat Priority: Minor Fix For: 0.17.0 Attachments: HADOOP-2099.patch, percent.png -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-2120) dfs -getMerge does not do what it says it does
[ https://issues.apache.org/jira/browse/HADOOP-2120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-2120: -- Component/s: (was: mapred) fs dfs -getMerge does not do what it says it does -- Key: HADOOP-2120 URL: https://issues.apache.org/jira/browse/HADOOP-2120 Project: Hadoop Issue Type: Bug Components: fs Affects Versions: 0.14.3 Environment: All Reporter: Milind Bhandarkar Fix For: 0.16.0 dfs -getMerge, which calls FileUtil.CopyMerge, contains this javadoc: {code} Get all the files in the directories that match the source file pattern * and merge and sort them to only one file on local fs * srcf is kept. {code} However, it only concatenates the set of input files, rather than merging them in sorted order. Ideally, the copyMerge should be equivalent to a map-reduce job with IdentityMapper and IdentityReducer with numReducers = 1. However, not having to run this as a map-reduce job has some advantages, since it increases cluster utilization during reduce phase. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-2141) speculative execution start up condition based on completion time
[ https://issues.apache.org/jira/browse/HADOOP-2141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-2141: -- Moving this to 0.17.0 as discussions are still on... speculative execution start up condition based on completion time - Key: HADOOP-2141 URL: https://issues.apache.org/jira/browse/HADOOP-2141 Project: Hadoop Issue Type: Improvement Components: mapred Affects Versions: 0.15.0 Reporter: Koji Noguchi Assignee: Arun C Murthy Fix For: 0.17.0 We had one job with speculative execution hang. 4 reduce tasks were stuck with 95% completion because of a bad disk. Devaraj pointed out bq . One of the conditions that must be met for launching a speculative instance of a task is that it must be at least 20% behind the average progress, and this is not true here. It would be nice if speculative execution also starts up when tasks stop making progress. Devaraj suggested bq. Maybe, we should introduce a condition for average completion time for tasks in the speculative execution check. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-1986) Add support for a general serialization mechanism for Map Reduce
[ https://issues.apache.org/jira/browse/HADOOP-1986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-1986: -- Fix Version/s: (was: 0.16.0) 0.17.0 I'm moving this to 0.17.0 while we continue discussions here... Add support for a general serialization mechanism for Map Reduce Key: HADOOP-1986 URL: https://issues.apache.org/jira/browse/HADOOP-1986 Project: Hadoop Issue Type: New Feature Components: mapred Reporter: Tom White Assignee: Tom White Fix For: 0.17.0 Attachments: hadoop-serializer-v2.tar.gz, SerializableWritable.java, serializer-v1.patch, serializer-v2.patch Currently Map Reduce programs have to use WritableComparable-Writable key-value pairs. While it's possible to write Writable wrappers for other serialization frameworks (such as Thrift), this is not very convenient: it would be nicer to be able to use arbitrary types directly, without explicit wrapping and unwrapping. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-2125) Exception thrown for URL.openConnection used in the shuffle phase should be caught thus making it possible to reuse the connection for future use
[ https://issues.apache.org/jira/browse/HADOOP-2125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-2125: -- Fix Version/s: (was: 0.16.0) 0.17.0 I'm moving it for 0.17.0, we need more investigation into HTTP keep-alive etc. Exception thrown for URL.openConnection used in the shuffle phase should be caught thus making it possible to reuse the connection for future use - Key: HADOOP-2125 URL: https://issues.apache.org/jira/browse/HADOOP-2125 Project: Hadoop Issue Type: Improvement Components: mapred Affects Versions: 0.16.0 Reporter: Amar Kamat Assignee: Amar Kamat Fix For: 0.17.0 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (HADOOP-2167) Reduce tips complete 100%, but job does not complete saying reduces still running.
[ https://issues.apache.org/jira/browse/HADOOP-2167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy resolved HADOOP-2167. --- Resolution: Cannot Reproduce We haven't seen this nor can we seem to repro it. Also HADOOP-2216 led us astray... I'm closing this for now, please re-open if required. Reduce tips complete 100%, but job does not complete saying reduces still running. -- Key: HADOOP-2167 URL: https://issues.apache.org/jira/browse/HADOOP-2167 Project: Hadoop Issue Type: Bug Components: mapred Reporter: Amareshwari Sri Ramadasu Assignee: Arun C Murthy Priority: Critical Fix For: 0.16.0 Job's reduces are stuck at 99.43% progress and 2 reduces in running state and Job is not complete. But the reduce task list on the job tracker shows they are complete 100% and marked as SUCCEEDED and Finishtime is available jobtasks.jsp and jobhistory also. With ipc.client.timeout = 60, the exceptions on TT's running the reduces are On one of the TTs, the logs show the following: 2007-11-07 08:34:16,092 INFO org.apache.hadoop.mapred.TaskTracker: Task task_200711070637_0001_r_000150_0 is done. 2007-11-07 08:35:34,013 INFO org.apache.hadoop.mapred.TaskTracker: Task task_200711070637_0001_r_000156_0 is done. 2007-11-07 08:42:44,751 ERROR org.apache.hadoop.mapred.TaskTracker: Caught exception: java.net.SocketTimeoutException: timedout waiting for rpc response at org.apache.hadoop.ipc.Client.call(Client.java:484) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:184) at org.apache.hadoop.mapred.$Proxy0.heartbeat(Unknown Source) at org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:897) at org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:799) at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:1193) at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:2055) 2007-11-07 08:42:44,767 INFO org.apache.hadoop.mapred.TaskTracker: Resending 'status' to . On the other TT, 2007-11-07 08:40:30,484 INFO org.apache.hadoop.mapred.TaskTracker: Task task_200711070637_0001_r_000160_0 is done. 2007-11-07 08:42:45,508 ERROR org.apache.hadoop.mapred.TaskTracker: Caught exception: java.net.SocketTimeoutException: timedout waiting for rpc response at org.apache.hadoop.ipc.Client.call(Client.java:484) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:184) at org.apache.hadoop.mapred.$Proxy0.heartbeat(Unknown Source) at org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:897) at org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:799) at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:1193) at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:2055) 2007-11-07 08:42:45,508 INFO org.apache.hadoop.mapred.TaskTracker: Resending 'status' to .. On JT logs, the reduce tasks are done successfully: 2007-11-07 06:39:09,151 INFO org.apache.hadoop.mapred.JobTracker: Adding task 'task_200711070637_0001_r_000160_0' to tip tip_200711070637_0001_r_000160, for tracker 'x' 2007-11-07 08:42:45,708 INFO org.apache.hadoop.mapred.TaskRunner: Saved output of task 'task_200711070637_0001_r_000160_0' to 'y' 2007-11-07 08:42:45,708 INFO org.apache.hadoop.mapred.JobInProgress: Task 'task_200711070637_0001_r_000160_0' has completed tip_200711070637_0001_r_000160 successfully. This would suggest that if tasks are done before the timeout, the problem occurs in progress update. This is also not consistent since other reduce tasks in the same situation are successful. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (HADOOP-1733) LocalJobRunner uses old-style job/tip ids
[ https://issues.apache.org/jira/browse/HADOOP-1733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy resolved HADOOP-1733. --- Resolution: Won't Fix HADOOP-544 will make this moot... either way, as Doug notes: {quote} I don't see a need for job ids to be identical between localrunner and jobtracker. User code should not rely on the format of job ids. Having them different helps enforce that! We should never parse job ids, but only require them to be sufficiently unique. {quote} LocalJobRunner uses old-style job/tip ids - Key: HADOOP-1733 URL: https://issues.apache.org/jira/browse/HADOOP-1733 Project: Hadoop Issue Type: Bug Components: mapred Affects Versions: 0.14.0 Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.16.0 We should rework LocalJobRunner to use the new style job/tip ids (post HADOOP-1473). Is this a *blocker*? This isn't a functionality bug, yet ... -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-2221) Configuration.toString is broken
[ https://issues.apache.org/jira/browse/HADOOP-2221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-2221: -- Fix Version/s: (was: 0.16.0) 0.17.0 Moving this to 0.17.0. Configuration.toString is broken Key: HADOOP-2221 URL: https://issues.apache.org/jira/browse/HADOOP-2221 Project: Hadoop Issue Type: Bug Components: conf Affects Versions: 0.15.0 Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.17.0 Attachments: HADOOP-2221_1_2007117.patch {{Configuration.toString}} doesn't string-ify the {{Configuration.resources}} field which was added in HADOOP-785. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-2141) speculative execution start up condition based on completion time
[ https://issues.apache.org/jira/browse/HADOOP-2141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-2141: -- Fix Version/s: (was: 0.16.0) 0.17.0 speculative execution start up condition based on completion time - Key: HADOOP-2141 URL: https://issues.apache.org/jira/browse/HADOOP-2141 Project: Hadoop Issue Type: Improvement Components: mapred Affects Versions: 0.15.0 Reporter: Koji Noguchi Assignee: Arun C Murthy Fix For: 0.17.0 We had one job with speculative execution hang. 4 reduce tasks were stuck with 95% completion because of a bad disk. Devaraj pointed out bq . One of the conditions that must be met for launching a speculative instance of a task is that it must be at least 20% behind the average progress, and this is not true here. It would be nice if speculative execution also starts up when tasks stop making progress. Devaraj suggested bq. Maybe, we should introduce a condition for average completion time for tasks in the speculative execution check. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-2165) Augment JobHistory to store tasks' userlogs
[ https://issues.apache.org/jira/browse/HADOOP-2165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-2165: -- Fix Version/s: (was: 0.16.0) 0.17.0 Pushing this to 0.17.0. Augment JobHistory to store tasks' userlogs --- Key: HADOOP-2165 URL: https://issues.apache.org/jira/browse/HADOOP-2165 Project: Hadoop Issue Type: Improvement Components: mapred Reporter: Arun C Murthy Fix For: 0.17.0 It will be very useful to be able to see the job's userlogs (the stdout/stderr/syslog of the tasks) from the JobHistory page. It will greatly aid in debugging etc. At the very minimum we should have links from the JobHistory to the logs on the TT. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-1281) Speculative map tasks aren't getting killed although the TIP completed
[ https://issues.apache.org/jira/browse/HADOOP-1281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-1281: -- Priority: Critical (was: Major) I'm marking up the priority to reflect that this is an important bug to fix for 0.16.0, we are losing lots of cycles due to this. Speculative map tasks aren't getting killed although the TIP completed -- Key: HADOOP-1281 URL: https://issues.apache.org/jira/browse/HADOOP-1281 Project: Hadoop Issue Type: Bug Components: mapred Affects Versions: 0.15.0 Reporter: Arun C Murthy Assignee: Arun C Murthy Priority: Critical Fix For: 0.16.0 Attachments: HADOOP-1281_1_20071117.patch, HADOOP-1281_2_20071123.patch The speculative map tasks run to completion although the TIP succeeded since the other task completed elsewhere. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-2510) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/HADOOP-2510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12555883#action_12555883 ] Arun C Murthy commented on HADOOP-2510: --- bq. [...] as opposed to a regular heartbeat from the TaskTracker. Probes can be done intelligently depending on the state of the overall Job and could significantly reduce network RPC traffic. Does this matter in practice on large clusters? Yes. To clarify, the idea is that the JobManager pings the TaskTrackers (today the TaskTracker pings the JobTracker) for status-updates for its tasks. Clearly it only pings the TaskTrackers which are _currently_ running its tasks. bq. currently, SPECULATIVE_GAP and SPECULATIVE_LAG control speculative execution at the task level. As with heartbeats versus probes, wouldn't this be better handled at the JobManager/MapReduce master level? Either way, this should be a JobConf param. Yes. Again the idea is that the JobManager decides to schedule speculative-tasks via SPECULATIVE_{LAG|GAP} etc., same as the normal tasks. It then asks the JobScheduler for free TaskTrackers. Thus _which_ task needs to run (normal/failed/speculative) is decided by the JobManager, whereas _where_ the task should be run (i.e. TaskTracker) is decided by the JobScheduler, it doesn't care about the nature of the task (it does care about the job's priorities etc.). bq. We set this to off for our cluster because it has caused severe instability when running many jobs simultaneously. Which version of Hadoop are you running? Things have improved a fair bit recently; further improvements are underway (HADOOP-2141). Map-Reduce 2.0 -- Key: HADOOP-2510 URL: https://issues.apache.org/jira/browse/HADOOP-2510 Project: Hadoop Issue Type: Improvement Components: mapred Reporter: Arun C Murthy We, at Yahoo!, have been using Hadoop-On-Demand as the resource provisioning/scheduling mechanism. With HoD the user uses a self-service system to ask-for a set of nodes. HoD allocates these from a global pool and also provisions a private Map-Reduce cluster for the user. She then runs her jobs and shuts the cluster down via HoD when done. All user-private clusters use the same humongous, static HDFS (e.g. 2k node HDFS). More details about HoD are available here: HADOOP-1301. h3. Motivation The current deployment (Hadoop + HoD) has a couple of implications: * _Non-optimal Cluster Utilization_ 1. Job-private Map-Reduce clusters imply that the user-cluster potentially could be *idle* for atleast a while before being detected and shut-down. 2. Elastic Jobs: Map-Reduce jobs, typically, have lots of maps with much-smaller no. of reduces; with maps being light and quick and reduces being i/o heavy and longer-running. Users typically allocate clusters depending on the no. of maps (i.e. input size) which leads to the scenario where all the maps are done (idle nodes in the cluster) and the few reduces are chugging along. Right now, we do not have the ability to shrink the HoD'ed Map-Reduce clusters which would alleviate this issue. * _Impact on data-locality_ With the current setup of a static, large HDFS and much smaller (5/10/20/50 node) clusters there is a good chance of losing one of Map-Reduce's primary features: ability to execute tasks on the datanodes where the input splits are located. In fact, we have seen the data-local tasks go down to 20-25 percent in the GridMix benchmarks, from the 95-98 percent we see on the randomwriter+sort runs run as part of the hadoopqa benchmarks (admittedly a synthetic benchmark, but yet). Admittedly, HADOOP-1985 (rack-aware Map-Reduce) helps significantly here. Primarily, the notion of *job-level scheduling* leading to private clusers, as opposed to *task-level scheduling*, is a good peg to hang-on the majority of the blame. Keeping the above factors in mind, here are some thoughts on how to re-structure Hadoop Map-Reduce to solve some of these issues. h3. State of the Art As it exists today, a large, static, Hadoop Map-Reduce cluster (forget HoD for a bit) does provide task-level scheduling; however as it exists today, it's scalability to tens-of-thousands of user-jobs, per-week, is in question. Lets review it's current architecture and main components: * JobTracker: It does both *task-scheduling* and *task-monitoring* (tasktrackers send task-statuses via periodic heartbeats), which implies it is fairly loaded. It is also a _single-point of failure_ in the Map-Reduce framework i.e. its failure implies that all the jobs in the system fail. This means a static, large Map-Reduce cluster is fairly susceptible and a definite suspect. Clearly HoD solves this by having per-job clusters, albeit with the above drawbacks. *
[jira] Updated: (HADOOP-2390) Document the user-controls for intermediate/output compression via forrest
[ https://issues.apache.org/jira/browse/HADOOP-2390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-2390: -- Resolution: Fixed Status: Resolved (was: Patch Available) I just committed this. Document the user-controls for intermediate/output compression via forrest -- Key: HADOOP-2390 URL: https://issues.apache.org/jira/browse/HADOOP-2390 Project: Hadoop Issue Type: Improvement Components: documentation, mapred Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.16.0 Attachments: HADOOP-2390_1_20071221.patch, HADOOP-2390_2_20080102.patch We should document the user-controls for compressing the intermediate and job outputs, including the types (record/block) and the various codecs in the hadoop website via forrest (mapred_tutorial.html). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HADOOP-2516) HADOOP-1819 removed a public api JobTracker.getTracker in 0.15.0
HADOOP-1819 removed a public api JobTracker.getTracker in 0.15.0 Key: HADOOP-2516 URL: https://issues.apache.org/jira/browse/HADOOP-2516 Project: Hadoop Issue Type: Bug Affects Versions: 0.15.1 Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.16.0 HADOOP-1819 removed a 0.14.0 public api {{JobTracker.getTracker}} in 0.15.0. http://svn.apache.org/viewvc?view=revrevision=575438 and http://svn.apache.org/viewvc/lucene/hadoop/branches/branch-0.15/src/java/org/apache/hadoop/mapred/JobTracker.java?r1=573708r2=575438diff_format=h There is a simple work-around i.e. use the return value of {{JobTracker.startTracker}} ... yet, is this a 0.15.2 blocker? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-2510) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/HADOOP-2510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12555634#action_12555634 ] Arun C Murthy commented on HADOOP-2510: --- bq. 2) One of our problems [...] Right, this will not affect your special case at all... you can continue to run multiple clusters on the same machines with different configs, ports etc. bq. I'm not totally sure [...] Yep. The point is to get people to think about ways of improving Map-Reduce to be scalable/reliable and maintain the single static MR cluster and do away with the notion of job-private clusters i.e. HoD; as expounded in the Motivation section. The stretch is to see if we can enhance it to support other, non-MR paradigms too. bq. You discuss the jobtracker being a single point of failure, but the namenode is already a more serious point of failure, since it is much more work to rebuild a namenode if it dies. Sure, that is at least as important; however I believe it's unrelated to this discussion. Map-Reduce 2.0 -- Key: HADOOP-2510 URL: https://issues.apache.org/jira/browse/HADOOP-2510 Project: Hadoop Issue Type: Improvement Components: mapred Reporter: Arun C Murthy We, at Yahoo!, have been using Hadoop-On-Demand as the resource provisioning/scheduling mechanism. With HoD the user uses a self-service system to ask-for a set of nodes. HoD allocates these from a global pool and also provisions a private Map-Reduce cluster for the user. She then runs her jobs and shuts the cluster down via HoD when done. All user-private clusters use the same humongous, static HDFS (e.g. 2k node HDFS). More details about HoD are available here: HADOOP-1301. h3. Motivation The current deployment (Hadoop + HoD) has a couple of implications: * _Non-optimal Cluster Utilization_ 1. Job-private Map-Reduce clusters imply that the user-cluster potentially could be *idle* for atleast a while before being detected and shut-down. 2. Elastic Jobs: Map-Reduce jobs, typically, have lots of maps with much-smaller no. of reduces; with maps being light and quick and reduces being i/o heavy and longer-running. Users typically allocate clusters depending on the no. of maps (i.e. input size) which leads to the scenario where all the maps are done (idle nodes in the cluster) and the few reduces are chugging along. Right now, we do not have the ability to shrink the HoD'ed Map-Reduce clusters which would alleviate this issue. * _Impact on data-locality_ With the current setup of a static, large HDFS and much smaller (5/10/20/50 node) clusters there is a good chance of losing one of Map-Reduce's primary features: ability to execute tasks on the datanodes where the input splits are located. In fact, we have seen the data-local tasks go down to 20-25 percent in the GridMix benchmarks, from the 95-98 percent we see on the randomwriter+sort runs run as part of the hadoopqa benchmarks (admittedly a synthetic benchmark, but yet). Admittedly, HADOOP-1985 (rack-aware Map-Reduce) helps significantly here. Primarily, the notion of *job-level scheduling* leading to private clusers, as opposed to *task-level scheduling*, is a good peg to hang-on the majority of the blame. Keeping the above factors in mind, here are some thoughts on how to re-structure Hadoop Map-Reduce to solve some of these issues. h3. State of the Art As it exists today, a large, static, Hadoop Map-Reduce cluster (forget HoD for a bit) does provide task-level scheduling; however as it exists today, it's scalability to tens-of-thousands of user-jobs, per-week, is in question. Lets review it's current architecture and main components: * JobTracker: It does both *task-scheduling* and *task-monitoring* (tasktrackers send task-statuses via periodic heartbeats), which implies it is fairly loaded. It is also a _single-point of failure_ in the Map-Reduce framework i.e. its failure implies that all the jobs in the system fail. This means a static, large Map-Reduce cluster is fairly susceptible and a definite suspect. Clearly HoD solves this by having per-job clusters, albeit with the above drawbacks. * TaskTracker: The slave in the system which executes one task at-a-time under directions from the JobTracker. * JobClient: The per-job client which just submits the job and polls the JobTracker for status. h3. Proposal - Map-Reduce 2.0 The primary idea is to move to task-level scheduling and static Map-Reduce clusters (so as to maintain the same storage cluster and compute cluster paradigm) as a way to directly tackle the two main issues illustrated above. Clearly, we will have to get around the existing problems, especially w.r.t. scalability and reliability. The proposal is to re-work Hadoop
[jira] Updated: (HADOOP-2344) Free up the buffers (input and error) while executing a shell command before waiting for it to finish.
[ https://issues.apache.org/jira/browse/HADOOP-2344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-2344: -- Resolution: Fixed Status: Resolved (was: Patch Available) I just committed this. Thanks, Amar! Free up the buffers (input and error) while executing a shell command before waiting for it to finish. -- Key: HADOOP-2344 URL: https://issues.apache.org/jira/browse/HADOOP-2344 Project: Hadoop Issue Type: Bug Affects Versions: 0.16.0 Reporter: Amar Kamat Assignee: Amar Kamat Fix For: 0.16.0 Attachments: HADOOP-2231.patch, HADOOP-2344.patch, HADOOP-2344.patch, HADOOP-2344.patch, HADOOP-2344.patch, HADOOP-2344.patch, HADOOP-2344.patch Process.waitFor() should be invoked after freeing up the input and error stream. While fixing https://issues.apache.org/jira/browse/HADOOP-2231 we found that this might be a possible cause. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-2390) Document the user-controls for intermediate/output compression via forrest
[ https://issues.apache.org/jira/browse/HADOOP-2390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-2390: -- Status: Open (was: Patch Available) Devaraj pointed out that the current patch doesn't talk about native-hadoop compression libraries... Unfortunately it would mean that we need to port http://wiki.apache.org/lucene-hadoop/NativeHadoop to forrest and then link off it; a new patch is forth-coming. Document the user-controls for intermediate/output compression via forrest -- Key: HADOOP-2390 URL: https://issues.apache.org/jira/browse/HADOOP-2390 Project: Hadoop Issue Type: Improvement Components: documentation, mapred Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.16.0 Attachments: HADOOP-2390_1_20071221.patch We should document the user-controls for compressing the intermediate and job outputs, including the types (record/block) and the various codecs in the hadoop website via forrest (mapred_tutorial.html). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-2390) Document the user-controls for intermediate/output compression via forrest
[ https://issues.apache.org/jira/browse/HADOOP-2390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-2390: -- Status: Patch Available (was: Open) Document the user-controls for intermediate/output compression via forrest -- Key: HADOOP-2390 URL: https://issues.apache.org/jira/browse/HADOOP-2390 Project: Hadoop Issue Type: Improvement Components: documentation, mapred Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.16.0 Attachments: HADOOP-2390_1_20071221.patch, HADOOP-2390_2_20080102.patch We should document the user-controls for compressing the intermediate and job outputs, including the types (record/block) and the various codecs in the hadoop website via forrest (mapred_tutorial.html). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-2390) Document the user-controls for intermediate/output compression via forrest
[ https://issues.apache.org/jira/browse/HADOOP-2390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-2390: -- Attachment: HADOOP-2390_2_20080102.patch Promised patch incorporating details found in http://wiki.apache.org/lucene-hadoop/NativeHadoop into forrest-based native_libraries.html. Document the user-controls for intermediate/output compression via forrest -- Key: HADOOP-2390 URL: https://issues.apache.org/jira/browse/HADOOP-2390 Project: Hadoop Issue Type: Improvement Components: documentation, mapred Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.16.0 Attachments: HADOOP-2390_1_20071221.patch, HADOOP-2390_2_20080102.patch We should document the user-controls for compressing the intermediate and job outputs, including the types (record/block) and the various codecs in the hadoop website via forrest (mapred_tutorial.html). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HADOOP-2510) Map-Reduce 2.0
Map-Reduce 2.0 -- Key: HADOOP-2510 URL: https://issues.apache.org/jira/browse/HADOOP-2510 Project: Hadoop Issue Type: Improvement Components: mapred Reporter: Arun C Murthy We, at Yahoo!, have been using Hadoop-On-Demand as the resource provisioning/scheduling mechanism. With HoD the user uses a self-service system to ask-for a set of nodes. HoD allocates these from a global pool and also provisions a private Map-Reduce cluster for the user. She then runs her jobs and shuts the cluster down via HoD when done. All user-private clusters use the same humongous, static HDFS (e.g. 2k node HDFS). More details about HoD are available here: HADOOP-1301. h3. Motivation The current deployment (Hadoop + HoD) has a couple of implications: * _Non-optimal Cluster Utilization_ 1. Job-private Map-Reduce clusters imply that the user-cluster potentially could be *idle* for atleast a while before being detected and shut-down. 2. Elastic Jobs: Map-Reduce jobs, typically, have lots of maps with much-smaller no. of reduces; with maps being light and quick and reduces being i/o heavy and longer-running. Users typically allocate clusters depending on the no. of maps (i.e. input size) which leads to the scenario where all the maps are done (idle nodes in the cluster) and the few reduces are chugging along. Right now, we do not have the ability to shrink the HoD'ed Map-Reduce clusters which would alleviate this issue. * _Impact on data-locality_ With the current setup of a static, large HDFS and much smaller (5/10/20/50 node) clusters there is a good chance of losing one of Map-Reduce's primary features: ability to execute tasks on the datanodes where the input splits are located. In fact, we have seen the data-local tasks go down to 20-25 percent in the GridMix benchmarks, from the 95-98 percent we see on the randomwriter+sort runs run as part of the hadoopqa benchmarks (admittedly a synthetic benchmark, but yet). Admittedly, HADOOP-1985 (rack-aware Map-Reduce) helps significantly here. Primarily, the notion of *job-level scheduling* leading to private clusers, as opposed to *task-level scheduling*, is a good peg to hang-on the majority of the blame. Keeping the above factors in mind, here are some thoughts on how to re-structure Hadoop Map-Reduce to solve some of these issues. h3. State of the Art As it exists today, a large, static, Hadoop Map-Reduce cluster (forget HoD for a bit) does provide task-level scheduling; however as it exists today, it's scalability to tens-of-thousands of user-jobs, per-week, is in question. Lets review it's current architecture and main components: * JobTracker: It does both *task-scheduling* and *task-monitoring* (tasktrackers send task-statuses via periodic heartbeats), which implies it is fairly loaded. It is also a _single-point of failure_ in the Map-Reduce framework i.e. its failure implies that all the jobs in the system fail. This means a static, large Map-Reduce cluster is fairly susceptible and a definite suspect. Clearly HoD solves this by having per-job clusters, albeit with the above drawbacks. * TaskTracker: The slave in the system which executes one task at-a-time under directions from the JobTracker. * JobClient: The per-job client which just submits the job and polls the JobTracker for status. h3. Proposal - Map-Reduce 2.0 The primary idea is to move to task-level scheduling and static Map-Reduce clusters (so as to maintain the same storage cluster and compute cluster paradigm) as a way to directly tackle the two main issues illustrated above. Clearly, we will have to get around the existing problems, especially w.r.t. scalability and reliability. The proposal is to re-work Hadoop Map-Reduce to make it suitable for a large, static cluster. Here is an overview of how its main components would look like: * JobTracker: Turn the JobTracker into a pure task-scheduler, a global one. Lets call this the *JobScheduler* henceforth. Clearly (data-locality aware) Maui/Moab are candidates for being the scheduler, in which case, the JobScheduler is just a thin wrapper around them. * TaskTracker: These stay as before, without some minor changes as illustrated later in the piece. * JobClient: Fatten up the JobClient my putting a lot more intelligence into it. Enhance it to talk to the JobTracker to ask for available TaskTrackers and then contact them to schedule and monitor the tasks. So we'll have lots of per-job clients talking to the JobScheduler and the relevant TaskTrackers for their respective jobs, a big change from today. Lets call this the *JobManager* henceforth. A broad sketch of how things would work: h4. Deployment There is a single, static, large Map-Reduce cluster, and no per-job clusters. Essentially there is one global JobScheduler with thousands of independent
[jira] Updated: (HADOOP-2511) HADOOP-2344 introduced a javadoc warning
[ https://issues.apache.org/jira/browse/HADOOP-2511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-2511: -- Attachment: HADOOP-2511_1_20080103.patch Straight-forward fix. HADOOP-2344 introduced a javadoc warning Key: HADOOP-2511 URL: https://issues.apache.org/jira/browse/HADOOP-2511 Project: Hadoop Issue Type: Bug Components: documentation Affects Versions: 0.16.0 Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.16.0 Attachments: HADOOP-2511_1_20080103.patch {noformat} [javadoc] /export/home/hudson/hudson/jobs/Hadoop-Patch/workspace/trunk/src/java/org/apache/hadoop/util/Shell.java:70: warning - @param argument Interval is not a parameter name. {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-2511) HADOOP-2344 introduced a javadoc warning
[ https://issues.apache.org/jira/browse/HADOOP-2511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-2511: -- Status: Patch Available (was: Open) HADOOP-2344 introduced a javadoc warning Key: HADOOP-2511 URL: https://issues.apache.org/jira/browse/HADOOP-2511 Project: Hadoop Issue Type: Bug Components: documentation Affects Versions: 0.16.0 Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.16.0 Attachments: HADOOP-2511_1_20080103.patch {noformat} [javadoc] /export/home/hudson/hudson/jobs/Hadoop-Patch/workspace/trunk/src/java/org/apache/hadoop/util/Shell.java:70: warning - @param argument Interval is not a parameter name. {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-2344) Free up the buffers (input and error) while executing a shell command before waiting for it to finish.
[ https://issues.apache.org/jira/browse/HADOOP-2344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12555424#action_12555424 ] Arun C Murthy commented on HADOOP-2344: --- Ugh, the long story is that a similar patch didn't generate a javadoc... sad excuse. My bad. I filed/fixed HADOOP-2511 to fix the javadoc warning. Free up the buffers (input and error) while executing a shell command before waiting for it to finish. -- Key: HADOOP-2344 URL: https://issues.apache.org/jira/browse/HADOOP-2344 Project: Hadoop Issue Type: Bug Affects Versions: 0.16.0 Reporter: Amar Kamat Assignee: Amar Kamat Fix For: 0.16.0 Attachments: HADOOP-2231.patch, HADOOP-2344.patch, HADOOP-2344.patch, HADOOP-2344.patch, HADOOP-2344.patch, HADOOP-2344.patch, HADOOP-2344.patch Process.waitFor() should be invoked after freeing up the input and error stream. While fixing https://issues.apache.org/jira/browse/HADOOP-2231 we found that this might be a possible cause. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HADOOP-2501) Implement utility-tools for working with SequenceFiles
Implement utility-tools for working with SequenceFiles -- Key: HADOOP-2501 URL: https://issues.apache.org/jira/browse/HADOOP-2501 Project: Hadoop Issue Type: New Feature Components: io Reporter: Arun C Murthy It would be nice to implement a bunch of utilities to work with SequenceFiles: * info (print-out header information such as key/value types, compression type/codec etc.) * cat * head/tail * merge multiple seq-files into one * ... I'd imagine this would look like: {noformat} $ bin/hadoop seq -info /user/joe/blah.seq $ bin/hadoop seq -head -n 10 /user/joe/blah.seq {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-2344) Free up the buffers (input and error) while executing a shell command before waiting for it to finish.
[ https://issues.apache.org/jira/browse/HADOOP-2344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-2344: -- Status: Patch Available (was: Open) Free up the buffers (input and error) while executing a shell command before waiting for it to finish. -- Key: HADOOP-2344 URL: https://issues.apache.org/jira/browse/HADOOP-2344 Project: Hadoop Issue Type: Bug Affects Versions: 0.16.0 Reporter: Amar Kamat Assignee: Amar Kamat Fix For: 0.16.0 Attachments: HADOOP-2231.patch, HADOOP-2344.patch, HADOOP-2344.patch, HADOOP-2344.patch, HADOOP-2344.patch, HADOOP-2344.patch, HADOOP-2344.patch Process.waitFor() should be invoked after freeing up the input and error stream. While fixing https://issues.apache.org/jira/browse/HADOOP-2231 we found that this might be a possible cause. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.