[jira] [Commented] (MAPREDUCE-6066) Speculative attempts should not run on the same node as their original attempt
[ https://issues.apache.org/jira/browse/MAPREDUCE-6066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14909840#comment-14909840 ] Chen He commented on MAPREDUCE-6066: This problem is interesting. I believe there are already many solutions from academic publications for this problem. Another corner case that we need to be careful is the case that if AM only get containers from a single NM, then we should allow speculative tasks run on the same node. Categorizing node becomes very important. What is the reason that causes this task (map or reduce) slow. Then, we can make more reasonable decision. > Speculative attempts should not run on the same node as their original attempt > -- > > Key: MAPREDUCE-6066 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6066 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: applicationmaster, scheduler >Affects Versions: 2.5.0, 2.6.0 >Reporter: Todd Lipcon > Attachments: conf.xml > > > I'm seeing a behavior on trunk with fair scheduler enabled where a > speculative reduce attempt is getting run on the same node as its original > attempt. This doesn't make sense -- the main reason for speculative execution > is to deal with a slow node, so scheduling a second attempt on the same node > would just make the problem worse if anything. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6066) Speculative attempts should not run on the same node as their original attempt
[ https://issues.apache.org/jira/browse/MAPREDUCE-6066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen He updated MAPREDUCE-6066: --- Affects Version/s: 2.6.0 > Speculative attempts should not run on the same node as their original attempt > -- > > Key: MAPREDUCE-6066 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6066 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: applicationmaster, scheduler >Affects Versions: 2.5.0, 2.6.0 >Reporter: Todd Lipcon > Attachments: conf.xml > > > I'm seeing a behavior on trunk with fair scheduler enabled where a > speculative reduce attempt is getting run on the same node as its original > attempt. This doesn't make sense -- the main reason for speculative execution > is to deal with a slow node, so scheduling a second attempt on the same node > would just make the problem worse if anything. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-3182) loadgen ignores -m command line when writing random data
[ https://issues.apache.org/jira/browse/MAPREDUCE-3182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14535906#comment-14535906 ] Chen He commented on MAPREDUCE-3182: Thank you for the review [~ajisakaa]. I will update this weekend. loadgen ignores -m command line when writing random data Key: MAPREDUCE-3182 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3182 Project: Hadoop Map/Reduce Issue Type: Bug Components: documentation, mrv2, test Affects Versions: 0.23.0, 2.3.0 Reporter: Jonathan Eagles Assignee: Chen He Priority: Minor Labels: BB2015-05-TBR Attachments: MAPREDUCE-3182.patch If no input directories are specified, loadgen goes into a special mode where random data is generated and written. In that mode, setting the number of mappers (-m command line option) is overridden by a calculation. Instead, it should take into consideration the user specified number of mappers and fall back to the calculation. In addition, update the documentation as well to match the new behavior in the code. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6082) Excessive logging by org.apache.hadoop.util.Progress when value is NaN
[ https://issues.apache.org/jira/browse/MAPREDUCE-6082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14130038#comment-14130038 ] Chen He commented on MAPREDUCE-6082: +1 lgtm Excessive logging by org.apache.hadoop.util.Progress when value is NaN -- Key: MAPREDUCE-6082 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6082 Project: Hadoop Map/Reduce Issue Type: Improvement Affects Versions: 2.6.0 Reporter: Mit Desai Assignee: Mit Desai Attachments: MAPREDUCE-6082.patch MAPREDUCE-5671 fixed the illegal progress values that do not fall into (0,1) interval when the progress value is given. Whenever illegal value was encountered, LOG.warn would log that incident. As a result, each of the task's syslog will be full of WARN [main] org.apache.hadoop.util.Progress: Illegal progress value found, progress is Float.NaN. Progress will be changed to 0 Each input record will contribute to one line of such log, leading to most of the tasks' syslog 1GB. We will need to change the log level to debug to avoid such excessive logging. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6066) Speculative attempts should not run on the same node as their original attempt
[ https://issues.apache.org/jira/browse/MAPREDUCE-6066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119914#comment-14119914 ] Chen He commented on MAPREDUCE-6066: I will take a look. Speculative attempts should not run on the same node as their original attempt -- Key: MAPREDUCE-6066 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6066 Project: Hadoop Map/Reduce Issue Type: Bug Components: applicationmaster, scheduler Affects Versions: 3.0.0 Reporter: Todd Lipcon I'm seeing a behavior on trunk with fair scheduler enabled where a speculative reduce attempt is getting run on the same node as its original attempt. This doesn't make sense -- the main reason for speculative execution is to deal with a slow node, so scheduling a second attempt on the same node would just make the problem worse if anything. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-5885) build/test/test.mapred.spill causes release audit warnings
[ https://issues.apache.org/jira/browse/MAPREDUCE-5885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen He updated MAPREDUCE-5885: --- Status: Patch Available (was: Open) build/test/test.mapred.spill causes release audit warnings -- Key: MAPREDUCE-5885 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5885 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Affects Versions: trunk Reporter: Jason Lowe Assignee: Chen He Attachments: MAPREDUCE-5885-1.patch, MAPREDUCE-5885.patch, MAPREDUCE-5885.patch, MAPREDUCE-5885.patch Multiple unit tests are creating files under hadoop-mapreduce-client-jobclient/build/test/test.mapred.spill which are causing release audit warnings during Jenkins patch precommit builds. In addition to being in a poor location for test output and not cleaning up after the test, there are multiple tests using this location which will cause conflicts if tests are run in parallel. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (MAPREDUCE-6050) Upgrade JUnit3 TestCase to JUnit 4
Chen He created MAPREDUCE-6050: -- Summary: Upgrade JUnit3 TestCase to JUnit 4 Key: MAPREDUCE-6050 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6050 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Reporter: Chen He Priority: Trivial There are still test classes that extend from junit.framework.TestCase. upgrade them to JUnit4. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5885) build/test/test.mapred.spill causes release audit warnings
[ https://issues.apache.org/jira/browse/MAPREDUCE-5885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14109257#comment-14109257 ] Chen He commented on MAPREDUCE-5885: Working on updating patch, also create MAPREDUCE-6050 for updating test classes from JUnit3 to JUnit4 in mapreduce project. build/test/test.mapred.spill causes release audit warnings -- Key: MAPREDUCE-5885 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5885 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Affects Versions: trunk Reporter: Jason Lowe Assignee: Chen He Attachments: MAPREDUCE-5885.patch, MAPREDUCE-5885.patch, MAPREDUCE-5885.patch Multiple unit tests are creating files under hadoop-mapreduce-client-jobclient/build/test/test.mapred.spill which are causing release audit warnings during Jenkins patch precommit builds. In addition to being in a poor location for test output and not cleaning up after the test, there are multiple tests using this location which will cause conflicts if tests are run in parallel. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-5885) build/test/test.mapred.spill causes release audit warnings
[ https://issues.apache.org/jira/browse/MAPREDUCE-5885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen He updated MAPREDUCE-5885: --- Attachment: MAPREDUCE-5885-1.patch patch updated. For the temporary directory name issue, I guess they just want to use class.getName+-mapred to avoid directory collision if many tests are running in parallel. build/test/test.mapred.spill causes release audit warnings -- Key: MAPREDUCE-5885 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5885 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Affects Versions: trunk Reporter: Jason Lowe Assignee: Chen He Attachments: MAPREDUCE-5885-1.patch, MAPREDUCE-5885.patch, MAPREDUCE-5885.patch, MAPREDUCE-5885.patch Multiple unit tests are creating files under hadoop-mapreduce-client-jobclient/build/test/test.mapred.spill which are causing release audit warnings during Jenkins patch precommit builds. In addition to being in a poor location for test output and not cleaning up after the test, there are multiple tests using this location which will cause conflicts if tests are run in parallel. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5885) build/test/test.mapred.spill causes release audit warnings
[ https://issues.apache.org/jira/browse/MAPREDUCE-5885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14109896#comment-14109896 ] Chen He commented on MAPREDUCE-5885: Sorry, my bad. I guess they think there will never be chance to run test in parallel? build/test/test.mapred.spill causes release audit warnings -- Key: MAPREDUCE-5885 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5885 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Affects Versions: trunk Reporter: Jason Lowe Assignee: Chen He Attachments: MAPREDUCE-5885-1.patch, MAPREDUCE-5885.patch, MAPREDUCE-5885.patch, MAPREDUCE-5885.patch Multiple unit tests are creating files under hadoop-mapreduce-client-jobclient/build/test/test.mapred.spill which are causing release audit warnings during Jenkins patch precommit builds. In addition to being in a poor location for test output and not cleaning up after the test, there are multiple tests using this location which will cause conflicts if tests are run in parallel. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-4818) Easier identification of tasks that timeout during localization
[ https://issues.apache.org/jira/browse/MAPREDUCE-4818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14108266#comment-14108266 ] Chen He commented on MAPREDUCE-4818: Does the yarn-localization-log introduces extra overhead to system (memory, disks, etc)? I mean there thousands of containers localizing data in a large busy cluster. How about we only record those failed ones. Easier identification of tasks that timeout during localization --- Key: MAPREDUCE-4818 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4818 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mr-am Affects Versions: 0.23.3, 2.0.3-alpha Reporter: Jason Lowe Labels: usability When a task is taking too long to localize and is killed by the AM due to task timeout, the job UI/history is not very helpful. The attempt simply lists a diagnostic stating it was killed due to timeout, but there are no logs for the attempt since it never actually got started. There are log messages on the NM that show the container never made it past localization by the time it was killed, but users often do not have access to those logs. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-5885) build/test/test.mapred.spill causes release audit warnings
[ https://issues.apache.org/jira/browse/MAPREDUCE-5885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen He updated MAPREDUCE-5885: --- Attachment: MAPREDUCE-5885.patch retrigger QA build/test/test.mapred.spill causes release audit warnings -- Key: MAPREDUCE-5885 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5885 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Affects Versions: trunk Reporter: Jason Lowe Assignee: Chen He Attachments: MAPREDUCE-5885.patch, MAPREDUCE-5885.patch, MAPREDUCE-5885.patch Multiple unit tests are creating files under hadoop-mapreduce-client-jobclient/build/test/test.mapred.spill which are causing release audit warnings during Jenkins patch precommit builds. In addition to being in a poor location for test output and not cleaning up after the test, there are multiple tests using this location which will cause conflicts if tests are run in parallel. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (MAPREDUCE-5961) Job start time setting to Thu Jan 01 05:29:59 IST 1970
[ https://issues.apache.org/jira/browse/MAPREDUCE-5961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen He resolved MAPREDUCE-5961. Resolution: Duplicate Job start time setting to Thu Jan 01 05:29:59 IST 1970 Key: MAPREDUCE-5961 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5961 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobhistoryserver Affects Versions: 2.4.1 Reporter: Nishan Shetty, Huawei Priority: Minor Induce RM switchover while job is in progress Observe that job start time setting to Thu Jan 01 05:29:59 IST 1970 saying below error {code} 2014-07-05 21:38:12,415 INFO org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager: Moving hdfs://mycluster:8020/home/testos/staging-dir/history/done_intermediate/testos/job_1404572770516_0056_conf.xml to hdfs://mycluster:8020/home/testos/staging-dir/history/done/2014/07/05/00/job_1404572770516_0056_conf.xml 2014-07-05 21:41:12,289 INFO org.apache.hadoop.mapreduce.v2.hs.JobHistory: Starting scan to move intermediate done files 2014-07-05 21:41:12,294 WARN org.apache.hadoop.mapreduce.v2.jobhistory.FileNameIndexUtils: Unable to parse launch time from job history file job_1404572770516_0057-1404576372149-testos-word+count-1404576499406-85-10-SUCCEEDED-default--1.jhist : java.lang.NumberFormatException: For input string: 2014-07-05 21:41:12,297 INFO org.apache.hadoop.mapreduce.jobhistory.JobSummary: jobId=job_1404572770516_0057,submitTime=1404576372149,launchTime=-1,firstMapTaskLaunchTime=1404576442635,firstReduceTaskLaunchTime=1404576492243,finishTime=1404576499406,resourcesPerMap=1024,resourcesPerReduce=1024,numMaps=85,numReduces=10,user=testos,queue=default,status=SUCCEEDED,mapSlotSeconds=690,reduceSlotSeconds=39,jobName=word count 2014-07-05 21:41:12,298 INFO org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager: Deleting JobSummary file: {code} AM LOG {code} 2014-07-05 21:38:19,432 INFO [Thread-74] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: JobHistoryEventHandler notified that forceJobCompletion is true 2014-07-05 21:38:19,432 INFO [Thread-74] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Calling stop for all the services 2014-07-05 21:38:19,433 INFO [Thread-74] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Stopping JobHistoryEventHandler. Size of the outstanding queue size is 0 2014-07-05 21:38:19,556 INFO [eventHandlingThread] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Copying hdfs://mycluster/home/testos/staging-dir/testos/.staging/job_1404572770516_0057/job_1404572770516_0057_2.jhist to hdfs://mycluster/home/testos/staging-dir/history/done_intermediate/testos/job_1404572770516_0057-1404576372149-testos-word+count-1404576499406-85-10-SUCCEEDED-default--1.jhist_tmp 2014-07-05 21:38:19,770 INFO [eventHandlingThread] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Copied to done location: hdfs://mycluster/home/testos/staging-dir/history/done_intermediate/testos/job_1404572770516_0057-1404576372149-testos-word+count-1404576499406-85-10-SUCCEEDED-default--1.jhist_tmp 2014-07-05 21:38:19,785 INFO [eventHandlingThread] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Copying hdfs://mycluster/home/testos/staging-dir/testos/.staging/job_1404572770516_0057/job_1404572770516_0057_2_conf.xml to hdfs://mycluster/home/testos/staging-dir/history/done_intermediate/testos/job_1404572770516_0057_conf.xml_tmp 2014-07-05 21:38:19,862 INFO [eventHandlingThread] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Copied to done location: hdfs://mycluster/home/testos/staging-dir/history/done_intermediate/testos/job_1404572770516_0057_conf.xml_tmp 2014-07-05 21:38:19,886 INFO [eventHandlingThread] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Moved tmp to done: hdfs://mycluster/home/testos/staging-dir/history/done_intermediate/testos/job_1404572770516_0057.summary_tmp to hdfs://mycluster/home/testos/staging-dir/history/done_intermediate/testos/job_1404572770516_0057.summary 2014-07-05 21:38:19,898 INFO [eventHandlingThread] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Moved tmp to done: hdfs://mycluster/home/testos/staging-dir/history/done_intermediate/testos/job_1404572770516_0057_conf.xml_tmp to hdfs://mycluster/home/testos/staging-dir/history/done_intermediate/testos/job_1404572770516_0057_conf.xml 2014-07-05 21:38:19,910 INFO [eventHandlingThread] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Moved tmp to done:
[jira] [Commented] (MAPREDUCE-5961) Job start time setting to Thu Jan 01 05:29:59 IST 1970
[ https://issues.apache.org/jira/browse/MAPREDUCE-5961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14053752#comment-14053752 ] Chen He commented on MAPREDUCE-5961: This is duplicate to MAPREDUCE-5939 Job start time setting to Thu Jan 01 05:29:59 IST 1970 Key: MAPREDUCE-5961 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5961 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobhistoryserver Affects Versions: 2.4.1 Reporter: Nishan Shetty, Huawei Priority: Minor Induce RM switchover while job is in progress Observe that job start time setting to Thu Jan 01 05:29:59 IST 1970 saying below error {code} 2014-07-05 21:38:12,415 INFO org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager: Moving hdfs://mycluster:8020/home/testos/staging-dir/history/done_intermediate/testos/job_1404572770516_0056_conf.xml to hdfs://mycluster:8020/home/testos/staging-dir/history/done/2014/07/05/00/job_1404572770516_0056_conf.xml 2014-07-05 21:41:12,289 INFO org.apache.hadoop.mapreduce.v2.hs.JobHistory: Starting scan to move intermediate done files 2014-07-05 21:41:12,294 WARN org.apache.hadoop.mapreduce.v2.jobhistory.FileNameIndexUtils: Unable to parse launch time from job history file job_1404572770516_0057-1404576372149-testos-word+count-1404576499406-85-10-SUCCEEDED-default--1.jhist : java.lang.NumberFormatException: For input string: 2014-07-05 21:41:12,297 INFO org.apache.hadoop.mapreduce.jobhistory.JobSummary: jobId=job_1404572770516_0057,submitTime=1404576372149,launchTime=-1,firstMapTaskLaunchTime=1404576442635,firstReduceTaskLaunchTime=1404576492243,finishTime=1404576499406,resourcesPerMap=1024,resourcesPerReduce=1024,numMaps=85,numReduces=10,user=testos,queue=default,status=SUCCEEDED,mapSlotSeconds=690,reduceSlotSeconds=39,jobName=word count 2014-07-05 21:41:12,298 INFO org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager: Deleting JobSummary file: {code} AM LOG {code} 2014-07-05 21:38:19,432 INFO [Thread-74] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: JobHistoryEventHandler notified that forceJobCompletion is true 2014-07-05 21:38:19,432 INFO [Thread-74] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Calling stop for all the services 2014-07-05 21:38:19,433 INFO [Thread-74] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Stopping JobHistoryEventHandler. Size of the outstanding queue size is 0 2014-07-05 21:38:19,556 INFO [eventHandlingThread] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Copying hdfs://mycluster/home/testos/staging-dir/testos/.staging/job_1404572770516_0057/job_1404572770516_0057_2.jhist to hdfs://mycluster/home/testos/staging-dir/history/done_intermediate/testos/job_1404572770516_0057-1404576372149-testos-word+count-1404576499406-85-10-SUCCEEDED-default--1.jhist_tmp 2014-07-05 21:38:19,770 INFO [eventHandlingThread] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Copied to done location: hdfs://mycluster/home/testos/staging-dir/history/done_intermediate/testos/job_1404572770516_0057-1404576372149-testos-word+count-1404576499406-85-10-SUCCEEDED-default--1.jhist_tmp 2014-07-05 21:38:19,785 INFO [eventHandlingThread] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Copying hdfs://mycluster/home/testos/staging-dir/testos/.staging/job_1404572770516_0057/job_1404572770516_0057_2_conf.xml to hdfs://mycluster/home/testos/staging-dir/history/done_intermediate/testos/job_1404572770516_0057_conf.xml_tmp 2014-07-05 21:38:19,862 INFO [eventHandlingThread] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Copied to done location: hdfs://mycluster/home/testos/staging-dir/history/done_intermediate/testos/job_1404572770516_0057_conf.xml_tmp 2014-07-05 21:38:19,886 INFO [eventHandlingThread] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Moved tmp to done: hdfs://mycluster/home/testos/staging-dir/history/done_intermediate/testos/job_1404572770516_0057.summary_tmp to hdfs://mycluster/home/testos/staging-dir/history/done_intermediate/testos/job_1404572770516_0057.summary 2014-07-05 21:38:19,898 INFO [eventHandlingThread] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Moved tmp to done: hdfs://mycluster/home/testos/staging-dir/history/done_intermediate/testos/job_1404572770516_0057_conf.xml_tmp to hdfs://mycluster/home/testos/staging-dir/history/done_intermediate/testos/job_1404572770516_0057_conf.xml 2014-07-05 21:38:19,910 INFO [eventHandlingThread] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Moved tmp to done:
[jira] [Updated] (MAPREDUCE-5939) StartTime showing up as the epoch time in JHS UI after upgrade
[ https://issues.apache.org/jira/browse/MAPREDUCE-5939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen He updated MAPREDUCE-5939: --- Attachment: MAPREDUCE-5939-v3.patch Thank you for the comments, [~jlowe]. I updated patch following your suggestion. StartTime showing up as the epoch time in JHS UI after upgrade -- Key: MAPREDUCE-5939 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5939 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.5.0 Reporter: Kihwal Lee Assignee: Chen He Attachments: MAPREDUCE-5939-v2.patch, MAPREDUCE-5939-v3.patch, MAPREDUCE-5939.patch After upgrading from 0.23.x to 2.5, the start time of old apps are showing up as the epoch time. It looks like 2.5 expects start time to be encoded at the end of the jhist file name (-[timestamp].jhist). It should have been made backward compatible. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-5939) StartTime showing up as the epoch time in JHS UI after upgrade
[ https://issues.apache.org/jira/browse/MAPREDUCE-5939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen He updated MAPREDUCE-5939: --- Target Version/s: 0.23.11, 2.5.0 Status: Patch Available (was: Open) StartTime showing up as the epoch time in JHS UI after upgrade -- Key: MAPREDUCE-5939 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5939 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.5.0 Reporter: Kihwal Lee Assignee: Chen He Attachments: MAPREDUCE-5939.patch After upgrading from 0.23.x to 2.5, the start time of old apps are showing up as the epoch time. It looks like 2.5 expects start time to be encoded at the end of the jhist file name (-[timestamp].jhist). It should have been made backward compatible. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-5939) StartTime showing up as the epoch time in JHS UI after upgrade
[ https://issues.apache.org/jira/browse/MAPREDUCE-5939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen He updated MAPREDUCE-5939: --- Attachment: MAPREDUCE-5939.patch StartTime showing up as the epoch time in JHS UI after upgrade -- Key: MAPREDUCE-5939 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5939 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.5.0 Reporter: Kihwal Lee Assignee: Chen He Attachments: MAPREDUCE-5939.patch After upgrading from 0.23.x to 2.5, the start time of old apps are showing up as the epoch time. It looks like 2.5 expects start time to be encoded at the end of the jhist file name (-[timestamp].jhist). It should have been made backward compatible. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-5939) StartTime showing up as the epoch time in JHS UI after upgrade
[ https://issues.apache.org/jira/browse/MAPREDUCE-5939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen He updated MAPREDUCE-5939: --- Attachment: MAPREDUCE-5939-v2.patch StartTime showing up as the epoch time in JHS UI after upgrade -- Key: MAPREDUCE-5939 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5939 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.5.0 Reporter: Kihwal Lee Assignee: Chen He Attachments: MAPREDUCE-5939-v2.patch, MAPREDUCE-5939.patch After upgrading from 0.23.x to 2.5, the start time of old apps are showing up as the epoch time. It looks like 2.5 expects start time to be encoded at the end of the jhist file name (-[timestamp].jhist). It should have been made backward compatible. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Assigned] (MAPREDUCE-5939) StartTime showing up as the epoch time in JHS UI after upgrade
[ https://issues.apache.org/jira/browse/MAPREDUCE-5939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen He reassigned MAPREDUCE-5939: -- Assignee: Chen He StartTime showing up as the epoch time in JHS UI after upgrade -- Key: MAPREDUCE-5939 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5939 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.5.0 Reporter: Kihwal Lee Assignee: Chen He After upgrading from 0.23.x to 2.5, the start time of old apps are showing up as the epoch time. It looks like 2.5 expects start time to be encoded at the end of the jhist file name (-[timestamp].jhist). It should have been made backward compatible. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-3182) loadgen ignores -m command line when writing random data
[ https://issues.apache.org/jira/browse/MAPREDUCE-3182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14034061#comment-14034061 ] Chen He commented on MAPREDUCE-3182: Hi [~jeagles], would you mind take a look of this patch. Thank you very much! loadgen ignores -m command line when writing random data Key: MAPREDUCE-3182 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3182 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2, test Affects Versions: 0.23.0, 2.3.0 Reporter: Jonathan Eagles Assignee: Chen He Attachments: MAPREDUCE-3182.patch If no input directories are specified, loadgen goes into a special mode where random data is generated and written. In that mode, setting the number of mappers (-m command line option) is overridden by a calculation. Instead, it should take into consideration the user specified number of mappers and fall back to the calculation. In addition, update the documentation as well to match the new behavior in the code. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5885) build/test/test.mapred.spill causes release audit warnings
[ https://issues.apache.org/jira/browse/MAPREDUCE-5885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14030608#comment-14030608 ] Chen He commented on MAPREDUCE-5885: The test failure is because of MAPREDUCE-5868 build/test/test.mapred.spill causes release audit warnings -- Key: MAPREDUCE-5885 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5885 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Affects Versions: trunk Reporter: Jason Lowe Assignee: Chen He Attachments: MAPREDUCE-5885.patch, MAPREDUCE-5885.patch Multiple unit tests are creating files under hadoop-mapreduce-client-jobclient/build/test/test.mapred.spill which are causing release audit warnings during Jenkins patch precommit builds. In addition to being in a poor location for test output and not cleaning up after the test, there are multiple tests using this location which will cause conflicts if tests are run in parallel. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-5885) build/test/test.mapred.spill causes release audit warnings
[ https://issues.apache.org/jira/browse/MAPREDUCE-5885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen He updated MAPREDUCE-5885: --- Attachment: MAPREDUCE-5885.patch attach patch again and trigger HadoopQA build/test/test.mapred.spill causes release audit warnings -- Key: MAPREDUCE-5885 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5885 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Affects Versions: trunk Reporter: Jason Lowe Assignee: Chen He Attachments: MAPREDUCE-5885.patch, MAPREDUCE-5885.patch Multiple unit tests are creating files under hadoop-mapreduce-client-jobclient/build/test/test.mapred.spill which are causing release audit warnings during Jenkins patch precommit builds. In addition to being in a poor location for test output and not cleaning up after the test, there are multiple tests using this location which will cause conflicts if tests are run in parallel. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (MAPREDUCE-5923) org.apache.hadoop.mapred.pipes.TestPipeApplication timeouts intermittently
Chen He created MAPREDUCE-5923: -- Summary: org.apache.hadoop.mapred.pipes.TestPipeApplication timeouts intermittently Key: MAPREDUCE-5923 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5923 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: trunk Reporter: Chen He Priority: Minor -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5885) build/test/test.mapred.spill causes release audit warnings
[ https://issues.apache.org/jira/browse/MAPREDUCE-5885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14029914#comment-14029914 ] Chen He commented on MAPREDUCE-5885: test failure is related to MAPREDUCE-5923 build/test/test.mapred.spill causes release audit warnings -- Key: MAPREDUCE-5885 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5885 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Affects Versions: trunk Reporter: Jason Lowe Assignee: Chen He Attachments: MAPREDUCE-5885.patch Multiple unit tests are creating files under hadoop-mapreduce-client-jobclient/build/test/test.mapred.spill which are causing release audit warnings during Jenkins patch precommit builds. In addition to being in a poor location for test output and not cleaning up after the test, there are multiple tests using this location which will cause conflicts if tests are run in parallel. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-5885) build/test/test.mapred.spill causes release audit warnings
[ https://issues.apache.org/jira/browse/MAPREDUCE-5885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen He updated MAPREDUCE-5885: --- Attachment: (was: MAPREDUCE-5885.patch) build/test/test.mapred.spill causes release audit warnings -- Key: MAPREDUCE-5885 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5885 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Affects Versions: trunk Reporter: Jason Lowe Assignee: Chen He Attachments: MAPREDUCE-5885.patch Multiple unit tests are creating files under hadoop-mapreduce-client-jobclient/build/test/test.mapred.spill which are causing release audit warnings during Jenkins patch precommit builds. In addition to being in a poor location for test output and not cleaning up after the test, there are multiple tests using this location which will cause conflicts if tests are run in parallel. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-5885) build/test/test.mapred.spill causes release audit warnings
[ https://issues.apache.org/jira/browse/MAPREDUCE-5885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen He updated MAPREDUCE-5885: --- Attachment: MAPREDUCE-5885.patch build/test/test.mapred.spill causes release audit warnings -- Key: MAPREDUCE-5885 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5885 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Affects Versions: trunk Reporter: Jason Lowe Assignee: Chen He Attachments: MAPREDUCE-5885.patch, MAPREDUCE-5885.patch Multiple unit tests are creating files under hadoop-mapreduce-client-jobclient/build/test/test.mapred.spill which are causing release audit warnings during Jenkins patch precommit builds. In addition to being in a poor location for test output and not cleaning up after the test, there are multiple tests using this location which will cause conflicts if tests are run in parallel. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5758) Reducer local data is not deleted until job completes
[ https://issues.apache.org/jira/browse/MAPREDUCE-5758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14003852#comment-14003852 ] Chen He commented on MAPREDUCE-5758: There are several issues we need to consider if we allow reducer use container local directory 1) The MapReduce framework should get container local dir from YARN. 2) We need to let Yarn framework know that MapReduce framework created some dirs under container local dir for reducers. Any suggestion, [~vinodkv]? Reducer local data is not deleted until job completes - Key: MAPREDUCE-5758 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5758 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.10, 2.2.0 Reporter: Jason Lowe Assignee: Chen He Ran into an instance where a reducer shuffled a large amount of data and subsequently failed, but the local data is not purged when the task fails but only after the entire job completes. This wastes disk space unnecessarily since the data is no longer relevant after the task-attempt exits. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5002) AM could potentially allocate a reduce container to a map attempt
[ https://issues.apache.org/jira/browse/MAPREDUCE-5002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13996836#comment-13996836 ] Chen He commented on MAPREDUCE-5002: Is this JIRA fixed? If so, could we close it? AM could potentially allocate a reduce container to a map attempt - Key: MAPREDUCE-5002 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5002 Project: Hadoop Map/Reduce Issue Type: Bug Components: mr-am Affects Versions: 2.0.3-alpha, 0.23.7 Reporter: Jason Lowe As discussed in MAPREDUCE-4982, after MAPREDUCE-4893 it is theoretically possible for the AM to accidentally assign a reducer container to a map attempt if the AM doesn't find a reduce attempt actively looking for the container (e.g.: the RM accidentally allocated too many reducer containers). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-4071) NPE while executing MRAppMaster shutdown hook
[ https://issues.apache.org/jira/browse/MAPREDUCE-4071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13996832#comment-13996832 ] Chen He commented on MAPREDUCE-4071: ping NPE while executing MRAppMaster shutdown hook - Key: MAPREDUCE-4071 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4071 Project: Hadoop Map/Reduce Issue Type: Bug Components: mr-am, mrv2 Affects Versions: 0.23.3, 2.0.0-alpha, trunk Reporter: Bhallamudi Venkata Siva Kamesh Attachments: MAPREDUCE-4071-1.patch, MAPREDUCE-4071-2.patch, MAPREDUCE-4071-2.patch, MAPREDUCE-4071.patch While running the shutdown hook of MRAppMaster, hit NPE {noformat} Exception in thread Thread-1 java.lang.NullPointerException at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter.setSignalled(MRAppMaster.java:668) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$MRAppMasterShutdownHook.run(MRAppMaster.java:1004) {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-5885) build/test/test.mapred.spill causes release audit warnings
[ https://issues.apache.org/jira/browse/MAPREDUCE-5885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen He updated MAPREDUCE-5885: --- Attachment: MAPREDUCE-5885.patch patch submitted. build/test/test.mapred.spill causes release audit warnings -- Key: MAPREDUCE-5885 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5885 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Affects Versions: trunk Reporter: Jason Lowe Assignee: Chen He Attachments: MAPREDUCE-5885.patch Multiple unit tests are creating files under hadoop-mapreduce-client-jobclient/build/test/test.mapred.spill which are causing release audit warnings during Jenkins patch precommit builds. In addition to being in a poor location for test output and not cleaning up after the test, there are multiple tests using this location which will cause conflicts if tests are run in parallel. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5885) build/test/test.mapred.spill causes release audit warnings
[ https://issues.apache.org/jira/browse/MAPREDUCE-5885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13996518#comment-13996518 ] Chen He commented on MAPREDUCE-5885: As well as TestMapReduce.java. build/test/test.mapred.spill causes release audit warnings -- Key: MAPREDUCE-5885 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5885 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Affects Versions: trunk Reporter: Jason Lowe Assignee: Chen He Multiple unit tests are creating files under hadoop-mapreduce-client-jobclient/build/test/test.mapred.spill which are causing release audit warnings during Jenkins patch precommit builds. In addition to being in a poor location for test output and not cleaning up after the test, there are multiple tests using this location which will cause conflicts if tests are run in parallel. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-5885) build/test/test.mapred.spill causes release audit warnings
[ https://issues.apache.org/jira/browse/MAPREDUCE-5885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen He updated MAPREDUCE-5885: --- Target Version/s: 0.23.11, 2.5.0 Status: Patch Available (was: Open) build/test/test.mapred.spill causes release audit warnings -- Key: MAPREDUCE-5885 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5885 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Affects Versions: trunk Reporter: Jason Lowe Assignee: Chen He Attachments: MAPREDUCE-5885.patch Multiple unit tests are creating files under hadoop-mapreduce-client-jobclient/build/test/test.mapred.spill which are causing release audit warnings during Jenkins patch precommit builds. In addition to being in a poor location for test output and not cleaning up after the test, there are multiple tests using this location which will cause conflicts if tests are run in parallel. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5889) Deprecate FileInputFormat.setInputPaths(JobConf, String) and FileInputFormat.addInputPaths(JobConf, String)
[ https://issues.apache.org/jira/browse/MAPREDUCE-5889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13996815#comment-13996815 ] Chen He commented on MAPREDUCE-5889: Agreed. +1 for the idea Deprecate FileInputFormat.setInputPaths(JobConf, String) and FileInputFormat.addInputPaths(JobConf, String) --- Key: MAPREDUCE-5889 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5889 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Akira AJISAKA Priority: Minor Labels: newbie {{FileInputFormat.setInputPaths(JobConf conf, String commaSeparatedPaths)}} and {{FileInputFormat.addInputPaths(JobConf conf, String commaSeparatedPaths)}} fail to parse commaSeparatedPaths if a comma is included in the file path. (e.g. Path: {{/path/file,with,comma}}) We should deprecate these methods and document to use {{setInputPaths(JobConf conf, Path... inputPaths)}} and {{addInputPaths(JobConf conf, Path... inputPaths)}} instead. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Assigned] (MAPREDUCE-5885) build/test/test.mapred.spill causes release audit warnings
[ https://issues.apache.org/jira/browse/MAPREDUCE-5885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen He reassigned MAPREDUCE-5885: -- Assignee: Chen He build/test/test.mapred.spill causes release audit warnings -- Key: MAPREDUCE-5885 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5885 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Affects Versions: trunk Reporter: Jason Lowe Assignee: Chen He Multiple unit tests are creating files under hadoop-mapreduce-client-jobclient/build/test/test.mapred.spill which are causing release audit warnings during Jenkins patch precommit builds. In addition to being in a poor location for test output and not cleaning up after the test, there are multiple tests using this location which will cause conflicts if tests are run in parallel. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5883) Total megabyte-seconds in job counters is slightly misleading
[ https://issues.apache.org/jira/browse/MAPREDUCE-5883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13994684#comment-13994684 ] Chen He commented on MAPREDUCE-5883: +1 non-binding. Total megabyte-seconds in job counters is slightly misleading --- Key: MAPREDUCE-5883 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5883 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 3.0.0, 2.4.0 Reporter: Nathan Roberts Assignee: Nathan Roberts Priority: Minor Attachments: MAPREDUCE-5883.patch The following counters are in milliseconds so megabyte-seconds might be better stated as megabyte-milliseconds MB_MILLIS_MAPS.name= Total megabyte-seconds taken by all map tasks MB_MILLIS_REDUCES.name=Total megabyte-seconds taken by all reduce tasks VCORES_MILLIS_MAPS.name= Total vcore-seconds taken by all map tasks VCORES_MILLIS_REDUCES.name=Total vcore-seconds taken by all reduce tasks -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-3677) If hadoop.security.authorization is set to true, NM is not starting.
[ https://issues.apache.org/jira/browse/MAPREDUCE-3677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13981221#comment-13981221 ] Chen He commented on MAPREDUCE-3677: Since there is no response for 3 days. I will close this issue. Reopen if necessary. If hadoop.security.authorization is set to true, NM is not starting. -- Key: MAPREDUCE-3677 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3677 Project: Hadoop Map/Reduce Issue Type: Bug Components: nodemanager Affects Versions: 0.23.0 Reporter: Ramgopal N Assignee: Chen He I have the hadoop cluster setup with root user.Accidentally i have set hadoop.security.authorization to true.I have not set any permissions in policy.xml.When i am trying to start the NM with root user ...it is throwing the following error Exception in thread main java.lang.NoClassDefFoundError: nodemanager Caused by: java.lang.ClassNotFoundException: nodemanager at java.net.URLClassLoader$1.run(URLClassLoader.java:200) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:188) at java.lang.ClassLoader.loadClass(ClassLoader.java:303) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:248) at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:316) Could not find the main class: nodemanager. Program will exit. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (MAPREDUCE-3677) If hadoop.security.authorization is set to true, NM is not starting.
[ https://issues.apache.org/jira/browse/MAPREDUCE-3677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen He resolved MAPREDUCE-3677. Resolution: Not a Problem If hadoop.security.authorization is set to true, NM is not starting. -- Key: MAPREDUCE-3677 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3677 Project: Hadoop Map/Reduce Issue Type: Bug Components: nodemanager Affects Versions: 0.23.0 Reporter: Ramgopal N Assignee: Chen He I have the hadoop cluster setup with root user.Accidentally i have set hadoop.security.authorization to true.I have not set any permissions in policy.xml.When i am trying to start the NM with root user ...it is throwing the following error Exception in thread main java.lang.NoClassDefFoundError: nodemanager Caused by: java.lang.ClassNotFoundException: nodemanager at java.net.URLClassLoader$1.run(URLClassLoader.java:200) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:188) at java.lang.ClassLoader.loadClass(ClassLoader.java:303) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:248) at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:316) Could not find the main class: nodemanager. Program will exit. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-4718) MapReduce fails If I pass a parameter as a S3 folder
[ https://issues.apache.org/jira/browse/MAPREDUCE-4718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen He updated MAPREDUCE-4718: --- Target Version/s: 1.0.3 (was: 1.0.3, 0.23.3) MapReduce fails If I pass a parameter as a S3 folder Key: MAPREDUCE-4718 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4718 Project: Hadoop Map/Reduce Issue Type: Bug Components: job submission Affects Versions: 1.0.0, 1.0.3 Environment: Hadoop with default configurations Reporter: Benjamin Kim I'm running a wordcount MR as follows hadoop jar WordCount.jar wordcount.WordCountDriver s3n://bucket/wordcount/input s3n://bucket/wordcount/output s3n://bucket/wordcount/input is a s3 object that contains other input files. However I get following NPE error 12/10/02 18:56:23 INFO mapred.JobClient: map 0% reduce 0% 12/10/02 18:56:54 INFO mapred.JobClient: map 50% reduce 0% 12/10/02 18:56:56 INFO mapred.JobClient: Task Id : attempt_201210021853_0001_m_01_0, Status : FAILED java.lang.NullPointerException at org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsInputStream.close(NativeS3FileSystem.java:106) at java.io.BufferedInputStream.close(BufferedInputStream.java:451) at java.io.FilterInputStream.close(FilterInputStream.java:155) at org.apache.hadoop.util.LineReader.close(LineReader.java:83) at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.close(LineRecordReader.java:144) at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.close(MapTask.java:497) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:765) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121) at org.apache.hadoop.mapred.Child.main(Child.java:249) MR runs fine if i specify more specific input path such as s3n://bucket/wordcount/input/file.txt MR fails if I pass s3 folder as a parameter In summary, This works hadoop jar ./hadoop-examples-1.0.3.jar wordcount /user/hadoop/wordcount/input/ s3n://bucket/wordcount/output/ This doesn't work hadoop jar ./hadoop-examples-1.0.3.jar wordcount s3n://bucket/wordcount/input/ s3n://bucket/wordcount/output/ (both input path are directories) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (MAPREDUCE-3476) Optimize YARN API calls
[ https://issues.apache.org/jira/browse/MAPREDUCE-3476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen He resolved MAPREDUCE-3476. Resolution: Later Optimize YARN API calls --- Key: MAPREDUCE-3476 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3476 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: mrv2 Affects Versions: 0.23.0 Reporter: Ravi Prakash Assignee: Vinod Kumar Vavilapalli Several YARN API calls are taking inordinately long. This might be a performance blocker. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-3476) Optimize YARN API calls
[ https://issues.apache.org/jira/browse/MAPREDUCE-3476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13977231#comment-13977231 ] Chen He commented on MAPREDUCE-3476: Close it, and reopen it if necessary. Thank you [~raviprak] and [~vinodkv] Optimize YARN API calls --- Key: MAPREDUCE-3476 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3476 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: mrv2 Affects Versions: 0.23.0 Reporter: Ravi Prakash Assignee: Vinod Kumar Vavilapalli Several YARN API calls are taking inordinately long. This might be a performance blocker. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-4734) The history server should link back to NM logs if aggregation is incomplete / disabled
[ https://issues.apache.org/jira/browse/MAPREDUCE-4734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13977234#comment-13977234 ] Chen He commented on MAPREDUCE-4734: retarget it to 3.0 The history server should link back to NM logs if aggregation is incomplete / disabled -- Key: MAPREDUCE-4734 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4734 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobhistoryserver, mrv2 Affects Versions: 0.23.4 Reporter: Siddharth Seth Assignee: Siddharth Seth Attachments: MR4734_WIP.txt -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-4734) The history server should link back to NM logs if aggregation is incomplete / disabled
[ https://issues.apache.org/jira/browse/MAPREDUCE-4734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen He updated MAPREDUCE-4734: --- Target Version/s: 3.0.0 (was: 3.0.0, 0.23.11) The history server should link back to NM logs if aggregation is incomplete / disabled -- Key: MAPREDUCE-4734 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4734 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobhistoryserver, mrv2 Affects Versions: 0.23.4 Reporter: Siddharth Seth Assignee: Siddharth Seth Attachments: MR4734_WIP.txt -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-4931) Add user-APIs for classpath precedence control
[ https://issues.apache.org/jira/browse/MAPREDUCE-4931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen He updated MAPREDUCE-4931: --- Target Version/s: 3.0.0 (was: 3.0.0, 0.23.11) Add user-APIs for classpath precedence control -- Key: MAPREDUCE-4931 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4931 Project: Hadoop Map/Reduce Issue Type: Improvement Components: client Affects Versions: 1.0.0 Reporter: Harsh J Priority: Minor The feature config from MAPREDUCE-1938 of allowing tasks to start with user-classes-first is fairly popular and can use its own API hooks in Job/JobConf classes, making it easier to discover and use it rather than continuing to keep it as an advanced param. I propose to add two APIs to Job/JobConf: {code} void setUserClassesTakesPrecedence(boolean) boolean userClassesTakesPrecedence() {code} Both of which, depending on their branch of commit, set the property {{mapreduce.user.classpath.first}} (1.x) or {{mapreduce.job.user.classpath.first}} (trunk, 2.x and if needed, in 0.23.x). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-4931) Add user-APIs for classpath precedence control
[ https://issues.apache.org/jira/browse/MAPREDUCE-4931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13977244#comment-13977244 ] Chen He commented on MAPREDUCE-4931: RETARGET TO 3.0 Add user-APIs for classpath precedence control -- Key: MAPREDUCE-4931 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4931 Project: Hadoop Map/Reduce Issue Type: Improvement Components: client Affects Versions: 1.0.0 Reporter: Harsh J Priority: Minor The feature config from MAPREDUCE-1938 of allowing tasks to start with user-classes-first is fairly popular and can use its own API hooks in Job/JobConf classes, making it easier to discover and use it rather than continuing to keep it as an advanced param. I propose to add two APIs to Job/JobConf: {code} void setUserClassesTakesPrecedence(boolean) boolean userClassesTakesPrecedence() {code} Both of which, depending on their branch of commit, set the property {{mapreduce.user.classpath.first}} (1.x) or {{mapreduce.job.user.classpath.first}} (trunk, 2.x and if needed, in 0.23.x). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-4718) MapReduce fails If I pass a parameter as a S3 folder
[ https://issues.apache.org/jira/browse/MAPREDUCE-4718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13975567#comment-13975567 ] Chen He commented on MAPREDUCE-4718: Hi [~benkimkimben] Thank you for the reply. Since it is not a problem for 2.x, would you mind remove 2.x from the target version? MapReduce fails If I pass a parameter as a S3 folder Key: MAPREDUCE-4718 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4718 Project: Hadoop Map/Reduce Issue Type: Bug Components: job submission Affects Versions: 1.0.0, 1.0.3 Environment: Hadoop with default configurations Reporter: Benjamin Kim I'm running a wordcount MR as follows hadoop jar WordCount.jar wordcount.WordCountDriver s3n://bucket/wordcount/input s3n://bucket/wordcount/output s3n://bucket/wordcount/input is a s3 object that contains other input files. However I get following NPE error 12/10/02 18:56:23 INFO mapred.JobClient: map 0% reduce 0% 12/10/02 18:56:54 INFO mapred.JobClient: map 50% reduce 0% 12/10/02 18:56:56 INFO mapred.JobClient: Task Id : attempt_201210021853_0001_m_01_0, Status : FAILED java.lang.NullPointerException at org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsInputStream.close(NativeS3FileSystem.java:106) at java.io.BufferedInputStream.close(BufferedInputStream.java:451) at java.io.FilterInputStream.close(FilterInputStream.java:155) at org.apache.hadoop.util.LineReader.close(LineReader.java:83) at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.close(LineRecordReader.java:144) at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.close(MapTask.java:497) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:765) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121) at org.apache.hadoop.mapred.Child.main(Child.java:249) MR runs fine if i specify more specific input path such as s3n://bucket/wordcount/input/file.txt MR fails if I pass s3 folder as a parameter In summary, This works hadoop jar ./hadoop-examples-1.0.3.jar wordcount /user/hadoop/wordcount/input/ s3n://bucket/wordcount/output/ This doesn't work hadoop jar ./hadoop-examples-1.0.3.jar wordcount s3n://bucket/wordcount/input/ s3n://bucket/wordcount/output/ (both input path are directories) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-4718) MapReduce fails If I pass a parameter as a S3 folder
[ https://issues.apache.org/jira/browse/MAPREDUCE-4718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13975570#comment-13975570 ] Chen He commented on MAPREDUCE-4718: Or close it if it is not a problem for 1.x either. MapReduce fails If I pass a parameter as a S3 folder Key: MAPREDUCE-4718 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4718 Project: Hadoop Map/Reduce Issue Type: Bug Components: job submission Affects Versions: 1.0.0, 1.0.3 Environment: Hadoop with default configurations Reporter: Benjamin Kim I'm running a wordcount MR as follows hadoop jar WordCount.jar wordcount.WordCountDriver s3n://bucket/wordcount/input s3n://bucket/wordcount/output s3n://bucket/wordcount/input is a s3 object that contains other input files. However I get following NPE error 12/10/02 18:56:23 INFO mapred.JobClient: map 0% reduce 0% 12/10/02 18:56:54 INFO mapred.JobClient: map 50% reduce 0% 12/10/02 18:56:56 INFO mapred.JobClient: Task Id : attempt_201210021853_0001_m_01_0, Status : FAILED java.lang.NullPointerException at org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsInputStream.close(NativeS3FileSystem.java:106) at java.io.BufferedInputStream.close(BufferedInputStream.java:451) at java.io.FilterInputStream.close(FilterInputStream.java:155) at org.apache.hadoop.util.LineReader.close(LineReader.java:83) at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.close(LineRecordReader.java:144) at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.close(MapTask.java:497) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:765) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121) at org.apache.hadoop.mapred.Child.main(Child.java:249) MR runs fine if i specify more specific input path such as s3n://bucket/wordcount/input/file.txt MR fails if I pass s3 folder as a parameter In summary, This works hadoop jar ./hadoop-examples-1.0.3.jar wordcount /user/hadoop/wordcount/input/ s3n://bucket/wordcount/output/ This doesn't work hadoop jar ./hadoop-examples-1.0.3.jar wordcount s3n://bucket/wordcount/input/ s3n://bucket/wordcount/output/ (both input path are directories) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Assigned] (MAPREDUCE-3677) If hadoop.security.authorization is set to true, NM is not starting.
[ https://issues.apache.org/jira/browse/MAPREDUCE-3677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen He reassigned MAPREDUCE-3677: -- Assignee: Chen He If hadoop.security.authorization is set to true, NM is not starting. -- Key: MAPREDUCE-3677 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3677 Project: Hadoop Map/Reduce Issue Type: Bug Components: nodemanager Affects Versions: 0.23.0 Reporter: Ramgopal N Assignee: Chen He I have the hadoop cluster setup with root user.Accidentally i have set hadoop.security.authorization to true.I have not set any permissions in policy.xml.When i am trying to start the NM with root user ...it is throwing the following error Exception in thread main java.lang.NoClassDefFoundError: nodemanager Caused by: java.lang.ClassNotFoundException: nodemanager at java.net.URLClassLoader$1.run(URLClassLoader.java:200) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:188) at java.lang.ClassLoader.loadClass(ClassLoader.java:303) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:248) at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:316) Could not find the main class: nodemanager. Program will exit. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (MAPREDUCE-4339) pi example job hangs on when run on hadoop 0.23.0 when capacity scheduler is included in the setting environment.
[ https://issues.apache.org/jira/browse/MAPREDUCE-4339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen He resolved MAPREDUCE-4339. Resolution: Cannot Reproduce pi example job hangs on when run on hadoop 0.23.0 when capacity scheduler is included in the setting environment. - Key: MAPREDUCE-4339 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4339 Project: Hadoop Map/Reduce Issue Type: Bug Components: examples, job submission, mrv2, scheduler Affects Versions: 0.23.0 Environment: Ubuntu Server 11.04, Hadoop 0.23.0, Reporter: srikanth ayalasomayajulu Labels: hadoop Fix For: 0.23.0 Original Estimate: 48h Remaining Estimate: 48h Tried to include default capacity scheduler in hadoop and tried to run an example pi program. The job hangs and no more output is getting displayed. Starting Job 2012-06-12 22:10:02,524 INFO ipc.YarnRPC (YarnRPC.java:create(47)) - Creating YarnRPC for org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC 2012-06-12 22:10:02,538 INFO mapred.ResourceMgrDelegate (ResourceMgrDelegate.java:init(95)) - Connecting to ResourceManager at localhost/127.0.0.1:8030 2012-06-12 22:10:02,539 INFO ipc.HadoopYarnRPC (HadoopYarnProtoRPC.java:getProxy(48)) - Creating a HadoopYarnProtoRpc proxy for protocol interface org.apache.hadoop.yarn.api.ClientRMProtocol 2012-06-12 22:10:02,665 INFO mapred.ResourceMgrDelegate (ResourceMgrDelegate.java:init(99)) - Connected to ResourceManager at localhost/127.0.0.1:8030 2012-06-12 22:10:02,727 WARN conf.Configuration (Configuration.java:handleDeprecation(326)) - fs.default.name is deprecated. Instead, use fs.defaultFS 2012-06-12 22:10:02,728 WARN conf.Configuration (Configuration.java:handleDeprecation(343)) - mapred.used.genericoptionsparser is deprecated. Instead, use mapreduce.client.genericoptionsparser.used 2012-06-12 22:10:02,831 INFO input.FileInputFormat (FileInputFormat.java:listStatus(245)) - Total input paths to process : 10 2012-06-12 22:10:02,900 INFO mapreduce.JobSubmitter (JobSubmitter.java:submitJobInternal(362)) - number of splits:10 2012-06-12 22:10:03,044 INFO mapred.YARNRunner (YARNRunner.java:createApplicationSubmissionContext(279)) - AppMaster capability = memory: 2048 2012-06-12 22:10:03,286 INFO mapred.YARNRunner (YARNRunner.java:createApplicationSubmissionContext(355)) - Command to launch container for ApplicationMaster is : $JAVA_HOME/bin/java -Dlog4j.configuration=container-log4j.properties -Dyarn.app.mapreduce.container.log.dir=LOG_DIR -Dyarn.app.mapreduce.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA -Xmx1536m org.apache.hadoop.mapreduce.v2.app.MRAppMaster 1LOG_DIR/stdout 2LOG_DIR/stderr 2012-06-12 22:10:03,370 INFO mapred.ResourceMgrDelegate (ResourceMgrDelegate.java:submitApplication(304)) - Submitted application application_1339507608976_0002 to ResourceManager 2012-06-12 22:10:03,432 INFO mapreduce.Job (Job.java:monitorAndPrintJob(1207)) - Running job: job_1339507608976_0002 2012-06-12 22:10:04,443 INFO mapreduce.Job (Job.java:monitorAndPrintJob(1227)) - map 0% reduce 0% -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-4339) pi example job hangs on when run on hadoop 0.23.0 when capacity scheduler is included in the setting environment.
[ https://issues.apache.org/jira/browse/MAPREDUCE-4339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13975646#comment-13975646 ] Chen He commented on MAPREDUCE-4339: I will close this issue since it can not be regenerated. Open if necessary. pi example job hangs on when run on hadoop 0.23.0 when capacity scheduler is included in the setting environment. - Key: MAPREDUCE-4339 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4339 Project: Hadoop Map/Reduce Issue Type: Bug Components: examples, job submission, mrv2, scheduler Affects Versions: 0.23.0 Environment: Ubuntu Server 11.04, Hadoop 0.23.0, Reporter: srikanth ayalasomayajulu Labels: hadoop Fix For: 0.23.0 Original Estimate: 48h Remaining Estimate: 48h Tried to include default capacity scheduler in hadoop and tried to run an example pi program. The job hangs and no more output is getting displayed. Starting Job 2012-06-12 22:10:02,524 INFO ipc.YarnRPC (YarnRPC.java:create(47)) - Creating YarnRPC for org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC 2012-06-12 22:10:02,538 INFO mapred.ResourceMgrDelegate (ResourceMgrDelegate.java:init(95)) - Connecting to ResourceManager at localhost/127.0.0.1:8030 2012-06-12 22:10:02,539 INFO ipc.HadoopYarnRPC (HadoopYarnProtoRPC.java:getProxy(48)) - Creating a HadoopYarnProtoRpc proxy for protocol interface org.apache.hadoop.yarn.api.ClientRMProtocol 2012-06-12 22:10:02,665 INFO mapred.ResourceMgrDelegate (ResourceMgrDelegate.java:init(99)) - Connected to ResourceManager at localhost/127.0.0.1:8030 2012-06-12 22:10:02,727 WARN conf.Configuration (Configuration.java:handleDeprecation(326)) - fs.default.name is deprecated. Instead, use fs.defaultFS 2012-06-12 22:10:02,728 WARN conf.Configuration (Configuration.java:handleDeprecation(343)) - mapred.used.genericoptionsparser is deprecated. Instead, use mapreduce.client.genericoptionsparser.used 2012-06-12 22:10:02,831 INFO input.FileInputFormat (FileInputFormat.java:listStatus(245)) - Total input paths to process : 10 2012-06-12 22:10:02,900 INFO mapreduce.JobSubmitter (JobSubmitter.java:submitJobInternal(362)) - number of splits:10 2012-06-12 22:10:03,044 INFO mapred.YARNRunner (YARNRunner.java:createApplicationSubmissionContext(279)) - AppMaster capability = memory: 2048 2012-06-12 22:10:03,286 INFO mapred.YARNRunner (YARNRunner.java:createApplicationSubmissionContext(355)) - Command to launch container for ApplicationMaster is : $JAVA_HOME/bin/java -Dlog4j.configuration=container-log4j.properties -Dyarn.app.mapreduce.container.log.dir=LOG_DIR -Dyarn.app.mapreduce.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA -Xmx1536m org.apache.hadoop.mapreduce.v2.app.MRAppMaster 1LOG_DIR/stdout 2LOG_DIR/stderr 2012-06-12 22:10:03,370 INFO mapred.ResourceMgrDelegate (ResourceMgrDelegate.java:submitApplication(304)) - Submitted application application_1339507608976_0002 to ResourceManager 2012-06-12 22:10:03,432 INFO mapreduce.Job (Job.java:monitorAndPrintJob(1207)) - Running job: job_1339507608976_0002 2012-06-12 22:10:04,443 INFO mapreduce.Job (Job.java:monitorAndPrintJob(1227)) - map 0% reduce 0% -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-3677) If hadoop.security.authorization is set to true, NM is not starting.
[ https://issues.apache.org/jira/browse/MAPREDUCE-3677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976262#comment-13976262 ] Chen He commented on MAPREDUCE-3677: hadoop.security.authorization is for secue RPC accessibility. I am not sure why you need to start nodemanager as root. But I did investigation based on Hadoop 0.23.10. I change hadoop/bin/yarn a little bit. I am using Java 1.7.0_45 and it reports illegal argument -jvm. I comment following lines in hadoop/bin/yarn: #if [[ $EUID -eq 0 ]]; then # YARN_OPTS=$YARN_OPTS -jvm server $YARN_NODEMANAGER_OPTS #else YARN_OPTS=$YARN_OPTS -server $YARN_NODEMANAGER_OPTS #fi It works fine. Feel free to make any comments. If there is no response, I will close this JIRA in 3 days. If hadoop.security.authorization is set to true, NM is not starting. -- Key: MAPREDUCE-3677 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3677 Project: Hadoop Map/Reduce Issue Type: Bug Components: nodemanager Affects Versions: 0.23.0 Reporter: Ramgopal N Assignee: Chen He I have the hadoop cluster setup with root user.Accidentally i have set hadoop.security.authorization to true.I have not set any permissions in policy.xml.When i am trying to start the NM with root user ...it is throwing the following error Exception in thread main java.lang.NoClassDefFoundError: nodemanager Caused by: java.lang.ClassNotFoundException: nodemanager at java.net.URLClassLoader$1.run(URLClassLoader.java:200) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:188) at java.lang.ClassLoader.loadClass(ClassLoader.java:303) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:248) at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:316) Could not find the main class: nodemanager. Program will exit. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-4931) Add user-APIs for classpath precedence control
[ https://issues.apache.org/jira/browse/MAPREDUCE-4931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973197#comment-13973197 ] Chen He commented on MAPREDUCE-4931: Hi [~qwertymaniac] Does this JIRA still an issue for 2.x? If so, could you retarget it to 2.x? Add user-APIs for classpath precedence control -- Key: MAPREDUCE-4931 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4931 Project: Hadoop Map/Reduce Issue Type: Improvement Components: client Affects Versions: 1.0.0 Reporter: Harsh J Priority: Minor The feature config from MAPREDUCE-1938 of allowing tasks to start with user-classes-first is fairly popular and can use its own API hooks in Job/JobConf classes, making it easier to discover and use it rather than continuing to keep it as an advanced param. I propose to add two APIs to Job/JobConf: {code} void setUserClassesTakesPrecedence(boolean) boolean userClassesTakesPrecedence() {code} Both of which, depending on their branch of commit, set the property {{mapreduce.user.classpath.first}} (1.x) or {{mapreduce.job.user.classpath.first}} (trunk, 2.x and if needed, in 0.23.x). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-3089) Augment TestRMContainerAllocator to verify MAPREDUCE-2646
[ https://issues.apache.org/jira/browse/MAPREDUCE-3089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973208#comment-13973208 ] Chen He commented on MAPREDUCE-3089: Hi [~acmurthy] Since both MAPREDUCE-3078 and MAPREDUCE-2646 are all resolved. Is this JIRA still an issue in 2.x? If so, could you retarget it to 2.x? Augment TestRMContainerAllocator to verify MAPREDUCE-2646 - Key: MAPREDUCE-3089 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3089 Project: Hadoop Map/Reduce Issue Type: Bug Components: applicationmaster, mrv2 Affects Versions: 0.23.0 Reporter: Arun C Murthy Assignee: Vinod Kumar Vavilapalli Fix For: 0.24.0 -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-4718) MapReduce fails If I pass a parameter as a S3 folder
[ https://issues.apache.org/jira/browse/MAPREDUCE-4718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973223#comment-13973223 ] Chen He commented on MAPREDUCE-4718: Hi [~benkimkimben] This JIRA has no updates since 11/Oct/12. Is it still a problem? Right now, it is time to clean up 0.23 JIRAs. If it is still a problem in 2.x. Please retarget it to 2.x. Thanks! MapReduce fails If I pass a parameter as a S3 folder Key: MAPREDUCE-4718 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4718 Project: Hadoop Map/Reduce Issue Type: Bug Components: job submission Affects Versions: 1.0.0, 1.0.3 Environment: Hadoop with default configurations Reporter: Benjamin Kim I'm running a wordcount MR as follows hadoop jar WordCount.jar wordcount.WordCountDriver s3n://bucket/wordcount/input s3n://bucket/wordcount/output s3n://bucket/wordcount/input is a s3 object that contains other input files. However I get following NPE error 12/10/02 18:56:23 INFO mapred.JobClient: map 0% reduce 0% 12/10/02 18:56:54 INFO mapred.JobClient: map 50% reduce 0% 12/10/02 18:56:56 INFO mapred.JobClient: Task Id : attempt_201210021853_0001_m_01_0, Status : FAILED java.lang.NullPointerException at org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsInputStream.close(NativeS3FileSystem.java:106) at java.io.BufferedInputStream.close(BufferedInputStream.java:451) at java.io.FilterInputStream.close(FilterInputStream.java:155) at org.apache.hadoop.util.LineReader.close(LineReader.java:83) at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.close(LineRecordReader.java:144) at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.close(MapTask.java:497) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:765) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121) at org.apache.hadoop.mapred.Child.main(Child.java:249) MR runs fine if i specify more specific input path such as s3n://bucket/wordcount/input/file.txt MR fails if I pass s3 folder as a parameter In summary, This works hadoop jar ./hadoop-examples-1.0.3.jar wordcount /user/hadoop/wordcount/input/ s3n://bucket/wordcount/output/ This doesn't work hadoop jar ./hadoop-examples-1.0.3.jar wordcount s3n://bucket/wordcount/input/ s3n://bucket/wordcount/output/ (both input path are directories) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-4711) Append time elapsed since job-start-time for finished tasks
[ https://issues.apache.org/jira/browse/MAPREDUCE-4711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13971540#comment-13971540 ] Chen He commented on MAPREDUCE-4711: Hi [~raviprak] Thank you for the patch. Would you mind update your patch then it can be applied to trunk? Another question, would you mind we retarget this JIRA to 2.5? Append time elapsed since job-start-time for finished tasks --- Key: MAPREDUCE-4711 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4711 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobhistoryserver Affects Versions: 0.23.3 Reporter: Ravi Prakash Assignee: Ravi Prakash Attachments: MAPREDUCE-4711.branch-0.23.patch In 0.20.x/1.x, the analyze job link gave this information bq. The last Map task task_sometask finished at (relative to the Job launch time): 5/10 20:23:10 (1hrs, 27mins, 54sec) The time it took for the last task to finish needs to be calculated mentally in 0.23. I believe we should print it next to the finish time. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-3406) Add node information to bin/mapred job -list-attempt-ids and other improvements
[ https://issues.apache.org/jira/browse/MAPREDUCE-3406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13971543#comment-13971543 ] Chen He commented on MAPREDUCE-3406: Hi [~raviprak], as you commented, this is a duplicated JIRA and it has been fixed. I will close this one. Add node information to bin/mapred job -list-attempt-ids and other improvements --- Key: MAPREDUCE-3406 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3406 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Affects Versions: 0.23.0 Reporter: Ravi Prakash Assignee: Ravi Prakash Fix For: 0.24.0 From [~rramya] Providing the NM information where the containers are scheduled in bin/mapred job -list-attempt-ids will be helpful in automation, debugging and to avoid grepping through the AM logs. From my own observation, the list-attempt-ids should list the attempt ids and not require the arguments. The arguments if given, can be used to filter the results. From the usage: bq. [-list-attempt-ids job-id task-type task-state]. Valid values for task-type are MAP REDUCE JOB_SETUP JOB_CLEANUP TASK_CLEANUP. Valid values for task-state are running, completed -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (MAPREDUCE-3406) Add node information to bin/mapred job -list-attempt-ids and other improvements
[ https://issues.apache.org/jira/browse/MAPREDUCE-3406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen He resolved MAPREDUCE-3406. Resolution: Duplicate Target Version/s: 2.0.0-alpha, 0.23.3, 3.0.0 (was: 0.23.3, 2.0.0-alpha, 3.0.0) Add node information to bin/mapred job -list-attempt-ids and other improvements --- Key: MAPREDUCE-3406 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3406 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Affects Versions: 0.23.0 Reporter: Ravi Prakash Assignee: Ravi Prakash Fix For: 0.24.0 From [~rramya] Providing the NM information where the containers are scheduled in bin/mapred job -list-attempt-ids will be helpful in automation, debugging and to avoid grepping through the AM logs. From my own observation, the list-attempt-ids should list the attempt ids and not require the arguments. The arguments if given, can be used to filter the results. From the usage: bq. [-list-attempt-ids job-id task-type task-state]. Valid values for task-type are MAP REDUCE JOB_SETUP JOB_CLEANUP TASK_CLEANUP. Valid values for task-state are running, completed -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-3476) Optimize YARN API calls
[ https://issues.apache.org/jira/browse/MAPREDUCE-3476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13971548#comment-13971548 ] Chen He commented on MAPREDUCE-3476: Hi [~vinodkv] Is this still a issue in 2.x? If so, could you retarget it to 2.5? If not, would you mind close it? Optimize YARN API calls --- Key: MAPREDUCE-3476 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3476 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: mrv2 Affects Versions: 0.23.0 Reporter: Ravi Prakash Assignee: Vinod Kumar Vavilapalli Several YARN API calls are taking inordinately long. This might be a performance blocker. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-3191) docs for map output compression incorrectly reference SequenceFile
[ https://issues.apache.org/jira/browse/MAPREDUCE-3191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen He updated MAPREDUCE-3191: --- Attachment: MAPREDUCE-3191-v2.patch docs for map output compression incorrectly reference SequenceFile -- Key: MAPREDUCE-3191 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3191 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.23.0 Reporter: Todd Lipcon Priority: Trivial Labels: noob Attachments: MAPREDUCE-3191-v2.patch, MAPREDUCE-3191.patch The documentation currently says that map output compression uses SequenceFile compression. This hasn't been true in several years, since we use IFile for intermediate data now. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-3191) docs for map output compression incorrectly reference SequenceFile
[ https://issues.apache.org/jira/browse/MAPREDUCE-3191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13971554#comment-13971554 ] Chen He commented on MAPREDUCE-3191: Hi [~qwertymaniac] Thank you for the comment. I have updated the patch. docs for map output compression incorrectly reference SequenceFile -- Key: MAPREDUCE-3191 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3191 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.23.0 Reporter: Todd Lipcon Priority: Trivial Labels: noob Attachments: MAPREDUCE-3191-v2.patch, MAPREDUCE-3191.patch The documentation currently says that map output compression uses SequenceFile compression. This hasn't been true in several years, since we use IFile for intermediate data now. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-3191) docs for map output compression incorrectly reference SequenceFile
[ https://issues.apache.org/jira/browse/MAPREDUCE-3191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13971559#comment-13971559 ] Chen He commented on MAPREDUCE-3191: Hi [~tlipcon] Would you mind retarget this issue to 2.5? docs for map output compression incorrectly reference SequenceFile -- Key: MAPREDUCE-3191 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3191 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.23.0 Reporter: Todd Lipcon Priority: Trivial Labels: documentation, noob Attachments: MAPREDUCE-3191-v2.patch, MAPREDUCE-3191.patch The documentation currently says that map output compression uses SequenceFile compression. This hasn't been true in several years, since we use IFile for intermediate data now. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-3191) docs for map output compression incorrectly reference SequenceFile
[ https://issues.apache.org/jira/browse/MAPREDUCE-3191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen He updated MAPREDUCE-3191: --- Labels: documentation noob (was: noob) docs for map output compression incorrectly reference SequenceFile -- Key: MAPREDUCE-3191 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3191 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.23.0 Reporter: Todd Lipcon Priority: Trivial Labels: documentation, noob Attachments: MAPREDUCE-3191-v2.patch, MAPREDUCE-3191.patch The documentation currently says that map output compression uses SequenceFile compression. This hasn't been true in several years, since we use IFile for intermediate data now. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Assigned] (MAPREDUCE-3191) docs for map output compression incorrectly reference SequenceFile
[ https://issues.apache.org/jira/browse/MAPREDUCE-3191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen He reassigned MAPREDUCE-3191: -- Assignee: Chen He docs for map output compression incorrectly reference SequenceFile -- Key: MAPREDUCE-3191 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3191 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.23.0 Reporter: Todd Lipcon Assignee: Chen He Priority: Trivial Labels: documentation, noob Attachments: MAPREDUCE-3191-v2.patch, MAPREDUCE-3191.patch The documentation currently says that map output compression uses SequenceFile compression. This hasn't been true in several years, since we use IFile for intermediate data now. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-4339) pi example job hangs on when run on hadoop 0.23.0 when capacity scheduler is included in the setting environment.
[ https://issues.apache.org/jira/browse/MAPREDUCE-4339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13971571#comment-13971571 ] Chen He commented on MAPREDUCE-4339: Hi [~srikraj8341] Is this still a issue for 2.x? If not, would you mind we close it? pi example job hangs on when run on hadoop 0.23.0 when capacity scheduler is included in the setting environment. - Key: MAPREDUCE-4339 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4339 Project: Hadoop Map/Reduce Issue Type: Bug Components: examples, job submission, mrv2, scheduler Affects Versions: 0.23.0 Environment: Ubuntu Server 11.04, Hadoop 0.23.0, Reporter: srikanth ayalasomayajulu Labels: hadoop Fix For: 0.23.0 Original Estimate: 48h Remaining Estimate: 48h Tried to include default capacity scheduler in hadoop and tried to run an example pi program. The job hangs and no more output is getting displayed. Starting Job 2012-06-12 22:10:02,524 INFO ipc.YarnRPC (YarnRPC.java:create(47)) - Creating YarnRPC for org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC 2012-06-12 22:10:02,538 INFO mapred.ResourceMgrDelegate (ResourceMgrDelegate.java:init(95)) - Connecting to ResourceManager at localhost/127.0.0.1:8030 2012-06-12 22:10:02,539 INFO ipc.HadoopYarnRPC (HadoopYarnProtoRPC.java:getProxy(48)) - Creating a HadoopYarnProtoRpc proxy for protocol interface org.apache.hadoop.yarn.api.ClientRMProtocol 2012-06-12 22:10:02,665 INFO mapred.ResourceMgrDelegate (ResourceMgrDelegate.java:init(99)) - Connected to ResourceManager at localhost/127.0.0.1:8030 2012-06-12 22:10:02,727 WARN conf.Configuration (Configuration.java:handleDeprecation(326)) - fs.default.name is deprecated. Instead, use fs.defaultFS 2012-06-12 22:10:02,728 WARN conf.Configuration (Configuration.java:handleDeprecation(343)) - mapred.used.genericoptionsparser is deprecated. Instead, use mapreduce.client.genericoptionsparser.used 2012-06-12 22:10:02,831 INFO input.FileInputFormat (FileInputFormat.java:listStatus(245)) - Total input paths to process : 10 2012-06-12 22:10:02,900 INFO mapreduce.JobSubmitter (JobSubmitter.java:submitJobInternal(362)) - number of splits:10 2012-06-12 22:10:03,044 INFO mapred.YARNRunner (YARNRunner.java:createApplicationSubmissionContext(279)) - AppMaster capability = memory: 2048 2012-06-12 22:10:03,286 INFO mapred.YARNRunner (YARNRunner.java:createApplicationSubmissionContext(355)) - Command to launch container for ApplicationMaster is : $JAVA_HOME/bin/java -Dlog4j.configuration=container-log4j.properties -Dyarn.app.mapreduce.container.log.dir=LOG_DIR -Dyarn.app.mapreduce.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA -Xmx1536m org.apache.hadoop.mapreduce.v2.app.MRAppMaster 1LOG_DIR/stdout 2LOG_DIR/stderr 2012-06-12 22:10:03,370 INFO mapred.ResourceMgrDelegate (ResourceMgrDelegate.java:submitApplication(304)) - Submitted application application_1339507608976_0002 to ResourceManager 2012-06-12 22:10:03,432 INFO mapreduce.Job (Job.java:monitorAndPrintJob(1207)) - Running job: job_1339507608976_0002 2012-06-12 22:10:04,443 INFO mapreduce.Job (Job.java:monitorAndPrintJob(1227)) - map 0% reduce 0% -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-4734) The history server should link back to NM logs if aggregation is incomplete / disabled
[ https://issues.apache.org/jira/browse/MAPREDUCE-4734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13971573#comment-13971573 ] Chen He commented on MAPREDUCE-4734: Hi [~sseth] Thank you for working on this. Would you mind we retarget it to 2.x? The history server should link back to NM logs if aggregation is incomplete / disabled -- Key: MAPREDUCE-4734 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4734 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobhistoryserver, mrv2 Affects Versions: 0.23.4 Reporter: Siddharth Seth Assignee: Siddharth Seth Attachments: MR4734_WIP.txt -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-3191) docs for map output compression incorrectly reference SequenceFile
[ https://issues.apache.org/jira/browse/MAPREDUCE-3191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13971630#comment-13971630 ] Chen He commented on MAPREDUCE-3191: Thank you for the remindering, [~jeagles]. I checked [~phatak.dev]'s activities, the latest one was in July 2012. I will ask patch owner and wait for 3 days before taking it in the next time. docs for map output compression incorrectly reference SequenceFile -- Key: MAPREDUCE-3191 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3191 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.23.0 Reporter: Todd Lipcon Assignee: Chen He Priority: Trivial Labels: documentation, noob Attachments: MAPREDUCE-3191-v2.patch, MAPREDUCE-3191.patch The documentation currently says that map output compression uses SequenceFile compression. This hasn't been true in several years, since we use IFile for intermediate data now. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-3191) docs for map output compression incorrectly reference SequenceFile
[ https://issues.apache.org/jira/browse/MAPREDUCE-3191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen He updated MAPREDUCE-3191: --- Target Version/s: 0.23.0, 2.5.0 (was: 0.23.0) docs for map output compression incorrectly reference SequenceFile -- Key: MAPREDUCE-3191 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3191 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.23.0 Reporter: Todd Lipcon Assignee: Chen He Priority: Trivial Labels: documentation, noob Attachments: MAPREDUCE-3191-v2.patch, MAPREDUCE-3191.patch The documentation currently says that map output compression uses SequenceFile compression. This hasn't been true in several years, since we use IFile for intermediate data now. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-3191) docs for map output compression incorrectly reference SequenceFile
[ https://issues.apache.org/jira/browse/MAPREDUCE-3191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen He updated MAPREDUCE-3191: --- Target Version/s: 0.23.0 (was: 0.23.0, 2.5.0) docs for map output compression incorrectly reference SequenceFile -- Key: MAPREDUCE-3191 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3191 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.23.0 Reporter: Todd Lipcon Assignee: Chen He Priority: Trivial Labels: documentation, noob Attachments: MAPREDUCE-3191-v2.patch, MAPREDUCE-3191.patch The documentation currently says that map output compression uses SequenceFile compression. This hasn't been true in several years, since we use IFile for intermediate data now. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-3191) docs for map output compression incorrectly reference SequenceFile
[ https://issues.apache.org/jira/browse/MAPREDUCE-3191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13971850#comment-13971850 ] Chen He commented on MAPREDUCE-3191: Hi [~phatak.dev] Feel free to take it at any time. docs for map output compression incorrectly reference SequenceFile -- Key: MAPREDUCE-3191 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3191 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.23.0 Reporter: Todd Lipcon Assignee: Chen He Priority: Trivial Labels: documentation, noob Fix For: 3.0.0, 0.23.11, 2.5.0, 2.4.1 Attachments: MAPREDUCE-3191-v2.patch, MAPREDUCE-3191.patch The documentation currently says that map output compression uses SequenceFile compression. This hasn't been true in several years, since we use IFile for intermediate data now. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-4711) Append time elapsed since job-start-time for finished tasks
[ https://issues.apache.org/jira/browse/MAPREDUCE-4711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13972246#comment-13972246 ] Chen He commented on MAPREDUCE-4711: Hi [~raviprak] Thank you for the reply. Right now, it is time to clean up 0.23 JIRAs and retarget them to 2.x if they still exist in 2.x. Would you mind to retarget this issue to 2.5? Append time elapsed since job-start-time for finished tasks --- Key: MAPREDUCE-4711 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4711 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobhistoryserver Affects Versions: 0.23.3 Reporter: Ravi Prakash Assignee: Ravi Prakash Attachments: MAPREDUCE-4711.branch-0.23.patch In 0.20.x/1.x, the analyze job link gave this information bq. The last Map task task_sometask finished at (relative to the Job launch time): 5/10 20:23:10 (1hrs, 27mins, 54sec) The time it took for the last task to finish needs to be calculated mentally in 0.23. I believe we should print it next to the finish time. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-3182) loadgen ignores -m command line when writing random data
[ https://issues.apache.org/jira/browse/MAPREDUCE-3182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13968473#comment-13968473 ] Chen He commented on MAPREDUCE-3182: There two GenericLoadGenerator classes in current Hadoop source code. One is under org.apache.hadoop.mapreduce package. It has two documentation problems. Firstly, it does not actually parse the -m command line option but still show this option in the Usage. Secondly, if user does not specify the input directory, it will create input data using RandomWriter with default setting( 10GB per map task and 10 map task per node). However, it does not show this option in the Usage. The other is under org.apache.hadoop.mapred package; It is an older version of GenericLoadGenerator. It has the second documentation problem described in above paragraph. loadgen ignores -m command line when writing random data Key: MAPREDUCE-3182 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3182 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2, test Affects Versions: 0.23.0, 0.24.0, 2.3.0 Reporter: Jonathan Eagles Assignee: Chen He If no input directories are specified, loadgen goes into a special mode where random data is generated and written. In that mode, setting the number of mappers (-m command line option) is overridden by a calculation. Instead, it should take into consideration the user specified number of mappers and fall back to the calculation. In addition, update the documentation as well to match the new behavior in the code. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-3182) loadgen ignores -m command line when writing random data
[ https://issues.apache.org/jira/browse/MAPREDUCE-3182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen He updated MAPREDUCE-3182: --- Affects Version/s: 2.3.0 loadgen ignores -m command line when writing random data Key: MAPREDUCE-3182 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3182 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2, test Affects Versions: 0.23.0, 0.24.0, 2.3.0 Reporter: Jonathan Eagles Assignee: Chen He If no input directories are specified, loadgen goes into a special mode where random data is generated and written. In that mode, setting the number of mappers (-m command line option) is overridden by a calculation. Instead, it should take into consideration the user specified number of mappers and fall back to the calculation. In addition, update the documentation as well to match the new behavior in the code. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-3182) loadgen ignores -m command line when writing random data
[ https://issues.apache.org/jira/browse/MAPREDUCE-3182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen He updated MAPREDUCE-3182: --- Target Version/s: 2.5.0 (was: 0.23.0, 0.24.0) loadgen ignores -m command line when writing random data Key: MAPREDUCE-3182 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3182 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2, test Affects Versions: 0.23.0, 0.24.0, 2.3.0 Reporter: Jonathan Eagles Assignee: Chen He If no input directories are specified, loadgen goes into a special mode where random data is generated and written. In that mode, setting the number of mappers (-m command line option) is overridden by a calculation. Instead, it should take into consideration the user specified number of mappers and fall back to the calculation. In addition, update the documentation as well to match the new behavior in the code. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-3182) loadgen ignores -m command line when writing random data
[ https://issues.apache.org/jira/browse/MAPREDUCE-3182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen He updated MAPREDUCE-3182: --- Attachment: MAPREDUCE-3182.patch loadgen ignores -m command line when writing random data Key: MAPREDUCE-3182 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3182 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2, test Affects Versions: 0.23.0, 0.24.0, 2.3.0 Reporter: Jonathan Eagles Assignee: Chen He Attachments: MAPREDUCE-3182.patch If no input directories are specified, loadgen goes into a special mode where random data is generated and written. In that mode, setting the number of mappers (-m command line option) is overridden by a calculation. Instead, it should take into consideration the user specified number of mappers and fall back to the calculation. In addition, update the documentation as well to match the new behavior in the code. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-3418) If map output is not found, shuffle runs in tight loop
[ https://issues.apache.org/jira/browse/MAPREDUCE-3418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen He updated MAPREDUCE-3418: --- Target Version/s: 2.5.0 (was: 0.23.0, 2.5.0) If map output is not found, shuffle runs in tight loop -- Key: MAPREDUCE-3418 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3418 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.23.0 Reporter: John George Sharad Agarwal bumped into this while simulating fetch failures. Removed the map output directory. Shuffle runs in tight loop throwing : 2011-06-01 09:02:20,511 WARN org.apache.hadoop.mapreduce.task.reduce.Fetcher: Invalid map id java.lang.IllegalArgumentException: TaskAttemptId string : TTP/1.1 500 Internal Server Error Content-Type: text/plain; charset=UTF is not properly formed at org.apache.hadoop.mapreduce.TaskAttemptID.forName(TaskAttemptID.java:174) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:284) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:251) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:149) Fetch failure is not triggered. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-3418) If map output is not found, shuffle runs in tight loop
[ https://issues.apache.org/jira/browse/MAPREDUCE-3418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen He updated MAPREDUCE-3418: --- Affects Version/s: 2.3.0 If map output is not found, shuffle runs in tight loop -- Key: MAPREDUCE-3418 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3418 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.23.0, 2.3.0 Reporter: John George Sharad Agarwal bumped into this while simulating fetch failures. Removed the map output directory. Shuffle runs in tight loop throwing : 2011-06-01 09:02:20,511 WARN org.apache.hadoop.mapreduce.task.reduce.Fetcher: Invalid map id java.lang.IllegalArgumentException: TaskAttemptId string : TTP/1.1 500 Internal Server Error Content-Type: text/plain; charset=UTF is not properly formed at org.apache.hadoop.mapreduce.TaskAttemptID.forName(TaskAttemptID.java:174) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:284) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:251) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:149) Fetch failure is not triggered. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-3174) app master UI goes away when app finishes - not very user friendly
[ https://issues.apache.org/jira/browse/MAPREDUCE-3174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13966493#comment-13966493 ] Chen He commented on MAPREDUCE-3174: According to [~tlipcon]'s comments, this is only for problem in 0.23. Is this correct? If so, do we need to file a similar one for 2.x? app master UI goes away when app finishes - not very user friendly -- Key: MAPREDUCE-3174 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3174 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Affects Versions: 0.23.0 Reporter: Thomas Graves A user can go to the application master UI to see the stats on the app, but as soon as the app finishes that UI goes away and user is left with nothing. A redirect to history server or similar would be much better. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-3174) app master UI goes away when app finishes - not very user friendly
[ https://issues.apache.org/jira/browse/MAPREDUCE-3174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13967010#comment-13967010 ] Chen He commented on MAPREDUCE-3174: Hi [~tgraves], Thank you for the comments. I will retarget this issue to 2.x. app master UI goes away when app finishes - not very user friendly -- Key: MAPREDUCE-3174 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3174 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Affects Versions: 0.23.0 Reporter: Thomas Graves A user can go to the application master UI to see the stats on the app, but as soon as the app finishes that UI goes away and user is left with nothing. A redirect to history server or similar would be much better. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-3174) app master UI goes away when app finishes - not very user friendly
[ https://issues.apache.org/jira/browse/MAPREDUCE-3174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen He updated MAPREDUCE-3174: --- Target Version/s: 2.5.0 (was: 0.23.0) app master UI goes away when app finishes - not very user friendly -- Key: MAPREDUCE-3174 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3174 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Affects Versions: 0.23.0 Reporter: Thomas Graves A user can go to the application master UI to see the stats on the app, but as soon as the app finishes that UI goes away and user is left with nothing. A redirect to history server or similar would be much better. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Assigned] (MAPREDUCE-3182) loadgen ignores -m command line when writing random data
[ https://issues.apache.org/jira/browse/MAPREDUCE-3182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen He reassigned MAPREDUCE-3182: -- Assignee: Chen He loadgen ignores -m command line when writing random data Key: MAPREDUCE-3182 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3182 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2, test Affects Versions: 0.23.0, 0.24.0 Reporter: Jonathan Eagles Assignee: Chen He If no input directories are specified, loadgen goes into a special mode where random data is generated and written. In that mode, setting the number of mappers (-m command line option) is overridden by a calculation. Instead, it should take into consideration the user specified number of mappers and fall back to the calculation. In addition, update the documentation as well to match the new behavior in the code. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-3182) loadgen ignores -m command line when writing random data
[ https://issues.apache.org/jira/browse/MAPREDUCE-3182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13967012#comment-13967012 ] Chen He commented on MAPREDUCE-3182: I will take look at this issue. loadgen ignores -m command line when writing random data Key: MAPREDUCE-3182 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3182 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2, test Affects Versions: 0.23.0, 0.24.0 Reporter: Jonathan Eagles If no input directories are specified, loadgen goes into a special mode where random data is generated and written. In that mode, setting the number of mappers (-m command line option) is overridden by a calculation. Instead, it should take into consideration the user specified number of mappers and fall back to the calculation. In addition, update the documentation as well to match the new behavior in the code. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-3418) If map output is not found, shuffle runs in tight loop
[ https://issues.apache.org/jira/browse/MAPREDUCE-3418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13967014#comment-13967014 ] Chen He commented on MAPREDUCE-3418: Is this still a issue for Hadoop 2.x? If not, I will close it on April 14th, 2014. If map output is not found, shuffle runs in tight loop -- Key: MAPREDUCE-3418 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3418 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.23.0 Reporter: John George Sharad Agarwal bumped into this while simulating fetch failures. Removed the map output directory. Shuffle runs in tight loop throwing : 2011-06-01 09:02:20,511 WARN org.apache.hadoop.mapreduce.task.reduce.Fetcher: Invalid map id java.lang.IllegalArgumentException: TaskAttemptId string : TTP/1.1 500 Internal Server Error Content-Type: text/plain; charset=UTF is not properly formed at org.apache.hadoop.mapreduce.TaskAttemptID.forName(TaskAttemptID.java:174) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:284) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:251) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:149) Fetch failure is not triggered. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-3418) If map output is not found, shuffle runs in tight loop
[ https://issues.apache.org/jira/browse/MAPREDUCE-3418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13967039#comment-13967039 ] Chen He commented on MAPREDUCE-3418: Thank you for the reply, [~vinodkv]. I will retarget this issue towards 2.x. If map output is not found, shuffle runs in tight loop -- Key: MAPREDUCE-3418 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3418 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.23.0 Reporter: John George Sharad Agarwal bumped into this while simulating fetch failures. Removed the map output directory. Shuffle runs in tight loop throwing : 2011-06-01 09:02:20,511 WARN org.apache.hadoop.mapreduce.task.reduce.Fetcher: Invalid map id java.lang.IllegalArgumentException: TaskAttemptId string : TTP/1.1 500 Internal Server Error Content-Type: text/plain; charset=UTF is not properly formed at org.apache.hadoop.mapreduce.TaskAttemptID.forName(TaskAttemptID.java:174) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:284) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:251) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:149) Fetch failure is not triggered. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-3418) If map output is not found, shuffle runs in tight loop
[ https://issues.apache.org/jira/browse/MAPREDUCE-3418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen He updated MAPREDUCE-3418: --- Target Version/s: 0.23.0, 2.5.0 (was: 0.23.0) If map output is not found, shuffle runs in tight loop -- Key: MAPREDUCE-3418 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3418 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.23.0 Reporter: John George Sharad Agarwal bumped into this while simulating fetch failures. Removed the map output directory. Shuffle runs in tight loop throwing : 2011-06-01 09:02:20,511 WARN org.apache.hadoop.mapreduce.task.reduce.Fetcher: Invalid map id java.lang.IllegalArgumentException: TaskAttemptId string : TTP/1.1 500 Internal Server Error Content-Type: text/plain; charset=UTF is not properly formed at org.apache.hadoop.mapreduce.TaskAttemptID.forName(TaskAttemptID.java:174) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:284) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:251) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:149) Fetch failure is not triggered. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Assigned] (MAPREDUCE-519) Fix capacity scheduler's documentation
[ https://issues.apache.org/jira/browse/MAPREDUCE-519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen He reassigned MAPREDUCE-519: - Assignee: Chen He Fix capacity scheduler's documentation -- Key: MAPREDUCE-519 URL: https://issues.apache.org/jira/browse/MAPREDUCE-519 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Vinod Kumar Vavilapalli Assignee: Chen He Parent jira for all documentation related issues in capacity scheduler. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Assigned] (MAPREDUCE-5758) Reducer local data is not deleted until job completes
[ https://issues.apache.org/jira/browse/MAPREDUCE-5758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen He reassigned MAPREDUCE-5758: -- Assignee: Chen He Reducer local data is not deleted until job completes - Key: MAPREDUCE-5758 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5758 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.10, 2.2.0 Reporter: Jason Lowe Assignee: Chen He Ran into an instance where a reducer shuffled a large amount of data and subsequently failed, but the local data is not purged when the task fails but only after the entire job completes. This wastes disk space unnecessarily since the data is no longer relevant after the task-attempt exits. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5804) TestMRJobsWithProfiler#testProfiler timesout
[ https://issues.apache.org/jira/browse/MAPREDUCE-5804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13943063#comment-13943063 ] Chen He commented on MAPREDUCE-5804: +1, download the patch, apply to trunk, run the test, and no timeout reported. TestMRJobsWithProfiler#testProfiler timesout Key: MAPREDUCE-5804 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5804 Project: Hadoop Map/Reduce Issue Type: Test Affects Versions: 2.4.0 Reporter: Mit Desai Assignee: Mit Desai Attachments: LOG.txt, MAPREDUCE-5804.patch {noformat} testProfiler(org.apache.hadoop.mapreduce.v2.TestMRJobsWithProfiler) Time elapsed: 154.972 sec ERROR! java.lang.Exception: test timed out after 12 milliseconds at java.io.UnixFileSystem.getBooleanAttributes0(Native Method) at java.io.UnixFileSystem.getBooleanAttributes(UnixFileSystem.java:242) at java.io.File.exists(File.java:813) at sun.misc.URLClassPath$FileLoader.getResource(URLClassPath.java:1080) at sun.misc.URLClassPath.getResource(URLClassPath.java:199) at java.net.URLClassLoader$1.run(URLClassLoader.java:358) at java.net.URLClassLoader$1.run(URLClassLoader.java:355) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:354) at java.lang.ClassLoader.loadClass(ClassLoader.java:425) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) at java.lang.ClassLoader.loadClass(ClassLoader.java:358) at org.apache.log4j.spi.LoggingEvent.init(LoggingEvent.java:165) at org.apache.log4j.Category.forcedLog(Category.java:391) at org.apache.log4j.Category.log(Category.java:856) at org.apache.commons.logging.impl.Log4JLogger.warn(Log4JLogger.java:208) at org.apache.hadoop.mapred.ClientServiceDelegate.invoke(ClientServiceDelegate.java:338) at org.apache.hadoop.mapred.ClientServiceDelegate.getJobStatus(ClientServiceDelegate.java:419) at org.apache.hadoop.mapred.YARNRunner.getJobStatus(YARNRunner.java:532) at org.apache.hadoop.mapreduce.Job$1.run(Job.java:314) at org.apache.hadoop.mapreduce.Job$1.run(Job.java:311) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1570) at org.apache.hadoop.mapreduce.Job.updateStatus(Job.java:311) at org.apache.hadoop.mapreduce.Job.isComplete(Job.java:599) at org.apache.hadoop.mapreduce.Job.monitorAndPrintJob(Job.java:1344) at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1306) at org.apache.hadoop.mapreduce.v2.TestMRJobsWithProfiler.testProfiler(TestMRJobsWithProfiler.java:138) Results : Tests in error: TestMRJobsWithProfiler.testProfiler:138 » test timed out after 12 millise... {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5688) MRAppMaster causes TestStagingCleanup to fail intermittently with JDK7
[ https://issues.apache.org/jira/browse/MAPREDUCE-5688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13894825#comment-13894825 ] Chen He commented on MAPREDUCE-5688: Thank you Mit. +1 patch is good. It will be great if you submit an updated version and get +1 from Hadoop QA. MRAppMaster causes TestStagingCleanup to fail intermittently with JDK7 -- Key: MAPREDUCE-5688 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5688 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 3.0.0, 2.3.0 Reporter: Mit Desai Assignee: Mit Desai Labels: java7 Attachments: MAPREDUCE-5688.patch Due to random ordering ordering in JDK7, the test TestStagingCleanup#testDeletionofStagingOnKillLastTry is failing {noformat} Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 4.231 sec FAILURE! test(org.apache.hadoop.mapreduce.v2.app.TestStagingCleanup) Time elapsed: 3882 sec ERROR! java.lang.NullPointerException at org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.serviceStop(JobHistoryEventHandler.java:349) at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) at org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:52) at org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:80) at org.apache.hadoop.service.CompositeService.stop(CompositeService.java:159) at org.apache.hadoop.service.CompositeService.serviceStop(CompositeService.java:132) at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$MRAppMasterShutdownHook.run(MRAppMaster.java:1399) at org.apache.hadoop.mapreduce.v2.app.TestStagingCleanup.testDeletionofStagingOnKillLastTry(TestStagingCleanup.java:239) at org.apache.hadoop.mapreduce.v2.app.TestStagingCleanup.test(TestStagingCleanup.java:82) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at junit.framework.TestCase.runTest(TestCase.java:168) at junit.framework.TestCase.runBare(TestCase.java:134) at junit.framework.TestResult$1.protect(TestResult.java:110) at junit.framework.TestResult.runProtected(TestResult.java:128) at junit.framework.TestResult.run(TestResult.java:113) at junit.framework.TestCase.run(TestCase.java:124) at junit.framework.TestSuite.runTest(TestSuite.java:243) at junit.framework.TestSuite.run(TestSuite.java:238) at org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:83) at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:242) at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:137) at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:112) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:189) at org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:165) at org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(ProviderFactory.java:85) at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:115) at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:75) {noformat} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (MAPREDUCE-1380) Adaptive Scheduler
[ https://issues.apache.org/jira/browse/MAPREDUCE-1380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13894966#comment-13894966 ] Chen He commented on MAPREDUCE-1380: This patch may need to be updated against Hadoop 1.x or 2.x Adaptive Scheduler -- Key: MAPREDUCE-1380 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1380 Project: Hadoop Map/Reduce Issue Type: New Feature Reporter: Jordà Polo Priority: Minor Attachments: MAPREDUCE-1380_0.1.patch, MAPREDUCE-1380_1.1.patch, MAPREDUCE-1380_1.1.pdf The Adaptive Scheduler is a pluggable Hadoop scheduler that automatically adjusts the amount of used resources depending on the performance of jobs and on user-defined high-level business goals. Existing Hadoop schedulers are focused on managing large, static clusters in which nodes are added or removed manually. On the other hand, the goal of this scheduler is to improve the integration of Hadoop and the applications that run on top of it with environments that allow a more dynamic provisioning of resources. The current implementation is quite straightforward. Users specify a deadline at job submission time, and the scheduler adjusts the resources to meet that deadline (at the moment, the scheduler can be configured to either minimize or maximize the amount of resources). If multiple jobs are run simultaneously, the scheduler prioritizes them by deadline. Note that the current approach to estimate the completion time of jobs is quite simplistic: it is based on the time it takes to finish each task, so it works well with regular jobs, but there is still room for improvement for unpredictable jobs. The idea is to further integrate it with cloud-like and virtual environments (such as Amazon EC2, Emotive, etc.) so that if, for instance, a job isn't able to meet its deadline, the scheduler automatically requests more resources. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (MAPREDUCE-3486) All jobs of all queues will be returned, whethor a particular queueName is specified or not
[ https://issues.apache.org/jira/browse/MAPREDUCE-3486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen He updated MAPREDUCE-3486: --- Affects Version/s: (was: 0.24.0) (was: 0.23.0) 1.2.2 1.3.0 1.1.3 All jobs of all queues will be returned, whethor a particular queueName is specified or not --- Key: MAPREDUCE-3486 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3486 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Affects Versions: 1.1.3, 1.3.0, 1.2.2 Reporter: XieXianshan Assignee: XieXianshan Priority: Minor Attachments: MAPREDUCE-3486.patch JobTracker.getJobsFromQueue(queueName) will return all jobs of all queues about the jobtracker even though i specify a queueName. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (MAPREDUCE-3486) All jobs of all queues will be returned, whethor a particular queueName is specified or not
[ https://issues.apache.org/jira/browse/MAPREDUCE-3486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13892199#comment-13892199 ] Chen He commented on MAPREDUCE-3486: change the affects version to 1.x All jobs of all queues will be returned, whethor a particular queueName is specified or not --- Key: MAPREDUCE-3486 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3486 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Affects Versions: 1.1.3, 1.3.0, 1.2.2 Reporter: XieXianshan Assignee: XieXianshan Priority: Minor Attachments: MAPREDUCE-3486.patch JobTracker.getJobsFromQueue(queueName) will return all jobs of all queues about the jobtracker even though i specify a queueName. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (MAPREDUCE-5643) DynamicMR: A Dynamic Slot Utilization Optimization Framework for Hadoop MRv1
[ https://issues.apache.org/jira/browse/MAPREDUCE-5643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13892219#comment-13892219 ] Chen He commented on MAPREDUCE-5643: This is interesting. I would suggest you upload your design documents including your DHFS, DSTS, and DLMS. I have following questions about your scheduler. 1) if map and reduce slots can exchange, it is possible that some small jobs can not finish in time; 2) is there any load-balancing feature in your scheduling for map and reduce stage? 3) if reduce tasks steal map slot, some local map task will become non-local task because of shortage of map slots; DynamicMR: A Dynamic Slot Utilization Optimization Framework for Hadoop MRv1 Key: MAPREDUCE-5643 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5643 Project: Hadoop Map/Reduce Issue Type: Improvement Components: contrib/fair-share Affects Versions: 1.2.1 Reporter: tang shanjiang Assignee: tang shanjiang Labels: performance Attachments: DynamicMR-0.1.1-patch, README Hadoop MRv1 uses the slot-based resource model with the static configuration of map/reduce slots. There is a strict utility constrain that map tasks can only run on map slots and reduce tasks can only use reduce slots. Due to the rigid execution order between map and reduce tasks in a MapReduce environment, slots can be severely under-utilized, which significantly degrades the performance. In contrast to YARN that gives up the slot-based resource model and propose a container-based model to maximize the resource utilization via unawareness of the types of map/reduce tasks, we keep the slot-based model and propose a dynamic slot utilization optimization system called DynamicMR to improve the performance of Hadoop by maximizing the slots utilization as well as slot utilization efficiency while guaranteeing the fairness across pools. It consists of three types of scheduling components, namely, Dynamic Hadoop Fair Scheduler (DHFS), Dynamic Speculative Task Scheduler (DSTS), and Data Locality Maximization Scheduler (DLMS). Our tests show that DynamicMR outperforms YARN for MapReduce workloads with multiple jobs, especially when the number of jobs is large. The explanation is that, given a certain number of resources, it is obvious that the performance for the case with a ratio control of concurrently running map and reduce tasks is better than without control. Because without control, it easily occurs that there are too many reduce tasks running, causing the network to be a bottleneck seriously. For YARN, both map and reduce tasks can run on any idle container. There is no control mechanism for the ratio of resource allocation between map and reduce tasks. It means that when there are pending reduce tasks, the idle container will be most likely possessed by them. In contrast, DynamicMR follows the traditional slot-based model. In contrast to the ’hard’ constrain of slot allocation that map slots have to be allocated to map tasks and reduce tasks should be dispatched to reduce tasks, DynamicMR obeys a ’soft’ constrain of slot allocation to allow that map slot can be allocated to reduce task and vice versa. But whenever there are pending map tasks, the map slot should be given to map tasks first, and the rule is similar for reduce tasks. It means that, the traditional way of static map/reduce slot configuration for the ratio control of running map/reduce tasks still works for DynamicMR. In comparison to YARN which maximizes the resource utilization only, DynamicMR can maximize the slot resource utilization and meanwhile dynamically control the ratio of running map/reduce tasks via map/reduce slot configuration. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (MAPREDUCE-5603) Ability to disable FileInputFormat listLocatedStatus optimization to save client memory
[ https://issues.apache.org/jira/browse/MAPREDUCE-5603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13892249#comment-13892249 ] Chen He commented on MAPREDUCE-5603: +1, patch is good. Ability to disable FileInputFormat listLocatedStatus optimization to save client memory --- Key: MAPREDUCE-5603 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5603 Project: Hadoop Map/Reduce Issue Type: Improvement Components: client, mrv2 Affects Versions: 0.23.10, 2.2.0 Reporter: Jason Lowe Assignee: Jason Lowe Priority: Minor Attachments: MAPREDUCE-5603.patch, MAPREDUCE-5603.patch It would be nice if users had the option to disable the listLocatedStatus optimization in FileInputFormat to save client memory. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (MAPREDUCE-5670) CombineFileRecordReader should report progress when moving to the next file
[ https://issues.apache.org/jira/browse/MAPREDUCE-5670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen He updated MAPREDUCE-5670: --- Status: Open (was: Patch Available) CombineFileRecordReader should report progress when moving to the next file --- Key: MAPREDUCE-5670 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5670 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.9 Reporter: Jason Lowe Assignee: Chen He Priority: Minor Attachments: MR-5670.patch, MR-5670v2.patch, MR-5670v3.patch If a combine split consists of many empty files (i.e.: no record found by the underlying record reader) then theoretically a task can timeout due to lack of reported progress. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (MAPREDUCE-5670) CombineFileRecordReader should report progress when moving to the next file
[ https://issues.apache.org/jira/browse/MAPREDUCE-5670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen He updated MAPREDUCE-5670: --- Attachment: MR-5670v3.patch CombineFileRecordReader should report progress when moving to the next file --- Key: MAPREDUCE-5670 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5670 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.9 Reporter: Jason Lowe Assignee: Chen He Priority: Minor Attachments: MR-5670.patch, MR-5670v2.patch, MR-5670v3.patch If a combine split consists of many empty files (i.e.: no record found by the underlying record reader) then theoretically a task can timeout due to lack of reported progress. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (MAPREDUCE-5670) CombineFileRecordReader should report progress when moving to the next file
[ https://issues.apache.org/jira/browse/MAPREDUCE-5670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13881338#comment-13881338 ] Chen He commented on MAPREDUCE-5670: Hi [~jlowe] Patch has been updated following your suggestion. CombineFileRecordReader should report progress when moving to the next file --- Key: MAPREDUCE-5670 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5670 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.9 Reporter: Jason Lowe Assignee: Chen He Priority: Minor Fix For: 2.4.0, 0.23.10 Attachments: MR-5670.patch, MR-5670v2.patch, MR-5670v3.patch If a combine split consists of many empty files (i.e.: no record found by the underlying record reader) then theoretically a task can timeout due to lack of reported progress. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (MAPREDUCE-5670) CombineFileRecordReader should report progress when moving to the next file
[ https://issues.apache.org/jira/browse/MAPREDUCE-5670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen He updated MAPREDUCE-5670: --- Fix Version/s: 0.23.10 2.4.0 Status: Patch Available (was: Open) CombineFileRecordReader should report progress when moving to the next file --- Key: MAPREDUCE-5670 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5670 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.9 Reporter: Jason Lowe Assignee: Chen He Priority: Minor Fix For: 2.4.0, 0.23.10 Attachments: MR-5670.patch, MR-5670v2.patch, MR-5670v3.patch If a combine split consists of many empty files (i.e.: no record found by the underlying record reader) then theoretically a task can timeout due to lack of reported progress. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (MAPREDUCE-5670) CombineFileRecordReader should report progress when moving to the next file
[ https://issues.apache.org/jira/browse/MAPREDUCE-5670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen He updated MAPREDUCE-5670: --- Attachment: (was: MR-5670.patch) CombineFileRecordReader should report progress when moving to the next file --- Key: MAPREDUCE-5670 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5670 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.9 Reporter: Jason Lowe Assignee: Chen He Priority: Minor Attachments: MR-5670v3.patch If a combine split consists of many empty files (i.e.: no record found by the underlying record reader) then theoretically a task can timeout due to lack of reported progress. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (MAPREDUCE-5670) CombineFileRecordReader should report progress when moving to the next file
[ https://issues.apache.org/jira/browse/MAPREDUCE-5670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen He updated MAPREDUCE-5670: --- Attachment: (was: MR-5670v2.patch) CombineFileRecordReader should report progress when moving to the next file --- Key: MAPREDUCE-5670 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5670 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.9 Reporter: Jason Lowe Assignee: Chen He Priority: Minor Attachments: MR-5670v3.patch If a combine split consists of many empty files (i.e.: no record found by the underlying record reader) then theoretically a task can timeout due to lack of reported progress. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (MAPREDUCE-5670) CombineFileRecordReader should report progress when moving to the next file
[ https://issues.apache.org/jira/browse/MAPREDUCE-5670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen He updated MAPREDUCE-5670: --- Status: Open (was: Patch Available) CombineFileRecordReader should report progress when moving to the next file --- Key: MAPREDUCE-5670 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5670 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.9 Reporter: Jason Lowe Assignee: Chen He Priority: Minor Attachments: MR-5670.patch If a combine split consists of many empty files (i.e.: no record found by the underlying record reader) then theoretically a task can timeout due to lack of reported progress. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (MAPREDUCE-5670) CombineFileRecordReader should report progress when moving to the next file
[ https://issues.apache.org/jira/browse/MAPREDUCE-5670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen He updated MAPREDUCE-5670: --- Attachment: MR-5670v2.patch CombineFileRecordReader should report progress when moving to the next file --- Key: MAPREDUCE-5670 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5670 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.9 Reporter: Jason Lowe Assignee: Chen He Priority: Minor Attachments: MR-5670.patch, MR-5670v2.patch If a combine split consists of many empty files (i.e.: no record found by the underlying record reader) then theoretically a task can timeout due to lack of reported progress. -- This message was sent by Atlassian JIRA (v6.1.5#6160)