[jira] [Commented] (MAPREDUCE-6066) Speculative attempts should not run on the same node as their original attempt

2015-09-27 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14909840#comment-14909840
 ] 

Chen He commented on MAPREDUCE-6066:


This problem is interesting. I believe there are already many solutions from 
academic publications for this problem. Another corner case that we need to be 
careful is the case that if AM only get containers from a single NM, then we 
should allow speculative tasks run on the same node. 

Categorizing node becomes very important. What is the reason that causes this 
task (map or reduce) slow. Then, we can make more reasonable decision.

> Speculative attempts should not run on the same node as their original attempt
> --
>
> Key: MAPREDUCE-6066
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6066
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster, scheduler
>Affects Versions: 2.5.0, 2.6.0
>Reporter: Todd Lipcon
> Attachments: conf.xml
>
>
> I'm seeing a behavior on trunk with fair scheduler enabled where a 
> speculative reduce attempt is getting run on the same node as its original 
> attempt. This doesn't make sense -- the main reason for speculative execution 
> is to deal with a slow node, so scheduling a second attempt on the same node 
> would just make the problem worse if anything.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6066) Speculative attempts should not run on the same node as their original attempt

2015-09-27 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He updated MAPREDUCE-6066:
---
Affects Version/s: 2.6.0

> Speculative attempts should not run on the same node as their original attempt
> --
>
> Key: MAPREDUCE-6066
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6066
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster, scheduler
>Affects Versions: 2.5.0, 2.6.0
>Reporter: Todd Lipcon
> Attachments: conf.xml
>
>
> I'm seeing a behavior on trunk with fair scheduler enabled where a 
> speculative reduce attempt is getting run on the same node as its original 
> attempt. This doesn't make sense -- the main reason for speculative execution 
> is to deal with a slow node, so scheduling a second attempt on the same node 
> would just make the problem worse if anything.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-3182) loadgen ignores -m command line when writing random data

2015-05-08 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14535906#comment-14535906
 ] 

Chen He commented on MAPREDUCE-3182:


Thank you for the review [~ajisakaa]. I will update this weekend.

 loadgen ignores -m command line when writing random data
 

 Key: MAPREDUCE-3182
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3182
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: documentation, mrv2, test
Affects Versions: 0.23.0, 2.3.0
Reporter: Jonathan Eagles
Assignee: Chen He
Priority: Minor
  Labels: BB2015-05-TBR
 Attachments: MAPREDUCE-3182.patch


 If no input directories are specified, loadgen goes into a special mode where 
 random data is generated and written. In that mode, setting the number of 
 mappers (-m command line option) is overridden by a calculation. Instead, it 
 should take into consideration the user specified number of mappers and fall 
 back to the calculation. In addition, update the documentation as well to 
 match the new behavior in the code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6082) Excessive logging by org.apache.hadoop.util.Progress when value is NaN

2014-09-11 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14130038#comment-14130038
 ] 

Chen He commented on MAPREDUCE-6082:


+1  lgtm

 Excessive logging by org.apache.hadoop.util.Progress when value is NaN
 --

 Key: MAPREDUCE-6082
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6082
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 2.6.0
Reporter: Mit Desai
Assignee: Mit Desai
 Attachments: MAPREDUCE-6082.patch


 MAPREDUCE-5671 fixed the illegal progress values that do not fall into 
 (0,1) interval when the progress value is given. Whenever illegal value was 
 encountered, LOG.warn would log that incident.
 As  a result, each of the task's syslog will be full of  WARN [main] 
 org.apache.hadoop.util.Progress: Illegal progress value found, progress is 
 Float.NaN. Progress will be changed to 0
 Each input record will contribute to one line of such log, leading to most
 of the tasks' syslog  1GB.
 We will need to change the log level to debug to avoid such excessive logging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6066) Speculative attempts should not run on the same node as their original attempt

2014-09-03 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119914#comment-14119914
 ] 

Chen He commented on MAPREDUCE-6066:


I will take a look.

 Speculative attempts should not run on the same node as their original attempt
 --

 Key: MAPREDUCE-6066
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6066
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster, scheduler
Affects Versions: 3.0.0
Reporter: Todd Lipcon

 I'm seeing a behavior on trunk with fair scheduler enabled where a 
 speculative reduce attempt is getting run on the same node as its original 
 attempt. This doesn't make sense -- the main reason for speculative execution 
 is to deal with a slow node, so scheduling a second attempt on the same node 
 would just make the problem worse if anything.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-5885) build/test/test.mapred.spill causes release audit warnings

2014-08-26 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He updated MAPREDUCE-5885:
---

Status: Patch Available  (was: Open)

 build/test/test.mapred.spill causes release audit warnings
 --

 Key: MAPREDUCE-5885
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5885
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Affects Versions: trunk
Reporter: Jason Lowe
Assignee: Chen He
 Attachments: MAPREDUCE-5885-1.patch, MAPREDUCE-5885.patch, 
 MAPREDUCE-5885.patch, MAPREDUCE-5885.patch


 Multiple unit tests are creating files under 
 hadoop-mapreduce-client-jobclient/build/test/test.mapred.spill which are 
 causing release audit warnings during Jenkins patch precommit builds.  In 
 addition to being in a poor location for test output and not cleaning up 
 after the test, there are multiple tests using this location which will cause 
 conflicts if tests are run in parallel.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (MAPREDUCE-6050) Upgrade JUnit3 TestCase to JUnit 4

2014-08-25 Thread Chen He (JIRA)
Chen He created MAPREDUCE-6050:
--

 Summary: Upgrade JUnit3 TestCase to JUnit 4
 Key: MAPREDUCE-6050
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6050
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Reporter: Chen He
Priority: Trivial


There are still test classes that extend from junit.framework.TestCase. upgrade 
them to JUnit4.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5885) build/test/test.mapred.spill causes release audit warnings

2014-08-25 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14109257#comment-14109257
 ] 

Chen He commented on MAPREDUCE-5885:


Working on updating patch, also create MAPREDUCE-6050 for updating test classes 
from JUnit3 to JUnit4 in mapreduce project.

 build/test/test.mapred.spill causes release audit warnings
 --

 Key: MAPREDUCE-5885
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5885
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Affects Versions: trunk
Reporter: Jason Lowe
Assignee: Chen He
 Attachments: MAPREDUCE-5885.patch, MAPREDUCE-5885.patch, 
 MAPREDUCE-5885.patch


 Multiple unit tests are creating files under 
 hadoop-mapreduce-client-jobclient/build/test/test.mapred.spill which are 
 causing release audit warnings during Jenkins patch precommit builds.  In 
 addition to being in a poor location for test output and not cleaning up 
 after the test, there are multiple tests using this location which will cause 
 conflicts if tests are run in parallel.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5885) build/test/test.mapred.spill causes release audit warnings

2014-08-25 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He updated MAPREDUCE-5885:
---

Attachment: MAPREDUCE-5885-1.patch

patch updated. 

For the temporary directory name issue, I guess they just want to use 
class.getName+-mapred to avoid directory collision if many tests are running 
in parallel.

 build/test/test.mapred.spill causes release audit warnings
 --

 Key: MAPREDUCE-5885
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5885
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Affects Versions: trunk
Reporter: Jason Lowe
Assignee: Chen He
 Attachments: MAPREDUCE-5885-1.patch, MAPREDUCE-5885.patch, 
 MAPREDUCE-5885.patch, MAPREDUCE-5885.patch


 Multiple unit tests are creating files under 
 hadoop-mapreduce-client-jobclient/build/test/test.mapred.spill which are 
 causing release audit warnings during Jenkins patch precommit builds.  In 
 addition to being in a poor location for test output and not cleaning up 
 after the test, there are multiple tests using this location which will cause 
 conflicts if tests are run in parallel.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5885) build/test/test.mapred.spill causes release audit warnings

2014-08-25 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14109896#comment-14109896
 ] 

Chen He commented on MAPREDUCE-5885:


Sorry, my bad. I guess they think there will never be chance to run test in 
parallel?

 build/test/test.mapred.spill causes release audit warnings
 --

 Key: MAPREDUCE-5885
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5885
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Affects Versions: trunk
Reporter: Jason Lowe
Assignee: Chen He
 Attachments: MAPREDUCE-5885-1.patch, MAPREDUCE-5885.patch, 
 MAPREDUCE-5885.patch, MAPREDUCE-5885.patch


 Multiple unit tests are creating files under 
 hadoop-mapreduce-client-jobclient/build/test/test.mapred.spill which are 
 causing release audit warnings during Jenkins patch precommit builds.  In 
 addition to being in a poor location for test output and not cleaning up 
 after the test, there are multiple tests using this location which will cause 
 conflicts if tests are run in parallel.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-4818) Easier identification of tasks that timeout during localization

2014-08-23 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14108266#comment-14108266
 ] 

Chen He commented on MAPREDUCE-4818:


Does the yarn-localization-log introduces extra overhead to system (memory, 
disks, etc)? I mean there thousands of containers localizing data in a large 
busy cluster. How about we only record those failed ones.  

 Easier identification of tasks that timeout during localization
 ---

 Key: MAPREDUCE-4818
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4818
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mr-am
Affects Versions: 0.23.3, 2.0.3-alpha
Reporter: Jason Lowe
  Labels: usability

 When a task is taking too long to localize and is killed by the AM due to 
 task timeout, the job UI/history is not very helpful.  The attempt simply 
 lists a diagnostic stating it was killed due to timeout, but there are no 
 logs for the attempt since it never actually got started.  There are log 
 messages on the NM that show the container never made it past localization by 
 the time it was killed, but users often do not have access to those logs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5885) build/test/test.mapred.spill causes release audit warnings

2014-07-08 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He updated MAPREDUCE-5885:
---

Attachment: MAPREDUCE-5885.patch

retrigger QA

 build/test/test.mapred.spill causes release audit warnings
 --

 Key: MAPREDUCE-5885
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5885
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Affects Versions: trunk
Reporter: Jason Lowe
Assignee: Chen He
 Attachments: MAPREDUCE-5885.patch, MAPREDUCE-5885.patch, 
 MAPREDUCE-5885.patch


 Multiple unit tests are creating files under 
 hadoop-mapreduce-client-jobclient/build/test/test.mapred.spill which are 
 causing release audit warnings during Jenkins patch precommit builds.  In 
 addition to being in a poor location for test output and not cleaning up 
 after the test, there are multiple tests using this location which will cause 
 conflicts if tests are run in parallel.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (MAPREDUCE-5961) Job start time setting to Thu Jan 01 05:29:59 IST 1970

2014-07-07 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He resolved MAPREDUCE-5961.


Resolution: Duplicate

 Job start time setting to Thu Jan 01 05:29:59 IST 1970
 

 Key: MAPREDUCE-5961
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5961
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver
Affects Versions: 2.4.1
Reporter: Nishan Shetty, Huawei
Priority: Minor

 Induce RM switchover while job is in progress
 Observe that  job start time setting to Thu Jan 01 05:29:59 IST 1970 saying 
 below error
 {code}
 2014-07-05 21:38:12,415 INFO 
 org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager: Moving 
 hdfs://mycluster:8020/home/testos/staging-dir/history/done_intermediate/testos/job_1404572770516_0056_conf.xml
  to 
 hdfs://mycluster:8020/home/testos/staging-dir/history/done/2014/07/05/00/job_1404572770516_0056_conf.xml
 2014-07-05 21:41:12,289 INFO org.apache.hadoop.mapreduce.v2.hs.JobHistory: 
 Starting scan to move intermediate done files
 2014-07-05 21:41:12,294 WARN 
 org.apache.hadoop.mapreduce.v2.jobhistory.FileNameIndexUtils: Unable to parse 
 launch time from job history file 
 job_1404572770516_0057-1404576372149-testos-word+count-1404576499406-85-10-SUCCEEDED-default--1.jhist
  : java.lang.NumberFormatException: For input string: 
 2014-07-05 21:41:12,297 INFO 
 org.apache.hadoop.mapreduce.jobhistory.JobSummary: 
 jobId=job_1404572770516_0057,submitTime=1404576372149,launchTime=-1,firstMapTaskLaunchTime=1404576442635,firstReduceTaskLaunchTime=1404576492243,finishTime=1404576499406,resourcesPerMap=1024,resourcesPerReduce=1024,numMaps=85,numReduces=10,user=testos,queue=default,status=SUCCEEDED,mapSlotSeconds=690,reduceSlotSeconds=39,jobName=word
  count
 2014-07-05 21:41:12,298 INFO 
 org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager: Deleting JobSummary 
 file:
 {code}
 AM LOG
 {code}
 2014-07-05 21:38:19,432 INFO [Thread-74] 
 org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: 
 JobHistoryEventHandler notified that forceJobCompletion is true
 2014-07-05 21:38:19,432 INFO [Thread-74] 
 org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Calling stop for all the 
 services
 2014-07-05 21:38:19,433 INFO [Thread-74] 
 org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Stopping 
 JobHistoryEventHandler. Size of the outstanding queue size is 0
 2014-07-05 21:38:19,556 INFO [eventHandlingThread] 
 org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Copying 
 hdfs://mycluster/home/testos/staging-dir/testos/.staging/job_1404572770516_0057/job_1404572770516_0057_2.jhist
  to 
 hdfs://mycluster/home/testos/staging-dir/history/done_intermediate/testos/job_1404572770516_0057-1404576372149-testos-word+count-1404576499406-85-10-SUCCEEDED-default--1.jhist_tmp
 2014-07-05 21:38:19,770 INFO [eventHandlingThread] 
 org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Copied to done 
 location: 
 hdfs://mycluster/home/testos/staging-dir/history/done_intermediate/testos/job_1404572770516_0057-1404576372149-testos-word+count-1404576499406-85-10-SUCCEEDED-default--1.jhist_tmp
 2014-07-05 21:38:19,785 INFO [eventHandlingThread] 
 org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Copying 
 hdfs://mycluster/home/testos/staging-dir/testos/.staging/job_1404572770516_0057/job_1404572770516_0057_2_conf.xml
  to 
 hdfs://mycluster/home/testos/staging-dir/history/done_intermediate/testos/job_1404572770516_0057_conf.xml_tmp
 2014-07-05 21:38:19,862 INFO [eventHandlingThread] 
 org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Copied to done 
 location: 
 hdfs://mycluster/home/testos/staging-dir/history/done_intermediate/testos/job_1404572770516_0057_conf.xml_tmp
 2014-07-05 21:38:19,886 INFO [eventHandlingThread] 
 org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Moved tmp to 
 done: 
 hdfs://mycluster/home/testos/staging-dir/history/done_intermediate/testos/job_1404572770516_0057.summary_tmp
  to 
 hdfs://mycluster/home/testos/staging-dir/history/done_intermediate/testos/job_1404572770516_0057.summary
 2014-07-05 21:38:19,898 INFO [eventHandlingThread] 
 org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Moved tmp to 
 done: 
 hdfs://mycluster/home/testos/staging-dir/history/done_intermediate/testos/job_1404572770516_0057_conf.xml_tmp
  to 
 hdfs://mycluster/home/testos/staging-dir/history/done_intermediate/testos/job_1404572770516_0057_conf.xml
 2014-07-05 21:38:19,910 INFO [eventHandlingThread] 
 org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Moved tmp to 
 done: 
 

[jira] [Commented] (MAPREDUCE-5961) Job start time setting to Thu Jan 01 05:29:59 IST 1970

2014-07-07 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14053752#comment-14053752
 ] 

Chen He commented on MAPREDUCE-5961:


This is duplicate to MAPREDUCE-5939

 Job start time setting to Thu Jan 01 05:29:59 IST 1970
 

 Key: MAPREDUCE-5961
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5961
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver
Affects Versions: 2.4.1
Reporter: Nishan Shetty, Huawei
Priority: Minor

 Induce RM switchover while job is in progress
 Observe that  job start time setting to Thu Jan 01 05:29:59 IST 1970 saying 
 below error
 {code}
 2014-07-05 21:38:12,415 INFO 
 org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager: Moving 
 hdfs://mycluster:8020/home/testos/staging-dir/history/done_intermediate/testos/job_1404572770516_0056_conf.xml
  to 
 hdfs://mycluster:8020/home/testos/staging-dir/history/done/2014/07/05/00/job_1404572770516_0056_conf.xml
 2014-07-05 21:41:12,289 INFO org.apache.hadoop.mapreduce.v2.hs.JobHistory: 
 Starting scan to move intermediate done files
 2014-07-05 21:41:12,294 WARN 
 org.apache.hadoop.mapreduce.v2.jobhistory.FileNameIndexUtils: Unable to parse 
 launch time from job history file 
 job_1404572770516_0057-1404576372149-testos-word+count-1404576499406-85-10-SUCCEEDED-default--1.jhist
  : java.lang.NumberFormatException: For input string: 
 2014-07-05 21:41:12,297 INFO 
 org.apache.hadoop.mapreduce.jobhistory.JobSummary: 
 jobId=job_1404572770516_0057,submitTime=1404576372149,launchTime=-1,firstMapTaskLaunchTime=1404576442635,firstReduceTaskLaunchTime=1404576492243,finishTime=1404576499406,resourcesPerMap=1024,resourcesPerReduce=1024,numMaps=85,numReduces=10,user=testos,queue=default,status=SUCCEEDED,mapSlotSeconds=690,reduceSlotSeconds=39,jobName=word
  count
 2014-07-05 21:41:12,298 INFO 
 org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager: Deleting JobSummary 
 file:
 {code}
 AM LOG
 {code}
 2014-07-05 21:38:19,432 INFO [Thread-74] 
 org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: 
 JobHistoryEventHandler notified that forceJobCompletion is true
 2014-07-05 21:38:19,432 INFO [Thread-74] 
 org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Calling stop for all the 
 services
 2014-07-05 21:38:19,433 INFO [Thread-74] 
 org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Stopping 
 JobHistoryEventHandler. Size of the outstanding queue size is 0
 2014-07-05 21:38:19,556 INFO [eventHandlingThread] 
 org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Copying 
 hdfs://mycluster/home/testos/staging-dir/testos/.staging/job_1404572770516_0057/job_1404572770516_0057_2.jhist
  to 
 hdfs://mycluster/home/testos/staging-dir/history/done_intermediate/testos/job_1404572770516_0057-1404576372149-testos-word+count-1404576499406-85-10-SUCCEEDED-default--1.jhist_tmp
 2014-07-05 21:38:19,770 INFO [eventHandlingThread] 
 org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Copied to done 
 location: 
 hdfs://mycluster/home/testos/staging-dir/history/done_intermediate/testos/job_1404572770516_0057-1404576372149-testos-word+count-1404576499406-85-10-SUCCEEDED-default--1.jhist_tmp
 2014-07-05 21:38:19,785 INFO [eventHandlingThread] 
 org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Copying 
 hdfs://mycluster/home/testos/staging-dir/testos/.staging/job_1404572770516_0057/job_1404572770516_0057_2_conf.xml
  to 
 hdfs://mycluster/home/testos/staging-dir/history/done_intermediate/testos/job_1404572770516_0057_conf.xml_tmp
 2014-07-05 21:38:19,862 INFO [eventHandlingThread] 
 org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Copied to done 
 location: 
 hdfs://mycluster/home/testos/staging-dir/history/done_intermediate/testos/job_1404572770516_0057_conf.xml_tmp
 2014-07-05 21:38:19,886 INFO [eventHandlingThread] 
 org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Moved tmp to 
 done: 
 hdfs://mycluster/home/testos/staging-dir/history/done_intermediate/testos/job_1404572770516_0057.summary_tmp
  to 
 hdfs://mycluster/home/testos/staging-dir/history/done_intermediate/testos/job_1404572770516_0057.summary
 2014-07-05 21:38:19,898 INFO [eventHandlingThread] 
 org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Moved tmp to 
 done: 
 hdfs://mycluster/home/testos/staging-dir/history/done_intermediate/testos/job_1404572770516_0057_conf.xml_tmp
  to 
 hdfs://mycluster/home/testos/staging-dir/history/done_intermediate/testos/job_1404572770516_0057_conf.xml
 2014-07-05 21:38:19,910 INFO [eventHandlingThread] 
 org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Moved tmp to 
 done: 
 

[jira] [Updated] (MAPREDUCE-5939) StartTime showing up as the epoch time in JHS UI after upgrade

2014-06-25 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He updated MAPREDUCE-5939:
---

Attachment: MAPREDUCE-5939-v3.patch

Thank you for the comments, [~jlowe]. I updated patch following your suggestion.

 StartTime showing up as the epoch time in JHS UI after upgrade
 --

 Key: MAPREDUCE-5939
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5939
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.5.0
Reporter: Kihwal Lee
Assignee: Chen He
 Attachments: MAPREDUCE-5939-v2.patch, MAPREDUCE-5939-v3.patch, 
 MAPREDUCE-5939.patch


 After upgrading from 0.23.x to 2.5, the start time of old apps are showing up 
 as the epoch time.  It looks like 2.5 expects start time to be encoded at the 
 end of the jhist file name (-[timestamp].jhist). It should have been 
 made backward compatible.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5939) StartTime showing up as the epoch time in JHS UI after upgrade

2014-06-24 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He updated MAPREDUCE-5939:
---

Target Version/s: 0.23.11, 2.5.0
  Status: Patch Available  (was: Open)

 StartTime showing up as the epoch time in JHS UI after upgrade
 --

 Key: MAPREDUCE-5939
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5939
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.5.0
Reporter: Kihwal Lee
Assignee: Chen He
 Attachments: MAPREDUCE-5939.patch


 After upgrading from 0.23.x to 2.5, the start time of old apps are showing up 
 as the epoch time.  It looks like 2.5 expects start time to be encoded at the 
 end of the jhist file name (-[timestamp].jhist). It should have been 
 made backward compatible.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5939) StartTime showing up as the epoch time in JHS UI after upgrade

2014-06-24 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He updated MAPREDUCE-5939:
---

Attachment: MAPREDUCE-5939.patch

 StartTime showing up as the epoch time in JHS UI after upgrade
 --

 Key: MAPREDUCE-5939
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5939
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.5.0
Reporter: Kihwal Lee
Assignee: Chen He
 Attachments: MAPREDUCE-5939.patch


 After upgrading from 0.23.x to 2.5, the start time of old apps are showing up 
 as the epoch time.  It looks like 2.5 expects start time to be encoded at the 
 end of the jhist file name (-[timestamp].jhist). It should have been 
 made backward compatible.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5939) StartTime showing up as the epoch time in JHS UI after upgrade

2014-06-24 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He updated MAPREDUCE-5939:
---

Attachment: MAPREDUCE-5939-v2.patch

 StartTime showing up as the epoch time in JHS UI after upgrade
 --

 Key: MAPREDUCE-5939
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5939
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.5.0
Reporter: Kihwal Lee
Assignee: Chen He
 Attachments: MAPREDUCE-5939-v2.patch, MAPREDUCE-5939.patch


 After upgrading from 0.23.x to 2.5, the start time of old apps are showing up 
 as the epoch time.  It looks like 2.5 expects start time to be encoded at the 
 end of the jhist file name (-[timestamp].jhist). It should have been 
 made backward compatible.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Assigned] (MAPREDUCE-5939) StartTime showing up as the epoch time in JHS UI after upgrade

2014-06-23 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He reassigned MAPREDUCE-5939:
--

Assignee: Chen He

 StartTime showing up as the epoch time in JHS UI after upgrade
 --

 Key: MAPREDUCE-5939
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5939
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.5.0
Reporter: Kihwal Lee
Assignee: Chen He

 After upgrading from 0.23.x to 2.5, the start time of old apps are showing up 
 as the epoch time.  It looks like 2.5 expects start time to be encoded at the 
 end of the jhist file name (-[timestamp].jhist). It should have been 
 made backward compatible.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-3182) loadgen ignores -m command line when writing random data

2014-06-17 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14034061#comment-14034061
 ] 

Chen He commented on MAPREDUCE-3182:


Hi [~jeagles], would you mind take a look of this patch. Thank you very much!

 loadgen ignores -m command line when writing random data
 

 Key: MAPREDUCE-3182
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3182
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2, test
Affects Versions: 0.23.0, 2.3.0
Reporter: Jonathan Eagles
Assignee: Chen He
 Attachments: MAPREDUCE-3182.patch


 If no input directories are specified, loadgen goes into a special mode where 
 random data is generated and written. In that mode, setting the number of 
 mappers (-m command line option) is overridden by a calculation. Instead, it 
 should take into consideration the user specified number of mappers and fall 
 back to the calculation. In addition, update the documentation as well to 
 match the new behavior in the code.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5885) build/test/test.mapred.spill causes release audit warnings

2014-06-13 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14030608#comment-14030608
 ] 

Chen He commented on MAPREDUCE-5885:


The test failure is because of MAPREDUCE-5868

 build/test/test.mapred.spill causes release audit warnings
 --

 Key: MAPREDUCE-5885
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5885
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Affects Versions: trunk
Reporter: Jason Lowe
Assignee: Chen He
 Attachments: MAPREDUCE-5885.patch, MAPREDUCE-5885.patch


 Multiple unit tests are creating files under 
 hadoop-mapreduce-client-jobclient/build/test/test.mapred.spill which are 
 causing release audit warnings during Jenkins patch precommit builds.  In 
 addition to being in a poor location for test output and not cleaning up 
 after the test, there are multiple tests using this location which will cause 
 conflicts if tests are run in parallel.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5885) build/test/test.mapred.spill causes release audit warnings

2014-06-12 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He updated MAPREDUCE-5885:
---

Attachment: MAPREDUCE-5885.patch

attach patch again and trigger HadoopQA

 build/test/test.mapred.spill causes release audit warnings
 --

 Key: MAPREDUCE-5885
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5885
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Affects Versions: trunk
Reporter: Jason Lowe
Assignee: Chen He
 Attachments: MAPREDUCE-5885.patch, MAPREDUCE-5885.patch


 Multiple unit tests are creating files under 
 hadoop-mapreduce-client-jobclient/build/test/test.mapred.spill which are 
 causing release audit warnings during Jenkins patch precommit builds.  In 
 addition to being in a poor location for test output and not cleaning up 
 after the test, there are multiple tests using this location which will cause 
 conflicts if tests are run in parallel.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (MAPREDUCE-5923) org.apache.hadoop.mapred.pipes.TestPipeApplication timeouts intermittently

2014-06-12 Thread Chen He (JIRA)
Chen He created MAPREDUCE-5923:
--

 Summary: org.apache.hadoop.mapred.pipes.TestPipeApplication 
timeouts intermittently
 Key: MAPREDUCE-5923
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5923
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: trunk
Reporter: Chen He
Priority: Minor






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5885) build/test/test.mapred.spill causes release audit warnings

2014-06-12 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14029914#comment-14029914
 ] 

Chen He commented on MAPREDUCE-5885:


test failure is related to MAPREDUCE-5923

 build/test/test.mapred.spill causes release audit warnings
 --

 Key: MAPREDUCE-5885
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5885
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Affects Versions: trunk
Reporter: Jason Lowe
Assignee: Chen He
 Attachments: MAPREDUCE-5885.patch


 Multiple unit tests are creating files under 
 hadoop-mapreduce-client-jobclient/build/test/test.mapred.spill which are 
 causing release audit warnings during Jenkins patch precommit builds.  In 
 addition to being in a poor location for test output and not cleaning up 
 after the test, there are multiple tests using this location which will cause 
 conflicts if tests are run in parallel.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5885) build/test/test.mapred.spill causes release audit warnings

2014-06-12 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He updated MAPREDUCE-5885:
---

Attachment: (was: MAPREDUCE-5885.patch)

 build/test/test.mapred.spill causes release audit warnings
 --

 Key: MAPREDUCE-5885
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5885
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Affects Versions: trunk
Reporter: Jason Lowe
Assignee: Chen He
 Attachments: MAPREDUCE-5885.patch


 Multiple unit tests are creating files under 
 hadoop-mapreduce-client-jobclient/build/test/test.mapred.spill which are 
 causing release audit warnings during Jenkins patch precommit builds.  In 
 addition to being in a poor location for test output and not cleaning up 
 after the test, there are multiple tests using this location which will cause 
 conflicts if tests are run in parallel.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5885) build/test/test.mapred.spill causes release audit warnings

2014-06-12 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He updated MAPREDUCE-5885:
---

Attachment: MAPREDUCE-5885.patch

 build/test/test.mapred.spill causes release audit warnings
 --

 Key: MAPREDUCE-5885
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5885
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Affects Versions: trunk
Reporter: Jason Lowe
Assignee: Chen He
 Attachments: MAPREDUCE-5885.patch, MAPREDUCE-5885.patch


 Multiple unit tests are creating files under 
 hadoop-mapreduce-client-jobclient/build/test/test.mapred.spill which are 
 causing release audit warnings during Jenkins patch precommit builds.  In 
 addition to being in a poor location for test output and not cleaning up 
 after the test, there are multiple tests using this location which will cause 
 conflicts if tests are run in parallel.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5758) Reducer local data is not deleted until job completes

2014-05-20 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14003852#comment-14003852
 ] 

Chen He commented on MAPREDUCE-5758:


There are several issues we need to consider if we allow reducer use container 
local directory
1) The MapReduce framework should get container local dir from YARN. 
2) We need to let Yarn framework know that MapReduce framework created some 
dirs under container local dir for reducers. 
Any suggestion, [~vinodkv]?

 Reducer local data is not deleted until job completes
 -

 Key: MAPREDUCE-5758
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5758
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.10, 2.2.0
Reporter: Jason Lowe
Assignee: Chen He

 Ran into an instance where a reducer shuffled a large amount of data and 
 subsequently failed, but the local data is not purged when the task fails but 
 only after the entire job completes.  This wastes disk space unnecessarily 
 since the data is no longer relevant after the task-attempt exits.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5002) AM could potentially allocate a reduce container to a map attempt

2014-05-15 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13996836#comment-13996836
 ] 

Chen He commented on MAPREDUCE-5002:


Is this JIRA fixed? If so, could we close it?

 AM could potentially allocate a reduce container to a map attempt
 -

 Key: MAPREDUCE-5002
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5002
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mr-am
Affects Versions: 2.0.3-alpha, 0.23.7
Reporter: Jason Lowe

 As discussed in MAPREDUCE-4982, after MAPREDUCE-4893 it is theoretically 
 possible for the AM to accidentally assign a reducer container to a map 
 attempt if the AM doesn't find a reduce attempt actively looking for the 
 container (e.g.: the RM accidentally allocated too many reducer containers).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-4071) NPE while executing MRAppMaster shutdown hook

2014-05-14 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13996832#comment-13996832
 ] 

Chen He commented on MAPREDUCE-4071:


ping

 NPE while executing MRAppMaster shutdown hook
 -

 Key: MAPREDUCE-4071
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4071
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mr-am, mrv2
Affects Versions: 0.23.3, 2.0.0-alpha, trunk
Reporter: Bhallamudi Venkata Siva Kamesh
 Attachments: MAPREDUCE-4071-1.patch, MAPREDUCE-4071-2.patch, 
 MAPREDUCE-4071-2.patch, MAPREDUCE-4071.patch


 While running the shutdown hook of MRAppMaster, hit NPE
 {noformat}
 Exception in thread Thread-1 java.lang.NullPointerException
   at 
 org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter.setSignalled(MRAppMaster.java:668)
   at 
 org.apache.hadoop.mapreduce.v2.app.MRAppMaster$MRAppMasterShutdownHook.run(MRAppMaster.java:1004)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5885) build/test/test.mapred.spill causes release audit warnings

2014-05-14 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He updated MAPREDUCE-5885:
---

Attachment: MAPREDUCE-5885.patch

patch submitted.

 build/test/test.mapred.spill causes release audit warnings
 --

 Key: MAPREDUCE-5885
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5885
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Affects Versions: trunk
Reporter: Jason Lowe
Assignee: Chen He
 Attachments: MAPREDUCE-5885.patch


 Multiple unit tests are creating files under 
 hadoop-mapreduce-client-jobclient/build/test/test.mapred.spill which are 
 causing release audit warnings during Jenkins patch precommit builds.  In 
 addition to being in a poor location for test output and not cleaning up 
 after the test, there are multiple tests using this location which will cause 
 conflicts if tests are run in parallel.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5885) build/test/test.mapred.spill causes release audit warnings

2014-05-13 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13996518#comment-13996518
 ] 

Chen He commented on MAPREDUCE-5885:


As well as TestMapReduce.java. 

 build/test/test.mapred.spill causes release audit warnings
 --

 Key: MAPREDUCE-5885
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5885
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Affects Versions: trunk
Reporter: Jason Lowe
Assignee: Chen He

 Multiple unit tests are creating files under 
 hadoop-mapreduce-client-jobclient/build/test/test.mapred.spill which are 
 causing release audit warnings during Jenkins patch precommit builds.  In 
 addition to being in a poor location for test output and not cleaning up 
 after the test, there are multiple tests using this location which will cause 
 conflicts if tests are run in parallel.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5885) build/test/test.mapred.spill causes release audit warnings

2014-05-13 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He updated MAPREDUCE-5885:
---

Target Version/s: 0.23.11, 2.5.0
  Status: Patch Available  (was: Open)

 build/test/test.mapred.spill causes release audit warnings
 --

 Key: MAPREDUCE-5885
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5885
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Affects Versions: trunk
Reporter: Jason Lowe
Assignee: Chen He
 Attachments: MAPREDUCE-5885.patch


 Multiple unit tests are creating files under 
 hadoop-mapreduce-client-jobclient/build/test/test.mapred.spill which are 
 causing release audit warnings during Jenkins patch precommit builds.  In 
 addition to being in a poor location for test output and not cleaning up 
 after the test, there are multiple tests using this location which will cause 
 conflicts if tests are run in parallel.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5889) Deprecate FileInputFormat.setInputPaths(JobConf, String) and FileInputFormat.addInputPaths(JobConf, String)

2014-05-13 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13996815#comment-13996815
 ] 

Chen He commented on MAPREDUCE-5889:


Agreed. +1 for the idea

 Deprecate FileInputFormat.setInputPaths(JobConf, String) and 
 FileInputFormat.addInputPaths(JobConf, String)
 ---

 Key: MAPREDUCE-5889
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5889
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Akira AJISAKA
Priority: Minor
  Labels: newbie

 {{FileInputFormat.setInputPaths(JobConf conf, String commaSeparatedPaths)}} 
 and {{FileInputFormat.addInputPaths(JobConf conf, String 
 commaSeparatedPaths)}} fail to parse commaSeparatedPaths if a comma is 
 included in the file path. (e.g. Path: {{/path/file,with,comma}})
 We should deprecate these methods and document to use {{setInputPaths(JobConf 
 conf, Path... inputPaths)}} and {{addInputPaths(JobConf conf, Path... 
 inputPaths)}} instead.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Assigned] (MAPREDUCE-5885) build/test/test.mapred.spill causes release audit warnings

2014-05-12 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He reassigned MAPREDUCE-5885:
--

Assignee: Chen He

 build/test/test.mapred.spill causes release audit warnings
 --

 Key: MAPREDUCE-5885
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5885
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Affects Versions: trunk
Reporter: Jason Lowe
Assignee: Chen He

 Multiple unit tests are creating files under 
 hadoop-mapreduce-client-jobclient/build/test/test.mapred.spill which are 
 causing release audit warnings during Jenkins patch precommit builds.  In 
 addition to being in a poor location for test output and not cleaning up 
 after the test, there are multiple tests using this location which will cause 
 conflicts if tests are run in parallel.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5883) Total megabyte-seconds in job counters is slightly misleading

2014-05-11 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13994684#comment-13994684
 ] 

Chen He commented on MAPREDUCE-5883:


+1 non-binding.

 Total megabyte-seconds in job counters is slightly misleading
 ---

 Key: MAPREDUCE-5883
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5883
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 3.0.0, 2.4.0
Reporter: Nathan Roberts
Assignee: Nathan Roberts
Priority: Minor
 Attachments: MAPREDUCE-5883.patch


 The following counters are in milliseconds so megabyte-seconds might be 
 better stated as megabyte-milliseconds
 MB_MILLIS_MAPS.name=   Total megabyte-seconds taken by all map 
 tasks
 MB_MILLIS_REDUCES.name=Total megabyte-seconds taken by all reduce 
 tasks
 VCORES_MILLIS_MAPS.name=   Total vcore-seconds taken by all map tasks
 VCORES_MILLIS_REDUCES.name=Total vcore-seconds taken by all reduce 
 tasks



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-3677) If hadoop.security.authorization is set to true, NM is not starting.

2014-04-25 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13981221#comment-13981221
 ] 

Chen He commented on MAPREDUCE-3677:


Since there is no response for 3 days. I will close this issue. Reopen if 
necessary.

 If hadoop.security.authorization is set to true, NM is not starting.
 --

 Key: MAPREDUCE-3677
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3677
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 0.23.0
Reporter: Ramgopal N
Assignee: Chen He

 I have the hadoop cluster setup with root user.Accidentally i have set 
 hadoop.security.authorization to true.I have not set any permissions in 
 policy.xml.When i am trying to start the NM with root user ...it is throwing 
 the following error
 Exception in thread main java.lang.NoClassDefFoundError: nodemanager
 Caused by: java.lang.ClassNotFoundException: nodemanager
 at java.net.URLClassLoader$1.run(URLClassLoader.java:200)
 at java.security.AccessController.doPrivileged(Native Method)
 at java.net.URLClassLoader.findClass(URLClassLoader.java:188)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:303)
 at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
 at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:316)
 Could not find the main class: nodemanager.  Program will exit.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (MAPREDUCE-3677) If hadoop.security.authorization is set to true, NM is not starting.

2014-04-25 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He resolved MAPREDUCE-3677.


Resolution: Not a Problem

 If hadoop.security.authorization is set to true, NM is not starting.
 --

 Key: MAPREDUCE-3677
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3677
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 0.23.0
Reporter: Ramgopal N
Assignee: Chen He

 I have the hadoop cluster setup with root user.Accidentally i have set 
 hadoop.security.authorization to true.I have not set any permissions in 
 policy.xml.When i am trying to start the NM with root user ...it is throwing 
 the following error
 Exception in thread main java.lang.NoClassDefFoundError: nodemanager
 Caused by: java.lang.ClassNotFoundException: nodemanager
 at java.net.URLClassLoader$1.run(URLClassLoader.java:200)
 at java.security.AccessController.doPrivileged(Native Method)
 at java.net.URLClassLoader.findClass(URLClassLoader.java:188)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:303)
 at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
 at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:316)
 Could not find the main class: nodemanager.  Program will exit.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-4718) MapReduce fails If I pass a parameter as a S3 folder

2014-04-22 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He updated MAPREDUCE-4718:
---

Target Version/s: 1.0.3  (was: 1.0.3, 0.23.3)

 MapReduce fails If I pass a parameter as a S3 folder
 

 Key: MAPREDUCE-4718
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4718
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: job submission
Affects Versions: 1.0.0, 1.0.3
 Environment: Hadoop with default configurations
Reporter: Benjamin Kim

 I'm running a wordcount MR as follows
 hadoop jar WordCount.jar wordcount.WordCountDriver 
 s3n://bucket/wordcount/input s3n://bucket/wordcount/output
  
 s3n://bucket/wordcount/input is a s3 object that contains other input files.
 However I get following NPE error
 12/10/02 18:56:23 INFO mapred.JobClient:  map 0% reduce 0%
 12/10/02 18:56:54 INFO mapred.JobClient:  map 50% reduce 0%
 12/10/02 18:56:56 INFO mapred.JobClient: Task Id : 
 attempt_201210021853_0001_m_01_0, Status : FAILED
 java.lang.NullPointerException
 at 
 org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsInputStream.close(NativeS3FileSystem.java:106)
 at java.io.BufferedInputStream.close(BufferedInputStream.java:451)
 at java.io.FilterInputStream.close(FilterInputStream.java:155)
 at org.apache.hadoop.util.LineReader.close(LineReader.java:83)
 at 
 org.apache.hadoop.mapreduce.lib.input.LineRecordReader.close(LineRecordReader.java:144)
 at 
 org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.close(MapTask.java:497)
 at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:765)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
 at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
 at org.apache.hadoop.mapred.Child.main(Child.java:249)
 MR runs fine if i specify more specific input path such as 
 s3n://bucket/wordcount/input/file.txt
 MR fails if I pass s3 folder as a parameter
 In summary,
 This works
  hadoop jar ./hadoop-examples-1.0.3.jar wordcount 
 /user/hadoop/wordcount/input/ s3n://bucket/wordcount/output/
 This doesn't work
  hadoop jar ./hadoop-examples-1.0.3.jar wordcount 
 s3n://bucket/wordcount/input/ s3n://bucket/wordcount/output/
 (both input path are directories)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (MAPREDUCE-3476) Optimize YARN API calls

2014-04-22 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He resolved MAPREDUCE-3476.


Resolution: Later

 Optimize YARN API calls
 ---

 Key: MAPREDUCE-3476
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3476
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: mrv2
Affects Versions: 0.23.0
Reporter: Ravi Prakash
Assignee: Vinod Kumar Vavilapalli

 Several YARN API calls are taking inordinately long. This might be a 
 performance blocker.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-3476) Optimize YARN API calls

2014-04-22 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13977231#comment-13977231
 ] 

Chen He commented on MAPREDUCE-3476:


Close it, and reopen it if necessary. Thank you [~raviprak] and [~vinodkv]

 Optimize YARN API calls
 ---

 Key: MAPREDUCE-3476
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3476
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: mrv2
Affects Versions: 0.23.0
Reporter: Ravi Prakash
Assignee: Vinod Kumar Vavilapalli

 Several YARN API calls are taking inordinately long. This might be a 
 performance blocker.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-4734) The history server should link back to NM logs if aggregation is incomplete / disabled

2014-04-22 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13977234#comment-13977234
 ] 

Chen He commented on MAPREDUCE-4734:


retarget it to 3.0

 The history server should link back to NM logs if aggregation is incomplete / 
 disabled
 --

 Key: MAPREDUCE-4734
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4734
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobhistoryserver, mrv2
Affects Versions: 0.23.4
Reporter: Siddharth Seth
Assignee: Siddharth Seth
 Attachments: MR4734_WIP.txt






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-4734) The history server should link back to NM logs if aggregation is incomplete / disabled

2014-04-22 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He updated MAPREDUCE-4734:
---

Target Version/s: 3.0.0  (was: 3.0.0, 0.23.11)

 The history server should link back to NM logs if aggregation is incomplete / 
 disabled
 --

 Key: MAPREDUCE-4734
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4734
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobhistoryserver, mrv2
Affects Versions: 0.23.4
Reporter: Siddharth Seth
Assignee: Siddharth Seth
 Attachments: MR4734_WIP.txt






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-4931) Add user-APIs for classpath precedence control

2014-04-22 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He updated MAPREDUCE-4931:
---

Target Version/s: 3.0.0  (was: 3.0.0, 0.23.11)

 Add user-APIs for classpath precedence control
 --

 Key: MAPREDUCE-4931
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4931
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: client
Affects Versions: 1.0.0
Reporter: Harsh J
Priority: Minor

 The feature config from MAPREDUCE-1938 of allowing tasks to start with 
 user-classes-first is fairly popular and can use its own API hooks in 
 Job/JobConf classes, making it easier to discover and use it rather than 
 continuing to keep it as an advanced param.
 I propose to add two APIs to Job/JobConf:
 {code}
 void setUserClassesTakesPrecedence(boolean)
 boolean userClassesTakesPrecedence()
 {code}
 Both of which, depending on their branch of commit, set the property 
 {{mapreduce.user.classpath.first}} (1.x) or 
 {{mapreduce.job.user.classpath.first}} (trunk, 2.x and if needed, in 0.23.x).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-4931) Add user-APIs for classpath precedence control

2014-04-22 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13977244#comment-13977244
 ] 

Chen He commented on MAPREDUCE-4931:


RETARGET TO 3.0

 Add user-APIs for classpath precedence control
 --

 Key: MAPREDUCE-4931
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4931
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: client
Affects Versions: 1.0.0
Reporter: Harsh J
Priority: Minor

 The feature config from MAPREDUCE-1938 of allowing tasks to start with 
 user-classes-first is fairly popular and can use its own API hooks in 
 Job/JobConf classes, making it easier to discover and use it rather than 
 continuing to keep it as an advanced param.
 I propose to add two APIs to Job/JobConf:
 {code}
 void setUserClassesTakesPrecedence(boolean)
 boolean userClassesTakesPrecedence()
 {code}
 Both of which, depending on their branch of commit, set the property 
 {{mapreduce.user.classpath.first}} (1.x) or 
 {{mapreduce.job.user.classpath.first}} (trunk, 2.x and if needed, in 0.23.x).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-4718) MapReduce fails If I pass a parameter as a S3 folder

2014-04-21 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13975567#comment-13975567
 ] 

Chen He commented on MAPREDUCE-4718:


Hi [~benkimkimben]
Thank you for the reply. Since it is not a problem for 2.x, would you mind 
remove 2.x from the target version? 

 MapReduce fails If I pass a parameter as a S3 folder
 

 Key: MAPREDUCE-4718
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4718
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: job submission
Affects Versions: 1.0.0, 1.0.3
 Environment: Hadoop with default configurations
Reporter: Benjamin Kim

 I'm running a wordcount MR as follows
 hadoop jar WordCount.jar wordcount.WordCountDriver 
 s3n://bucket/wordcount/input s3n://bucket/wordcount/output
  
 s3n://bucket/wordcount/input is a s3 object that contains other input files.
 However I get following NPE error
 12/10/02 18:56:23 INFO mapred.JobClient:  map 0% reduce 0%
 12/10/02 18:56:54 INFO mapred.JobClient:  map 50% reduce 0%
 12/10/02 18:56:56 INFO mapred.JobClient: Task Id : 
 attempt_201210021853_0001_m_01_0, Status : FAILED
 java.lang.NullPointerException
 at 
 org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsInputStream.close(NativeS3FileSystem.java:106)
 at java.io.BufferedInputStream.close(BufferedInputStream.java:451)
 at java.io.FilterInputStream.close(FilterInputStream.java:155)
 at org.apache.hadoop.util.LineReader.close(LineReader.java:83)
 at 
 org.apache.hadoop.mapreduce.lib.input.LineRecordReader.close(LineRecordReader.java:144)
 at 
 org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.close(MapTask.java:497)
 at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:765)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
 at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
 at org.apache.hadoop.mapred.Child.main(Child.java:249)
 MR runs fine if i specify more specific input path such as 
 s3n://bucket/wordcount/input/file.txt
 MR fails if I pass s3 folder as a parameter
 In summary,
 This works
  hadoop jar ./hadoop-examples-1.0.3.jar wordcount 
 /user/hadoop/wordcount/input/ s3n://bucket/wordcount/output/
 This doesn't work
  hadoop jar ./hadoop-examples-1.0.3.jar wordcount 
 s3n://bucket/wordcount/input/ s3n://bucket/wordcount/output/
 (both input path are directories)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-4718) MapReduce fails If I pass a parameter as a S3 folder

2014-04-21 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13975570#comment-13975570
 ] 

Chen He commented on MAPREDUCE-4718:


Or close it if it is not a problem for 1.x either. 

 MapReduce fails If I pass a parameter as a S3 folder
 

 Key: MAPREDUCE-4718
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4718
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: job submission
Affects Versions: 1.0.0, 1.0.3
 Environment: Hadoop with default configurations
Reporter: Benjamin Kim

 I'm running a wordcount MR as follows
 hadoop jar WordCount.jar wordcount.WordCountDriver 
 s3n://bucket/wordcount/input s3n://bucket/wordcount/output
  
 s3n://bucket/wordcount/input is a s3 object that contains other input files.
 However I get following NPE error
 12/10/02 18:56:23 INFO mapred.JobClient:  map 0% reduce 0%
 12/10/02 18:56:54 INFO mapred.JobClient:  map 50% reduce 0%
 12/10/02 18:56:56 INFO mapred.JobClient: Task Id : 
 attempt_201210021853_0001_m_01_0, Status : FAILED
 java.lang.NullPointerException
 at 
 org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsInputStream.close(NativeS3FileSystem.java:106)
 at java.io.BufferedInputStream.close(BufferedInputStream.java:451)
 at java.io.FilterInputStream.close(FilterInputStream.java:155)
 at org.apache.hadoop.util.LineReader.close(LineReader.java:83)
 at 
 org.apache.hadoop.mapreduce.lib.input.LineRecordReader.close(LineRecordReader.java:144)
 at 
 org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.close(MapTask.java:497)
 at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:765)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
 at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
 at org.apache.hadoop.mapred.Child.main(Child.java:249)
 MR runs fine if i specify more specific input path such as 
 s3n://bucket/wordcount/input/file.txt
 MR fails if I pass s3 folder as a parameter
 In summary,
 This works
  hadoop jar ./hadoop-examples-1.0.3.jar wordcount 
 /user/hadoop/wordcount/input/ s3n://bucket/wordcount/output/
 This doesn't work
  hadoop jar ./hadoop-examples-1.0.3.jar wordcount 
 s3n://bucket/wordcount/input/ s3n://bucket/wordcount/output/
 (both input path are directories)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Assigned] (MAPREDUCE-3677) If hadoop.security.authorization is set to true, NM is not starting.

2014-04-21 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He reassigned MAPREDUCE-3677:
--

Assignee: Chen He

 If hadoop.security.authorization is set to true, NM is not starting.
 --

 Key: MAPREDUCE-3677
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3677
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 0.23.0
Reporter: Ramgopal N
Assignee: Chen He

 I have the hadoop cluster setup with root user.Accidentally i have set 
 hadoop.security.authorization to true.I have not set any permissions in 
 policy.xml.When i am trying to start the NM with root user ...it is throwing 
 the following error
 Exception in thread main java.lang.NoClassDefFoundError: nodemanager
 Caused by: java.lang.ClassNotFoundException: nodemanager
 at java.net.URLClassLoader$1.run(URLClassLoader.java:200)
 at java.security.AccessController.doPrivileged(Native Method)
 at java.net.URLClassLoader.findClass(URLClassLoader.java:188)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:303)
 at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
 at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:316)
 Could not find the main class: nodemanager.  Program will exit.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (MAPREDUCE-4339) pi example job hangs on when run on hadoop 0.23.0 when capacity scheduler is included in the setting environment.

2014-04-21 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He resolved MAPREDUCE-4339.


Resolution: Cannot Reproduce

 pi example job hangs on when run on hadoop 0.23.0 when capacity scheduler is 
 included in the setting environment.
 -

 Key: MAPREDUCE-4339
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4339
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: examples, job submission, mrv2, scheduler
Affects Versions: 0.23.0
 Environment: Ubuntu Server 11.04, Hadoop 0.23.0, 
Reporter: srikanth ayalasomayajulu
  Labels: hadoop
 Fix For: 0.23.0

   Original Estimate: 48h
  Remaining Estimate: 48h

 Tried to include default capacity scheduler in hadoop and tried to run an 
 example pi program. The job hangs and no more output is getting displayed.
 Starting Job
 2012-06-12 22:10:02,524 INFO  ipc.YarnRPC (YarnRPC.java:create(47)) - 
 Creating YarnRPC for org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC
 2012-06-12 22:10:02,538 INFO  mapred.ResourceMgrDelegate 
 (ResourceMgrDelegate.java:init(95)) - Connecting to ResourceManager at 
 localhost/127.0.0.1:8030
 2012-06-12 22:10:02,539 INFO  ipc.HadoopYarnRPC 
 (HadoopYarnProtoRPC.java:getProxy(48)) - Creating a HadoopYarnProtoRpc proxy 
 for protocol interface org.apache.hadoop.yarn.api.ClientRMProtocol
 2012-06-12 22:10:02,665 INFO  mapred.ResourceMgrDelegate 
 (ResourceMgrDelegate.java:init(99)) - Connected to ResourceManager at 
 localhost/127.0.0.1:8030
 2012-06-12 22:10:02,727 WARN  conf.Configuration 
 (Configuration.java:handleDeprecation(326)) - fs.default.name is deprecated. 
 Instead, use fs.defaultFS
 2012-06-12 22:10:02,728 WARN  conf.Configuration 
 (Configuration.java:handleDeprecation(343)) - 
 mapred.used.genericoptionsparser is deprecated. Instead, use 
 mapreduce.client.genericoptionsparser.used
 2012-06-12 22:10:02,831 INFO  input.FileInputFormat 
 (FileInputFormat.java:listStatus(245)) - Total input paths to process : 10
 2012-06-12 22:10:02,900 INFO  mapreduce.JobSubmitter 
 (JobSubmitter.java:submitJobInternal(362)) - number of splits:10
 2012-06-12 22:10:03,044 INFO  mapred.YARNRunner 
 (YARNRunner.java:createApplicationSubmissionContext(279)) - AppMaster 
 capability = memory: 2048
 2012-06-12 22:10:03,286 INFO  mapred.YARNRunner 
 (YARNRunner.java:createApplicationSubmissionContext(355)) - Command to launch 
 container for ApplicationMaster is : $JAVA_HOME/bin/java 
 -Dlog4j.configuration=container-log4j.properties 
 -Dyarn.app.mapreduce.container.log.dir=LOG_DIR 
 -Dyarn.app.mapreduce.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA 
 -Xmx1536m org.apache.hadoop.mapreduce.v2.app.MRAppMaster 1LOG_DIR/stdout 
 2LOG_DIR/stderr 
 2012-06-12 22:10:03,370 INFO  mapred.ResourceMgrDelegate 
 (ResourceMgrDelegate.java:submitApplication(304)) - Submitted application 
 application_1339507608976_0002 to ResourceManager
 2012-06-12 22:10:03,432 INFO  mapreduce.Job 
 (Job.java:monitorAndPrintJob(1207)) - Running job: job_1339507608976_0002
 2012-06-12 22:10:04,443 INFO  mapreduce.Job 
 (Job.java:monitorAndPrintJob(1227)) -  map 0% reduce 0%



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-4339) pi example job hangs on when run on hadoop 0.23.0 when capacity scheduler is included in the setting environment.

2014-04-21 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13975646#comment-13975646
 ] 

Chen He commented on MAPREDUCE-4339:


I will close this issue since it can not be regenerated. Open if necessary. 

 pi example job hangs on when run on hadoop 0.23.0 when capacity scheduler is 
 included in the setting environment.
 -

 Key: MAPREDUCE-4339
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4339
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: examples, job submission, mrv2, scheduler
Affects Versions: 0.23.0
 Environment: Ubuntu Server 11.04, Hadoop 0.23.0, 
Reporter: srikanth ayalasomayajulu
  Labels: hadoop
 Fix For: 0.23.0

   Original Estimate: 48h
  Remaining Estimate: 48h

 Tried to include default capacity scheduler in hadoop and tried to run an 
 example pi program. The job hangs and no more output is getting displayed.
 Starting Job
 2012-06-12 22:10:02,524 INFO  ipc.YarnRPC (YarnRPC.java:create(47)) - 
 Creating YarnRPC for org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC
 2012-06-12 22:10:02,538 INFO  mapred.ResourceMgrDelegate 
 (ResourceMgrDelegate.java:init(95)) - Connecting to ResourceManager at 
 localhost/127.0.0.1:8030
 2012-06-12 22:10:02,539 INFO  ipc.HadoopYarnRPC 
 (HadoopYarnProtoRPC.java:getProxy(48)) - Creating a HadoopYarnProtoRpc proxy 
 for protocol interface org.apache.hadoop.yarn.api.ClientRMProtocol
 2012-06-12 22:10:02,665 INFO  mapred.ResourceMgrDelegate 
 (ResourceMgrDelegate.java:init(99)) - Connected to ResourceManager at 
 localhost/127.0.0.1:8030
 2012-06-12 22:10:02,727 WARN  conf.Configuration 
 (Configuration.java:handleDeprecation(326)) - fs.default.name is deprecated. 
 Instead, use fs.defaultFS
 2012-06-12 22:10:02,728 WARN  conf.Configuration 
 (Configuration.java:handleDeprecation(343)) - 
 mapred.used.genericoptionsparser is deprecated. Instead, use 
 mapreduce.client.genericoptionsparser.used
 2012-06-12 22:10:02,831 INFO  input.FileInputFormat 
 (FileInputFormat.java:listStatus(245)) - Total input paths to process : 10
 2012-06-12 22:10:02,900 INFO  mapreduce.JobSubmitter 
 (JobSubmitter.java:submitJobInternal(362)) - number of splits:10
 2012-06-12 22:10:03,044 INFO  mapred.YARNRunner 
 (YARNRunner.java:createApplicationSubmissionContext(279)) - AppMaster 
 capability = memory: 2048
 2012-06-12 22:10:03,286 INFO  mapred.YARNRunner 
 (YARNRunner.java:createApplicationSubmissionContext(355)) - Command to launch 
 container for ApplicationMaster is : $JAVA_HOME/bin/java 
 -Dlog4j.configuration=container-log4j.properties 
 -Dyarn.app.mapreduce.container.log.dir=LOG_DIR 
 -Dyarn.app.mapreduce.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA 
 -Xmx1536m org.apache.hadoop.mapreduce.v2.app.MRAppMaster 1LOG_DIR/stdout 
 2LOG_DIR/stderr 
 2012-06-12 22:10:03,370 INFO  mapred.ResourceMgrDelegate 
 (ResourceMgrDelegate.java:submitApplication(304)) - Submitted application 
 application_1339507608976_0002 to ResourceManager
 2012-06-12 22:10:03,432 INFO  mapreduce.Job 
 (Job.java:monitorAndPrintJob(1207)) - Running job: job_1339507608976_0002
 2012-06-12 22:10:04,443 INFO  mapreduce.Job 
 (Job.java:monitorAndPrintJob(1227)) -  map 0% reduce 0%



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-3677) If hadoop.security.authorization is set to true, NM is not starting.

2014-04-21 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976262#comment-13976262
 ] 

Chen He commented on MAPREDUCE-3677:


hadoop.security.authorization is for secue RPC accessibility. I am not sure 
why you need to start nodemanager as root. But I did investigation based on 
Hadoop 0.23.10.
 I change hadoop/bin/yarn a little bit. I am using Java 1.7.0_45 and it reports 
illegal argument -jvm.  I comment following lines in hadoop/bin/yarn:

  #if [[ $EUID -eq 0 ]]; then
   # YARN_OPTS=$YARN_OPTS -jvm server $YARN_NODEMANAGER_OPTS
  #else
YARN_OPTS=$YARN_OPTS -server $YARN_NODEMANAGER_OPTS
  #fi
 
It works fine. Feel free to make any comments. If there is no response, I will 
close this JIRA in 3 days.

 If hadoop.security.authorization is set to true, NM is not starting.
 --

 Key: MAPREDUCE-3677
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3677
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 0.23.0
Reporter: Ramgopal N
Assignee: Chen He

 I have the hadoop cluster setup with root user.Accidentally i have set 
 hadoop.security.authorization to true.I have not set any permissions in 
 policy.xml.When i am trying to start the NM with root user ...it is throwing 
 the following error
 Exception in thread main java.lang.NoClassDefFoundError: nodemanager
 Caused by: java.lang.ClassNotFoundException: nodemanager
 at java.net.URLClassLoader$1.run(URLClassLoader.java:200)
 at java.security.AccessController.doPrivileged(Native Method)
 at java.net.URLClassLoader.findClass(URLClassLoader.java:188)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:303)
 at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
 at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:316)
 Could not find the main class: nodemanager.  Program will exit.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-4931) Add user-APIs for classpath precedence control

2014-04-17 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973197#comment-13973197
 ] 

Chen He commented on MAPREDUCE-4931:


Hi [~qwertymaniac]
Does this JIRA still an issue for 2.x? If so, could you retarget it to 2.x?

 Add user-APIs for classpath precedence control
 --

 Key: MAPREDUCE-4931
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4931
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: client
Affects Versions: 1.0.0
Reporter: Harsh J
Priority: Minor

 The feature config from MAPREDUCE-1938 of allowing tasks to start with 
 user-classes-first is fairly popular and can use its own API hooks in 
 Job/JobConf classes, making it easier to discover and use it rather than 
 continuing to keep it as an advanced param.
 I propose to add two APIs to Job/JobConf:
 {code}
 void setUserClassesTakesPrecedence(boolean)
 boolean userClassesTakesPrecedence()
 {code}
 Both of which, depending on their branch of commit, set the property 
 {{mapreduce.user.classpath.first}} (1.x) or 
 {{mapreduce.job.user.classpath.first}} (trunk, 2.x and if needed, in 0.23.x).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-3089) Augment TestRMContainerAllocator to verify MAPREDUCE-2646

2014-04-17 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973208#comment-13973208
 ] 

Chen He commented on MAPREDUCE-3089:


Hi [~acmurthy]
Since both MAPREDUCE-3078 and MAPREDUCE-2646 are all resolved. Is this JIRA 
still an issue in 2.x? If so, could you retarget it to 2.x?

 Augment TestRMContainerAllocator to verify MAPREDUCE-2646
 -

 Key: MAPREDUCE-3089
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3089
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster, mrv2
Affects Versions: 0.23.0
Reporter: Arun C Murthy
Assignee: Vinod Kumar Vavilapalli
 Fix For: 0.24.0






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-4718) MapReduce fails If I pass a parameter as a S3 folder

2014-04-17 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973223#comment-13973223
 ] 

Chen He commented on MAPREDUCE-4718:


Hi [~benkimkimben]
This JIRA has no updates since 11/Oct/12. Is it still a problem? Right now, it 
is time to clean up 0.23 JIRAs. If it is still a problem in 2.x. Please 
retarget it to 2.x. Thanks!

 MapReduce fails If I pass a parameter as a S3 folder
 

 Key: MAPREDUCE-4718
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4718
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: job submission
Affects Versions: 1.0.0, 1.0.3
 Environment: Hadoop with default configurations
Reporter: Benjamin Kim

 I'm running a wordcount MR as follows
 hadoop jar WordCount.jar wordcount.WordCountDriver 
 s3n://bucket/wordcount/input s3n://bucket/wordcount/output
  
 s3n://bucket/wordcount/input is a s3 object that contains other input files.
 However I get following NPE error
 12/10/02 18:56:23 INFO mapred.JobClient:  map 0% reduce 0%
 12/10/02 18:56:54 INFO mapred.JobClient:  map 50% reduce 0%
 12/10/02 18:56:56 INFO mapred.JobClient: Task Id : 
 attempt_201210021853_0001_m_01_0, Status : FAILED
 java.lang.NullPointerException
 at 
 org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsInputStream.close(NativeS3FileSystem.java:106)
 at java.io.BufferedInputStream.close(BufferedInputStream.java:451)
 at java.io.FilterInputStream.close(FilterInputStream.java:155)
 at org.apache.hadoop.util.LineReader.close(LineReader.java:83)
 at 
 org.apache.hadoop.mapreduce.lib.input.LineRecordReader.close(LineRecordReader.java:144)
 at 
 org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.close(MapTask.java:497)
 at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:765)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
 at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
 at org.apache.hadoop.mapred.Child.main(Child.java:249)
 MR runs fine if i specify more specific input path such as 
 s3n://bucket/wordcount/input/file.txt
 MR fails if I pass s3 folder as a parameter
 In summary,
 This works
  hadoop jar ./hadoop-examples-1.0.3.jar wordcount 
 /user/hadoop/wordcount/input/ s3n://bucket/wordcount/output/
 This doesn't work
  hadoop jar ./hadoop-examples-1.0.3.jar wordcount 
 s3n://bucket/wordcount/input/ s3n://bucket/wordcount/output/
 (both input path are directories)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-4711) Append time elapsed since job-start-time for finished tasks

2014-04-16 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13971540#comment-13971540
 ] 

Chen He commented on MAPREDUCE-4711:


Hi [~raviprak]
Thank you for the patch. Would you mind update your patch then it can be 
applied to trunk? Another question, would you mind we retarget this JIRA to 2.5?

 Append time elapsed since job-start-time for finished tasks
 ---

 Key: MAPREDUCE-4711
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4711
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver
Affects Versions: 0.23.3
Reporter: Ravi Prakash
Assignee: Ravi Prakash
 Attachments: MAPREDUCE-4711.branch-0.23.patch


 In 0.20.x/1.x, the analyze job link gave this information
 bq. The last Map task task_sometask finished at (relative to the Job launch 
 time): 5/10 20:23:10 (1hrs, 27mins, 54sec)
 The time it took for the last task to finish needs to be calculated mentally 
 in 0.23. I believe we should print it next to the finish time.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-3406) Add node information to bin/mapred job -list-attempt-ids and other improvements

2014-04-16 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13971543#comment-13971543
 ] 

Chen He commented on MAPREDUCE-3406:


Hi [~raviprak], as you commented, this is a duplicated JIRA and it has been 
fixed. I will close this one. 

 Add node information to bin/mapred job -list-attempt-ids and other 
 improvements
 ---

 Key: MAPREDUCE-3406
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3406
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Affects Versions: 0.23.0
Reporter: Ravi Prakash
Assignee: Ravi Prakash
 Fix For: 0.24.0


 From [~rramya]
 Providing the NM information where the containers are scheduled in bin/mapred 
 job -list-attempt-ids will be helpful in automation, debugging and to avoid 
 grepping through the AM logs.
 From my own observation, the list-attempt-ids should list the attempt ids and 
 not require the arguments. The arguments if given, can be used to filter the 
 results. From the usage:
 bq. [-list-attempt-ids job-id task-type task-state]. Valid values for 
 task-type are MAP REDUCE JOB_SETUP JOB_CLEANUP TASK_CLEANUP. Valid values 
 for task-state are running, completed



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (MAPREDUCE-3406) Add node information to bin/mapred job -list-attempt-ids and other improvements

2014-04-16 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He resolved MAPREDUCE-3406.


  Resolution: Duplicate
Target Version/s: 2.0.0-alpha, 0.23.3, 3.0.0  (was: 0.23.3, 2.0.0-alpha, 
3.0.0)

 Add node information to bin/mapred job -list-attempt-ids and other 
 improvements
 ---

 Key: MAPREDUCE-3406
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3406
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Affects Versions: 0.23.0
Reporter: Ravi Prakash
Assignee: Ravi Prakash
 Fix For: 0.24.0


 From [~rramya]
 Providing the NM information where the containers are scheduled in bin/mapred 
 job -list-attempt-ids will be helpful in automation, debugging and to avoid 
 grepping through the AM logs.
 From my own observation, the list-attempt-ids should list the attempt ids and 
 not require the arguments. The arguments if given, can be used to filter the 
 results. From the usage:
 bq. [-list-attempt-ids job-id task-type task-state]. Valid values for 
 task-type are MAP REDUCE JOB_SETUP JOB_CLEANUP TASK_CLEANUP. Valid values 
 for task-state are running, completed



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-3476) Optimize YARN API calls

2014-04-16 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13971548#comment-13971548
 ] 

Chen He commented on MAPREDUCE-3476:


Hi [~vinodkv]
Is this still a issue in 2.x? If so, could you retarget it to 2.5?
 If not, would you mind close it?

 Optimize YARN API calls
 ---

 Key: MAPREDUCE-3476
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3476
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: mrv2
Affects Versions: 0.23.0
Reporter: Ravi Prakash
Assignee: Vinod Kumar Vavilapalli

 Several YARN API calls are taking inordinately long. This might be a 
 performance blocker.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-3191) docs for map output compression incorrectly reference SequenceFile

2014-04-16 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He updated MAPREDUCE-3191:
---

Attachment: MAPREDUCE-3191-v2.patch

 docs for map output compression incorrectly reference SequenceFile
 --

 Key: MAPREDUCE-3191
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3191
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.23.0
Reporter: Todd Lipcon
Priority: Trivial
  Labels: noob
 Attachments: MAPREDUCE-3191-v2.patch, MAPREDUCE-3191.patch


 The documentation currently says that map output compression uses 
 SequenceFile compression. This hasn't been true in several years, since we 
 use IFile for intermediate data now.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-3191) docs for map output compression incorrectly reference SequenceFile

2014-04-16 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13971554#comment-13971554
 ] 

Chen He commented on MAPREDUCE-3191:


Hi [~qwertymaniac]
Thank you for the comment. I have updated the patch. 

 docs for map output compression incorrectly reference SequenceFile
 --

 Key: MAPREDUCE-3191
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3191
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.23.0
Reporter: Todd Lipcon
Priority: Trivial
  Labels: noob
 Attachments: MAPREDUCE-3191-v2.patch, MAPREDUCE-3191.patch


 The documentation currently says that map output compression uses 
 SequenceFile compression. This hasn't been true in several years, since we 
 use IFile for intermediate data now.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-3191) docs for map output compression incorrectly reference SequenceFile

2014-04-16 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13971559#comment-13971559
 ] 

Chen He commented on MAPREDUCE-3191:


Hi [~tlipcon]
Would you mind retarget this issue to 2.5?

 docs for map output compression incorrectly reference SequenceFile
 --

 Key: MAPREDUCE-3191
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3191
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.23.0
Reporter: Todd Lipcon
Priority: Trivial
  Labels: documentation, noob
 Attachments: MAPREDUCE-3191-v2.patch, MAPREDUCE-3191.patch


 The documentation currently says that map output compression uses 
 SequenceFile compression. This hasn't been true in several years, since we 
 use IFile for intermediate data now.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-3191) docs for map output compression incorrectly reference SequenceFile

2014-04-16 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He updated MAPREDUCE-3191:
---

Labels: documentation noob  (was: noob)

 docs for map output compression incorrectly reference SequenceFile
 --

 Key: MAPREDUCE-3191
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3191
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.23.0
Reporter: Todd Lipcon
Priority: Trivial
  Labels: documentation, noob
 Attachments: MAPREDUCE-3191-v2.patch, MAPREDUCE-3191.patch


 The documentation currently says that map output compression uses 
 SequenceFile compression. This hasn't been true in several years, since we 
 use IFile for intermediate data now.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Assigned] (MAPREDUCE-3191) docs for map output compression incorrectly reference SequenceFile

2014-04-16 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He reassigned MAPREDUCE-3191:
--

Assignee: Chen He

 docs for map output compression incorrectly reference SequenceFile
 --

 Key: MAPREDUCE-3191
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3191
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.23.0
Reporter: Todd Lipcon
Assignee: Chen He
Priority: Trivial
  Labels: documentation, noob
 Attachments: MAPREDUCE-3191-v2.patch, MAPREDUCE-3191.patch


 The documentation currently says that map output compression uses 
 SequenceFile compression. This hasn't been true in several years, since we 
 use IFile for intermediate data now.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-4339) pi example job hangs on when run on hadoop 0.23.0 when capacity scheduler is included in the setting environment.

2014-04-16 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13971571#comment-13971571
 ] 

Chen He commented on MAPREDUCE-4339:


Hi [~srikraj8341] 
Is this still a issue for 2.x? If not, would you mind we close it?

 pi example job hangs on when run on hadoop 0.23.0 when capacity scheduler is 
 included in the setting environment.
 -

 Key: MAPREDUCE-4339
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4339
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: examples, job submission, mrv2, scheduler
Affects Versions: 0.23.0
 Environment: Ubuntu Server 11.04, Hadoop 0.23.0, 
Reporter: srikanth ayalasomayajulu
  Labels: hadoop
 Fix For: 0.23.0

   Original Estimate: 48h
  Remaining Estimate: 48h

 Tried to include default capacity scheduler in hadoop and tried to run an 
 example pi program. The job hangs and no more output is getting displayed.
 Starting Job
 2012-06-12 22:10:02,524 INFO  ipc.YarnRPC (YarnRPC.java:create(47)) - 
 Creating YarnRPC for org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC
 2012-06-12 22:10:02,538 INFO  mapred.ResourceMgrDelegate 
 (ResourceMgrDelegate.java:init(95)) - Connecting to ResourceManager at 
 localhost/127.0.0.1:8030
 2012-06-12 22:10:02,539 INFO  ipc.HadoopYarnRPC 
 (HadoopYarnProtoRPC.java:getProxy(48)) - Creating a HadoopYarnProtoRpc proxy 
 for protocol interface org.apache.hadoop.yarn.api.ClientRMProtocol
 2012-06-12 22:10:02,665 INFO  mapred.ResourceMgrDelegate 
 (ResourceMgrDelegate.java:init(99)) - Connected to ResourceManager at 
 localhost/127.0.0.1:8030
 2012-06-12 22:10:02,727 WARN  conf.Configuration 
 (Configuration.java:handleDeprecation(326)) - fs.default.name is deprecated. 
 Instead, use fs.defaultFS
 2012-06-12 22:10:02,728 WARN  conf.Configuration 
 (Configuration.java:handleDeprecation(343)) - 
 mapred.used.genericoptionsparser is deprecated. Instead, use 
 mapreduce.client.genericoptionsparser.used
 2012-06-12 22:10:02,831 INFO  input.FileInputFormat 
 (FileInputFormat.java:listStatus(245)) - Total input paths to process : 10
 2012-06-12 22:10:02,900 INFO  mapreduce.JobSubmitter 
 (JobSubmitter.java:submitJobInternal(362)) - number of splits:10
 2012-06-12 22:10:03,044 INFO  mapred.YARNRunner 
 (YARNRunner.java:createApplicationSubmissionContext(279)) - AppMaster 
 capability = memory: 2048
 2012-06-12 22:10:03,286 INFO  mapred.YARNRunner 
 (YARNRunner.java:createApplicationSubmissionContext(355)) - Command to launch 
 container for ApplicationMaster is : $JAVA_HOME/bin/java 
 -Dlog4j.configuration=container-log4j.properties 
 -Dyarn.app.mapreduce.container.log.dir=LOG_DIR 
 -Dyarn.app.mapreduce.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA 
 -Xmx1536m org.apache.hadoop.mapreduce.v2.app.MRAppMaster 1LOG_DIR/stdout 
 2LOG_DIR/stderr 
 2012-06-12 22:10:03,370 INFO  mapred.ResourceMgrDelegate 
 (ResourceMgrDelegate.java:submitApplication(304)) - Submitted application 
 application_1339507608976_0002 to ResourceManager
 2012-06-12 22:10:03,432 INFO  mapreduce.Job 
 (Job.java:monitorAndPrintJob(1207)) - Running job: job_1339507608976_0002
 2012-06-12 22:10:04,443 INFO  mapreduce.Job 
 (Job.java:monitorAndPrintJob(1227)) -  map 0% reduce 0%



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-4734) The history server should link back to NM logs if aggregation is incomplete / disabled

2014-04-16 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13971573#comment-13971573
 ] 

Chen He commented on MAPREDUCE-4734:


Hi [~sseth]
Thank you for working on this. Would you mind we retarget it to 2.x?

 The history server should link back to NM logs if aggregation is incomplete / 
 disabled
 --

 Key: MAPREDUCE-4734
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4734
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobhistoryserver, mrv2
Affects Versions: 0.23.4
Reporter: Siddharth Seth
Assignee: Siddharth Seth
 Attachments: MR4734_WIP.txt






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-3191) docs for map output compression incorrectly reference SequenceFile

2014-04-16 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13971630#comment-13971630
 ] 

Chen He commented on MAPREDUCE-3191:


Thank you for the remindering, [~jeagles]. I checked [~phatak.dev]'s 
activities, the latest one was in July 2012. I will ask patch owner and wait 
for 3 days before taking it in the next time.

 docs for map output compression incorrectly reference SequenceFile
 --

 Key: MAPREDUCE-3191
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3191
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.23.0
Reporter: Todd Lipcon
Assignee: Chen He
Priority: Trivial
  Labels: documentation, noob
 Attachments: MAPREDUCE-3191-v2.patch, MAPREDUCE-3191.patch


 The documentation currently says that map output compression uses 
 SequenceFile compression. This hasn't been true in several years, since we 
 use IFile for intermediate data now.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-3191) docs for map output compression incorrectly reference SequenceFile

2014-04-16 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He updated MAPREDUCE-3191:
---

Target Version/s: 0.23.0, 2.5.0  (was: 0.23.0)

 docs for map output compression incorrectly reference SequenceFile
 --

 Key: MAPREDUCE-3191
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3191
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.23.0
Reporter: Todd Lipcon
Assignee: Chen He
Priority: Trivial
  Labels: documentation, noob
 Attachments: MAPREDUCE-3191-v2.patch, MAPREDUCE-3191.patch


 The documentation currently says that map output compression uses 
 SequenceFile compression. This hasn't been true in several years, since we 
 use IFile for intermediate data now.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-3191) docs for map output compression incorrectly reference SequenceFile

2014-04-16 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He updated MAPREDUCE-3191:
---

Target Version/s: 0.23.0  (was: 0.23.0, 2.5.0)

 docs for map output compression incorrectly reference SequenceFile
 --

 Key: MAPREDUCE-3191
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3191
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.23.0
Reporter: Todd Lipcon
Assignee: Chen He
Priority: Trivial
  Labels: documentation, noob
 Attachments: MAPREDUCE-3191-v2.patch, MAPREDUCE-3191.patch


 The documentation currently says that map output compression uses 
 SequenceFile compression. This hasn't been true in several years, since we 
 use IFile for intermediate data now.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-3191) docs for map output compression incorrectly reference SequenceFile

2014-04-16 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13971850#comment-13971850
 ] 

Chen He commented on MAPREDUCE-3191:


Hi [~phatak.dev]
Feel free to take it at any time. 

 docs for map output compression incorrectly reference SequenceFile
 --

 Key: MAPREDUCE-3191
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3191
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.23.0
Reporter: Todd Lipcon
Assignee: Chen He
Priority: Trivial
  Labels: documentation, noob
 Fix For: 3.0.0, 0.23.11, 2.5.0, 2.4.1

 Attachments: MAPREDUCE-3191-v2.patch, MAPREDUCE-3191.patch


 The documentation currently says that map output compression uses 
 SequenceFile compression. This hasn't been true in several years, since we 
 use IFile for intermediate data now.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-4711) Append time elapsed since job-start-time for finished tasks

2014-04-16 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13972246#comment-13972246
 ] 

Chen He commented on MAPREDUCE-4711:


Hi [~raviprak]
Thank you for the reply. Right now, it is time to clean up 0.23 JIRAs and 
retarget them to 2.x if they still exist in 2.x.  Would you mind to retarget 
this issue to 2.5? 

 Append time elapsed since job-start-time for finished tasks
 ---

 Key: MAPREDUCE-4711
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4711
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver
Affects Versions: 0.23.3
Reporter: Ravi Prakash
Assignee: Ravi Prakash
 Attachments: MAPREDUCE-4711.branch-0.23.patch


 In 0.20.x/1.x, the analyze job link gave this information
 bq. The last Map task task_sometask finished at (relative to the Job launch 
 time): 5/10 20:23:10 (1hrs, 27mins, 54sec)
 The time it took for the last task to finish needs to be calculated mentally 
 in 0.23. I believe we should print it next to the finish time.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-3182) loadgen ignores -m command line when writing random data

2014-04-14 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13968473#comment-13968473
 ] 

Chen He commented on MAPREDUCE-3182:


There two GenericLoadGenerator classes in current Hadoop source code. 
  One is under org.apache.hadoop.mapreduce package. It has two documentation 
problems. Firstly, it does not actually parse the -m command line option but 
still show this option in the Usage. Secondly, if user does not specify the 
input directory, it will create input data using RandomWriter with default 
setting( 10GB per map task and 10 map task per node). However, it does not show 
this option in the Usage. 

  The other is under org.apache.hadoop.mapred package; It is an older version 
of GenericLoadGenerator. It has the second documentation problem described in 
above paragraph. 

 loadgen ignores -m command line when writing random data
 

 Key: MAPREDUCE-3182
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3182
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2, test
Affects Versions: 0.23.0, 0.24.0, 2.3.0
Reporter: Jonathan Eagles
Assignee: Chen He

 If no input directories are specified, loadgen goes into a special mode where 
 random data is generated and written. In that mode, setting the number of 
 mappers (-m command line option) is overridden by a calculation. Instead, it 
 should take into consideration the user specified number of mappers and fall 
 back to the calculation. In addition, update the documentation as well to 
 match the new behavior in the code.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-3182) loadgen ignores -m command line when writing random data

2014-04-14 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He updated MAPREDUCE-3182:
---

Affects Version/s: 2.3.0

 loadgen ignores -m command line when writing random data
 

 Key: MAPREDUCE-3182
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3182
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2, test
Affects Versions: 0.23.0, 0.24.0, 2.3.0
Reporter: Jonathan Eagles
Assignee: Chen He

 If no input directories are specified, loadgen goes into a special mode where 
 random data is generated and written. In that mode, setting the number of 
 mappers (-m command line option) is overridden by a calculation. Instead, it 
 should take into consideration the user specified number of mappers and fall 
 back to the calculation. In addition, update the documentation as well to 
 match the new behavior in the code.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-3182) loadgen ignores -m command line when writing random data

2014-04-14 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He updated MAPREDUCE-3182:
---

Target Version/s: 2.5.0  (was: 0.23.0, 0.24.0)

 loadgen ignores -m command line when writing random data
 

 Key: MAPREDUCE-3182
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3182
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2, test
Affects Versions: 0.23.0, 0.24.0, 2.3.0
Reporter: Jonathan Eagles
Assignee: Chen He

 If no input directories are specified, loadgen goes into a special mode where 
 random data is generated and written. In that mode, setting the number of 
 mappers (-m command line option) is overridden by a calculation. Instead, it 
 should take into consideration the user specified number of mappers and fall 
 back to the calculation. In addition, update the documentation as well to 
 match the new behavior in the code.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-3182) loadgen ignores -m command line when writing random data

2014-04-14 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He updated MAPREDUCE-3182:
---

Attachment: MAPREDUCE-3182.patch

 loadgen ignores -m command line when writing random data
 

 Key: MAPREDUCE-3182
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3182
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2, test
Affects Versions: 0.23.0, 0.24.0, 2.3.0
Reporter: Jonathan Eagles
Assignee: Chen He
 Attachments: MAPREDUCE-3182.patch


 If no input directories are specified, loadgen goes into a special mode where 
 random data is generated and written. In that mode, setting the number of 
 mappers (-m command line option) is overridden by a calculation. Instead, it 
 should take into consideration the user specified number of mappers and fall 
 back to the calculation. In addition, update the documentation as well to 
 match the new behavior in the code.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-3418) If map output is not found, shuffle runs in tight loop

2014-04-12 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He updated MAPREDUCE-3418:
---

Target Version/s: 2.5.0  (was: 0.23.0, 2.5.0)

 If map output is not found, shuffle runs in tight loop
 --

 Key: MAPREDUCE-3418
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3418
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.23.0
Reporter: John George

 Sharad Agarwal bumped into this while simulating fetch failures. 
 Removed the map output directory. Shuffle runs in tight loop throwing
 :
 2011-06-01 09:02:20,511 WARN org.apache.hadoop.mapreduce.task.reduce.Fetcher: 
 Invalid map id 
 java.lang.IllegalArgumentException: TaskAttemptId string : TTP/1.1 500 
 Internal Server Error
 Content-Type: text/plain; charset=UTF is not properly formed
 at 
 org.apache.hadoop.mapreduce.TaskAttemptID.forName(TaskAttemptID.java:174)
 at 
 org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:284)
 at 
 org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:251)
 at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:149)
 Fetch failure is not triggered.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-3418) If map output is not found, shuffle runs in tight loop

2014-04-12 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He updated MAPREDUCE-3418:
---

Affects Version/s: 2.3.0

 If map output is not found, shuffle runs in tight loop
 --

 Key: MAPREDUCE-3418
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3418
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.23.0, 2.3.0
Reporter: John George

 Sharad Agarwal bumped into this while simulating fetch failures. 
 Removed the map output directory. Shuffle runs in tight loop throwing
 :
 2011-06-01 09:02:20,511 WARN org.apache.hadoop.mapreduce.task.reduce.Fetcher: 
 Invalid map id 
 java.lang.IllegalArgumentException: TaskAttemptId string : TTP/1.1 500 
 Internal Server Error
 Content-Type: text/plain; charset=UTF is not properly formed
 at 
 org.apache.hadoop.mapreduce.TaskAttemptID.forName(TaskAttemptID.java:174)
 at 
 org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:284)
 at 
 org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:251)
 at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:149)
 Fetch failure is not triggered.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-3174) app master UI goes away when app finishes - not very user friendly

2014-04-11 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13966493#comment-13966493
 ] 

Chen He commented on MAPREDUCE-3174:


According to [~tlipcon]'s comments, this is only for problem in 0.23. Is this 
correct? If so, do we need to file a similar one for 2.x?

 app master UI goes away when app finishes - not very user friendly
 --

 Key: MAPREDUCE-3174
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3174
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Affects Versions: 0.23.0
Reporter: Thomas Graves

 A user can go to the application master UI to see the stats on the app, but 
 as soon as the app finishes that UI goes away and user is left with nothing.  
 A redirect to history server or similar would be much better.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-3174) app master UI goes away when app finishes - not very user friendly

2014-04-11 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13967010#comment-13967010
 ] 

Chen He commented on MAPREDUCE-3174:


Hi [~tgraves], Thank you for the comments. I will retarget this issue to 2.x.

 app master UI goes away when app finishes - not very user friendly
 --

 Key: MAPREDUCE-3174
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3174
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Affects Versions: 0.23.0
Reporter: Thomas Graves

 A user can go to the application master UI to see the stats on the app, but 
 as soon as the app finishes that UI goes away and user is left with nothing.  
 A redirect to history server or similar would be much better.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-3174) app master UI goes away when app finishes - not very user friendly

2014-04-11 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He updated MAPREDUCE-3174:
---

Target Version/s: 2.5.0  (was: 0.23.0)

 app master UI goes away when app finishes - not very user friendly
 --

 Key: MAPREDUCE-3174
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3174
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Affects Versions: 0.23.0
Reporter: Thomas Graves

 A user can go to the application master UI to see the stats on the app, but 
 as soon as the app finishes that UI goes away and user is left with nothing.  
 A redirect to history server or similar would be much better.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Assigned] (MAPREDUCE-3182) loadgen ignores -m command line when writing random data

2014-04-11 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He reassigned MAPREDUCE-3182:
--

Assignee: Chen He

 loadgen ignores -m command line when writing random data
 

 Key: MAPREDUCE-3182
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3182
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2, test
Affects Versions: 0.23.0, 0.24.0
Reporter: Jonathan Eagles
Assignee: Chen He

 If no input directories are specified, loadgen goes into a special mode where 
 random data is generated and written. In that mode, setting the number of 
 mappers (-m command line option) is overridden by a calculation. Instead, it 
 should take into consideration the user specified number of mappers and fall 
 back to the calculation. In addition, update the documentation as well to 
 match the new behavior in the code.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-3182) loadgen ignores -m command line when writing random data

2014-04-11 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13967012#comment-13967012
 ] 

Chen He commented on MAPREDUCE-3182:


I will take look at this issue. 

 loadgen ignores -m command line when writing random data
 

 Key: MAPREDUCE-3182
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3182
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2, test
Affects Versions: 0.23.0, 0.24.0
Reporter: Jonathan Eagles

 If no input directories are specified, loadgen goes into a special mode where 
 random data is generated and written. In that mode, setting the number of 
 mappers (-m command line option) is overridden by a calculation. Instead, it 
 should take into consideration the user specified number of mappers and fall 
 back to the calculation. In addition, update the documentation as well to 
 match the new behavior in the code.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-3418) If map output is not found, shuffle runs in tight loop

2014-04-11 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13967014#comment-13967014
 ] 

Chen He commented on MAPREDUCE-3418:


Is this still a issue for Hadoop 2.x? 
If not, I will close it on April 14th, 2014.

 If map output is not found, shuffle runs in tight loop
 --

 Key: MAPREDUCE-3418
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3418
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.23.0
Reporter: John George

 Sharad Agarwal bumped into this while simulating fetch failures. 
 Removed the map output directory. Shuffle runs in tight loop throwing
 :
 2011-06-01 09:02:20,511 WARN org.apache.hadoop.mapreduce.task.reduce.Fetcher: 
 Invalid map id 
 java.lang.IllegalArgumentException: TaskAttemptId string : TTP/1.1 500 
 Internal Server Error
 Content-Type: text/plain; charset=UTF is not properly formed
 at 
 org.apache.hadoop.mapreduce.TaskAttemptID.forName(TaskAttemptID.java:174)
 at 
 org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:284)
 at 
 org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:251)
 at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:149)
 Fetch failure is not triggered.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-3418) If map output is not found, shuffle runs in tight loop

2014-04-11 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13967039#comment-13967039
 ] 

Chen He commented on MAPREDUCE-3418:


Thank you for the reply, [~vinodkv].
I will retarget this issue towards 2.x. 

 If map output is not found, shuffle runs in tight loop
 --

 Key: MAPREDUCE-3418
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3418
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.23.0
Reporter: John George

 Sharad Agarwal bumped into this while simulating fetch failures. 
 Removed the map output directory. Shuffle runs in tight loop throwing
 :
 2011-06-01 09:02:20,511 WARN org.apache.hadoop.mapreduce.task.reduce.Fetcher: 
 Invalid map id 
 java.lang.IllegalArgumentException: TaskAttemptId string : TTP/1.1 500 
 Internal Server Error
 Content-Type: text/plain; charset=UTF is not properly formed
 at 
 org.apache.hadoop.mapreduce.TaskAttemptID.forName(TaskAttemptID.java:174)
 at 
 org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:284)
 at 
 org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:251)
 at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:149)
 Fetch failure is not triggered.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-3418) If map output is not found, shuffle runs in tight loop

2014-04-11 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He updated MAPREDUCE-3418:
---

Target Version/s: 0.23.0, 2.5.0  (was: 0.23.0)

 If map output is not found, shuffle runs in tight loop
 --

 Key: MAPREDUCE-3418
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3418
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.23.0
Reporter: John George

 Sharad Agarwal bumped into this while simulating fetch failures. 
 Removed the map output directory. Shuffle runs in tight loop throwing
 :
 2011-06-01 09:02:20,511 WARN org.apache.hadoop.mapreduce.task.reduce.Fetcher: 
 Invalid map id 
 java.lang.IllegalArgumentException: TaskAttemptId string : TTP/1.1 500 
 Internal Server Error
 Content-Type: text/plain; charset=UTF is not properly formed
 at 
 org.apache.hadoop.mapreduce.TaskAttemptID.forName(TaskAttemptID.java:174)
 at 
 org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:284)
 at 
 org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:251)
 at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:149)
 Fetch failure is not triggered.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Assigned] (MAPREDUCE-519) Fix capacity scheduler's documentation

2014-03-24 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He reassigned MAPREDUCE-519:
-

Assignee: Chen He

 Fix capacity scheduler's documentation
 --

 Key: MAPREDUCE-519
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-519
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Vinod Kumar Vavilapalli
Assignee: Chen He

 Parent jira for all documentation related issues in capacity scheduler.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Assigned] (MAPREDUCE-5758) Reducer local data is not deleted until job completes

2014-03-22 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He reassigned MAPREDUCE-5758:
--

Assignee: Chen He

 Reducer local data is not deleted until job completes
 -

 Key: MAPREDUCE-5758
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5758
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.10, 2.2.0
Reporter: Jason Lowe
Assignee: Chen He

 Ran into an instance where a reducer shuffled a large amount of data and 
 subsequently failed, but the local data is not purged when the task fails but 
 only after the entire job completes.  This wastes disk space unnecessarily 
 since the data is no longer relevant after the task-attempt exits.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5804) TestMRJobsWithProfiler#testProfiler timesout

2014-03-21 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13943063#comment-13943063
 ] 

Chen He commented on MAPREDUCE-5804:


+1, download the patch, apply to trunk, run the test, and no timeout reported.

 TestMRJobsWithProfiler#testProfiler timesout
 

 Key: MAPREDUCE-5804
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5804
 Project: Hadoop Map/Reduce
  Issue Type: Test
Affects Versions: 2.4.0
Reporter: Mit Desai
Assignee: Mit Desai
 Attachments: LOG.txt, MAPREDUCE-5804.patch


 {noformat}
 testProfiler(org.apache.hadoop.mapreduce.v2.TestMRJobsWithProfiler)  Time 
 elapsed: 154.972 sec   ERROR!
 java.lang.Exception: test timed out after 12 milliseconds
   at java.io.UnixFileSystem.getBooleanAttributes0(Native Method)
   at java.io.UnixFileSystem.getBooleanAttributes(UnixFileSystem.java:242)
   at java.io.File.exists(File.java:813)
   at sun.misc.URLClassPath$FileLoader.getResource(URLClassPath.java:1080)
   at sun.misc.URLClassPath.getResource(URLClassPath.java:199)
   at java.net.URLClassLoader$1.run(URLClassLoader.java:358)
   at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
   at java.security.AccessController.doPrivileged(Native Method)
   at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
   at org.apache.log4j.spi.LoggingEvent.init(LoggingEvent.java:165)
   at org.apache.log4j.Category.forcedLog(Category.java:391)
   at org.apache.log4j.Category.log(Category.java:856)
   at 
 org.apache.commons.logging.impl.Log4JLogger.warn(Log4JLogger.java:208)
   at 
 org.apache.hadoop.mapred.ClientServiceDelegate.invoke(ClientServiceDelegate.java:338)
   at 
 org.apache.hadoop.mapred.ClientServiceDelegate.getJobStatus(ClientServiceDelegate.java:419)
   at org.apache.hadoop.mapred.YARNRunner.getJobStatus(YARNRunner.java:532)
   at org.apache.hadoop.mapreduce.Job$1.run(Job.java:314)
   at org.apache.hadoop.mapreduce.Job$1.run(Job.java:311)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1570)
   at org.apache.hadoop.mapreduce.Job.updateStatus(Job.java:311)
   at org.apache.hadoop.mapreduce.Job.isComplete(Job.java:599)
   at org.apache.hadoop.mapreduce.Job.monitorAndPrintJob(Job.java:1344)
   at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1306)
   at 
 org.apache.hadoop.mapreduce.v2.TestMRJobsWithProfiler.testProfiler(TestMRJobsWithProfiler.java:138)
 Results :
 Tests in error: 
   TestMRJobsWithProfiler.testProfiler:138 »  test timed out after 12 
 millise...
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5688) MRAppMaster causes TestStagingCleanup to fail intermittently with JDK7

2014-02-07 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13894825#comment-13894825
 ] 

Chen He commented on MAPREDUCE-5688:


Thank you Mit.
+1 patch is good. 
It will be great if you submit an updated version and get +1 from Hadoop QA.


 MRAppMaster causes TestStagingCleanup to fail intermittently with JDK7
 --

 Key: MAPREDUCE-5688
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5688
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 3.0.0, 2.3.0
Reporter: Mit Desai
Assignee: Mit Desai
  Labels: java7
 Attachments: MAPREDUCE-5688.patch


 Due to random ordering ordering in JDK7, the test 
 TestStagingCleanup#testDeletionofStagingOnKillLastTry is failing
 {noformat}
 Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 4.231 sec  
 FAILURE!
 test(org.apache.hadoop.mapreduce.v2.app.TestStagingCleanup)  Time elapsed: 
 3882 sec   ERROR!
 java.lang.NullPointerException
   at 
 org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.serviceStop(JobHistoryEventHandler.java:349)
   at 
 org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221)
   at 
 org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:52)
   at 
 org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:80)
   at 
 org.apache.hadoop.service.CompositeService.stop(CompositeService.java:159)
   at 
 org.apache.hadoop.service.CompositeService.serviceStop(CompositeService.java:132)
   at 
 org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221)
   at 
 org.apache.hadoop.mapreduce.v2.app.MRAppMaster$MRAppMasterShutdownHook.run(MRAppMaster.java:1399)
   at 
 org.apache.hadoop.mapreduce.v2.app.TestStagingCleanup.testDeletionofStagingOnKillLastTry(TestStagingCleanup.java:239)
   at 
 org.apache.hadoop.mapreduce.v2.app.TestStagingCleanup.test(TestStagingCleanup.java:82)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at junit.framework.TestCase.runTest(TestCase.java:168)
   at junit.framework.TestCase.runBare(TestCase.java:134)
   at junit.framework.TestResult$1.protect(TestResult.java:110)
   at junit.framework.TestResult.runProtected(TestResult.java:128)
   at junit.framework.TestResult.run(TestResult.java:113)
   at junit.framework.TestCase.run(TestCase.java:124)
   at junit.framework.TestSuite.runTest(TestSuite.java:243)
   at junit.framework.TestSuite.run(TestSuite.java:238)
   at 
 org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:83)
   at 
 org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:242)
   at 
 org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:137)
   at 
 org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:112)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at 
 org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:189)
   at 
 org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:165)
   at 
 org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(ProviderFactory.java:85)
   at 
 org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:115)
   at 
 org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:75)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (MAPREDUCE-1380) Adaptive Scheduler

2014-02-07 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13894966#comment-13894966
 ] 

Chen He commented on MAPREDUCE-1380:


This patch may need to be updated against Hadoop 1.x or 2.x

 Adaptive Scheduler
 --

 Key: MAPREDUCE-1380
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1380
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Reporter: Jordà Polo
Priority: Minor
 Attachments: MAPREDUCE-1380_0.1.patch, MAPREDUCE-1380_1.1.patch, 
 MAPREDUCE-1380_1.1.pdf


 The Adaptive Scheduler is a pluggable Hadoop scheduler that automatically 
 adjusts the amount of used resources depending on the performance of jobs and 
 on user-defined high-level business goals.
 Existing Hadoop schedulers are focused on managing large, static clusters in 
 which nodes are added or removed manually. On the other hand, the goal of 
 this scheduler is to improve the integration of Hadoop and the applications 
 that run on top of it with environments that allow a more dynamic 
 provisioning of resources.
 The current implementation is quite straightforward. Users specify a deadline 
 at job submission time, and the scheduler adjusts the resources to meet that 
 deadline (at the moment, the scheduler can be configured to either minimize 
 or maximize the amount of resources). If multiple jobs are run 
 simultaneously, the scheduler prioritizes them by deadline. Note that the 
 current approach to estimate the completion time of jobs is quite simplistic: 
 it is based on the time it takes to finish each task, so it works well with 
 regular jobs, but there is still room for improvement for unpredictable jobs.
 The idea is to further integrate it with cloud-like and virtual environments 
 (such as Amazon EC2, Emotive, etc.) so that if, for instance, a job isn't 
 able to meet its deadline, the scheduler automatically requests more 
 resources.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (MAPREDUCE-3486) All jobs of all queues will be returned, whethor a particular queueName is specified or not

2014-02-05 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He updated MAPREDUCE-3486:
---

Affects Version/s: (was: 0.24.0)
   (was: 0.23.0)
   1.2.2
   1.3.0
   1.1.3

 All jobs of all queues will be returned, whethor a particular queueName is 
 specified or not
 ---

 Key: MAPREDUCE-3486
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3486
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 1.1.3, 1.3.0, 1.2.2
Reporter: XieXianshan
Assignee: XieXianshan
Priority: Minor
 Attachments: MAPREDUCE-3486.patch


 JobTracker.getJobsFromQueue(queueName) will return all jobs of all queues 
 about the jobtracker even though i specify a queueName. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (MAPREDUCE-3486) All jobs of all queues will be returned, whethor a particular queueName is specified or not

2014-02-05 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13892199#comment-13892199
 ] 

Chen He commented on MAPREDUCE-3486:


change the affects version to 1.x 

 All jobs of all queues will be returned, whethor a particular queueName is 
 specified or not
 ---

 Key: MAPREDUCE-3486
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3486
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 1.1.3, 1.3.0, 1.2.2
Reporter: XieXianshan
Assignee: XieXianshan
Priority: Minor
 Attachments: MAPREDUCE-3486.patch


 JobTracker.getJobsFromQueue(queueName) will return all jobs of all queues 
 about the jobtracker even though i specify a queueName. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (MAPREDUCE-5643) DynamicMR: A Dynamic Slot Utilization Optimization Framework for Hadoop MRv1

2014-02-05 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13892219#comment-13892219
 ] 

Chen He commented on MAPREDUCE-5643:


This is interesting. I would suggest you upload your design documents including 
your DHFS, DSTS, and DLMS. I have following questions about your  scheduler.

1) if map and reduce slots can exchange, it is possible that some small jobs 
can not finish in time;
2) is there any load-balancing feature in your scheduling for map and reduce 
stage?
3) if reduce tasks steal map slot, some local map task will become non-local 
task because of shortage of map slots;


 DynamicMR: A Dynamic Slot Utilization Optimization Framework for Hadoop MRv1
 

 Key: MAPREDUCE-5643
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5643
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: contrib/fair-share
Affects Versions: 1.2.1
Reporter: tang shanjiang
Assignee: tang shanjiang
  Labels: performance
 Attachments: DynamicMR-0.1.1-patch, README


 Hadoop MRv1 uses the slot-based resource model with the static configuration 
 of map/reduce slots. There is a strict utility constrain that map tasks can 
 only run on map slots and reduce tasks can only use reduce slots. Due to the 
 rigid execution order between map and reduce tasks in a MapReduce 
 environment, slots can be severely under-utilized, which significantly 
 degrades the performance. 
 In contrast to YARN that gives up the slot-based resource model and propose a 
 container-based model to maximize the resource utilization via unawareness of 
 the types of map/reduce tasks, we keep the slot-based model and propose a 
 dynamic slot utilization optimization system called DynamicMR to improve the 
 performance of Hadoop by maximizing the slots utilization as well as slot 
 utilization efficiency while guaranteeing the fairness across pools. It 
 consists of three types of scheduling components, namely, Dynamic Hadoop Fair 
 Scheduler (DHFS), Dynamic Speculative Task Scheduler (DSTS), and Data 
 Locality Maximization Scheduler (DLMS).
 Our tests show that DynamicMR outperforms YARN for MapReduce workloads with 
 multiple jobs, especially when the number of jobs is large. The explanation 
 is that, given a certain number of resources, it is obvious that the 
 performance for the case with a ratio control of concurrently running map and 
 reduce tasks is better than without control. Because without control, it 
 easily occurs that there are too many reduce tasks running, causing the 
 network to be a bottleneck seriously. For YARN, both map and reduce tasks can 
 run on any idle container. There is no control mechanism for the ratio of 
 resource allocation between map and reduce tasks. It means that when there 
 are pending reduce tasks, the idle container will be most likely possessed by 
 them. In contrast, DynamicMR follows the traditional slot-based model. In 
 contrast to the ’hard’ constrain of slot allocation that map slots have to be 
 allocated to map tasks and reduce tasks should be dispatched to reduce tasks, 
 DynamicMR obeys a ’soft’ constrain of slot allocation to allow that map slot 
 can be allocated to reduce task and vice versa. But whenever there are 
 pending map tasks, the map slot should be given to map tasks first, and the 
 rule is similar for reduce tasks. It means that, the traditional way of 
 static map/reduce slot configuration for the ratio control of running 
 map/reduce tasks still works for DynamicMR. In comparison to YARN which 
 maximizes the resource utilization only, DynamicMR can maximize the slot 
 resource utilization and meanwhile dynamically control the ratio of running 
 map/reduce tasks via map/reduce slot configuration.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (MAPREDUCE-5603) Ability to disable FileInputFormat listLocatedStatus optimization to save client memory

2014-02-05 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13892249#comment-13892249
 ] 

Chen He commented on MAPREDUCE-5603:


+1, patch is good. 

 Ability to disable FileInputFormat listLocatedStatus optimization to save 
 client memory
 ---

 Key: MAPREDUCE-5603
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5603
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: client, mrv2
Affects Versions: 0.23.10, 2.2.0
Reporter: Jason Lowe
Assignee: Jason Lowe
Priority: Minor
 Attachments: MAPREDUCE-5603.patch, MAPREDUCE-5603.patch


 It would be nice if users had the option to disable the listLocatedStatus 
 optimization in FileInputFormat to save client memory.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (MAPREDUCE-5670) CombineFileRecordReader should report progress when moving to the next file

2014-01-24 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He updated MAPREDUCE-5670:
---

Status: Open  (was: Patch Available)

 CombineFileRecordReader should report progress when moving to the next file
 ---

 Key: MAPREDUCE-5670
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5670
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.9
Reporter: Jason Lowe
Assignee: Chen He
Priority: Minor
 Attachments: MR-5670.patch, MR-5670v2.patch, MR-5670v3.patch


 If a combine split consists of many empty files (i.e.: no record found by 
 the underlying record reader) then theoretically a task can timeout due to 
 lack of reported progress.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (MAPREDUCE-5670) CombineFileRecordReader should report progress when moving to the next file

2014-01-24 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He updated MAPREDUCE-5670:
---

Attachment: MR-5670v3.patch

 CombineFileRecordReader should report progress when moving to the next file
 ---

 Key: MAPREDUCE-5670
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5670
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.9
Reporter: Jason Lowe
Assignee: Chen He
Priority: Minor
 Attachments: MR-5670.patch, MR-5670v2.patch, MR-5670v3.patch


 If a combine split consists of many empty files (i.e.: no record found by 
 the underlying record reader) then theoretically a task can timeout due to 
 lack of reported progress.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (MAPREDUCE-5670) CombineFileRecordReader should report progress when moving to the next file

2014-01-24 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13881338#comment-13881338
 ] 

Chen He commented on MAPREDUCE-5670:


Hi [~jlowe]
 
Patch has been updated following your suggestion.

 CombineFileRecordReader should report progress when moving to the next file
 ---

 Key: MAPREDUCE-5670
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5670
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.9
Reporter: Jason Lowe
Assignee: Chen He
Priority: Minor
 Fix For: 2.4.0, 0.23.10

 Attachments: MR-5670.patch, MR-5670v2.patch, MR-5670v3.patch


 If a combine split consists of many empty files (i.e.: no record found by 
 the underlying record reader) then theoretically a task can timeout due to 
 lack of reported progress.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (MAPREDUCE-5670) CombineFileRecordReader should report progress when moving to the next file

2014-01-24 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He updated MAPREDUCE-5670:
---

Fix Version/s: 0.23.10
   2.4.0
   Status: Patch Available  (was: Open)

 CombineFileRecordReader should report progress when moving to the next file
 ---

 Key: MAPREDUCE-5670
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5670
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.9
Reporter: Jason Lowe
Assignee: Chen He
Priority: Minor
 Fix For: 2.4.0, 0.23.10

 Attachments: MR-5670.patch, MR-5670v2.patch, MR-5670v3.patch


 If a combine split consists of many empty files (i.e.: no record found by 
 the underlying record reader) then theoretically a task can timeout due to 
 lack of reported progress.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (MAPREDUCE-5670) CombineFileRecordReader should report progress when moving to the next file

2014-01-24 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He updated MAPREDUCE-5670:
---

Attachment: (was: MR-5670.patch)

 CombineFileRecordReader should report progress when moving to the next file
 ---

 Key: MAPREDUCE-5670
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5670
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.9
Reporter: Jason Lowe
Assignee: Chen He
Priority: Minor
 Attachments: MR-5670v3.patch


 If a combine split consists of many empty files (i.e.: no record found by 
 the underlying record reader) then theoretically a task can timeout due to 
 lack of reported progress.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (MAPREDUCE-5670) CombineFileRecordReader should report progress when moving to the next file

2014-01-24 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He updated MAPREDUCE-5670:
---

Attachment: (was: MR-5670v2.patch)

 CombineFileRecordReader should report progress when moving to the next file
 ---

 Key: MAPREDUCE-5670
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5670
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.9
Reporter: Jason Lowe
Assignee: Chen He
Priority: Minor
 Attachments: MR-5670v3.patch


 If a combine split consists of many empty files (i.e.: no record found by 
 the underlying record reader) then theoretically a task can timeout due to 
 lack of reported progress.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (MAPREDUCE-5670) CombineFileRecordReader should report progress when moving to the next file

2014-01-22 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He updated MAPREDUCE-5670:
---

Status: Open  (was: Patch Available)

 CombineFileRecordReader should report progress when moving to the next file
 ---

 Key: MAPREDUCE-5670
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5670
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.9
Reporter: Jason Lowe
Assignee: Chen He
Priority: Minor
 Attachments: MR-5670.patch


 If a combine split consists of many empty files (i.e.: no record found by 
 the underlying record reader) then theoretically a task can timeout due to 
 lack of reported progress.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (MAPREDUCE-5670) CombineFileRecordReader should report progress when moving to the next file

2014-01-22 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He updated MAPREDUCE-5670:
---

Attachment: MR-5670v2.patch

 CombineFileRecordReader should report progress when moving to the next file
 ---

 Key: MAPREDUCE-5670
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5670
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.9
Reporter: Jason Lowe
Assignee: Chen He
Priority: Minor
 Attachments: MR-5670.patch, MR-5670v2.patch


 If a combine split consists of many empty files (i.e.: no record found by 
 the underlying record reader) then theoretically a task can timeout due to 
 lack of reported progress.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


  1   2   >