[jira] [Commented] (MAPREDUCE-5519) Change JobClient to use YarnClient to interact with RM

2013-09-19 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13772672#comment-13772672
 ] 

Sandy Ryza commented on MAPREDUCE-5519:
---

What does it use instead?

> Change JobClient to use YarnClient to interact with RM
> --
>
> Key: MAPREDUCE-5519
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5519
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Jian He
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (MAPREDUCE-5519) Change JobClient to use YarnClient to interact with RM

2013-09-19 Thread Jian He (JIRA)
Jian He created MAPREDUCE-5519:
--

 Summary: Change JobClient to use YarnClient to interact with RM
 Key: MAPREDUCE-5519
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5519
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Jian He




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5518) Fix typo "can't read paritions file"

2013-09-19 Thread Tsuyoshi OZAWA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13772609#comment-13772609
 ] 

Tsuyoshi OZAWA commented on MAPREDUCE-5518:
---

+1

> Fix typo "can't read paritions file"
> 
>
> Key: MAPREDUCE-5518
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5518
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: examples
>Affects Versions: 3.0.0
>Reporter: Albert Chu
>Assignee: Albert Chu
>Priority: Trivial
> Attachments: MAPREDUCE-5518.patch
>
>
> Noticed a spelling error when I saw this error message
> {noformat}
> 13/09/19 13:25:08 INFO mapreduce.Job: Task Id : 
> attempt_1379622083112_0002_m_000114_0, Status : FAILED
> Error: java.lang.IllegalArgumentException: can't read paritions file
> {noformat}
> "paritions" should be "partitions"

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5518) Fix typo "can't read paritions file"

2013-09-19 Thread Tsuyoshi OZAWA (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsuyoshi OZAWA updated MAPREDUCE-5518:
--

Hadoop Flags: Reviewed

> Fix typo "can't read paritions file"
> 
>
> Key: MAPREDUCE-5518
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5518
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: examples
>Affects Versions: 3.0.0
>Reporter: Albert Chu
>Assignee: Albert Chu
>Priority: Trivial
> Attachments: MAPREDUCE-5518.patch
>
>
> Noticed a spelling error when I saw this error message
> {noformat}
> 13/09/19 13:25:08 INFO mapreduce.Job: Task Id : 
> attempt_1379622083112_0002_m_000114_0, Status : FAILED
> Error: java.lang.IllegalArgumentException: can't read paritions file
> {noformat}
> "paritions" should be "partitions"

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5502) History link in resource manager is broken for KILLED jobs

2013-09-19 Thread Vrushali C (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13772605#comment-13772605
 ] 

Vrushali C commented on MAPREDUCE-5502:
---

The diff that I pasted does not "appear" correct and seems like I can't edit it 
now. To summarize the two changes proposed are:

1) if (status.getState() != JobStatus.State.RUNNING) changes to  if 
(status.getState() == JobStatus.State.PREP)

2) if (status.getState() != JobStatus.State.KILLED) extends to 
if ((status.getState() != JobStatus.State.KILLED) && (status.getState() != 
JobStatus.State.FAILED) && (status.getState() != JobStatus.State.SUCCEEDED))

> History link in resource manager is broken for KILLED jobs
> --
>
> Key: MAPREDUCE-5502
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5502
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.0.5-alpha
>Reporter: Vrushali C
>Assignee: Vrushali C
>  Labels: ui
>
> History link in resource manager is broken for KILLED jobs.
> Seems to happen with jobs with State 'KILLED' and FinalStatus 'KILLED'. If 
> the State is 'FINISHED' and FinalStatus is 'KILLED', then the "History" link 
> is fine.
> It isn't easy to reproduce the problem since the time at which the app is 
> killed determines the state it ends up in, which is hard to guess. these 
> particular jobs seem to get a Diagnostics message of "Application killed by 
> user." where as the other killed jobs get " Kill Job received from client 
> job_1378766187901_0002
> Job received Kill while in RUNNING state. "

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5502) History link in resource manager is broken for KILLED jobs

2013-09-19 Thread Vrushali C (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13772604#comment-13772604
 ] 

Vrushali C commented on MAPREDUCE-5502:
---


Right. So the fix seems to be in two places in YARNRunner. 

I think that the first call to resMgrDelegate.killApplication should occur when 
the job is in PREP state, (as opposed to !RUNNING in the current code). At this 
time, there is no RUNNING/FAILED/SUCCEEDED/KILLED status since it probably has 
not even started running. Hence the kill to RM and return would make sense. In 
this case, the application ends up in KILLED/KILLED which is correct according 
to me. The "Tracking URL: History" on the cluster/app/application_number page 
points to itself, which is also correct in this case I think.

The second call to resMgrDelegate.killApplication should occur when the 
JobStatus is in any of the terminal states - KILLED/SUCCEEDED/FAILED. In this 
case, the application ends up in FINISHED/KILLED , FINISHED/SUCCEEDED (I 
haven't yet experimented with FAILED). The tracking URL on the application page 
is also updated correctly.

The changes are:

--- 
a/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/main/java/org/apache/hadoop/mapred/YARNRunner.java
+++ 
b/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/main/java/org/apache/hadoop/mapred/YARNRunner.java
@@ -550,8 +550,12 @@ public JobStatus getJobStatus(JobID jobID) throws 
IOException,
   public void killJob(JobID arg0) throws IOException, InterruptedException {
 /* check if the status is not running, if not send kill to RM */
 JobStatus status = clientCache.getClient(arg0).getJobStatus(arg0);
-if (status.getState() != JobStatus.State.RUNNING) {
+if ( (status.getState() == JobStatus.State.PREP) ) { 
   resMgrDelegate.killApplication(TypeConverter.toYarn(arg0).getAppId());
   return;
 }
 
@@ -574,7 +578,11 @@ public void killJob(JobID arg0) throws IOException, 
InterruptedException {
 } catch(IOException io) {
   LOG.debug("Error when checking for application status", io);
 }
-if (status.getState() != JobStatus.State.KILLED) {
+if ( (status.getState() != JobStatus.State.KILLED) && 
+   (status.getState() != JobStatus.State.FAILED) &&
+(status.getState() != JobStatus.State.SUCCEEDED) ){
   resMgrDelegate.killApplication(TypeConverter.toYarn(arg0).getAppId());
 }

What do you think?



> History link in resource manager is broken for KILLED jobs
> --
>
> Key: MAPREDUCE-5502
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5502
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.0.5-alpha
>Reporter: Vrushali C
>Assignee: Vrushali C
>  Labels: ui
>
> History link in resource manager is broken for KILLED jobs.
> Seems to happen with jobs with State 'KILLED' and FinalStatus 'KILLED'. If 
> the State is 'FINISHED' and FinalStatus is 'KILLED', then the "History" link 
> is fine.
> It isn't easy to reproduce the problem since the time at which the app is 
> killed determines the state it ends up in, which is hard to guess. these 
> particular jobs seem to get a Diagnostics message of "Application killed by 
> user." where as the other killed jobs get " Kill Job received from client 
> job_1378766187901_0002
> Job received Kill while in RUNNING state. "

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5518) Fix typo "can't read paritions file"

2013-09-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13772591#comment-13772591
 ] 

Hadoop QA commented on MAPREDUCE-5518:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12604172/MAPREDUCE-5518.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-examples.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4019//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4019//console

This message is automatically generated.

> Fix typo "can't read paritions file"
> 
>
> Key: MAPREDUCE-5518
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5518
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: examples
>Affects Versions: 3.0.0
>Reporter: Albert Chu
>Assignee: Albert Chu
>Priority: Trivial
> Attachments: MAPREDUCE-5518.patch
>
>
> Noticed a spelling error when I saw this error message
> {noformat}
> 13/09/19 13:25:08 INFO mapreduce.Job: Task Id : 
> attempt_1379622083112_0002_m_000114_0, Status : FAILED
> Error: java.lang.IllegalArgumentException: can't read paritions file
> {noformat}
> "paritions" should be "partitions"

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-2288) JT Availability

2013-09-19 Thread Tsuyoshi OZAWA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13772586#comment-13772586
 ] 

Tsuyoshi OZAWA commented on MAPREDUCE-2288:
---

Does this feature mean MRAppMaster's HA?

> JT Availability
> ---
>
> Key: MAPREDUCE-2288
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2288
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: jobtracker
>Reporter: Eli Collins
>
> This is an umbrella jira, like HDFS-1064, for discussing and providing 
> references to jobtracker availability jiras (eg from JT restart on a host or 
> to cross host fail-over).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5518) Fix typo "can't read paritions file"

2013-09-19 Thread Tsuyoshi OZAWA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13772584#comment-13772584
 ] 

Tsuyoshi OZAWA commented on MAPREDUCE-5518:
---

LGTM, so I submitted your patch.

> Fix typo "can't read paritions file"
> 
>
> Key: MAPREDUCE-5518
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5518
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: examples
>Affects Versions: 3.0.0
>Reporter: Albert Chu
>Assignee: Albert Chu
>Priority: Trivial
> Attachments: MAPREDUCE-5518.patch
>
>
> Noticed a spelling error when I saw this error message
> {noformat}
> 13/09/19 13:25:08 INFO mapreduce.Job: Task Id : 
> attempt_1379622083112_0002_m_000114_0, Status : FAILED
> Error: java.lang.IllegalArgumentException: can't read paritions file
> {noformat}
> "paritions" should be "partitions"

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5518) Fix typo "can't read paritions file"

2013-09-19 Thread Tsuyoshi OZAWA (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsuyoshi OZAWA updated MAPREDUCE-5518:
--

Assignee: Albert Chu
  Status: Patch Available  (was: Open)

> Fix typo "can't read paritions file"
> 
>
> Key: MAPREDUCE-5518
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5518
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: examples
>Affects Versions: 3.0.0
>Reporter: Albert Chu
>Assignee: Albert Chu
>Priority: Trivial
> Attachments: MAPREDUCE-5518.patch
>
>
> Noticed a spelling error when I saw this error message
> {noformat}
> 13/09/19 13:25:08 INFO mapreduce.Job: Task Id : 
> attempt_1379622083112_0002_m_000114_0, Status : FAILED
> Error: java.lang.IllegalArgumentException: can't read paritions file
> {noformat}
> "paritions" should be "partitions"

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5481) TestUberAM timeout

2013-09-19 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13772559#comment-13772559
 ] 

Xuan Gong commented on MAPREDUCE-5481:
--

Looks like the issue is on testSleepJob. I commented out the testSleepJob, we 
would not meet the timeout issue.

> TestUberAM timeout
> --
>
> Key: MAPREDUCE-5481
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5481
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2, test
>Affects Versions: 3.0.0
>Reporter: Jason Lowe
>Assignee: Xuan Gong
>Priority: Blocker
>
> TestUberAM has been timing out on trunk for some time now and surefire then 
> fails the build.  I'm not able to reproduce it locally, but the Jenkins 
> builds have been seeing it fairly consistently.  See 
> https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1529/console

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5518) Fix typo "can't read paritions file"

2013-09-19 Thread Albert Chu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Albert Chu updated MAPREDUCE-5518:
--

Attachment: MAPREDUCE-5518.patch

No tests added, it's a trivial typo fix.

> Fix typo "can't read paritions file"
> 
>
> Key: MAPREDUCE-5518
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5518
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: examples
>Affects Versions: 3.0.0
>Reporter: Albert Chu
>Priority: Trivial
> Attachments: MAPREDUCE-5518.patch
>
>
> Noticed a spelling error when I saw this error message
> {noformat}
> 13/09/19 13:25:08 INFO mapreduce.Job: Task Id : 
> attempt_1379622083112_0002_m_000114_0, Status : FAILED
> Error: java.lang.IllegalArgumentException: can't read paritions file
> {noformat}
> "paritions" should be "partitions"

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5505) Clients should be notified job finished only after job successfully unregistered

2013-09-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13772533#comment-13772533
 ] 

Hadoop QA commented on MAPREDUCE-5505:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12604165/MAPREDUCE-5505.3.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 4 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The following test timeouts occurred in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs:

org.apache.hadoop.mapreduce.v2.app.TestRMContainerAllocator

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4018//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4018//console

This message is automatically generated.

> Clients should be notified job finished only after job successfully 
> unregistered 
> -
>
> Key: MAPREDUCE-5505
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5505
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Jian He
>Assignee: Zhijie Shen
> Attachments: MAPREDUCE-5505.1.patch, MAPREDUCE-5505.1.patch, 
> MAPREDUCE-5505.3.patch
>
>
> This is to make sure user is notified job finished after job is really done. 
> This does increase client latency but can reduce some races during unregister 
> like YARN-540

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (MAPREDUCE-5518) Fix typo "can't read paritions file"

2013-09-19 Thread Albert Chu (JIRA)
Albert Chu created MAPREDUCE-5518:
-

 Summary: Fix typo "can't read paritions file"
 Key: MAPREDUCE-5518
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5518
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: examples
Affects Versions: 3.0.0
Reporter: Albert Chu
Priority: Trivial


Noticed a spelling error when I saw this error message

{noformat}
13/09/19 13:25:08 INFO mapreduce.Job: Task Id : 
attempt_1379622083112_0002_m_000114_0, Status : FAILED
Error: java.lang.IllegalArgumentException: can't read paritions file
{noformat}

"paritions" should be "partitions"

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (MAPREDUCE-5515) Application Manager UI does not appear with Https enabled

2013-09-19 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli resolved MAPREDUCE-5515.


  Resolution: Fixed
Hadoop Flags: Reviewed

Committed this together with YARN-1203. Thanks Omkar!

Will post the fix-version once the corresponding tags are available.

> Application Manager UI does not appear with Https enabled
> -
>
> Key: MAPREDUCE-5515
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5515
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Omkar Vinit Joshi
>Assignee: Omkar Vinit Joshi
> Attachments: MAPREDUCE-5515.txt
>
>
> related issue YARN-1203. We need to disable https for MR-AM by default as 
> they will need access to keystore which can not be granted in the cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5505) Clients should be notified job finished only after job successfully unregistered

2013-09-19 Thread Zhijie Shen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhijie Shen updated MAPREDUCE-5505:
---

Status: Patch Available  (was: Open)

> Clients should be notified job finished only after job successfully 
> unregistered 
> -
>
> Key: MAPREDUCE-5505
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5505
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Jian He
>Assignee: Zhijie Shen
> Attachments: MAPREDUCE-5505.1.patch, MAPREDUCE-5505.1.patch, 
> MAPREDUCE-5505.3.patch
>
>
> This is to make sure user is notified job finished after job is really done. 
> This does increase client latency but can reduce some races during unregister 
> like YARN-540

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5505) Clients should be notified job finished only after job successfully unregistered

2013-09-19 Thread Zhijie Shen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhijie Shen updated MAPREDUCE-5505:
---

Attachment: MAPREDUCE-5505.3.patch

Thanks Jian and Bikas for the comments. I've uploaded a new patch, which uses 
atomic boolean instead. It is placed at MRAppMaster, and exposed through 
AppContext, as other variables are. As the flag is put in MRAppMaster, instead 
of setting it in ClientService#serviceStop(), MRAppMaster#shutDownJob(), where 
other services have been stopped already, and ClientService is to be stopped. 
Therefore, it should be equivalent to what Bikas proposed, but simplify the 
code.

> Clients should be notified job finished only after job successfully 
> unregistered 
> -
>
> Key: MAPREDUCE-5505
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5505
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Jian He
>Assignee: Zhijie Shen
> Attachments: MAPREDUCE-5505.1.patch, MAPREDUCE-5505.1.patch, 
> MAPREDUCE-5505.3.patch
>
>
> This is to make sure user is notified job finished after job is really done. 
> This does increase client latency but can reduce some races during unregister 
> like YARN-540

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5514) TestRMContainerAllocator fails on trunk

2013-09-19 Thread Omkar Vinit Joshi (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13772497#comment-13772497
 ] 

Omkar Vinit Joshi commented on MAPREDUCE-5514:
--

Can you check this ? is it trying to resolve ip?
inside SecurityUtil.java
{code}
boolean useIp = conf.getBoolean(
  CommonConfigurationKeys.HADOOP_SECURITY_TOKEN_SERVICE_USE_IP,
  CommonConfigurationKeys.HADOOP_SECURITY_TOKEN_SERVICE_USE_IP_DEFAULT);
{code}

> TestRMContainerAllocator fails on trunk
> ---
>
> Key: MAPREDUCE-5514
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5514
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Zhijie Shen
>Assignee: Zhijie Shen
>Priority: Blocker
> Attachments: 
> org.apache.hadoop.mapreduce.v2.app.TestRMContainerAllocator-output.txt
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5481) TestUberAM timeout

2013-09-19 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13772495#comment-13772495
 ] 

Xuan Gong commented on MAPREDUCE-5481:
--

[~jlowe][~masokan] I did the localtest, too. Looks like the client can connect 
to RM. NM do the heartbeat normally, container status looks fine. But after 
mapper tasks finish, the reducer task will never start. The more weird thing is 
that (the reducer task will never start) happens when we set number of reducer 
is 1, but if i increase the number of reducer to 2, the reducers can start to 
work...

I double checked the 2.1.beta, this test works, but it can not work on trunk. I 
am wondering if we made some changes after 2.1.beta to fail this test.

Any idea ?

> TestUberAM timeout
> --
>
> Key: MAPREDUCE-5481
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5481
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2, test
>Affects Versions: 3.0.0
>Reporter: Jason Lowe
>Assignee: Xuan Gong
>Priority: Blocker
>
> TestUberAM has been timing out on trunk for some time now and surefire then 
> fails the build.  I'm not able to reproduce it locally, but the Jenkins 
> builds have been seeing it fairly consistently.  See 
> https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1529/console

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5505) Clients should be notified job finished only after job successfully unregistered

2013-09-19 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13772484#comment-13772484
 ] 

Bikas Saha commented on MAPREDUCE-5505:
---

We could do that but lets leave it out of the scope of the current jira. For 
this one, lets keep doing what we used to do.

> Clients should be notified job finished only after job successfully 
> unregistered 
> -
>
> Key: MAPREDUCE-5505
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5505
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Jian He
>Assignee: Zhijie Shen
> Attachments: MAPREDUCE-5505.1.patch, MAPREDUCE-5505.1.patch
>
>
> This is to make sure user is notified job finished after job is really done. 
> This does increase client latency but can reduce some races during unregister 
> like YARN-540

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5515) Application Manager UI does not appear with Https enabled

2013-09-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13772483#comment-13772483
 ] 

Hudson commented on MAPREDUCE-5515:
---

SUCCESS: Integrated in Hadoop-trunk-Commit #4446 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/4446/])
YARN-1203. Changed YARN web-app proxy to handle http and https URLs from AM 
registration and finish correctly. Contributed by Omkar Vinit Joshi.
MAPREDUCE-5515. Fixed MR AM's webapp to depend on a new config 
mapreduce.ssl.enabled to enable https and disabling it by default as MR AM needs
to set up its own certificates etc and not depend on clusters'. Contributed by 
Omkar Vinit Joshi. (vinodkv: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1524864)
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/http/HttpConfig.java
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/MRAppMaster.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/client/MRClientService.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMCommunicator.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/webapp/AppController.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/webapp/JobBlock.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/webapp/NavBlock.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/webapp/TaskPage.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/webapp/WebAppUtil.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/webapp/dao/AMAttemptInfo.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRConfig.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/JobHistoryServer.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/webapp/HsJobBlock.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/webapp/HsTaskPage.java
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/FinishApplicationMasterRequest.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/RegisterApplicationMasterRequest.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-web-proxy/src/main/java/org/apache/hadoop/yarn/server/webproxy/ProxyUriUtils.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-web-proxy/src/main/java/org/apache/hadoop/yarn/server/webproxy/WebAppProxyServlet.java


> Application Manager UI does not appear with Https enabled
> -
>
> Key: MAPREDUCE-5515
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5515
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Omkar Vinit Joshi
>Assignee: Omkar Vinit Joshi
> Attachments: MAPREDUCE-5515.txt
>
>
> related issue YARN-1203. We need to disable https for MR-AM by default as 
> they will need access to keystore which can not be granted in the cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators

[jira] [Commented] (MAPREDUCE-5507) MapReduce reducer ramp down is suboptimal with potential job-hanging issues

2013-09-19 Thread Omkar Vinit Joshi (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13772479#comment-13772479
 ] 

Omkar Vinit Joshi commented on MAPREDUCE-5507:
--

also there looks to be a problem with below code. You can either preempt 
reducer or schedule new but not both at the same time  any thoughts? 
Planning to fix this as a part of this

{code}
if (recalculateReduceSchedule) {
  preemptReducesIfNeeded();
  scheduleReduces(
  getJob().getTotalMaps(), completedMaps,
  scheduledRequests.maps.size(), scheduledRequests.reduces.size(), 
  assignedRequests.maps.size(), assignedRequests.reduces.size(),
  mapResourceReqt, reduceResourceReqt,
  pendingReduces.size(), 
  maxReduceRampupLimit, reduceSlowStart);
  recalculateReduceSchedule = false;
}
{code}

> MapReduce reducer ramp down is suboptimal with potential job-hanging issues
> ---
>
> Key: MAPREDUCE-5507
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5507
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Omkar Vinit Joshi
>Assignee: Omkar Vinit Joshi
>
> Today if we are setting "yarn.app.mapreduce.am.job.reduce.rampup.limit" and 
> "mapreduce.job.reduce.slowstart.completedmaps" then reducers are launched 
> more aggressively. However the calculation to either Ramp up or Ramp down 
> reducer is not done in most optimal way. 
> * If MR AM at any point sees situation something like 
> ** scheduledMaps : 30
> ** scheduledReducers : 10
> ** assignedMaps : 0
> ** assignedReducers : 11
> ** finishedMaps : 120
> ** headroom : 756 ( when your map /reduce task needs only 512mb)
> * then today it simply hangs because it thinks that there is sufficient room 
> to launch one more mapper and therefore there is no need to ramp down. 
> However, if this continues forever then this is not the correct way / optimal 
> way.
> * Ideally for MR AM when it sees that assignedMaps drops have dropped to 0 
> and there are running reducers around then it should wait for certain time ( 
> upper limited by average map task completion time ... for heuristic 
> sake)..but after that if still it doesn't get new container for map task then 
> it should preempt the reducer one by one with some interval and should ramp 
> up slowly...
> ** Preemption of reducers can be done in little smarter way
> *** preempt reducer on a node manager for which there is any pending map 
> request.
> *** otherwise preempt any other reducer. MR AM will contribute to getting new 
> mapper by releasing such a reducer / container because it will reduce its 
> cluster consumption and thereby may become candidate for an allocation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5507) MapReduce reducer ramp down is suboptimal with potential job-hanging issues

2013-09-19 Thread Omkar Vinit Joshi (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13772473#comment-13772473
 ] 

Omkar Vinit Joshi commented on MAPREDUCE-5507:
--

Potential problem I see here is that reducer preemption logic is mainly 
dependent on headroom (available resources) returned by RM. After discussing 
with [~vinodkv] and [~sseth] offline.. There are certain important points we 
need to take care of
* If we ever hit the situation where I have 
assignedMaps=0,assignedReducers>0,scheduledMaps>0,scheduledRed>=0...then 
** I should wait for some time..
*** we are proposing time to be min[ (some percentage of average map reduce 
task completion time) , (some configurable number * AM-RM heartbeat interval) ]
** if we don't get any new container for map task during above interval then we 
will follow 
*** first remove all the scheduled reducer requests as done today in 
RMContainerAllocator#preemptReducesIfNeeded()
*** remove as many reducers as required to allocate a single map task.
** We should keep doing above steps repeatedly after above interval of time if 
we don't get any new map task. Also we should avoid ramping up later and cap 
the reducer count to the current running reducers as there is no point in 
requesting and canceling later the reducer requests/ killing running reducers 
in future (As we already using up to the capacity of the running user).



> MapReduce reducer ramp down is suboptimal with potential job-hanging issues
> ---
>
> Key: MAPREDUCE-5507
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5507
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Omkar Vinit Joshi
>Assignee: Omkar Vinit Joshi
>
> Today if we are setting "yarn.app.mapreduce.am.job.reduce.rampup.limit" and 
> "mapreduce.job.reduce.slowstart.completedmaps" then reducers are launched 
> more aggressively. However the calculation to either Ramp up or Ramp down 
> reducer is not done in most optimal way. 
> * If MR AM at any point sees situation something like 
> ** scheduledMaps : 30
> ** scheduledReducers : 10
> ** assignedMaps : 0
> ** assignedReducers : 11
> ** finishedMaps : 120
> ** headroom : 756 ( when your map /reduce task needs only 512mb)
> * then today it simply hangs because it thinks that there is sufficient room 
> to launch one more mapper and therefore there is no need to ramp down. 
> However, if this continues forever then this is not the correct way / optimal 
> way.
> * Ideally for MR AM when it sees that assignedMaps drops have dropped to 0 
> and there are running reducers around then it should wait for certain time ( 
> upper limited by average map task completion time ... for heuristic 
> sake)..but after that if still it doesn't get new container for map task then 
> it should preempt the reducer one by one with some interval and should ramp 
> up slowly...
> ** Preemption of reducers can be done in little smarter way
> *** preempt reducer on a node manager for which there is any pending map 
> request.
> *** otherwise preempt any other reducer. MR AM will contribute to getting new 
> mapper by releasing such a reducer / container because it will reduce its 
> cluster consumption and thereby may become candidate for an allocation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5515) Application Manager UI does not appear with Https enabled

2013-09-19 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated MAPREDUCE-5515:
---

Attachment: MAPREDUCE-5515.txt

Here's the MR patch that Omkar worked on via YARN-1203.
 - It adds a new config mapreduce.ssl.enable that can be set by MR AMs 
explicitly if MR users want to enable Https on AM webapp.

{code}
+  mapreduce.ssl.enabled
+  false
+  
+   If enabled, MapReduce application master's http server will be
+   started with SSL enabled. Map reduce AM by default doesn't support SSL.
+   If MapReduce jobs want SSL support, it is the user's responsibility to
+   create and manage certificates, keystores and trust-stores with 
appropriate
+   permissions. This is only for MapReduce application master and is not 
used
+   by job history server. To enable encrypted shuffle this property is not
+   required, instead refer to (mapreduce.shuffle.ssl.enabled) property.
+  
+
{code}

Already reviewed and committing this together with YARN-1203.

> Application Manager UI does not appear with Https enabled
> -
>
> Key: MAPREDUCE-5515
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5515
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Omkar Vinit Joshi
>Assignee: Omkar Vinit Joshi
> Attachments: MAPREDUCE-5515.txt
>
>
> related issue YARN-1203. We need to disable https for MR-AM by default as 
> they will need access to keystore which can not be granted in the cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5505) Clients should be notified job finished only after job successfully unregistered

2013-09-19 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13772437#comment-13772437
 ] 

Jian He commented on MAPREDUCE-5505:


bq. We probably don't need "appContext.isLastAMRetry()" this check here,
In fact, in case of REBOOT,we can always return RUNNING. Either the AM is 
restarted and the jobClient continues to run, or the AM failed because of 
unregister fail or this is lastRetry, in which case RM can tell JobClient that 
the app FAILED.

> Clients should be notified job finished only after job successfully 
> unregistered 
> -
>
> Key: MAPREDUCE-5505
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5505
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Jian He
>Assignee: Zhijie Shen
> Attachments: MAPREDUCE-5505.1.patch, MAPREDUCE-5505.1.patch
>
>
> This is to make sure user is notified job finished after job is really done. 
> This does increase client latency but can reduce some races during unregister 
> like YARN-540

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5488) Job recovery fails after killing all the running containers for the app

2013-09-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13772419#comment-13772419
 ] 

Hudson commented on MAPREDUCE-5488:
---

SUCCESS: Integrated in Hadoop-trunk-Commit #4445 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/4445/])
MAPREDUCE-5488. Changed MR client to keep trying to reach the application when 
it sees that on attempt's AM is down. Contributed by Jian He. (vinodkv: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1524856)
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* /hadoop/common/trunk/hadoop-mapreduce-project/dev-support/findbugs-exclude.xml
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/main/java/org/apache/hadoop/mapred/ClientServiceDelegate.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestClientServiceDelegate.java


> Job recovery fails after killing all the running containers for the app
> ---
>
> Key: MAPREDUCE-5488
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5488
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.1.0-beta
>Reporter: Arpit Gupta
>Assignee: Jian He
> Attachments: MAPREDUCE-5488.1.patch, MAPREDUCE-5488.2.patch, 
> MAPREDUCE-5488.3.patch, MAPREDUCE-5488.patch, MAPREDUCE-5488.patch, 
> MAPREDUCE-5488.patch
>
>
> Here is the client stack trace
> {code}
> RUNNING: /usr/lib/hadoop/bin/hadoop jar 
> /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples-2.1.0.2.0.5.0-66.jar 
> wordcount "-Dmapreduce.reduce.input.limit=-1" 
> /user/user/test_yarn_ha/medium_wordcount_input 
> /user/hrt_qa/test_yarn_ha/test_mapred_ha_single_job_applicationmaster-1-time
> 13/08/30 08:45:39 INFO client.RMProxy: Connecting to ResourceManager at 
> hostname/68.142.247.148:8032
> 13/08/30 08:45:40 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 19 
> for user on ha-hdfs:ha-2-secure
> 13/08/30 08:45:40 INFO security.TokenCache: Got dt for hdfs://ha-2-secure; 
> Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:ha-2-secure, Ident: 
> (HDFS_DELEGATION_TOKEN token 19 for user)
> 13/08/30 08:45:40 INFO input.FileInputFormat: Total input paths to process : 
> 20
> 13/08/30 08:45:40 INFO lzo.GPLNativeCodeLoader: Loaded native gpl library
> 13/08/30 08:45:40 INFO lzo.LzoCodec: Successfully loaded & initialized 
> native-lzo library [hadoop-lzo rev cf4e7cbf8ed0f0622504d008101c2729dc0c9ff3]
> 13/08/30 08:45:40 INFO mapreduce.JobSubmitter: number of splits:180
> 13/08/30 08:45:40 WARN conf.Configuration: user.name is deprecated. Instead, 
> use mapreduce.job.user.name
> 13/08/30 08:45:40 WARN conf.Configuration: mapred.jar is deprecated. Instead, 
> use mapreduce.job.jar
> 13/08/30 08:45:40 WARN conf.Configuration: mapred.output.value.class is 
> deprecated. Instead, use mapreduce.job.output.value.class
> 13/08/30 08:45:40 WARN conf.Configuration: mapreduce.combine.class is 
> deprecated. Instead, use mapreduce.job.combine.class
> 13/08/30 08:45:40 WARN conf.Configuration: mapreduce.map.class is deprecated. 
> Instead, use mapreduce.job.map.class
> 13/08/30 08:45:40 WARN conf.Configuration: mapred.job.name is deprecated. 
> Instead, use mapreduce.job.name
> 13/08/30 08:45:40 WARN conf.Configuration: mapreduce.reduce.class is 
> deprecated. Instead, use mapreduce.job.reduce.class
> 13/08/30 08:45:40 WARN conf.Configuration: mapred.input.dir is deprecated. 
> Instead, use mapreduce.input.fileinputformat.inputdir
> 13/08/30 08:45:40 WARN conf.Configuration: mapred.output.dir is deprecated. 
> Instead, use mapreduce.output.fileoutputformat.outputdir
> 13/08/30 08:45:40 WARN conf.Configuration: mapred.map.tasks is deprecated. 
> Instead, use mapreduce.job.maps
> 13/08/30 08:45:40 WARN conf.Configuration: mapred.output.key.class is 
> deprecated. Instead, use mapreduce.job.output.key.class
> 13/08/30 08:45:40 WARN conf.Configuration: mapred.working.dir is deprecated. 
> Instead, use mapreduce.job.working.dir
> 13/08/30 08:45:41 INFO mapreduce.JobSubmitter: Submitting tokens for job: 
> job_1377851032086_0003
> 13/08/30 08:45:41 INFO mapreduce.JobSubmitter: Kind: HDFS_DELEGATION_TOKEN, 
> Service: ha-hdfs:ha-2-secure, Ident: (HDFS_DELEGATION_TOKEN token 19 for user)
> 13/08/30 08:45:42 INFO impl.YarnClientImpl: Submitted application 
> application_1377851032086_0003 to ResourceManager at 
> 

[jira] [Updated] (MAPREDUCE-5488) Job recovery fails after killing all the running containers for the app

2013-09-19 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated MAPREDUCE-5488:
---

  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Committed this to trunk, branch-2 and branch-2.1. Thanks Jian!

Will set the fix-versions once 2.1.2/2.2 are created in JIRA.

> Job recovery fails after killing all the running containers for the app
> ---
>
> Key: MAPREDUCE-5488
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5488
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.1.0-beta
>Reporter: Arpit Gupta
>Assignee: Jian He
> Attachments: MAPREDUCE-5488.1.patch, MAPREDUCE-5488.2.patch, 
> MAPREDUCE-5488.3.patch, MAPREDUCE-5488.patch, MAPREDUCE-5488.patch, 
> MAPREDUCE-5488.patch
>
>
> Here is the client stack trace
> {code}
> RUNNING: /usr/lib/hadoop/bin/hadoop jar 
> /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples-2.1.0.2.0.5.0-66.jar 
> wordcount "-Dmapreduce.reduce.input.limit=-1" 
> /user/user/test_yarn_ha/medium_wordcount_input 
> /user/hrt_qa/test_yarn_ha/test_mapred_ha_single_job_applicationmaster-1-time
> 13/08/30 08:45:39 INFO client.RMProxy: Connecting to ResourceManager at 
> hostname/68.142.247.148:8032
> 13/08/30 08:45:40 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 19 
> for user on ha-hdfs:ha-2-secure
> 13/08/30 08:45:40 INFO security.TokenCache: Got dt for hdfs://ha-2-secure; 
> Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:ha-2-secure, Ident: 
> (HDFS_DELEGATION_TOKEN token 19 for user)
> 13/08/30 08:45:40 INFO input.FileInputFormat: Total input paths to process : 
> 20
> 13/08/30 08:45:40 INFO lzo.GPLNativeCodeLoader: Loaded native gpl library
> 13/08/30 08:45:40 INFO lzo.LzoCodec: Successfully loaded & initialized 
> native-lzo library [hadoop-lzo rev cf4e7cbf8ed0f0622504d008101c2729dc0c9ff3]
> 13/08/30 08:45:40 INFO mapreduce.JobSubmitter: number of splits:180
> 13/08/30 08:45:40 WARN conf.Configuration: user.name is deprecated. Instead, 
> use mapreduce.job.user.name
> 13/08/30 08:45:40 WARN conf.Configuration: mapred.jar is deprecated. Instead, 
> use mapreduce.job.jar
> 13/08/30 08:45:40 WARN conf.Configuration: mapred.output.value.class is 
> deprecated. Instead, use mapreduce.job.output.value.class
> 13/08/30 08:45:40 WARN conf.Configuration: mapreduce.combine.class is 
> deprecated. Instead, use mapreduce.job.combine.class
> 13/08/30 08:45:40 WARN conf.Configuration: mapreduce.map.class is deprecated. 
> Instead, use mapreduce.job.map.class
> 13/08/30 08:45:40 WARN conf.Configuration: mapred.job.name is deprecated. 
> Instead, use mapreduce.job.name
> 13/08/30 08:45:40 WARN conf.Configuration: mapreduce.reduce.class is 
> deprecated. Instead, use mapreduce.job.reduce.class
> 13/08/30 08:45:40 WARN conf.Configuration: mapred.input.dir is deprecated. 
> Instead, use mapreduce.input.fileinputformat.inputdir
> 13/08/30 08:45:40 WARN conf.Configuration: mapred.output.dir is deprecated. 
> Instead, use mapreduce.output.fileoutputformat.outputdir
> 13/08/30 08:45:40 WARN conf.Configuration: mapred.map.tasks is deprecated. 
> Instead, use mapreduce.job.maps
> 13/08/30 08:45:40 WARN conf.Configuration: mapred.output.key.class is 
> deprecated. Instead, use mapreduce.job.output.key.class
> 13/08/30 08:45:40 WARN conf.Configuration: mapred.working.dir is deprecated. 
> Instead, use mapreduce.job.working.dir
> 13/08/30 08:45:41 INFO mapreduce.JobSubmitter: Submitting tokens for job: 
> job_1377851032086_0003
> 13/08/30 08:45:41 INFO mapreduce.JobSubmitter: Kind: HDFS_DELEGATION_TOKEN, 
> Service: ha-hdfs:ha-2-secure, Ident: (HDFS_DELEGATION_TOKEN token 19 for user)
> 13/08/30 08:45:42 INFO impl.YarnClientImpl: Submitted application 
> application_1377851032086_0003 to ResourceManager at 
> hostname/68.142.247.148:8032
> 13/08/30 08:45:42 INFO mapreduce.Job: The url to track the job: 
> http://hostname:8088/proxy/application_1377851032086_0003/
> 13/08/30 08:45:42 INFO mapreduce.Job: Running job: job_1377851032086_0003
> 13/08/30 08:45:48 INFO mapreduce.Job: Job job_1377851032086_0003 running in 
> uber mode : false
> 13/08/30 08:45:48 INFO mapreduce.Job:  map 0% reduce 0%
> stop applicationmaster
> beaver.component.hadoop|INFO|Kill container 
> container_1377851032086_0003_01_01 on host hostname
> RUNNING: ssh -o StrictHostKeyChecking=no hostname "sudo su - -c \"ps aux | 
> grep container_1377851032086_0003_01_01 | awk '{print \\\$2}' | xargs 
> kill -9\" root"
> Warning: Permanently added 'hostname,68.142.247.155' (RSA) to the list of 
> known hosts.
> kill 8978: No such process
> waiting for down time 10 seconds for service applicationmaster
> 13/08/30 08:45:55 INFO ipc.Client: Retrying co

[jira] [Commented] (MAPREDUCE-5488) Job recovery fails after killing all the running containers for the app

2013-09-19 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13772367#comment-13772367
 ] 

Vinod Kumar Vavilapalli commented on MAPREDUCE-5488:


+1, this looks good. Checking this in.

> Job recovery fails after killing all the running containers for the app
> ---
>
> Key: MAPREDUCE-5488
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5488
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.1.0-beta
>Reporter: Arpit Gupta
>Assignee: Jian He
> Attachments: MAPREDUCE-5488.1.patch, MAPREDUCE-5488.2.patch, 
> MAPREDUCE-5488.3.patch, MAPREDUCE-5488.patch, MAPREDUCE-5488.patch, 
> MAPREDUCE-5488.patch
>
>
> Here is the client stack trace
> {code}
> RUNNING: /usr/lib/hadoop/bin/hadoop jar 
> /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples-2.1.0.2.0.5.0-66.jar 
> wordcount "-Dmapreduce.reduce.input.limit=-1" 
> /user/user/test_yarn_ha/medium_wordcount_input 
> /user/hrt_qa/test_yarn_ha/test_mapred_ha_single_job_applicationmaster-1-time
> 13/08/30 08:45:39 INFO client.RMProxy: Connecting to ResourceManager at 
> hostname/68.142.247.148:8032
> 13/08/30 08:45:40 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 19 
> for user on ha-hdfs:ha-2-secure
> 13/08/30 08:45:40 INFO security.TokenCache: Got dt for hdfs://ha-2-secure; 
> Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:ha-2-secure, Ident: 
> (HDFS_DELEGATION_TOKEN token 19 for user)
> 13/08/30 08:45:40 INFO input.FileInputFormat: Total input paths to process : 
> 20
> 13/08/30 08:45:40 INFO lzo.GPLNativeCodeLoader: Loaded native gpl library
> 13/08/30 08:45:40 INFO lzo.LzoCodec: Successfully loaded & initialized 
> native-lzo library [hadoop-lzo rev cf4e7cbf8ed0f0622504d008101c2729dc0c9ff3]
> 13/08/30 08:45:40 INFO mapreduce.JobSubmitter: number of splits:180
> 13/08/30 08:45:40 WARN conf.Configuration: user.name is deprecated. Instead, 
> use mapreduce.job.user.name
> 13/08/30 08:45:40 WARN conf.Configuration: mapred.jar is deprecated. Instead, 
> use mapreduce.job.jar
> 13/08/30 08:45:40 WARN conf.Configuration: mapred.output.value.class is 
> deprecated. Instead, use mapreduce.job.output.value.class
> 13/08/30 08:45:40 WARN conf.Configuration: mapreduce.combine.class is 
> deprecated. Instead, use mapreduce.job.combine.class
> 13/08/30 08:45:40 WARN conf.Configuration: mapreduce.map.class is deprecated. 
> Instead, use mapreduce.job.map.class
> 13/08/30 08:45:40 WARN conf.Configuration: mapred.job.name is deprecated. 
> Instead, use mapreduce.job.name
> 13/08/30 08:45:40 WARN conf.Configuration: mapreduce.reduce.class is 
> deprecated. Instead, use mapreduce.job.reduce.class
> 13/08/30 08:45:40 WARN conf.Configuration: mapred.input.dir is deprecated. 
> Instead, use mapreduce.input.fileinputformat.inputdir
> 13/08/30 08:45:40 WARN conf.Configuration: mapred.output.dir is deprecated. 
> Instead, use mapreduce.output.fileoutputformat.outputdir
> 13/08/30 08:45:40 WARN conf.Configuration: mapred.map.tasks is deprecated. 
> Instead, use mapreduce.job.maps
> 13/08/30 08:45:40 WARN conf.Configuration: mapred.output.key.class is 
> deprecated. Instead, use mapreduce.job.output.key.class
> 13/08/30 08:45:40 WARN conf.Configuration: mapred.working.dir is deprecated. 
> Instead, use mapreduce.job.working.dir
> 13/08/30 08:45:41 INFO mapreduce.JobSubmitter: Submitting tokens for job: 
> job_1377851032086_0003
> 13/08/30 08:45:41 INFO mapreduce.JobSubmitter: Kind: HDFS_DELEGATION_TOKEN, 
> Service: ha-hdfs:ha-2-secure, Ident: (HDFS_DELEGATION_TOKEN token 19 for user)
> 13/08/30 08:45:42 INFO impl.YarnClientImpl: Submitted application 
> application_1377851032086_0003 to ResourceManager at 
> hostname/68.142.247.148:8032
> 13/08/30 08:45:42 INFO mapreduce.Job: The url to track the job: 
> http://hostname:8088/proxy/application_1377851032086_0003/
> 13/08/30 08:45:42 INFO mapreduce.Job: Running job: job_1377851032086_0003
> 13/08/30 08:45:48 INFO mapreduce.Job: Job job_1377851032086_0003 running in 
> uber mode : false
> 13/08/30 08:45:48 INFO mapreduce.Job:  map 0% reduce 0%
> stop applicationmaster
> beaver.component.hadoop|INFO|Kill container 
> container_1377851032086_0003_01_01 on host hostname
> RUNNING: ssh -o StrictHostKeyChecking=no hostname "sudo su - -c \"ps aux | 
> grep container_1377851032086_0003_01_01 | awk '{print \\\$2}' | xargs 
> kill -9\" root"
> Warning: Permanently added 'hostname,68.142.247.155' (RSA) to the list of 
> known hosts.
> kill 8978: No such process
> waiting for down time 10 seconds for service applicationmaster
> 13/08/30 08:45:55 INFO ipc.Client: Retrying connect to server: 
> hostname/68.142.247.155:52713. Already tried 0 time(s); retry policy is 
> RetryUpToMaximumCountWithFixedSleep(ma

[jira] [Updated] (MAPREDUCE-5507) MapReduce reducer ramp down is suboptimal with potential job-hanging issues

2013-09-19 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated MAPREDUCE-5507:
---

Summary: MapReduce reducer ramp down is suboptimal with potential 
job-hanging issues  (was: MapReduce reducer preemption gets hanged)

> MapReduce reducer ramp down is suboptimal with potential job-hanging issues
> ---
>
> Key: MAPREDUCE-5507
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5507
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Omkar Vinit Joshi
>Assignee: Omkar Vinit Joshi
>
> Today if we are setting "yarn.app.mapreduce.am.job.reduce.rampup.limit" and 
> "mapreduce.job.reduce.slowstart.completedmaps" then reducers are launched 
> more aggressively. However the calculation to either Ramp up or Ramp down 
> reducer is not done in most optimal way. 
> * If MR AM at any point sees situation something like 
> ** scheduledMaps : 30
> ** scheduledReducers : 10
> ** assignedMaps : 0
> ** assignedReducers : 11
> ** finishedMaps : 120
> ** headroom : 756 ( when your map /reduce task needs only 512mb)
> * then today it simply hangs because it thinks that there is sufficient room 
> to launch one more mapper and therefore there is no need to ramp down. 
> However, if this continues forever then this is not the correct way / optimal 
> way.
> * Ideally for MR AM when it sees that assignedMaps drops have dropped to 0 
> and there are running reducers around then it should wait for certain time ( 
> upper limited by average map task completion time ... for heuristic 
> sake)..but after that if still it doesn't get new container for map task then 
> it should preempt the reducer one by one with some interval and should ramp 
> up slowly...
> ** Preemption of reducers can be done in little smarter way
> *** preempt reducer on a node manager for which there is any pending map 
> request.
> *** otherwise preempt any other reducer. MR AM will contribute to getting new 
> mapper by releasing such a reducer / container because it will reduce its 
> cluster consumption and thereby may become candidate for an allocation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5504) mapred queue -info inconsistent with types

2013-09-19 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13772294#comment-13772294
 ] 

Thomas Graves commented on MAPREDUCE-5504:
--

+1, looks good. Thanks Kousuke!

> mapred queue -info inconsistent with types
> --
>
> Key: MAPREDUCE-5504
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5504
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: client
>Affects Versions: 0.23.9
>Reporter: Thomas Graves
>Assignee: Kousuke Saruta
> Attachments: MAPREDUCE-5504.patch
>
>
> $ mapred queue -info default
> ==
> Queue Name : default
> Queue State : running
> Scheduling Info : Capacity: 4.0, MaximumCapacity: 0.67, CurrentCapacity: 
> 0.9309831
> The capacity is displayed in % as 4, however maximum capacity is displayed as 
> an absolute number 0.67 instead of 67%.
> We should make these consistent with the type we are displaying

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5504) mapred queue -info inconsistent with types

2013-09-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13772315#comment-13772315
 ] 

Hudson commented on MAPREDUCE-5504:
---

SUCCESS: Integrated in Hadoop-trunk-Commit # (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit//])
MAPREDUCE-5504. mapred queue -info inconsistent with types (Kousuke Saruta via 
tgraves) (tgraves: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1524841)
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapreduce/TypeConverter.java


> mapred queue -info inconsistent with types
> --
>
> Key: MAPREDUCE-5504
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5504
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: client
>Affects Versions: 0.23.9
>Reporter: Thomas Graves
>Assignee: Kousuke Saruta
> Fix For: 3.0.0, 2.3.0, 0.23.10
>
> Attachments: MAPREDUCE-5504.patch
>
>
> $ mapred queue -info default
> ==
> Queue Name : default
> Queue State : running
> Scheduling Info : Capacity: 4.0, MaximumCapacity: 0.67, CurrentCapacity: 
> 0.9309831
> The capacity is displayed in % as 4, however maximum capacity is displayed as 
> an absolute number 0.67 instead of 67%.
> We should make these consistent with the type we are displaying

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5517) enabling uber mode with 0 reducer still requires mapreduce.reduce.memory.mb to be less than yarn.app.mapreduce.am.resource.mb

2013-09-19 Thread Siqi Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siqi Li updated MAPREDUCE-5517:
---

Status: Open  (was: Patch Available)

> enabling uber mode with 0 reducer still requires mapreduce.reduce.memory.mb 
> to be less than yarn.app.mapreduce.am.resource.mb
> -
>
> Key: MAPREDUCE-5517
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5517
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.0.5-alpha
>Reporter: Siqi Li
>Priority: Minor
> Attachments: MAPREDUCE_5517_v1.patch.txt, MAPREDUCE_5517_v2.patch.txt
>
>
> Since there is no reducer, the memory allocated to reducer is irrelevant to 
> enable uber mode of a job

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5517) enabling uber mode with 0 reducer still requires mapreduce.reduce.memory.mb to be less than yarn.app.mapreduce.am.resource.mb

2013-09-19 Thread Siqi Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siqi Li updated MAPREDUCE-5517:
---

Status: Patch Available  (was: Open)

> enabling uber mode with 0 reducer still requires mapreduce.reduce.memory.mb 
> to be less than yarn.app.mapreduce.am.resource.mb
> -
>
> Key: MAPREDUCE-5517
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5517
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.0.5-alpha
>Reporter: Siqi Li
>Priority: Minor
> Attachments: MAPREDUCE_5517_v1.patch.txt, MAPREDUCE_5517_v2.patch.txt
>
>
> Since there is no reducer, the memory allocated to reducer is irrelevant to 
> enable uber mode of a job

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5517) enabling uber mode with 0 reducer still requires mapreduce.reduce.memory.mb to be less than yarn.app.mapreduce.am.resource.mb

2013-09-19 Thread Siqi Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siqi Li updated MAPREDUCE-5517:
---

Attachment: MAPREDUCE_5517_v2.patch.txt

> enabling uber mode with 0 reducer still requires mapreduce.reduce.memory.mb 
> to be less than yarn.app.mapreduce.am.resource.mb
> -
>
> Key: MAPREDUCE-5517
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5517
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.0.5-alpha
>Reporter: Siqi Li
>Priority: Minor
> Attachments: MAPREDUCE_5517_v1.patch.txt, MAPREDUCE_5517_v2.patch.txt
>
>
> Since there is no reducer, the memory allocated to reducer is irrelevant to 
> enable uber mode of a job

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5514) TestRMContainerAllocator fails on trunk

2013-09-19 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13772214#comment-13772214
 ] 

Xuan Gong commented on MAPREDUCE-5514:
--

I got following error message
{code}
2013-09-19 12:51:42,379 FATAL [AsyncDispatcher event handler] 
event.AsyncDispatcher (AsyncDispatcher.java:dispatch(141)) - Error in 
dispatcher thread
java.lang.IllegalArgumentException: java.net.UnknownHostException: amNM
at 
org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:418)
at 
org.apache.hadoop.yarn.server.utils.BuilderUtils.newContainerToken(BuilderUtils.java:247)
at 
org.apache.hadoop.yarn.server.resourcemanager.security.RMContainerTokenSecretManager.createContainerToken(RMContainerTokenSecretManager.java:195)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.assignContainer(FifoScheduler.java:590)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.assignOffSwitchContainers(FifoScheduler.java:554)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.assignContainersOnNode(FifoScheduler.java:482)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.assignContainers(FifoScheduler.java:411)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.nodeUpdate(FifoScheduler.java:650)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.handle(FifoScheduler.java:679)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.handle(FifoScheduler.java:95)
at 
org.apache.hadoop.mapreduce.v2.app.TestRMContainerAllocator$MyResourceManager$1.handle(TestRMContainerAllocator.java:451)
at 
org.apache.hadoop.mapreduce.v2.app.TestRMContainerAllocator$MyResourceManager$1.handle(TestRMContainerAllocator.java:1)
at 
org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:134)
at 
org.apache.hadoop.yarn.event.DrainDispatcher$1.run(DrainDispatcher.java:65)
at java.lang.Thread.run(Thread.java:680)
Caused by: java.net.UnknownHostException: amNM
at 
org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:419)
... 14 more
{code}

> TestRMContainerAllocator fails on trunk
> ---
>
> Key: MAPREDUCE-5514
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5514
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Zhijie Shen
>Assignee: Zhijie Shen
>Priority: Blocker
> Attachments: 
> org.apache.hadoop.mapreduce.v2.app.TestRMContainerAllocator-output.txt
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5505) Clients should be notified job finished only after job successfully unregistered

2013-09-19 Thread Bikas Saha (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikas Saha updated MAPREDUCE-5505:
--

Status: Open  (was: Patch Available)

> Clients should be notified job finished only after job successfully 
> unregistered 
> -
>
> Key: MAPREDUCE-5505
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5505
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Jian He
>Assignee: Zhijie Shen
> Attachments: MAPREDUCE-5505.1.patch, MAPREDUCE-5505.1.patch
>
>
> This is to make sure user is notified job finished after job is really done. 
> This does increase client latency but can reduce some races during unregister 
> like YARN-540

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5505) Clients should be notified job finished only after job successfully unregistered

2013-09-19 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13772301#comment-13772301
 ] 

Bikas Saha commented on MAPREDUCE-5505:
---

Are we sure that previous state is always RUNNING before FAILED?
{code}
+  case FAILED:
+if (isUnregistered) {
+  return JobState.FAILED;
+} else {
+  return JobState.RUNNING;
{code}

Instead of isUnregistered, let us create an AtomicBoolean called 
safeToReportTerminationToUser. Instead of JobImpl, this boolean can be made 
visible via the AppContext object so that everyone has access to it. When to 
set the boolean to true? We could do it in RMCommunicator after unregister 
succeeds (like in this patch). Or we can do it in 
MRClientService.serviceStop(). Since MRClientService is the last service to 
stop() we can be sure that everything finished nicely. 
MRClientService.serviceStop() can set the boolean. Then we can move the 
sleep(5sec) from MRAppMaster to MRClientService.serviceStop() after setting the 
boolean. 
We should leave a comment explaining this in MRAppMaster.shutdown() before the 
call to clientService.stop() so that its easy for someone else to track this 
logic.

Please do run single node tests to verify the behavior for real along with RM 
restart.

> Clients should be notified job finished only after job successfully 
> unregistered 
> -
>
> Key: MAPREDUCE-5505
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5505
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Jian He
>Assignee: Zhijie Shen
> Attachments: MAPREDUCE-5505.1.patch, MAPREDUCE-5505.1.patch
>
>
> This is to make sure user is notified job finished after job is really done. 
> This does increase client latency but can reduce some races during unregister 
> like YARN-540

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5504) mapred queue -info inconsistent with types

2013-09-19 Thread Thomas Graves (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated MAPREDUCE-5504:
-

   Resolution: Fixed
Fix Version/s: 0.23.10
   2.3.0
   3.0.0
   Status: Resolved  (was: Patch Available)

> mapred queue -info inconsistent with types
> --
>
> Key: MAPREDUCE-5504
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5504
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: client
>Affects Versions: 0.23.9
>Reporter: Thomas Graves
>Assignee: Kousuke Saruta
> Fix For: 3.0.0, 2.3.0, 0.23.10
>
> Attachments: MAPREDUCE-5504.patch
>
>
> $ mapred queue -info default
> ==
> Queue Name : default
> Queue State : running
> Scheduling Info : Capacity: 4.0, MaximumCapacity: 0.67, CurrentCapacity: 
> 0.9309831
> The capacity is displayed in % as 4, however maximum capacity is displayed as 
> an absolute number 0.67 instead of 67%.
> We should make these consistent with the type we are displaying

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5488) Job recovery fails after killing all the running containers for the app

2013-09-19 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated MAPREDUCE-5488:
---

Status: Open  (was: Patch Available)

> Job recovery fails after killing all the running containers for the app
> ---
>
> Key: MAPREDUCE-5488
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5488
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.1.0-beta
>Reporter: Arpit Gupta
>Assignee: Jian He
> Attachments: MAPREDUCE-5488.1.patch, MAPREDUCE-5488.2.patch, 
> MAPREDUCE-5488.3.patch, MAPREDUCE-5488.patch, MAPREDUCE-5488.patch, 
> MAPREDUCE-5488.patch
>
>
> Here is the client stack trace
> {code}
> RUNNING: /usr/lib/hadoop/bin/hadoop jar 
> /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples-2.1.0.2.0.5.0-66.jar 
> wordcount "-Dmapreduce.reduce.input.limit=-1" 
> /user/user/test_yarn_ha/medium_wordcount_input 
> /user/hrt_qa/test_yarn_ha/test_mapred_ha_single_job_applicationmaster-1-time
> 13/08/30 08:45:39 INFO client.RMProxy: Connecting to ResourceManager at 
> hostname/68.142.247.148:8032
> 13/08/30 08:45:40 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 19 
> for user on ha-hdfs:ha-2-secure
> 13/08/30 08:45:40 INFO security.TokenCache: Got dt for hdfs://ha-2-secure; 
> Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:ha-2-secure, Ident: 
> (HDFS_DELEGATION_TOKEN token 19 for user)
> 13/08/30 08:45:40 INFO input.FileInputFormat: Total input paths to process : 
> 20
> 13/08/30 08:45:40 INFO lzo.GPLNativeCodeLoader: Loaded native gpl library
> 13/08/30 08:45:40 INFO lzo.LzoCodec: Successfully loaded & initialized 
> native-lzo library [hadoop-lzo rev cf4e7cbf8ed0f0622504d008101c2729dc0c9ff3]
> 13/08/30 08:45:40 INFO mapreduce.JobSubmitter: number of splits:180
> 13/08/30 08:45:40 WARN conf.Configuration: user.name is deprecated. Instead, 
> use mapreduce.job.user.name
> 13/08/30 08:45:40 WARN conf.Configuration: mapred.jar is deprecated. Instead, 
> use mapreduce.job.jar
> 13/08/30 08:45:40 WARN conf.Configuration: mapred.output.value.class is 
> deprecated. Instead, use mapreduce.job.output.value.class
> 13/08/30 08:45:40 WARN conf.Configuration: mapreduce.combine.class is 
> deprecated. Instead, use mapreduce.job.combine.class
> 13/08/30 08:45:40 WARN conf.Configuration: mapreduce.map.class is deprecated. 
> Instead, use mapreduce.job.map.class
> 13/08/30 08:45:40 WARN conf.Configuration: mapred.job.name is deprecated. 
> Instead, use mapreduce.job.name
> 13/08/30 08:45:40 WARN conf.Configuration: mapreduce.reduce.class is 
> deprecated. Instead, use mapreduce.job.reduce.class
> 13/08/30 08:45:40 WARN conf.Configuration: mapred.input.dir is deprecated. 
> Instead, use mapreduce.input.fileinputformat.inputdir
> 13/08/30 08:45:40 WARN conf.Configuration: mapred.output.dir is deprecated. 
> Instead, use mapreduce.output.fileoutputformat.outputdir
> 13/08/30 08:45:40 WARN conf.Configuration: mapred.map.tasks is deprecated. 
> Instead, use mapreduce.job.maps
> 13/08/30 08:45:40 WARN conf.Configuration: mapred.output.key.class is 
> deprecated. Instead, use mapreduce.job.output.key.class
> 13/08/30 08:45:40 WARN conf.Configuration: mapred.working.dir is deprecated. 
> Instead, use mapreduce.job.working.dir
> 13/08/30 08:45:41 INFO mapreduce.JobSubmitter: Submitting tokens for job: 
> job_1377851032086_0003
> 13/08/30 08:45:41 INFO mapreduce.JobSubmitter: Kind: HDFS_DELEGATION_TOKEN, 
> Service: ha-hdfs:ha-2-secure, Ident: (HDFS_DELEGATION_TOKEN token 19 for user)
> 13/08/30 08:45:42 INFO impl.YarnClientImpl: Submitted application 
> application_1377851032086_0003 to ResourceManager at 
> hostname/68.142.247.148:8032
> 13/08/30 08:45:42 INFO mapreduce.Job: The url to track the job: 
> http://hostname:8088/proxy/application_1377851032086_0003/
> 13/08/30 08:45:42 INFO mapreduce.Job: Running job: job_1377851032086_0003
> 13/08/30 08:45:48 INFO mapreduce.Job: Job job_1377851032086_0003 running in 
> uber mode : false
> 13/08/30 08:45:48 INFO mapreduce.Job:  map 0% reduce 0%
> stop applicationmaster
> beaver.component.hadoop|INFO|Kill container 
> container_1377851032086_0003_01_01 on host hostname
> RUNNING: ssh -o StrictHostKeyChecking=no hostname "sudo su - -c \"ps aux | 
> grep container_1377851032086_0003_01_01 | awk '{print \\\$2}' | xargs 
> kill -9\" root"
> Warning: Permanently added 'hostname,68.142.247.155' (RSA) to the list of 
> known hosts.
> kill 8978: No such process
> waiting for down time 10 seconds for service applicationmaster
> 13/08/30 08:45:55 INFO ipc.Client: Retrying connect to server: 
> hostname/68.142.247.155:52713. Already tried 0 time(s); retry policy is 
> RetryUpToMaximumCountWithFixedSleep(maxRetries=1, sleepTime=1 SECONDS)
> 13/08/30 08:45:56 INFO ipc.Client: Retrying connect to server: 

[jira] [Updated] (MAPREDUCE-5488) Job recovery fails after killing all the running containers for the app

2013-09-19 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated MAPREDUCE-5488:
---

Attachment: MAPREDUCE-5488.3.patch

updated findbug-exclude file

> Job recovery fails after killing all the running containers for the app
> ---
>
> Key: MAPREDUCE-5488
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5488
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.1.0-beta
>Reporter: Arpit Gupta
>Assignee: Jian He
> Attachments: MAPREDUCE-5488.1.patch, MAPREDUCE-5488.2.patch, 
> MAPREDUCE-5488.3.patch, MAPREDUCE-5488.patch, MAPREDUCE-5488.patch, 
> MAPREDUCE-5488.patch
>
>
> Here is the client stack trace
> {code}
> RUNNING: /usr/lib/hadoop/bin/hadoop jar 
> /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples-2.1.0.2.0.5.0-66.jar 
> wordcount "-Dmapreduce.reduce.input.limit=-1" 
> /user/user/test_yarn_ha/medium_wordcount_input 
> /user/hrt_qa/test_yarn_ha/test_mapred_ha_single_job_applicationmaster-1-time
> 13/08/30 08:45:39 INFO client.RMProxy: Connecting to ResourceManager at 
> hostname/68.142.247.148:8032
> 13/08/30 08:45:40 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 19 
> for user on ha-hdfs:ha-2-secure
> 13/08/30 08:45:40 INFO security.TokenCache: Got dt for hdfs://ha-2-secure; 
> Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:ha-2-secure, Ident: 
> (HDFS_DELEGATION_TOKEN token 19 for user)
> 13/08/30 08:45:40 INFO input.FileInputFormat: Total input paths to process : 
> 20
> 13/08/30 08:45:40 INFO lzo.GPLNativeCodeLoader: Loaded native gpl library
> 13/08/30 08:45:40 INFO lzo.LzoCodec: Successfully loaded & initialized 
> native-lzo library [hadoop-lzo rev cf4e7cbf8ed0f0622504d008101c2729dc0c9ff3]
> 13/08/30 08:45:40 INFO mapreduce.JobSubmitter: number of splits:180
> 13/08/30 08:45:40 WARN conf.Configuration: user.name is deprecated. Instead, 
> use mapreduce.job.user.name
> 13/08/30 08:45:40 WARN conf.Configuration: mapred.jar is deprecated. Instead, 
> use mapreduce.job.jar
> 13/08/30 08:45:40 WARN conf.Configuration: mapred.output.value.class is 
> deprecated. Instead, use mapreduce.job.output.value.class
> 13/08/30 08:45:40 WARN conf.Configuration: mapreduce.combine.class is 
> deprecated. Instead, use mapreduce.job.combine.class
> 13/08/30 08:45:40 WARN conf.Configuration: mapreduce.map.class is deprecated. 
> Instead, use mapreduce.job.map.class
> 13/08/30 08:45:40 WARN conf.Configuration: mapred.job.name is deprecated. 
> Instead, use mapreduce.job.name
> 13/08/30 08:45:40 WARN conf.Configuration: mapreduce.reduce.class is 
> deprecated. Instead, use mapreduce.job.reduce.class
> 13/08/30 08:45:40 WARN conf.Configuration: mapred.input.dir is deprecated. 
> Instead, use mapreduce.input.fileinputformat.inputdir
> 13/08/30 08:45:40 WARN conf.Configuration: mapred.output.dir is deprecated. 
> Instead, use mapreduce.output.fileoutputformat.outputdir
> 13/08/30 08:45:40 WARN conf.Configuration: mapred.map.tasks is deprecated. 
> Instead, use mapreduce.job.maps
> 13/08/30 08:45:40 WARN conf.Configuration: mapred.output.key.class is 
> deprecated. Instead, use mapreduce.job.output.key.class
> 13/08/30 08:45:40 WARN conf.Configuration: mapred.working.dir is deprecated. 
> Instead, use mapreduce.job.working.dir
> 13/08/30 08:45:41 INFO mapreduce.JobSubmitter: Submitting tokens for job: 
> job_1377851032086_0003
> 13/08/30 08:45:41 INFO mapreduce.JobSubmitter: Kind: HDFS_DELEGATION_TOKEN, 
> Service: ha-hdfs:ha-2-secure, Ident: (HDFS_DELEGATION_TOKEN token 19 for user)
> 13/08/30 08:45:42 INFO impl.YarnClientImpl: Submitted application 
> application_1377851032086_0003 to ResourceManager at 
> hostname/68.142.247.148:8032
> 13/08/30 08:45:42 INFO mapreduce.Job: The url to track the job: 
> http://hostname:8088/proxy/application_1377851032086_0003/
> 13/08/30 08:45:42 INFO mapreduce.Job: Running job: job_1377851032086_0003
> 13/08/30 08:45:48 INFO mapreduce.Job: Job job_1377851032086_0003 running in 
> uber mode : false
> 13/08/30 08:45:48 INFO mapreduce.Job:  map 0% reduce 0%
> stop applicationmaster
> beaver.component.hadoop|INFO|Kill container 
> container_1377851032086_0003_01_01 on host hostname
> RUNNING: ssh -o StrictHostKeyChecking=no hostname "sudo su - -c \"ps aux | 
> grep container_1377851032086_0003_01_01 | awk '{print \\\$2}' | xargs 
> kill -9\" root"
> Warning: Permanently added 'hostname,68.142.247.155' (RSA) to the list of 
> known hosts.
> kill 8978: No such process
> waiting for down time 10 seconds for service applicationmaster
> 13/08/30 08:45:55 INFO ipc.Client: Retrying connect to server: 
> hostname/68.142.247.155:52713. Already tried 0 time(s); retry policy is 
> RetryUpToMaximumCountWithFixedSleep(maxRetries=1, sleepTime=1 SECONDS)
> 13/08/30 08:45:56 INFO i

[jira] [Updated] (MAPREDUCE-5481) TestUberAM timeout

2013-09-19 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated MAPREDUCE-5481:
---

Priority: Blocker  (was: Major)

> TestUberAM timeout
> --
>
> Key: MAPREDUCE-5481
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5481
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2, test
>Affects Versions: 3.0.0
>Reporter: Jason Lowe
>Priority: Blocker
>
> TestUberAM has been timing out on trunk for some time now and surefire then 
> fails the build.  I'm not able to reproduce it locally, but the Jenkins 
> builds have been seeing it fairly consistently.  See 
> https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1529/console

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5505) Clients should be notified job finished only after job successfully unregistered

2013-09-19 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13772291#comment-13772291
 ] 

Jian He commented on MAPREDUCE-5505:


{code}
case REBOOT:
if (isUnregistered && appContext.isLastAMRetry()) {
  return JobState.ERROR;
{code}
We probably don't need "appContext.isLastAMRetry()" this check here, since if 
this is the last retry, the app will fail on RM side. After this AM exits, 
JobClient is able to query RM for final status, in which case JobClient will be 
told FAILED. This is good for all transitions to follow the same logic.

If that's the case we can create a common function to handle the logic that for 
every final state, if registered return final state, otherwise return the 
previous state

> Clients should be notified job finished only after job successfully 
> unregistered 
> -
>
> Key: MAPREDUCE-5505
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5505
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Jian He
>Assignee: Zhijie Shen
> Attachments: MAPREDUCE-5505.1.patch, MAPREDUCE-5505.1.patch
>
>
> This is to make sure user is notified job finished after job is really done. 
> This does increase client latency but can reduce some races during unregister 
> like YARN-540

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5488) Job recovery fails after killing all the running containers for the app

2013-09-19 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated MAPREDUCE-5488:
---

Status: Patch Available  (was: Open)

> Job recovery fails after killing all the running containers for the app
> ---
>
> Key: MAPREDUCE-5488
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5488
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.1.0-beta
>Reporter: Arpit Gupta
>Assignee: Jian He
> Attachments: MAPREDUCE-5488.1.patch, MAPREDUCE-5488.2.patch, 
> MAPREDUCE-5488.3.patch, MAPREDUCE-5488.patch, MAPREDUCE-5488.patch, 
> MAPREDUCE-5488.patch
>
>
> Here is the client stack trace
> {code}
> RUNNING: /usr/lib/hadoop/bin/hadoop jar 
> /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples-2.1.0.2.0.5.0-66.jar 
> wordcount "-Dmapreduce.reduce.input.limit=-1" 
> /user/user/test_yarn_ha/medium_wordcount_input 
> /user/hrt_qa/test_yarn_ha/test_mapred_ha_single_job_applicationmaster-1-time
> 13/08/30 08:45:39 INFO client.RMProxy: Connecting to ResourceManager at 
> hostname/68.142.247.148:8032
> 13/08/30 08:45:40 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 19 
> for user on ha-hdfs:ha-2-secure
> 13/08/30 08:45:40 INFO security.TokenCache: Got dt for hdfs://ha-2-secure; 
> Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:ha-2-secure, Ident: 
> (HDFS_DELEGATION_TOKEN token 19 for user)
> 13/08/30 08:45:40 INFO input.FileInputFormat: Total input paths to process : 
> 20
> 13/08/30 08:45:40 INFO lzo.GPLNativeCodeLoader: Loaded native gpl library
> 13/08/30 08:45:40 INFO lzo.LzoCodec: Successfully loaded & initialized 
> native-lzo library [hadoop-lzo rev cf4e7cbf8ed0f0622504d008101c2729dc0c9ff3]
> 13/08/30 08:45:40 INFO mapreduce.JobSubmitter: number of splits:180
> 13/08/30 08:45:40 WARN conf.Configuration: user.name is deprecated. Instead, 
> use mapreduce.job.user.name
> 13/08/30 08:45:40 WARN conf.Configuration: mapred.jar is deprecated. Instead, 
> use mapreduce.job.jar
> 13/08/30 08:45:40 WARN conf.Configuration: mapred.output.value.class is 
> deprecated. Instead, use mapreduce.job.output.value.class
> 13/08/30 08:45:40 WARN conf.Configuration: mapreduce.combine.class is 
> deprecated. Instead, use mapreduce.job.combine.class
> 13/08/30 08:45:40 WARN conf.Configuration: mapreduce.map.class is deprecated. 
> Instead, use mapreduce.job.map.class
> 13/08/30 08:45:40 WARN conf.Configuration: mapred.job.name is deprecated. 
> Instead, use mapreduce.job.name
> 13/08/30 08:45:40 WARN conf.Configuration: mapreduce.reduce.class is 
> deprecated. Instead, use mapreduce.job.reduce.class
> 13/08/30 08:45:40 WARN conf.Configuration: mapred.input.dir is deprecated. 
> Instead, use mapreduce.input.fileinputformat.inputdir
> 13/08/30 08:45:40 WARN conf.Configuration: mapred.output.dir is deprecated. 
> Instead, use mapreduce.output.fileoutputformat.outputdir
> 13/08/30 08:45:40 WARN conf.Configuration: mapred.map.tasks is deprecated. 
> Instead, use mapreduce.job.maps
> 13/08/30 08:45:40 WARN conf.Configuration: mapred.output.key.class is 
> deprecated. Instead, use mapreduce.job.output.key.class
> 13/08/30 08:45:40 WARN conf.Configuration: mapred.working.dir is deprecated. 
> Instead, use mapreduce.job.working.dir
> 13/08/30 08:45:41 INFO mapreduce.JobSubmitter: Submitting tokens for job: 
> job_1377851032086_0003
> 13/08/30 08:45:41 INFO mapreduce.JobSubmitter: Kind: HDFS_DELEGATION_TOKEN, 
> Service: ha-hdfs:ha-2-secure, Ident: (HDFS_DELEGATION_TOKEN token 19 for user)
> 13/08/30 08:45:42 INFO impl.YarnClientImpl: Submitted application 
> application_1377851032086_0003 to ResourceManager at 
> hostname/68.142.247.148:8032
> 13/08/30 08:45:42 INFO mapreduce.Job: The url to track the job: 
> http://hostname:8088/proxy/application_1377851032086_0003/
> 13/08/30 08:45:42 INFO mapreduce.Job: Running job: job_1377851032086_0003
> 13/08/30 08:45:48 INFO mapreduce.Job: Job job_1377851032086_0003 running in 
> uber mode : false
> 13/08/30 08:45:48 INFO mapreduce.Job:  map 0% reduce 0%
> stop applicationmaster
> beaver.component.hadoop|INFO|Kill container 
> container_1377851032086_0003_01_01 on host hostname
> RUNNING: ssh -o StrictHostKeyChecking=no hostname "sudo su - -c \"ps aux | 
> grep container_1377851032086_0003_01_01 | awk '{print \\\$2}' | xargs 
> kill -9\" root"
> Warning: Permanently added 'hostname,68.142.247.155' (RSA) to the list of 
> known hosts.
> kill 8978: No such process
> waiting for down time 10 seconds for service applicationmaster
> 13/08/30 08:45:55 INFO ipc.Client: Retrying connect to server: 
> hostname/68.142.247.155:52713. Already tried 0 time(s); retry policy is 
> RetryUpToMaximumCountWithFixedSleep(maxRetries=1, sleepTime=1 SECONDS)
> 13/08/30 08:45:56 INFO ipc.Client: Retrying connect to server: 

[jira] [Updated] (MAPREDUCE-5503) TestMRJobClient.testJobClient is failing

2013-09-19 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated MAPREDUCE-5503:
---

Priority: Blocker  (was: Major)

> TestMRJobClient.testJobClient is failing
> 
>
> Key: MAPREDUCE-5503
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5503
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 3.0.0
>Reporter: Jason Lowe
>Priority: Blocker
>
> TestMRJobClient.testJobClient is failing on trunk and causing precommit 
> builds to complain:
> {noformat}
> testJobClient(org.apache.hadoop.mapreduce.TestMRJobClient)  Time elapsed: 
> 26.361 sec  <<< FAILURE!
> junit.framework.AssertionFailedError: expected:<1> but was:<0>
>   at junit.framework.Assert.fail(Assert.java:50)
>   at junit.framework.Assert.failNotEquals(Assert.java:287)
>   at junit.framework.Assert.assertEquals(Assert.java:67)
>   at junit.framework.Assert.assertEquals(Assert.java:199)
>   at junit.framework.Assert.assertEquals(Assert.java:205)
>   at 
> org.apache.hadoop.mapreduce.TestMRJobClient.testJobList(TestMRJobClient.java:474)
>   at 
> org.apache.hadoop.mapreduce.TestMRJobClient.testJobClient(TestMRJobClient.java:112)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5517) enabling uber mode with 0 reducer still requires mapreduce.reduce.memory.mb to be less than yarn.app.mapreduce.am.resource.mb

2013-09-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13772242#comment-13772242
 ] 

Hadoop QA commented on MAPREDUCE-5517:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12604100/MAPREDUCE_5517_v2.patch.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The following test timeouts occurred in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app:

org.apache.hadoop.mapreduce.v2.app.TestRMContainerAllocator

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4017//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4017//console

This message is automatically generated.

> enabling uber mode with 0 reducer still requires mapreduce.reduce.memory.mb 
> to be less than yarn.app.mapreduce.am.resource.mb
> -
>
> Key: MAPREDUCE-5517
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5517
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.0.5-alpha
>Reporter: Siqi Li
>Priority: Minor
> Attachments: MAPREDUCE_5517_v1.patch.txt, MAPREDUCE_5517_v2.patch.txt
>
>
> Since there is no reducer, the memory allocated to reducer is irrelevant to 
> enable uber mode of a job

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5505) Clients should be notified job finished only after job successfully unregistered

2013-09-19 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13772238#comment-13772238
 ] 

Jian He commented on MAPREDUCE-5505:


bq. markUnregistered will not be called, and JobClient will still see RUNNING.
Correct, JobClient will see RUNNING until AM exits, in which case JobClient 
will keep waiting until next AM comes up(MAPREDUCE-5488 made this change).
Here we make a decision that if unregister call fails, the MR job is deemed as 
fail and will be restarted by RM.

isUnregistered use atomic boolean ?

test case: also assert job state is running before markUnregistered is called.

> Clients should be notified job finished only after job successfully 
> unregistered 
> -
>
> Key: MAPREDUCE-5505
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5505
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Jian He
>Assignee: Zhijie Shen
> Attachments: MAPREDUCE-5505.1.patch, MAPREDUCE-5505.1.patch
>
>
> This is to make sure user is notified job finished after job is really done. 
> This does increase client latency but can reduce some races during unregister 
> like YARN-540

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5488) Job recovery fails after killing all the running containers for the app

2013-09-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13772205#comment-13772205
 ] 

Hadoop QA commented on MAPREDUCE-5488:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12604075/MAPREDUCE-5488.3.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient:

  org.apache.hadoop.mapreduce.TestMRJobClient

  The following test timeouts occurred in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient:

org.apache.hadoop.mapreduce.v2.TestUberAM

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4015//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4015//console

This message is automatically generated.

> Job recovery fails after killing all the running containers for the app
> ---
>
> Key: MAPREDUCE-5488
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5488
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.1.0-beta
>Reporter: Arpit Gupta
>Assignee: Jian He
> Attachments: MAPREDUCE-5488.1.patch, MAPREDUCE-5488.2.patch, 
> MAPREDUCE-5488.3.patch, MAPREDUCE-5488.patch, MAPREDUCE-5488.patch, 
> MAPREDUCE-5488.patch
>
>
> Here is the client stack trace
> {code}
> RUNNING: /usr/lib/hadoop/bin/hadoop jar 
> /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples-2.1.0.2.0.5.0-66.jar 
> wordcount "-Dmapreduce.reduce.input.limit=-1" 
> /user/user/test_yarn_ha/medium_wordcount_input 
> /user/hrt_qa/test_yarn_ha/test_mapred_ha_single_job_applicationmaster-1-time
> 13/08/30 08:45:39 INFO client.RMProxy: Connecting to ResourceManager at 
> hostname/68.142.247.148:8032
> 13/08/30 08:45:40 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 19 
> for user on ha-hdfs:ha-2-secure
> 13/08/30 08:45:40 INFO security.TokenCache: Got dt for hdfs://ha-2-secure; 
> Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:ha-2-secure, Ident: 
> (HDFS_DELEGATION_TOKEN token 19 for user)
> 13/08/30 08:45:40 INFO input.FileInputFormat: Total input paths to process : 
> 20
> 13/08/30 08:45:40 INFO lzo.GPLNativeCodeLoader: Loaded native gpl library
> 13/08/30 08:45:40 INFO lzo.LzoCodec: Successfully loaded & initialized 
> native-lzo library [hadoop-lzo rev cf4e7cbf8ed0f0622504d008101c2729dc0c9ff3]
> 13/08/30 08:45:40 INFO mapreduce.JobSubmitter: number of splits:180
> 13/08/30 08:45:40 WARN conf.Configuration: user.name is deprecated. Instead, 
> use mapreduce.job.user.name
> 13/08/30 08:45:40 WARN conf.Configuration: mapred.jar is deprecated. Instead, 
> use mapreduce.job.jar
> 13/08/30 08:45:40 WARN conf.Configuration: mapred.output.value.class is 
> deprecated. Instead, use mapreduce.job.output.value.class
> 13/08/30 08:45:40 WARN conf.Configuration: mapreduce.combine.class is 
> deprecated. Instead, use mapreduce.job.combine.class
> 13/08/30 08:45:40 WARN conf.Configuration: mapreduce.map.class is deprecated. 
> Instead, use mapreduce.job.map.class
> 13/08/30 08:45:40 WARN conf.Configuration: mapred.job.name is deprecated. 
> Instead, use mapreduce.job.name
> 13/08/30 08:45:40 WARN conf.Configuration: mapreduce.reduce.class is 
> deprecated. Instead, use mapreduce.job.reduce.class
> 13/08/30 08:45:40 WARN conf.Configuration: mapred.input.dir is deprecated. 
> Instead, use mapreduce.input.fileinputformat.inputdir
> 13/08/30 08:45:40 WARN conf.Configuration: mapred.output.dir is deprecated. 
> Instead, use mapreduce.output.fileoutputformat.outputdir
> 13/08/30 08:45:40 WARN conf.Configuration: mapred.map.tasks is depre

[jira] [Commented] (MAPREDUCE-5508) JobTracker memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob

2013-09-19 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13772173#comment-13772173
 ] 

Chris Nauroth commented on MAPREDUCE-5508:
--

Thanks for the new patch, Xi.  This mostly looks good to me, and I'm glad to 
hear that it still fixes the memory leak.  Here are a few comments:

# Can we remove the unused {{PathDeletionContext}} constructor?  It would 
require a small change in {{TestCleanupQueue}}.
# Swallowing the {{InterruptedException}} is problematic if any upstream code 
depends on seeing the thread's interrupted status, so let's restore the 
interrupted status in the catch block by calling 
{{Thread.currentThread().interrupt()}}.
# If there is an {{InterruptedException}}, then we currently would pass a null 
{{tempDirFs}} to the {{CleanupQueue}}, where we'd once again risk leaking 
memory.  I suggest that if there is an {{InterruptedException}}, then we skip 
adding to the {{CleanupQueue}} and log a warning.  This is consistent with the 
error-handling strategy in the rest of the method.  (It logs warnings.)


> JobTracker memory leak caused by unreleased FileSystem objects in 
> JobInProgress#cleanupJob
> --
>
> Key: MAPREDUCE-5508
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5508
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobtracker
>Affects Versions: 1-win, 1.2.1
>Reporter: Xi Fang
>Assignee: Xi Fang
>Priority: Critical
> Attachments: MAPREDUCE-5508.1.patch, MAPREDUCE-5508.patch
>
>
> MAPREDUCE-5351 fixed a memory leak problem but introducing another filesystem 
> object (see "tempDirFs") that is not properly released.
> {code} JobInProgress#cleanupJob()
>   void cleanupJob() {
> ...
>   tempDirFs = jobTempDirPath.getFileSystem(conf);
>   CleanupQueue.getInstance().addToQueue(
>   new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId));
> ...
>  if (tempDirFs != fs) {
>   try {
> fs.close();
>   } catch (IOException ie) {
> ...
> }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5507) MapReduce reducer preemption gets hanged

2013-09-19 Thread Omkar Vinit Joshi (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Omkar Vinit Joshi updated MAPREDUCE-5507:
-

Description: 
Today if we are setting "yarn.app.mapreduce.am.job.reduce.rampup.limit" and 
"mapreduce.job.reduce.slowstart.completedmaps" then reducers are launched more 
aggressively. However the calculation to either Ramp up or Ramp down reducer is 
not done in most optimal way. 
* If MR AM at any point sees situation something like 
** scheduledMaps : 30
** scheduledReducers : 10
** assignedMaps : 0
** assignedReducers : 11
** finishedMaps : 120
** headroom : 756 ( when your map /reduce task needs only 512mb)
* then today it simply hangs because it thinks that there is sufficient room to 
launch one more mapper and therefore there is no need to ramp down. However, if 
this continues forever then this is not the correct way / optimal way.
* Ideally for MR AM when it sees that assignedMaps drops have dropped to 0 and 
there are running reducers around then it should wait for certain time ( upper 
limited by average map task completion time ... for heuristic sake)..but after 
that if still it doesn't get new container for map task then it should preempt 
the reducer one by one with some interval and should ramp up slowly...
** Preemption of reducers can be done in little smarter way
*** preempt reducer on a node manager for which there is any pending map 
request.
*** otherwise preempt any other reducer. MR AM will contribute to getting new 
mapper by releasing such a reducer / container because it will reduce its 
cluster consumption and thereby may become candidate for an allocation.

  was:
Today if we are setting "yarn.app.mapreduce.am.job.reduce.rampup.limit" and 
"mapreduce.job.reduce.slowstart.completedmaps" then reducer are launched more 
aggressively. However the calculation to either Ramp up or Ramp down reducer is 
not down in most optimal way. 
* If MR AM at any point sees situation something like 
** scheduledMaps : 30
** scheduledReducers : 10
** assignedMaps : 0
** assignedReducers : 11
** finishedMaps : 120
** headroom : 756 ( when your map /reduce task needs only 512mb)
* then today it simply hangs because it thinks that there is sufficient room to 
launch one more mapper and therefore there is no need to ramp down. However, if 
this continues forever then this is not the correct way / optimal way.
* Ideally for MR AM when it sees that assignedMaps drops have dropped to 0 and 
there are running reducers around should wait for certain time ( upper limited 
by average map task completion time ... for heuristic sake)..but after that if 
still it doesn't get new container for map task then should preempt the reducer 
one by one with some interval and should ramp up slowly...
** Preemption of reducer can be done in little smarter way
*** preempt reducer on a node manager for which there is any pending map 
request.
*** otherwise preempt any other reducer. MR AM will contribute to getting new 
mapper by releasing such a reducer / container because it will reduce its 
cluster consumption and thereby may become candidate for an allocation.


> MapReduce reducer preemption gets hanged
> 
>
> Key: MAPREDUCE-5507
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5507
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Omkar Vinit Joshi
>Assignee: Omkar Vinit Joshi
>
> Today if we are setting "yarn.app.mapreduce.am.job.reduce.rampup.limit" and 
> "mapreduce.job.reduce.slowstart.completedmaps" then reducers are launched 
> more aggressively. However the calculation to either Ramp up or Ramp down 
> reducer is not done in most optimal way. 
> * If MR AM at any point sees situation something like 
> ** scheduledMaps : 30
> ** scheduledReducers : 10
> ** assignedMaps : 0
> ** assignedReducers : 11
> ** finishedMaps : 120
> ** headroom : 756 ( when your map /reduce task needs only 512mb)
> * then today it simply hangs because it thinks that there is sufficient room 
> to launch one more mapper and therefore there is no need to ramp down. 
> However, if this continues forever then this is not the correct way / optimal 
> way.
> * Ideally for MR AM when it sees that assignedMaps drops have dropped to 0 
> and there are running reducers around then it should wait for certain time ( 
> upper limited by average map task completion time ... for heuristic 
> sake)..but after that if still it doesn't get new container for map task then 
> it should preempt the reducer one by one with some interval and should ramp 
> up slowly...
> ** Preemption of reducers can be done in little smarter way
> *** preempt reducer on a node manager for which there is any pending map 
> request.
> *** otherwise preempt any other reducer. MR AM will contribute to getting new 
> mapper by r

[jira] [Assigned] (MAPREDUCE-5481) TestUberAM timeout

2013-09-19 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong reassigned MAPREDUCE-5481:


Assignee: Xuan Gong

> TestUberAM timeout
> --
>
> Key: MAPREDUCE-5481
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5481
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2, test
>Affects Versions: 3.0.0
>Reporter: Jason Lowe
>Assignee: Xuan Gong
>Priority: Blocker
>
> TestUberAM has been timing out on trunk for some time now and surefire then 
> fails the build.  I'm not able to reproduce it locally, but the Jenkins 
> builds have been seeing it fairly consistently.  See 
> https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1529/console

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (MAPREDUCE-5514) TestRMContainerAllocator fails on trunk

2013-09-19 Thread Zhijie Shen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhijie Shen reassigned MAPREDUCE-5514:
--

Assignee: Zhijie Shen

> TestRMContainerAllocator fails on trunk
> ---
>
> Key: MAPREDUCE-5514
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5514
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Zhijie Shen
>Assignee: Zhijie Shen
>Priority: Blocker
> Attachments: 
> org.apache.hadoop.mapreduce.v2.app.TestRMContainerAllocator-output.txt
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5517) enabling uber mode with 0 reducer still requires mapreduce.reduce.memory.mb to be less than yarn.app.mapreduce.am.resource.mb

2013-09-19 Thread Siqi Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siqi Li updated MAPREDUCE-5517:
---

Description: Since there is no reducer, the memory allocated to reducer is 
irrelevant to enable uber mode of a job

> enabling uber mode with 0 reducer still requires mapreduce.reduce.memory.mb 
> to be less than yarn.app.mapreduce.am.resource.mb
> -
>
> Key: MAPREDUCE-5517
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5517
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.0.5-alpha
>Reporter: Siqi Li
>Priority: Minor
> Attachments: MAPREDUCE_5517_v1.patch.txt
>
>
> Since there is no reducer, the memory allocated to reducer is irrelevant to 
> enable uber mode of a job

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (MAPREDUCE-5517) enabling uber mode with 0 reducer still requires mapreduce.reduce.memory.mb to be less than yarn.app.mapreduce.am.resource.mb

2013-09-19 Thread Siqi Li (JIRA)
Siqi Li created MAPREDUCE-5517:
--

 Summary: enabling uber mode with 0 reducer still requires 
mapreduce.reduce.memory.mb to be less than yarn.app.mapreduce.am.resource.mb
 Key: MAPREDUCE-5517
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5517
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.0.5-alpha
Reporter: Siqi Li
Priority: Minor
 Attachments: MAPREDUCE_5517_v1.patch.txt



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5517) enabling uber mode with 0 reducer still requires mapreduce.reduce.memory.mb to be less than yarn.app.mapreduce.am.resource.mb

2013-09-19 Thread Siqi Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siqi Li updated MAPREDUCE-5517:
---

Status: Patch Available  (was: Open)

> enabling uber mode with 0 reducer still requires mapreduce.reduce.memory.mb 
> to be less than yarn.app.mapreduce.am.resource.mb
> -
>
> Key: MAPREDUCE-5517
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5517
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.0.5-alpha
>Reporter: Siqi Li
>Priority: Minor
> Attachments: MAPREDUCE_5517_v1.patch.txt
>
>
> Since there is no reducer, the memory allocated to reducer is irrelevant to 
> enable uber mode of a job

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5517) enabling uber mode with 0 reducer still requires mapreduce.reduce.memory.mb to be less than yarn.app.mapreduce.am.resource.mb

2013-09-19 Thread Siqi Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siqi Li updated MAPREDUCE-5517:
---

Attachment: MAPREDUCE_5517_v1.patch.txt

Patch available

> enabling uber mode with 0 reducer still requires mapreduce.reduce.memory.mb 
> to be less than yarn.app.mapreduce.am.resource.mb
> -
>
> Key: MAPREDUCE-5517
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5517
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.0.5-alpha
>Reporter: Siqi Li
>Priority: Minor
> Attachments: MAPREDUCE_5517_v1.patch.txt
>
>
> Since there is no reducer, the memory allocated to reducer is irrelevant to 
> enable uber mode of a job

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5517) enabling uber mode with 0 reducer still requires mapreduce.reduce.memory.mb to be less than yarn.app.mapreduce.am.resource.mb

2013-09-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13772156#comment-13772156
 ] 

Hadoop QA commented on MAPREDUCE-5517:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12604080/MAPREDUCE_5517_v1.patch.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The following test timeouts occurred in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app:

org.apache.hadoop.mapreduce.v2.app.TestRMContainerAllocator

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4016//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4016//console

This message is automatically generated.

> enabling uber mode with 0 reducer still requires mapreduce.reduce.memory.mb 
> to be less than yarn.app.mapreduce.am.resource.mb
> -
>
> Key: MAPREDUCE-5517
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5517
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.0.5-alpha
>Reporter: Siqi Li
>Priority: Minor
> Attachments: MAPREDUCE_5517_v1.patch.txt
>
>
> Since there is no reducer, the memory allocated to reducer is irrelevant to 
> enable uber mode of a job

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5502) History link in resource manager is broken for KILLED jobs

2013-09-19 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13772117#comment-13772117
 ] 

Vinod Kumar Vavilapalli commented on MAPREDUCE-5502:


bq. I think a better fix would be to change YARNRunner.killJob to avoid sending 
a kill to the RM if the reported job state is terminal rather than just 
checking for KILLED.
+1 for this. That is what I was pushing for before YARN was Apache YARN. We can 
definitely print on the CLI that apps may get stuck after this, so that we 
suggest users to use "yarn application -kill" in those corner cases.

> History link in resource manager is broken for KILLED jobs
> --
>
> Key: MAPREDUCE-5502
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5502
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.0.5-alpha
>Reporter: Vrushali C
>Assignee: Vrushali C
>  Labels: ui
>
> History link in resource manager is broken for KILLED jobs.
> Seems to happen with jobs with State 'KILLED' and FinalStatus 'KILLED'. If 
> the State is 'FINISHED' and FinalStatus is 'KILLED', then the "History" link 
> is fine.
> It isn't easy to reproduce the problem since the time at which the app is 
> killed determines the state it ends up in, which is hard to guess. these 
> particular jobs seem to get a Diagnostics message of "Application killed by 
> user." where as the other killed jobs get " Kill Job received from client 
> job_1378766187901_0002
> Job received Kill while in RUNNING state. "

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5516) TestMRJobClient fails on trunk

2013-09-19 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13772084#comment-13772084
 ] 

Jian He commented on MAPREDUCE-5516:


right.. thanks, close as a dup

> TestMRJobClient fails on trunk
> --
>
> Key: MAPREDUCE-5516
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5516
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Jian He
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (MAPREDUCE-5516) TestMRJobClient fails on trunk

2013-09-19 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He resolved MAPREDUCE-5516.


Resolution: Duplicate

> TestMRJobClient fails on trunk
> --
>
> Key: MAPREDUCE-5516
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5516
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Jian He
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5514) TestRMContainerAllocator fails on trunk

2013-09-19 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated MAPREDUCE-5514:
---

Priority: Blocker  (was: Major)

> TestRMContainerAllocator fails on trunk
> ---
>
> Key: MAPREDUCE-5514
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5514
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Zhijie Shen
>Priority: Blocker
> Attachments: 
> org.apache.hadoop.mapreduce.v2.app.TestRMContainerAllocator-output.txt
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5508) JobTracker memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob

2013-09-19 Thread Xi Fang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13772073#comment-13772073
 ] 

Xi Fang commented on MAPREDUCE-5508:


I set both staging and system dirs to hdfs on my test cluster. I ran 35,000 job 
submissions and manually checked the number of DistributedFileSystem objects. 
No memory leak related to DistributedFileSystem was found.

> JobTracker memory leak caused by unreleased FileSystem objects in 
> JobInProgress#cleanupJob
> --
>
> Key: MAPREDUCE-5508
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5508
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobtracker
>Affects Versions: 1-win, 1.2.1
>Reporter: Xi Fang
>Assignee: Xi Fang
>Priority: Critical
> Attachments: MAPREDUCE-5508.1.patch, MAPREDUCE-5508.patch
>
>
> MAPREDUCE-5351 fixed a memory leak problem but introducing another filesystem 
> object (see "tempDirFs") that is not properly released.
> {code} JobInProgress#cleanupJob()
>   void cleanupJob() {
> ...
>   tempDirFs = jobTempDirPath.getFileSystem(conf);
>   CleanupQueue.getInstance().addToQueue(
>   new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId));
> ...
>  if (tempDirFs != fs) {
>   try {
> fs.close();
>   } catch (IOException ie) {
> ...
> }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5516) TestMRJobClient fails on trunk

2013-09-19 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13771943#comment-13771943
 ] 

Jason Lowe commented on MAPREDUCE-5516:
---

Dup of MAPREDUCE-5503?

> TestMRJobClient fails on trunk
> --
>
> Key: MAPREDUCE-5516
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5516
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Jian He
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5502) History link in resource manager is broken for KILLED jobs

2013-09-19 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13771891#comment-13771891
 ] 

Jason Lowe commented on MAPREDUCE-5502:
---

bq. From our investigation, it appears that the client's kill sends a KILL to 
the App Master as well as to the RM for this App.

Were you actually seeing the AM shutdown due to the SIGTERM it would receive as 
part of the YARN kill, or did you see "Kill job received from client " in 
the AM logs as well?  I see in YARNRunner.killJob that it can send a kill to 
the AM and later to the RM if for 10 seconds the AM doesn't end up in the 
KILLED state.  That, too, seems to be a bug, since it really should be checking 
not for state != KILLED but rather for state not in a terminal state, i.e.: 
FAILED, KILLED, SUCCEEDED.  Otherwise there's a race where the AM can enter a 
terminal state on its own but the code later tries to kill it via YARN anyway.

bq. Similar to the patch in MAPREDUCE-5497, in YarnRunner's killJob function, 
we added a sleep for a few seconds before the (2nd) call to 
resMgrDelegate.killApplication where status.getState() != JobStatus.State.KILLED

In general I'm not a fan of sleeps as a "fix" since they're just masking a race 
window rather than resolving the underlying condition.  Sleeps also slow down 
the process in general, and it would be better to solve it without them if 
possible.  Also MAPREDUCE-5497 didn't add a sleep, rather it moved an existing 
sleep to later in the AM shutdown process.  That sleep is simply there for the 
AM to linger around for clients to fetch the final job status rather than 
redirect to the history server.  I'm not sure it's necessary anymore, actually.

I think a better fix would be to change YARNRunner.killJob to avoid sending a 
kill to the RM if the reported job state is terminal rather than just checking 
for KILLED.

> History link in resource manager is broken for KILLED jobs
> --
>
> Key: MAPREDUCE-5502
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5502
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.0.5-alpha
>Reporter: Vrushali C
>Assignee: Vrushali C
>  Labels: ui
>
> History link in resource manager is broken for KILLED jobs.
> Seems to happen with jobs with State 'KILLED' and FinalStatus 'KILLED'. If 
> the State is 'FINISHED' and FinalStatus is 'KILLED', then the "History" link 
> is fine.
> It isn't easy to reproduce the problem since the time at which the app is 
> killed determines the state it ends up in, which is hard to guess. these 
> particular jobs seem to get a Diagnostics message of "Application killed by 
> user." where as the other killed jobs get " Kill Job received from client 
> job_1378766187901_0002
> Job received Kill while in RUNNING state. "

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5487) In task processes, JobConf is unnecessarily loaded again in Limits

2013-09-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13771878#comment-13771878
 ] 

Hudson commented on MAPREDUCE-5487:
---

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1527 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1527/])
MAPREDUCE-5487. In task processes, JobConf is unnecessarily loaded again in 
Limits (Sandy Ryza) (sandy: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1524408)
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapred/YarnChild.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/Counters.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/counters/Limits.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/TestCounters.java


> In task processes, JobConf is unnecessarily loaded again in Limits
> --
>
> Key: MAPREDUCE-5487
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5487
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: performance, task
>Affects Versions: 2.1.0-beta
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Fix For: 2.3.0
>
> Attachments: MAPREDUCE-5487-1.patch, MAPREDUCE-5487.patch
>
>
> Limits statically loads a JobConf, which incurs costs of reading files from 
> disk and parsing XML.  The contents of this JobConf are identical to the one 
> loaded by YarnChild (before adding job.xml as a resource).  Allowing Limits 
> to initialize with the JobConf loaded in YarnChild would reduce task startup 
> time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5487) In task processes, JobConf is unnecessarily loaded again in Limits

2013-09-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13771868#comment-13771868
 ] 

Hudson commented on MAPREDUCE-5487:
---

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1553 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1553/])
MAPREDUCE-5487. In task processes, JobConf is unnecessarily loaded again in 
Limits (Sandy Ryza) (sandy: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1524408)
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapred/YarnChild.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/Counters.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/counters/Limits.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/TestCounters.java


> In task processes, JobConf is unnecessarily loaded again in Limits
> --
>
> Key: MAPREDUCE-5487
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5487
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: performance, task
>Affects Versions: 2.1.0-beta
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Fix For: 2.3.0
>
> Attachments: MAPREDUCE-5487-1.patch, MAPREDUCE-5487.patch
>
>
> Limits statically loads a JobConf, which incurs costs of reading files from 
> disk and parsing XML.  The contents of this JobConf are identical to the one 
> loaded by YarnChild (before adding job.xml as a resource).  Allowing Limits 
> to initialize with the JobConf loaded in YarnChild would reduce task startup 
> time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5487) In task processes, JobConf is unnecessarily loaded again in Limits

2013-09-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13771791#comment-13771791
 ] 

Hudson commented on MAPREDUCE-5487:
---

SUCCESS: Integrated in Hadoop-Yarn-trunk #337 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/337/])
MAPREDUCE-5487. In task processes, JobConf is unnecessarily loaded again in 
Limits (Sandy Ryza) (sandy: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1524408)
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapred/YarnChild.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/Counters.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/counters/Limits.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/TestCounters.java


> In task processes, JobConf is unnecessarily loaded again in Limits
> --
>
> Key: MAPREDUCE-5487
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5487
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: performance, task
>Affects Versions: 2.1.0-beta
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Fix For: 2.3.0
>
> Attachments: MAPREDUCE-5487-1.patch, MAPREDUCE-5487.patch
>
>
> Limits statically loads a JobConf, which incurs costs of reading files from 
> disk and parsing XML.  The contents of this JobConf are identical to the one 
> loaded by YarnChild (before adding job.xml as a resource).  Allowing Limits 
> to initialize with the JobConf loaded in YarnChild would reduce task startup 
> time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira