[jira] [Commented] (MAPREDUCE-5796) Use current version of the archive name in DistributedCacheDeploy document

2014-07-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14051105#comment-14051105
 ] 

Hadoop QA commented on MAPREDUCE-5796:
--

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12634594/MAPREDUCE-5796.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+0 tests included{color}.  The patch appears to be a 
documentation patch that doesn't require tests.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4707//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4707//console

This message is automatically generated.

 Use current version of the archive name in DistributedCacheDeploy document
 --

 Key: MAPREDUCE-5796
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5796
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: documentation
Affects Versions: 2.3.0
Reporter: Akira AJISAKA
Assignee: Akira AJISAKA
Priority: Minor
  Labels: newbie
 Attachments: MAPREDUCE-5796.patch


 The archive name is {{hadoop-mapreduce-2.1.1.tar.gz}} in 
 DistributedCacheDeploy document but Hadoop 2.1.1 is not released. It should 
 be fixed to {{hadoop-mapreduce-$\{project.version\}.tar.gz}} to show the 
 current version.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5868) TestPipeApplication causing nightly build to fail

2014-07-03 Thread Akira AJISAKA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14051265#comment-14051265
 ] 

Akira AJISAKA commented on MAPREDUCE-5868:
--

I modified Application.java to get error message from the forked process.
{code}
process = runClient(cmd, env);
+String error = IOUtils.toString(process.getErrorStream());
+LOG.info(error);
{code}
The error message is as follows:
{code}
bash: 
/Users/aajisaka/git/hadoop-common/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/userlogs/job_001_0002/attempt_001_0002_r_04_5/stdout:
 No such file or directory
{code}
It means the process failed immediately because stdout file was missing.
I'll attach a patch to create an empty stdout file.

 TestPipeApplication causing nightly build to fail
 -

 Key: MAPREDUCE-5868
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5868
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Affects Versions: trunk
Reporter: Jason Lowe
Assignee: Akira AJISAKA
 Attachments: TestPipeApplication.stack, jstack.log, 
 mapreduce-5868-v1.txt


 TestPipeApplication appears to be timing out which causes the nightly build 
 to fail.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5868) TestPipeApplication causing nightly build to fail

2014-07-03 Thread Akira AJISAKA (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA updated MAPREDUCE-5868:
-

Attachment: MAPREDUCE-5868.2.patch

 TestPipeApplication causing nightly build to fail
 -

 Key: MAPREDUCE-5868
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5868
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Affects Versions: trunk
Reporter: Jason Lowe
Assignee: Akira AJISAKA
 Attachments: MAPREDUCE-5868.2.patch, TestPipeApplication.stack, 
 jstack.log, mapreduce-5868-v1.txt


 TestPipeApplication appears to be timing out which causes the nightly build 
 to fail.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5868) TestPipeApplication causing nightly build to fail

2014-07-03 Thread Akira AJISAKA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14051291#comment-14051291
 ] 

Akira AJISAKA commented on MAPREDUCE-5868:
--

bq. It means the process failed immediately because stdout file was missing.
More precisely, the command (bash -c cache.sh 1 stdout 2 stderr) failed 
because the parent directory of stdout does not exist. If the parent directory 
does not exist, the process fails to create stdout.

 TestPipeApplication causing nightly build to fail
 -

 Key: MAPREDUCE-5868
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5868
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Affects Versions: trunk
Reporter: Jason Lowe
Assignee: Akira AJISAKA
 Attachments: MAPREDUCE-5868.2.patch, TestPipeApplication.stack, 
 jstack.log, mapreduce-5868-v1.txt


 TestPipeApplication appears to be timing out which causes the nightly build 
 to fail.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5549) distcp app should fail if m/r job fails

2014-07-03 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated MAPREDUCE-5549:
--

Attachment: MAPREDUCE-5549-002.patch

patch rebased against trunk; no other changes

 distcp app should fail if m/r job fails
 ---

 Key: MAPREDUCE-5549
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5549
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: distcp, mrv2
Affects Versions: 3.0.0
Reporter: David Rosenstrauch
 Attachments: MAPREDUCE-5549-001.patch, MAPREDUCE-5549-002.patch


 I run distcpv2 in a scripted manner.  The script checks if the distcp step 
 fails and, if so, aborts the rest of the script.  However, I ran into an 
 issue today where the distcp job failed, but my calling script went on its 
 merry way.
 Digging into the code a bit more (at 
 https://svn.apache.org/repos/asf/hadoop/common/trunk/hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/DistCp.java),
  I think I see the issue:  the distcp app is not returning an error exit code 
 to the shell when the distcp job fails.  This is a big problem, IMO, as it 
 prevents distcp from being successfully used in a scripted environment.  IMO, 
 the code should change like so:
 Before:
 {code:title=org.apache.hadoop.tools.DistCp.java}
 //...
   public int run(String[] argv) {
 //...
 try {
   execute();
 } catch (InvalidInputException e) {
   LOG.error(Invalid input: , e);
   return DistCpConstants.INVALID_ARGUMENT;
 } catch (DuplicateFileException e) {
   LOG.error(Duplicate files in input path: , e);
   return DistCpConstants.DUPLICATE_INPUT;
 } catch (Exception e) {
   LOG.error(Exception encountered , e);
   return DistCpConstants.UNKNOWN_ERROR;
 }
 return DistCpConstants.SUCCESS;
   }
 //...
 {code}
 After:
 {code:title=org.apache.hadoop.tools.DistCp.java}
 //...
   public int run(String[] argv) {
 //...
 Job job = null;
 try {
   job = execute();
 } catch (InvalidInputException e) {
   LOG.error(Invalid input: , e);
   return DistCpConstants.INVALID_ARGUMENT;
 } catch (DuplicateFileException e) {
   LOG.error(Duplicate files in input path: , e);
   return DistCpConstants.DUPLICATE_INPUT;
 } catch (Exception e) {
   LOG.error(Exception encountered , e);
   return DistCpConstants.UNKNOWN_ERROR;
 }
 if (job.isSuccessful()) {
   return DistCpConstants.SUCCESS;
 }
 else {
   return DistCpConstants.UNKNOWN_ERROR;
 }
   }
 //...
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5549) distcp app should fail if m/r job fails

2014-07-03 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated MAPREDUCE-5549:
--

Status: Open  (was: Patch Available)

 distcp app should fail if m/r job fails
 ---

 Key: MAPREDUCE-5549
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5549
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: distcp, mrv2
Affects Versions: 3.0.0
Reporter: David Rosenstrauch
 Attachments: MAPREDUCE-5549-001.patch, MAPREDUCE-5549-002.patch


 I run distcpv2 in a scripted manner.  The script checks if the distcp step 
 fails and, if so, aborts the rest of the script.  However, I ran into an 
 issue today where the distcp job failed, but my calling script went on its 
 merry way.
 Digging into the code a bit more (at 
 https://svn.apache.org/repos/asf/hadoop/common/trunk/hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/DistCp.java),
  I think I see the issue:  the distcp app is not returning an error exit code 
 to the shell when the distcp job fails.  This is a big problem, IMO, as it 
 prevents distcp from being successfully used in a scripted environment.  IMO, 
 the code should change like so:
 Before:
 {code:title=org.apache.hadoop.tools.DistCp.java}
 //...
   public int run(String[] argv) {
 //...
 try {
   execute();
 } catch (InvalidInputException e) {
   LOG.error(Invalid input: , e);
   return DistCpConstants.INVALID_ARGUMENT;
 } catch (DuplicateFileException e) {
   LOG.error(Duplicate files in input path: , e);
   return DistCpConstants.DUPLICATE_INPUT;
 } catch (Exception e) {
   LOG.error(Exception encountered , e);
   return DistCpConstants.UNKNOWN_ERROR;
 }
 return DistCpConstants.SUCCESS;
   }
 //...
 {code}
 After:
 {code:title=org.apache.hadoop.tools.DistCp.java}
 //...
   public int run(String[] argv) {
 //...
 Job job = null;
 try {
   job = execute();
 } catch (InvalidInputException e) {
   LOG.error(Invalid input: , e);
   return DistCpConstants.INVALID_ARGUMENT;
 } catch (DuplicateFileException e) {
   LOG.error(Duplicate files in input path: , e);
   return DistCpConstants.DUPLICATE_INPUT;
 } catch (Exception e) {
   LOG.error(Exception encountered , e);
   return DistCpConstants.UNKNOWN_ERROR;
 }
 if (job.isSuccessful()) {
   return DistCpConstants.SUCCESS;
 }
 else {
   return DistCpConstants.UNKNOWN_ERROR;
 }
   }
 //...
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5868) TestPipeApplication causing nightly build to fail

2014-07-03 Thread Akira AJISAKA (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA updated MAPREDUCE-5868:
-

Target Version/s: 2.5.0
  Status: Patch Available  (was: Open)

 TestPipeApplication causing nightly build to fail
 -

 Key: MAPREDUCE-5868
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5868
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Affects Versions: trunk
Reporter: Jason Lowe
Assignee: Akira AJISAKA
 Attachments: MAPREDUCE-5868.2.patch, TestPipeApplication.stack, 
 jstack.log, mapreduce-5868-v1.txt


 TestPipeApplication appears to be timing out which causes the nightly build 
 to fail.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5868) TestPipeApplication causing nightly build to fail

2014-07-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14051312#comment-14051312
 ] 

Hadoop QA commented on MAPREDUCE-5868:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12653800/MAPREDUCE-5868.2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4708//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4708//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-mapreduce-client-core.html
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4708//console

This message is automatically generated.

 TestPipeApplication causing nightly build to fail
 -

 Key: MAPREDUCE-5868
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5868
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Affects Versions: trunk
Reporter: Jason Lowe
Assignee: Akira AJISAKA
 Attachments: MAPREDUCE-5868.2.patch, TestPipeApplication.stack, 
 jstack.log, mapreduce-5868-v1.txt


 TestPipeApplication appears to be timing out which causes the nightly build 
 to fail.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5940) Avoid negative elapsed time in JHS/MRAM web UI and services

2014-07-03 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14051345#comment-14051345
 ] 

Junping Du commented on MAPREDUCE-5940:
---

bq. Silently making the elapsed time as 0 when it is negative may lead to 
hiding the bugs related to elapsed time. Adding a warning/info message before 
making it as 0 would help to diagnose/find out the issues if any.
Agree. I have similar comments above.

 Avoid negative elapsed time in JHS/MRAM web UI and services
 ---

 Key: MAPREDUCE-5940
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5940
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver, mr-am, webapps
Reporter: Zhijie Shen
Assignee: Zhijie Shen
 Attachments: MAPREDUCE-5940.1.patch


 Recently we observed a rare bug that an elapsed time of a reducer is going to 
 be negative on JHS web UI and via REST APIs. While the real reason for this 
 bug seems to be clock asynchronization on different hosts, the web frontend 
 should have masked the negative values. However, in the current code, 
 *org.apache.hadoop.mapreduce.v2.app.webapp.dao.** only check whether the 
 elapsed time is -1 or not.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5900) Container preemption interpreted as task failures and eventually job failures

2014-07-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14051337#comment-14051337
 ] 

Hudson commented on MAPREDUCE-5900:
---

SUCCESS: Integrated in Hadoop-Yarn-trunk #602 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/602/])
MAPREDUCE-5900. Changed to the interpret container preemption exit code as a 
task attempt killing event. Contributed by Mayank Bansal. (zjshen: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1607512)
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerAllocator.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TestTaskAttempt.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/rm/TestRMContainerAllocator.java


 Container preemption interpreted as task failures and eventually job failures 
 --

 Key: MAPREDUCE-5900
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5900
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: applicationmaster, mr-am, mrv2
Affects Versions: trunk, 2.4.1
Reporter: Mayank Bansal
Assignee: Mayank Bansal
 Fix For: 2.5.0

 Attachments: MAPREDUCE-5900-1.patch, 
 MAPREDUCE-5900-branch-241-2.patch, MAPREDUCE-5900-trunk-1.patch, 
 MAPREDUCE-5900-trunk-2.patch, MAPREDUCE-5900-trunk-3.patch


 We have Added preemption exit code needs to be incorporated
 MR needs to recognize the special exit code value of -102 and interpret it as 
 a container being killed instead of a container failure.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5938) Shuffle port in nodemanager is binding to all IPs

2014-07-03 Thread Ashutosh Jindal (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Jindal updated MAPREDUCE-5938:
---

Attachment: mapreduce-5938.patch

 Shuffle port in nodemanager is binding to all IPs 
 --

 Key: MAPREDUCE-5938
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5938
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Ashutosh Jindal
Assignee: Ashutosh Jindal
 Attachments: issue.jpg, mapreduce-5938.patch


 nodemanager port mapreduce.shuffle.port is listening to all ip



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5549) distcp app should fail if m/r job fails

2014-07-03 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated MAPREDUCE-5549:
--

Status: Patch Available  (was: Open)

 distcp app should fail if m/r job fails
 ---

 Key: MAPREDUCE-5549
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5549
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: distcp, mrv2
Affects Versions: 3.0.0
Reporter: David Rosenstrauch
 Attachments: MAPREDUCE-5549-001.patch, MAPREDUCE-5549-002.patch


 I run distcpv2 in a scripted manner.  The script checks if the distcp step 
 fails and, if so, aborts the rest of the script.  However, I ran into an 
 issue today where the distcp job failed, but my calling script went on its 
 merry way.
 Digging into the code a bit more (at 
 https://svn.apache.org/repos/asf/hadoop/common/trunk/hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/DistCp.java),
  I think I see the issue:  the distcp app is not returning an error exit code 
 to the shell when the distcp job fails.  This is a big problem, IMO, as it 
 prevents distcp from being successfully used in a scripted environment.  IMO, 
 the code should change like so:
 Before:
 {code:title=org.apache.hadoop.tools.DistCp.java}
 //...
   public int run(String[] argv) {
 //...
 try {
   execute();
 } catch (InvalidInputException e) {
   LOG.error(Invalid input: , e);
   return DistCpConstants.INVALID_ARGUMENT;
 } catch (DuplicateFileException e) {
   LOG.error(Duplicate files in input path: , e);
   return DistCpConstants.DUPLICATE_INPUT;
 } catch (Exception e) {
   LOG.error(Exception encountered , e);
   return DistCpConstants.UNKNOWN_ERROR;
 }
 return DistCpConstants.SUCCESS;
   }
 //...
 {code}
 After:
 {code:title=org.apache.hadoop.tools.DistCp.java}
 //...
   public int run(String[] argv) {
 //...
 Job job = null;
 try {
   job = execute();
 } catch (InvalidInputException e) {
   LOG.error(Invalid input: , e);
   return DistCpConstants.INVALID_ARGUMENT;
 } catch (DuplicateFileException e) {
   LOG.error(Duplicate files in input path: , e);
   return DistCpConstants.DUPLICATE_INPUT;
 } catch (Exception e) {
   LOG.error(Exception encountered , e);
   return DistCpConstants.UNKNOWN_ERROR;
 }
 if (job.isSuccessful()) {
   return DistCpConstants.SUCCESS;
 }
 else {
   return DistCpConstants.UNKNOWN_ERROR;
 }
   }
 //...
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5549) distcp app should fail if m/r job fails

2014-07-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14051373#comment-14051373
 ] 

Hadoop QA commented on MAPREDUCE-5549:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12653802/MAPREDUCE-5549-002.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-tools/hadoop-distcp.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4709//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4709//console

This message is automatically generated.

 distcp app should fail if m/r job fails
 ---

 Key: MAPREDUCE-5549
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5549
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: distcp, mrv2
Affects Versions: 3.0.0
Reporter: David Rosenstrauch
 Attachments: MAPREDUCE-5549-001.patch, MAPREDUCE-5549-002.patch


 I run distcpv2 in a scripted manner.  The script checks if the distcp step 
 fails and, if so, aborts the rest of the script.  However, I ran into an 
 issue today where the distcp job failed, but my calling script went on its 
 merry way.
 Digging into the code a bit more (at 
 https://svn.apache.org/repos/asf/hadoop/common/trunk/hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/DistCp.java),
  I think I see the issue:  the distcp app is not returning an error exit code 
 to the shell when the distcp job fails.  This is a big problem, IMO, as it 
 prevents distcp from being successfully used in a scripted environment.  IMO, 
 the code should change like so:
 Before:
 {code:title=org.apache.hadoop.tools.DistCp.java}
 //...
   public int run(String[] argv) {
 //...
 try {
   execute();
 } catch (InvalidInputException e) {
   LOG.error(Invalid input: , e);
   return DistCpConstants.INVALID_ARGUMENT;
 } catch (DuplicateFileException e) {
   LOG.error(Duplicate files in input path: , e);
   return DistCpConstants.DUPLICATE_INPUT;
 } catch (Exception e) {
   LOG.error(Exception encountered , e);
   return DistCpConstants.UNKNOWN_ERROR;
 }
 return DistCpConstants.SUCCESS;
   }
 //...
 {code}
 After:
 {code:title=org.apache.hadoop.tools.DistCp.java}
 //...
   public int run(String[] argv) {
 //...
 Job job = null;
 try {
   job = execute();
 } catch (InvalidInputException e) {
   LOG.error(Invalid input: , e);
   return DistCpConstants.INVALID_ARGUMENT;
 } catch (DuplicateFileException e) {
   LOG.error(Duplicate files in input path: , e);
   return DistCpConstants.DUPLICATE_INPUT;
 } catch (Exception e) {
   LOG.error(Exception encountered , e);
   return DistCpConstants.UNKNOWN_ERROR;
 }
 if (job.isSuccessful()) {
   return DistCpConstants.SUCCESS;
 }
 else {
   return DistCpConstants.UNKNOWN_ERROR;
 }
   }
 //...
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5868) TestPipeApplication causing nightly build to fail

2014-07-03 Thread Akira AJISAKA (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA updated MAPREDUCE-5868:
-

Attachment: MAPREDUCE-5868.3.patch

Fix findbug warnings

 TestPipeApplication causing nightly build to fail
 -

 Key: MAPREDUCE-5868
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5868
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Affects Versions: trunk
Reporter: Jason Lowe
Assignee: Akira AJISAKA
 Attachments: MAPREDUCE-5868.2.patch, MAPREDUCE-5868.3.patch, 
 TestPipeApplication.stack, jstack.log, mapreduce-5868-v1.txt


 TestPipeApplication appears to be timing out which causes the nightly build 
 to fail.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5868) TestPipeApplication causing nightly build to fail

2014-07-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14051459#comment-14051459
 ] 

Hadoop QA commented on MAPREDUCE-5868:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12653825/MAPREDUCE-5868.3.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4710//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4710//console

This message is automatically generated.

 TestPipeApplication causing nightly build to fail
 -

 Key: MAPREDUCE-5868
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5868
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Affects Versions: trunk
Reporter: Jason Lowe
Assignee: Akira AJISAKA
 Attachments: MAPREDUCE-5868.2.patch, MAPREDUCE-5868.3.patch, 
 TestPipeApplication.stack, jstack.log, mapreduce-5868-v1.txt


 TestPipeApplication appears to be timing out which causes the nightly build 
 to fail.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5900) Container preemption interpreted as task failures and eventually job failures

2014-07-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14051470#comment-14051470
 ] 

Hudson commented on MAPREDUCE-5900:
---

FAILURE: Integrated in Hadoop-Hdfs-trunk #1793 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1793/])
MAPREDUCE-5900. Changed to the interpret container preemption exit code as a 
task attempt killing event. Contributed by Mayank Bansal. (zjshen: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1607512)
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerAllocator.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TestTaskAttempt.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/rm/TestRMContainerAllocator.java


 Container preemption interpreted as task failures and eventually job failures 
 --

 Key: MAPREDUCE-5900
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5900
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: applicationmaster, mr-am, mrv2
Affects Versions: trunk, 2.4.1
Reporter: Mayank Bansal
Assignee: Mayank Bansal
 Fix For: 2.5.0

 Attachments: MAPREDUCE-5900-1.patch, 
 MAPREDUCE-5900-branch-241-2.patch, MAPREDUCE-5900-trunk-1.patch, 
 MAPREDUCE-5900-trunk-2.patch, MAPREDUCE-5900-trunk-3.patch


 We have Added preemption exit code needs to be incorporated
 MR needs to recognize the special exit code value of -102 and interpret it as 
 a container being killed instead of a container failure.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5900) Container preemption interpreted as task failures and eventually job failures

2014-07-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14051565#comment-14051565
 ] 

Hudson commented on MAPREDUCE-5900:
---

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1820 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1820/])
MAPREDUCE-5900. Changed to the interpret container preemption exit code as a 
task attempt killing event. Contributed by Mayank Bansal. (zjshen: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1607512)
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerAllocator.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TestTaskAttempt.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/rm/TestRMContainerAllocator.java


 Container preemption interpreted as task failures and eventually job failures 
 --

 Key: MAPREDUCE-5900
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5900
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: applicationmaster, mr-am, mrv2
Affects Versions: trunk, 2.4.1
Reporter: Mayank Bansal
Assignee: Mayank Bansal
 Fix For: 2.5.0

 Attachments: MAPREDUCE-5900-1.patch, 
 MAPREDUCE-5900-branch-241-2.patch, MAPREDUCE-5900-trunk-1.patch, 
 MAPREDUCE-5900-trunk-2.patch, MAPREDUCE-5900-trunk-3.patch


 We have Added preemption exit code needs to be incorporated
 MR needs to recognize the special exit code value of -102 and interpret it as 
 a container being killed instead of a container failure.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5940) Avoid negative elapsed time in JHS/MRAM web UI and services

2014-07-03 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14051610#comment-14051610
 ] 

Zhijie Shen commented on MAPREDUCE-5940:


Thanks for review, Junping and Devaraj.

bq. If System.currentTimeMillis()  started, then we can return -1 or 0 instead

IMHO, Times#elapsed is to computed the delta between two timestamps: started 
and finished. Given System.currentTimeMillis()  started = finished, it still 
should be a valid case. To make sure the elapsed time should always be 
non-negative, we need to check started = finished, and return -1 if not.

bq. (and log a warn that clock not getting synchronized)
bq. Adding a warning/info message before making it as 0 would help to 
diagnose/find out the issues if any.
bq. Also adding a test in TestTimes.java could be a good idea.

Sounds a good idea. Will address it in the new patch.

In addition, add a code comment to explicitly declare the behavior of 
Times#elapsed

 Avoid negative elapsed time in JHS/MRAM web UI and services
 ---

 Key: MAPREDUCE-5940
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5940
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver, mr-am, webapps
Reporter: Zhijie Shen
Assignee: Zhijie Shen
 Attachments: MAPREDUCE-5940.1.patch, MAPREDUCE-5940.2.patch


 Recently we observed a rare bug that an elapsed time of a reducer is going to 
 be negative on JHS web UI and via REST APIs. While the real reason for this 
 bug seems to be clock asynchronization on different hosts, the web frontend 
 should have masked the negative values. However, in the current code, 
 *org.apache.hadoop.mapreduce.v2.app.webapp.dao.** only check whether the 
 elapsed time is -1 or not.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5940) Avoid negative elapsed time in JHS/MRAM web UI and services

2014-07-03 Thread Zhijie Shen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhijie Shen updated MAPREDUCE-5940:
---

Attachment: MAPREDUCE-5940.2.patch

 Avoid negative elapsed time in JHS/MRAM web UI and services
 ---

 Key: MAPREDUCE-5940
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5940
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver, mr-am, webapps
Reporter: Zhijie Shen
Assignee: Zhijie Shen
 Attachments: MAPREDUCE-5940.1.patch, MAPREDUCE-5940.2.patch


 Recently we observed a rare bug that an elapsed time of a reducer is going to 
 be negative on JHS web UI and via REST APIs. While the real reason for this 
 bug seems to be clock asynchronization on different hosts, the web frontend 
 should have masked the negative values. However, in the current code, 
 *org.apache.hadoop.mapreduce.v2.app.webapp.dao.** only check whether the 
 elapsed time is -1 or not.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5617) map task is not re-launched when the task is failed while reducers are running with full cluster capacity - which will lead to job hang

2014-07-03 Thread Sunil G (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil G updated MAPREDUCE-5617:
---

Attachment: Yarn-1408.7.1.patch

 map task is not re-launched when the task is failed while reducers are 
 running with full cluster capacity - which will lead to job hang
 ---

 Key: MAPREDUCE-5617
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5617
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.2.0
 Environment: SuSe Linux
Reporter: Sunil G
Priority: Critical
 Attachments: Yarn-1408.7.1.patch


 In a Cluster with 16GB capacity, job has started with 100maps and 10 
 reducers. 
 When the reducers has started its execution, one NM has went down and 
 resulted a failure for 2 maps. But at this time, remaining 8Gb was used by 6 
 reducers and AM. So there was no place to launch the failed maps. [NM never 
 came up again, and cluster size became 8GB]
 If we kill one of reducers, then also the map cannot be launched as the 
 priority of Failed map is lesser than that of reducer. So the remaining 
 reducer only will get allocated from RM side.
 This is causing a hang for in reducer side. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5617) map task is not re-launched when the task is failed while reducers are running with full cluster capacity - which will lead to job hang

2014-07-03 Thread Sunil G (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil G updated MAPREDUCE-5617:
---

Attachment: Yarn.7.2.patch

 map task is not re-launched when the task is failed while reducers are 
 running with full cluster capacity - which will lead to job hang
 ---

 Key: MAPREDUCE-5617
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5617
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.2.0
 Environment: SuSe Linux
Reporter: Sunil G
Priority: Critical
 Attachments: Yarn.7.1.patch, Yarn.7.2.patch


 In a Cluster with 16GB capacity, job has started with 100maps and 10 
 reducers. 
 When the reducers has started its execution, one NM has went down and 
 resulted a failure for 2 maps. But at this time, remaining 8Gb was used by 6 
 reducers and AM. So there was no place to launch the failed maps. [NM never 
 came up again, and cluster size became 8GB]
 If we kill one of reducers, then also the map cannot be launched as the 
 priority of Failed map is lesser than that of reducer. So the remaining 
 reducer only will get allocated from RM side.
 This is causing a hang for in reducer side. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5617) map task is not re-launched when the task is failed while reducers are running with full cluster capacity - which will lead to job hang

2014-07-03 Thread Sunil G (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil G updated MAPREDUCE-5617:
---

Attachment: Yarn.7.1.patch

 map task is not re-launched when the task is failed while reducers are 
 running with full cluster capacity - which will lead to job hang
 ---

 Key: MAPREDUCE-5617
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5617
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.2.0
 Environment: SuSe Linux
Reporter: Sunil G
Priority: Critical
 Attachments: Yarn.7.1.patch, Yarn.7.2.patch


 In a Cluster with 16GB capacity, job has started with 100maps and 10 
 reducers. 
 When the reducers has started its execution, one NM has went down and 
 resulted a failure for 2 maps. But at this time, remaining 8Gb was used by 6 
 reducers and AM. So there was no place to launch the failed maps. [NM never 
 came up again, and cluster size became 8GB]
 If we kill one of reducers, then also the map cannot be launched as the 
 priority of Failed map is lesser than that of reducer. So the remaining 
 reducer only will get allocated from RM side.
 This is causing a hang for in reducer side. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5617) map task is not re-launched when the task is failed while reducers are running with full cluster capacity - which will lead to job hang

2014-07-03 Thread Sunil G (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil G updated MAPREDUCE-5617:
---

Attachment: (was: Yarn-1408.7.1.patch)

 map task is not re-launched when the task is failed while reducers are 
 running with full cluster capacity - which will lead to job hang
 ---

 Key: MAPREDUCE-5617
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5617
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.2.0
 Environment: SuSe Linux
Reporter: Sunil G
Priority: Critical
 Attachments: Yarn.7.1.patch, Yarn.7.2.patch


 In a Cluster with 16GB capacity, job has started with 100maps and 10 
 reducers. 
 When the reducers has started its execution, one NM has went down and 
 resulted a failure for 2 maps. But at this time, remaining 8Gb was used by 6 
 reducers and AM. So there was no place to launch the failed maps. [NM never 
 came up again, and cluster size became 8GB]
 If we kill one of reducers, then also the map cannot be launched as the 
 priority of Failed map is lesser than that of reducer. So the remaining 
 reducer only will get allocated from RM side.
 This is causing a hang for in reducer side. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5617) map task is not re-launched when the task is failed while reducers are running with full cluster capacity - which will lead to job hang

2014-07-03 Thread Sunil G (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil G updated MAPREDUCE-5617:
---

Attachment: (was: Yarn.7.1.patch)

 map task is not re-launched when the task is failed while reducers are 
 running with full cluster capacity - which will lead to job hang
 ---

 Key: MAPREDUCE-5617
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5617
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.2.0
 Environment: SuSe Linux
Reporter: Sunil G
Priority: Critical

 In a Cluster with 16GB capacity, job has started with 100maps and 10 
 reducers. 
 When the reducers has started its execution, one NM has went down and 
 resulted a failure for 2 maps. But at this time, remaining 8Gb was used by 6 
 reducers and AM. So there was no place to launch the failed maps. [NM never 
 came up again, and cluster size became 8GB]
 If we kill one of reducers, then also the map cannot be launched as the 
 priority of Failed map is lesser than that of reducer. So the remaining 
 reducer only will get allocated from RM side.
 This is causing a hang for in reducer side. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5617) map task is not re-launched when the task is failed while reducers are running with full cluster capacity - which will lead to job hang

2014-07-03 Thread Sunil G (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil G updated MAPREDUCE-5617:
---

Attachment: (was: Yarn.7.2.patch)

 map task is not re-launched when the task is failed while reducers are 
 running with full cluster capacity - which will lead to job hang
 ---

 Key: MAPREDUCE-5617
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5617
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.2.0
 Environment: SuSe Linux
Reporter: Sunil G
Priority: Critical

 In a Cluster with 16GB capacity, job has started with 100maps and 10 
 reducers. 
 When the reducers has started its execution, one NM has went down and 
 resulted a failure for 2 maps. But at this time, remaining 8Gb was used by 6 
 reducers and AM. So there was no place to launch the failed maps. [NM never 
 came up again, and cluster size became 8GB]
 If we kill one of reducers, then also the map cannot be launched as the 
 priority of Failed map is lesser than that of reducer. So the remaining 
 reducer only will get allocated from RM side.
 This is causing a hang for in reducer side. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5940) Avoid negative elapsed time in JHS/MRAM web UI and services

2014-07-03 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14052116#comment-14052116
 ] 

Junping Du commented on MAPREDUCE-5940:
---

Kick off Jenkins test.
Patch looks good to me. [~devaraj.k], are you OK with the new patch? If so, I 
will commit it once Jenkins +1.

 Avoid negative elapsed time in JHS/MRAM web UI and services
 ---

 Key: MAPREDUCE-5940
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5940
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver, mr-am, webapps
Reporter: Zhijie Shen
Assignee: Zhijie Shen
 Attachments: MAPREDUCE-5940.1.patch, MAPREDUCE-5940.2.patch


 Recently we observed a rare bug that an elapsed time of a reducer is going to 
 be negative on JHS web UI and via REST APIs. While the real reason for this 
 bug seems to be clock asynchronization on different hosts, the web frontend 
 should have masked the negative values. However, in the current code, 
 *org.apache.hadoop.mapreduce.v2.app.webapp.dao.** only check whether the 
 elapsed time is -1 or not.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5940) Avoid negative elapsed time in JHS/MRAM web UI and services

2014-07-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14052135#comment-14052135
 ] 

Hadoop QA commented on MAPREDUCE-5940:
--

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12653860/MAPREDUCE-5940.2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4712//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4712//console

This message is automatically generated.

 Avoid negative elapsed time in JHS/MRAM web UI and services
 ---

 Key: MAPREDUCE-5940
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5940
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver, mr-am, webapps
Reporter: Zhijie Shen
Assignee: Zhijie Shen
 Attachments: MAPREDUCE-5940.1.patch, MAPREDUCE-5940.2.patch


 Recently we observed a rare bug that an elapsed time of a reducer is going to 
 be negative on JHS web UI and via REST APIs. While the real reason for this 
 bug seems to be clock asynchronization on different hosts, the web frontend 
 should have masked the negative values. However, in the current code, 
 *org.apache.hadoop.mapreduce.v2.app.webapp.dao.** only check whether the 
 elapsed time is -1 or not.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5940) Avoid negative elapsed time in JHS/MRAM web UI and services

2014-07-03 Thread Devaraj K (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14052166#comment-14052166
 ] 

Devaraj K commented on MAPREDUCE-5940:
--

+1, Latest patch looks good to me.


 Avoid negative elapsed time in JHS/MRAM web UI and services
 ---

 Key: MAPREDUCE-5940
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5940
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver, mr-am, webapps
Reporter: Zhijie Shen
Assignee: Zhijie Shen
 Attachments: MAPREDUCE-5940.1.patch, MAPREDUCE-5940.2.patch


 Recently we observed a rare bug that an elapsed time of a reducer is going to 
 be negative on JHS web UI and via REST APIs. While the real reason for this 
 bug seems to be clock asynchronization on different hosts, the web frontend 
 should have masked the negative values. However, in the current code, 
 *org.apache.hadoop.mapreduce.v2.app.webapp.dao.** only check whether the 
 elapsed time is -1 or not.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5868) TestPipeApplication causing nightly build to fail

2014-07-03 Thread Akira AJISAKA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14052176#comment-14052176
 ] 

Akira AJISAKA commented on MAPREDUCE-5868:
--

The patch is to pass the existing test, so new tests are not needed.
In addition, Jenkins didn't try TestPipeApplication since the test was in 
hadoop-mapreduce-client-jobclient project. I've run the test locally.

 TestPipeApplication causing nightly build to fail
 -

 Key: MAPREDUCE-5868
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5868
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Affects Versions: trunk
Reporter: Jason Lowe
Assignee: Akira AJISAKA
 Attachments: MAPREDUCE-5868.2.patch, MAPREDUCE-5868.3.patch, 
 TestPipeApplication.stack, jstack.log, mapreduce-5868-v1.txt


 TestPipeApplication appears to be timing out which causes the nightly build 
 to fail.



--
This message was sent by Atlassian JIRA
(v6.2#6252)