[jira] [Commented] (MAPREDUCE-3353) Need a RM-AM channel to inform AMs about faulty/unhealthy/lost nodes

2012-03-27 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13239428#comment-13239428
 ] 

Hudson commented on MAPREDUCE-3353:
---

Integrated in Hadoop-Hdfs-trunk #997 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/997/])
MAPREDUCE-3353. Fixed commit msg to point to right jira. (Revision 1305457)

 Result = FAILURE
acmurthy : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1305457
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt


 Need a RM-AM channel to inform AMs about faulty/unhealthy/lost nodes
 -

 Key: MAPREDUCE-3353
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3353
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster, mrv2, resourcemanager
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Bikas Saha
 Fix For: 0.23.3

 Attachments: MAPREDUCE-3353-branch-0.23.patch, 
 MAPREDUCE-3353-branch-0.23.patch, MAPREDUCE-3353-branch-0.23.patch, 
 MAPREDUCE-3353-branch-0.23.patch, MAPREDUCE-3353-branch-0.23.patch, 
 MAPREDUCE-3353-branch-0.23.patch, MAPREDUCE-3353-branch-0.23.patch


 When a node gets lost or turns faulty, AM needs to know about that event so 
 that it can take some action like for e.g. re-executing map tasks whose 
 intermediate output live on that faulty node.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3353) Need a RM-AM channel to inform AMs about faulty/unhealthy/lost nodes

2012-03-27 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13239475#comment-13239475
 ] 

Hudson commented on MAPREDUCE-3353:
---

Integrated in Hadoop-Mapreduce-trunk #1032 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1032/])
MAPREDUCE-3353. Fixed commit msg to point to right jira. (Revision 1305457)

 Result = SUCCESS
acmurthy : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1305457
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt


 Need a RM-AM channel to inform AMs about faulty/unhealthy/lost nodes
 -

 Key: MAPREDUCE-3353
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3353
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster, mrv2, resourcemanager
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Bikas Saha
 Fix For: 0.23.3

 Attachments: MAPREDUCE-3353-branch-0.23.patch, 
 MAPREDUCE-3353-branch-0.23.patch, MAPREDUCE-3353-branch-0.23.patch, 
 MAPREDUCE-3353-branch-0.23.patch, MAPREDUCE-3353-branch-0.23.patch, 
 MAPREDUCE-3353-branch-0.23.patch, MAPREDUCE-3353-branch-0.23.patch


 When a node gets lost or turns faulty, AM needs to know about that event so 
 that it can take some action like for e.g. re-executing map tasks whose 
 intermediate output live on that faulty node.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3353) Need a RM-AM channel to inform AMs about faulty/unhealthy/lost nodes

2012-03-26 Thread Robert Joseph Evans (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238393#comment-13238393
 ] 

Robert Joseph Evans commented on MAPREDUCE-3353:


Arun,

It looks like you put in MAPREDUCE-3533 instead of MAPREDUCE-3353 in 
CHANGES.txt, could you please fix it.

 Need a RM-AM channel to inform AMs about faulty/unhealthy/lost nodes
 -

 Key: MAPREDUCE-3353
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3353
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster, mrv2, resourcemanager
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Bikas Saha
 Fix For: 0.23.3

 Attachments: MAPREDUCE-3353-branch-0.23.patch, 
 MAPREDUCE-3353-branch-0.23.patch, MAPREDUCE-3353-branch-0.23.patch, 
 MAPREDUCE-3353-branch-0.23.patch, MAPREDUCE-3353-branch-0.23.patch, 
 MAPREDUCE-3353-branch-0.23.patch, MAPREDUCE-3353-branch-0.23.patch


 When a node gets lost or turns faulty, AM needs to know about that event so 
 that it can take some action like for e.g. re-executing map tasks whose 
 intermediate output live on that faulty node.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3353) Need a RM-AM channel to inform AMs about faulty/unhealthy/lost nodes

2012-03-26 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238601#comment-13238601
 ] 

Hudson commented on MAPREDUCE-3353:
---

Integrated in Hadoop-Common-trunk-Commit #1930 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/1930/])
MAPREDUCE-3353. Fixed commit msg to point to right jira. (Revision 1305457)

 Result = SUCCESS
acmurthy : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1305457
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt


 Need a RM-AM channel to inform AMs about faulty/unhealthy/lost nodes
 -

 Key: MAPREDUCE-3353
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3353
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster, mrv2, resourcemanager
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Bikas Saha
 Fix For: 0.23.3

 Attachments: MAPREDUCE-3353-branch-0.23.patch, 
 MAPREDUCE-3353-branch-0.23.patch, MAPREDUCE-3353-branch-0.23.patch, 
 MAPREDUCE-3353-branch-0.23.patch, MAPREDUCE-3353-branch-0.23.patch, 
 MAPREDUCE-3353-branch-0.23.patch, MAPREDUCE-3353-branch-0.23.patch


 When a node gets lost or turns faulty, AM needs to know about that event so 
 that it can take some action like for e.g. re-executing map tasks whose 
 intermediate output live on that faulty node.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3353) Need a RM-AM channel to inform AMs about faulty/unhealthy/lost nodes

2012-03-26 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238603#comment-13238603
 ] 

Hudson commented on MAPREDUCE-3353:
---

Integrated in Hadoop-Hdfs-0.23-Commit #719 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-0.23-Commit/719/])
MAPREDUCE-3353. Fixed commit msg to point to right jira. (Revision 1305458)

 Result = SUCCESS
acmurthy : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1305458
Files : 
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt


 Need a RM-AM channel to inform AMs about faulty/unhealthy/lost nodes
 -

 Key: MAPREDUCE-3353
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3353
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster, mrv2, resourcemanager
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Bikas Saha
 Fix For: 0.23.3

 Attachments: MAPREDUCE-3353-branch-0.23.patch, 
 MAPREDUCE-3353-branch-0.23.patch, MAPREDUCE-3353-branch-0.23.patch, 
 MAPREDUCE-3353-branch-0.23.patch, MAPREDUCE-3353-branch-0.23.patch, 
 MAPREDUCE-3353-branch-0.23.patch, MAPREDUCE-3353-branch-0.23.patch


 When a node gets lost or turns faulty, AM needs to know about that event so 
 that it can take some action like for e.g. re-executing map tasks whose 
 intermediate output live on that faulty node.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3353) Need a RM-AM channel to inform AMs about faulty/unhealthy/lost nodes

2012-03-26 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238617#comment-13238617
 ] 

Hudson commented on MAPREDUCE-3353:
---

Integrated in Hadoop-Hdfs-trunk-Commit #2005 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2005/])
MAPREDUCE-3353. Fixed commit msg to point to right jira. (Revision 1305457)

 Result = SUCCESS
acmurthy : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1305457
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt


 Need a RM-AM channel to inform AMs about faulty/unhealthy/lost nodes
 -

 Key: MAPREDUCE-3353
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3353
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster, mrv2, resourcemanager
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Bikas Saha
 Fix For: 0.23.3

 Attachments: MAPREDUCE-3353-branch-0.23.patch, 
 MAPREDUCE-3353-branch-0.23.patch, MAPREDUCE-3353-branch-0.23.patch, 
 MAPREDUCE-3353-branch-0.23.patch, MAPREDUCE-3353-branch-0.23.patch, 
 MAPREDUCE-3353-branch-0.23.patch, MAPREDUCE-3353-branch-0.23.patch


 When a node gets lost or turns faulty, AM needs to know about that event so 
 that it can take some action like for e.g. re-executing map tasks whose 
 intermediate output live on that faulty node.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3353) Need a RM-AM channel to inform AMs about faulty/unhealthy/lost nodes

2012-03-26 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238624#comment-13238624
 ] 

Hudson commented on MAPREDUCE-3353:
---

Integrated in Hadoop-Common-0.23-Commit #729 (See 
[https://builds.apache.org/job/Hadoop-Common-0.23-Commit/729/])
MAPREDUCE-3353. Fixed commit msg to point to right jira. (Revision 1305458)

 Result = SUCCESS
acmurthy : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1305458
Files : 
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt


 Need a RM-AM channel to inform AMs about faulty/unhealthy/lost nodes
 -

 Key: MAPREDUCE-3353
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3353
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster, mrv2, resourcemanager
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Bikas Saha
 Fix For: 0.23.3

 Attachments: MAPREDUCE-3353-branch-0.23.patch, 
 MAPREDUCE-3353-branch-0.23.patch, MAPREDUCE-3353-branch-0.23.patch, 
 MAPREDUCE-3353-branch-0.23.patch, MAPREDUCE-3353-branch-0.23.patch, 
 MAPREDUCE-3353-branch-0.23.patch, MAPREDUCE-3353-branch-0.23.patch


 When a node gets lost or turns faulty, AM needs to know about that event so 
 that it can take some action like for e.g. re-executing map tasks whose 
 intermediate output live on that faulty node.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3353) Need a RM-AM channel to inform AMs about faulty/unhealthy/lost nodes

2012-03-26 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238683#comment-13238683
 ] 

Hudson commented on MAPREDUCE-3353:
---

Integrated in Hadoop-Mapreduce-0.23-Commit #738 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-0.23-Commit/738/])
MAPREDUCE-3353. Fixed commit msg to point to right jira. (Revision 1305458)

 Result = ABORTED
acmurthy : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1305458
Files : 
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt


 Need a RM-AM channel to inform AMs about faulty/unhealthy/lost nodes
 -

 Key: MAPREDUCE-3353
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3353
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster, mrv2, resourcemanager
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Bikas Saha
 Fix For: 0.23.3

 Attachments: MAPREDUCE-3353-branch-0.23.patch, 
 MAPREDUCE-3353-branch-0.23.patch, MAPREDUCE-3353-branch-0.23.patch, 
 MAPREDUCE-3353-branch-0.23.patch, MAPREDUCE-3353-branch-0.23.patch, 
 MAPREDUCE-3353-branch-0.23.patch, MAPREDUCE-3353-branch-0.23.patch


 When a node gets lost or turns faulty, AM needs to know about that event so 
 that it can take some action like for e.g. re-executing map tasks whose 
 intermediate output live on that faulty node.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3353) Need a RM-AM channel to inform AMs about faulty/unhealthy/lost nodes

2012-03-26 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238703#comment-13238703
 ] 

Hudson commented on MAPREDUCE-3353:
---

Integrated in Hadoop-Mapreduce-trunk-Commit #1941 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/1941/])
MAPREDUCE-3353. Fixed commit msg to point to right jira. (Revision 1305457)

 Result = ABORTED
acmurthy : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1305457
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt


 Need a RM-AM channel to inform AMs about faulty/unhealthy/lost nodes
 -

 Key: MAPREDUCE-3353
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3353
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster, mrv2, resourcemanager
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Bikas Saha
 Fix For: 0.23.3

 Attachments: MAPREDUCE-3353-branch-0.23.patch, 
 MAPREDUCE-3353-branch-0.23.patch, MAPREDUCE-3353-branch-0.23.patch, 
 MAPREDUCE-3353-branch-0.23.patch, MAPREDUCE-3353-branch-0.23.patch, 
 MAPREDUCE-3353-branch-0.23.patch, MAPREDUCE-3353-branch-0.23.patch


 When a node gets lost or turns faulty, AM needs to know about that event so 
 that it can take some action like for e.g. re-executing map tasks whose 
 intermediate output live on that faulty node.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3353) Need a RM-AM channel to inform AMs about faulty/unhealthy/lost nodes

2012-03-20 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233248#comment-13233248
 ] 

Hadoop QA commented on MAPREDUCE-3353:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12519012/MAPREDUCE-3353-branch-0.23.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 17 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

-1 javac.  The applied patch generated 507 javac compiler warnings (more 
than the trunk's current 505 warnings).

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in .

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2074//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2074//console

This message is automatically generated.

 Need a RM-AM channel to inform AMs about faulty/unhealthy/lost nodes
 -

 Key: MAPREDUCE-3353
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3353
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster, mrv2, resourcemanager
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Bikas Saha
 Fix For: 0.23.2

 Attachments: MAPREDUCE-3353-branch-0.23.patch, 
 MAPREDUCE-3353-branch-0.23.patch, MAPREDUCE-3353-branch-0.23.patch, 
 MAPREDUCE-3353-branch-0.23.patch, MAPREDUCE-3353-branch-0.23.patch, 
 MAPREDUCE-3353-branch-0.23.patch, MAPREDUCE-3353-branch-0.23.patch


 When a node gets lost or turns faulty, AM needs to know about that event so 
 that it can take some action like for e.g. re-executing map tasks whose 
 intermediate output live on that faulty node.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3353) Need a RM-AM channel to inform AMs about faulty/unhealthy/lost nodes

2012-03-20 Thread Arun C Murthy (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233361#comment-13233361
 ] 

Arun C Murthy commented on MAPREDUCE-3353:
--

Bikas, can you pls look at the javac warnings? Thanks.


 Need a RM-AM channel to inform AMs about faulty/unhealthy/lost nodes
 -

 Key: MAPREDUCE-3353
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3353
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster, mrv2, resourcemanager
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Bikas Saha
 Fix For: 0.23.2

 Attachments: MAPREDUCE-3353-branch-0.23.patch, 
 MAPREDUCE-3353-branch-0.23.patch, MAPREDUCE-3353-branch-0.23.patch, 
 MAPREDUCE-3353-branch-0.23.patch, MAPREDUCE-3353-branch-0.23.patch, 
 MAPREDUCE-3353-branch-0.23.patch, MAPREDUCE-3353-branch-0.23.patch


 When a node gets lost or turns faulty, AM needs to know about that event so 
 that it can take some action like for e.g. re-executing map tasks whose 
 intermediate output live on that faulty node.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3353) Need a RM-AM channel to inform AMs about faulty/unhealthy/lost nodes

2012-03-20 Thread Bikas Saha (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233568#comment-13233568
 ] 

Bikas Saha commented on MAPREDUCE-3353:
---

They have already been clarified in the first patch submission. Pasting from an 
earlier comment.

The javac warnings are because of events handlers being called in 
NodesListManager.java and are similar to pre-existing warnings.
===
[WARNING] 
/home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/NodesListManager.java:[142,19]
 [unchecked] unchecked call to handle(T) as a member of the raw type 
org.apache.hadoop.yarn.event.EventHandler
[WARNING] 
/home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/NodesListManager.java:[155,21]
 [unchecked] unchecked call to handle(T) as a member of the raw type 
org.apache.hadoop.yarn.event.EventHandler
[WARNING] 
/home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmcontainer/RMContainerImpl.java:[244,35]
 [unchecked] unchecked call to handle(T) as a member of the raw type 
org.apache.hadoop.yarn.event.EventHandler
===

 Need a RM-AM channel to inform AMs about faulty/unhealthy/lost nodes
 -

 Key: MAPREDUCE-3353
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3353
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster, mrv2, resourcemanager
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Bikas Saha
 Fix For: 0.23.2

 Attachments: MAPREDUCE-3353-branch-0.23.patch, 
 MAPREDUCE-3353-branch-0.23.patch, MAPREDUCE-3353-branch-0.23.patch, 
 MAPREDUCE-3353-branch-0.23.patch, MAPREDUCE-3353-branch-0.23.patch, 
 MAPREDUCE-3353-branch-0.23.patch, MAPREDUCE-3353-branch-0.23.patch


 When a node gets lost or turns faulty, AM needs to know about that event so 
 that it can take some action like for e.g. re-executing map tasks whose 
 intermediate output live on that faulty node.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3353) Need a RM-AM channel to inform AMs about faulty/unhealthy/lost nodes

2012-03-19 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233157#comment-13233157
 ] 

Hadoop QA commented on MAPREDUCE-3353:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12519006/MAPREDUCE-3353-branch-0.23.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 20 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

-1 javac.  The applied patch generated 507 javac compiler warnings (more 
than the trunk's current 505 warnings).

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in .

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2071//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2071//console

This message is automatically generated.

 Need a RM-AM channel to inform AMs about faulty/unhealthy/lost nodes
 -

 Key: MAPREDUCE-3353
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3353
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster, mrv2, resourcemanager
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Bikas Saha
 Fix For: 0.23.2

 Attachments: MAPREDUCE-3353-branch-0.23.patch, 
 MAPREDUCE-3353-branch-0.23.patch, MAPREDUCE-3353-branch-0.23.patch, 
 MAPREDUCE-3353-branch-0.23.patch, MAPREDUCE-3353-branch-0.23.patch, 
 MAPREDUCE-3353-branch-0.23.patch, MAPREDUCE-3353-branch-0.23.patch


 When a node gets lost or turns faulty, AM needs to know about that event so 
 that it can take some action like for e.g. re-executing map tasks whose 
 intermediate output live on that faulty node.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3353) Need a RM-AM channel to inform AMs about faulty/unhealthy/lost nodes

2012-03-19 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233178#comment-13233178
 ] 

Hadoop QA commented on MAPREDUCE-3353:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12519012/MAPREDUCE-3353-branch-0.23.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 17 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

-1 javac.  The applied patch generated 507 javac compiler warnings (more 
than the trunk's current 505 warnings).

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in .

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2072//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2072//console

This message is automatically generated.

 Need a RM-AM channel to inform AMs about faulty/unhealthy/lost nodes
 -

 Key: MAPREDUCE-3353
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3353
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster, mrv2, resourcemanager
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Bikas Saha
 Fix For: 0.23.2

 Attachments: MAPREDUCE-3353-branch-0.23.patch, 
 MAPREDUCE-3353-branch-0.23.patch, MAPREDUCE-3353-branch-0.23.patch, 
 MAPREDUCE-3353-branch-0.23.patch, MAPREDUCE-3353-branch-0.23.patch, 
 MAPREDUCE-3353-branch-0.23.patch, MAPREDUCE-3353-branch-0.23.patch


 When a node gets lost or turns faulty, AM needs to know about that event so 
 that it can take some action like for e.g. re-executing map tasks whose 
 intermediate output live on that faulty node.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3353) Need a RM-AM channel to inform AMs about faulty/unhealthy/lost nodes

2012-03-13 Thread Arun C Murthy (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13228739#comment-13228739
 ] 

Arun C Murthy commented on MAPREDUCE-3353:
--

Looks good, some minor nits:

# RegisterApplicationMasterResponse.(get,set)UnusableNodes is not used right 
now, so let's add it later when we need it.
# AMResponse.getUpdatedNodes could use javadocs.
# BuilderUtils.createNodeReport
# RMContextImpl.applications shud be changed to ConcurrentSkipListMap to be 
safe - we should open a separate jira to fix the signature of RMContext.get* 
which return ConcurrentMap
# RMAppImpl.pullRMNodeUpdates needs a writeLock since it's clearing
# RMAppImpl shud ignore NodeUpdate in COMPLETED state (thus we can remove the 
'if' condition in RMAppNodeUpdateTransition).


 Need a RM-AM channel to inform AMs about faulty/unhealthy/lost nodes
 -

 Key: MAPREDUCE-3353
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3353
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster, mrv2, resourcemanager
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Bikas Saha
 Fix For: 0.23.2

 Attachments: MAPREDUCE-3353-branch-0.23.patch, 
 MAPREDUCE-3353-branch-0.23.patch, MAPREDUCE-3353-branch-0.23.patch, 
 MAPREDUCE-3353-branch-0.23.patch


 When a node gets lost or turns faulty, AM needs to know about that event so 
 that it can take some action like for e.g. re-executing map tasks whose 
 intermediate output live on that faulty node.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3353) Need a RM-AM channel to inform AMs about faulty/unhealthy/lost nodes

2012-03-05 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13222848#comment-13222848
 ] 

Hadoop QA commented on MAPREDUCE-3353:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12517166/MAPREDUCE-3353-branch-0.23.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 29 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

-1 javac.  The applied patch generated 503 javac compiler warnings (more 
than the trunk's current 501 warnings).

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed these unit tests:
  org.apache.hadoop.mapreduce.v2.app.TestRMContainerAllocator

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2008//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2008//console

This message is automatically generated.

 Need a RM-AM channel to inform AMs about faulty/unhealthy/lost nodes
 -

 Key: MAPREDUCE-3353
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3353
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster, mrv2, resourcemanager
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Bikas Saha
 Fix For: 0.23.2

 Attachments: MAPREDUCE-3353-branch-0.23.patch, 
 MAPREDUCE-3353-branch-0.23.patch, MAPREDUCE-3353-branch-0.23.patch, 
 MAPREDUCE-3353-branch-0.23.patch


 When a node gets lost or turns faulty, AM needs to know about that event so 
 that it can take some action like for e.g. re-executing map tasks whose 
 intermediate output live on that faulty node.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3353) Need a RM-AM channel to inform AMs about faulty/unhealthy/lost nodes

2012-03-05 Thread Bikas Saha (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13222947#comment-13222947
 ] 

Bikas Saha commented on MAPREDUCE-3353:
---

The test failure is unrelated to the patch and happens on trunk. MAPREDUCE-3976.


 Need a RM-AM channel to inform AMs about faulty/unhealthy/lost nodes
 -

 Key: MAPREDUCE-3353
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3353
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster, mrv2, resourcemanager
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Bikas Saha
 Fix For: 0.23.2

 Attachments: MAPREDUCE-3353-branch-0.23.patch, 
 MAPREDUCE-3353-branch-0.23.patch, MAPREDUCE-3353-branch-0.23.patch, 
 MAPREDUCE-3353-branch-0.23.patch


 When a node gets lost or turns faulty, AM needs to know about that event so 
 that it can take some action like for e.g. re-executing map tasks whose 
 intermediate output live on that faulty node.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3353) Need a RM-AM channel to inform AMs about faulty/unhealthy/lost nodes

2012-03-04 Thread Amol Kekre (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13222111#comment-13222111
 ] 

Amol Kekre commented on MAPREDUCE-3353:
---

Not sure why this jira is marked critical. This only impacts if a node goes bad 
during AM life span right? If so given 3 attempts by MR, how important this 
jira (Major?).

 Need a RM-AM channel to inform AMs about faulty/unhealthy/lost nodes
 -

 Key: MAPREDUCE-3353
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3353
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster, mrv2, resourcemanager
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Bikas Saha
Priority: Critical
 Fix For: 0.23.2

 Attachments: MAPREDUCE-3353-branch-0.23.patch, 
 MAPREDUCE-3353-branch-0.23.patch, MAPREDUCE-3353-branch-0.23.patch


 When a node gets lost or turns faulty, AM needs to know about that event so 
 that it can take some action like for e.g. re-executing map tasks whose 
 intermediate output live on that faulty node.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3353) Need a RM-AM channel to inform AMs about faulty/unhealthy/lost nodes

2012-02-29 Thread Amol Kekre (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13219337#comment-13219337
 ] 

Amol Kekre commented on MAPREDUCE-3353:
---

any updates?

 Need a RM-AM channel to inform AMs about faulty/unhealthy/lost nodes
 -

 Key: MAPREDUCE-3353
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3353
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster, mrv2, resourcemanager
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Bikas Saha
Priority: Critical
 Fix For: 0.23.2


 When a node gets lost or turns faulty, AM needs to know about that event so 
 that it can take some action like for e.g. re-executing map tasks whose 
 intermediate output live on that faulty node.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3353) Need a RM-AM channel to inform AMs about faulty/unhealthy/lost nodes

2012-02-29 Thread Bikas Saha (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13219400#comment-13219400
 ] 

Bikas Saha commented on MAPREDUCE-3353:
---

The changes turned out to be more than initially expected. I have the code done 
and will start on the tests.

 Need a RM-AM channel to inform AMs about faulty/unhealthy/lost nodes
 -

 Key: MAPREDUCE-3353
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3353
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster, mrv2, resourcemanager
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Bikas Saha
Priority: Critical
 Fix For: 0.23.2


 When a node gets lost or turns faulty, AM needs to know about that event so 
 that it can take some action like for e.g. re-executing map tasks whose 
 intermediate output live on that faulty node.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3353) Need a RM-AM channel to inform AMs about faulty/unhealthy/lost nodes

2012-02-23 Thread Bikas Saha (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215306#comment-13215306
 ] 

Bikas Saha commented on MAPREDUCE-3353:
---

A potential solution would be the following
1) have the scheduler interface return the set of bad nodes on which it has 
stopped scheduling. This keeps the decision of which node is bad in the 
scheduler. The scheduler is the ultimate authority on what runs on a node and 
should tell its clients whether about the nodes that it is not considering for 
scheduling.
2) 1) above could be done as another interface API or piggybacked on the 
scheduler.allocate() API.
3) The response could contain all the known bad nodes or deltas to the previous 
response. Deltas are cheaper to send but are susceptible to message loss and 
retransmission. Also, deltas would have to be divided into new bad nodes and 
new good nodes.
4) The AM might want to know the type of bad node. Say lost or unhealthy etc. 
The bad nodes information could be enhanced via querying the RMNode object for 
the actual reason/health.

As an enhancement, we could add a new RMNodeMananger entity that manages all 
the RMNodes. The above functionality could move from the scheduler into 
RMNodeManager (though it would need to be in sync with the scheduler). After 
that, getting detailed information may not need direct access to RMNode object. 
Potentially, other interactions with RMNode could be forwarded through the 
RMNodeManager. But this would be a fairly significant refactoring thats best 
left to a separate future work item.

 Need a RM-AM channel to inform AMs about faulty/unhealthy/lost nodes
 -

 Key: MAPREDUCE-3353
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3353
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster, mrv2, resourcemanager
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Bikas Saha
Priority: Critical
 Fix For: 0.23.2


 When a node gets lost or turns faulty, AM needs to know about that event so 
 that it can take some action like for e.g. re-executing map tasks whose 
 intermediate output live on that faulty node.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3353) Need a RM-AM channel to inform AMs about faulty/unhealthy/lost nodes

2012-02-23 Thread Bikas Saha (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215328#comment-13215328
 ] 

Bikas Saha commented on MAPREDUCE-3353:
---

Not doing deltas on the RM-AM channel does not seem viable because of high 
frequency message traffic. Sending information about 100 bad nodes at 100 bytes 
per node for 1000AM's every second is about 10MB/s of traffic.
Sending deltas means tracking last and current states on the RM on a per AM 
attempt basis. That would not be good to do in the scheduler because its not 
the responsibility of the scheduler. So this needs to be done on each RMAttempt 
object. The RMAttempt object gets the current list of bad nodes and compares it 
with its last known list of bad nodes. Additions and deletions are sent to the 
AM as new bad and good nodes.
Alternatively, each RMNode could send an event to each RMAppAttempt for 
healthy-unhealthy and vice versa transitions. These events could be 
accumulated and copied to the AM via the allocate response.

 Need a RM-AM channel to inform AMs about faulty/unhealthy/lost nodes
 -

 Key: MAPREDUCE-3353
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3353
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster, mrv2, resourcemanager
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Bikas Saha
Priority: Critical
 Fix For: 0.23.2


 When a node gets lost or turns faulty, AM needs to know about that event so 
 that it can take some action like for e.g. re-executing map tasks whose 
 intermediate output live on that faulty node.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3353) Need a RM-AM channel to inform AMs about faulty/unhealthy/lost nodes

2012-02-21 Thread Amol Kekre (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13212963#comment-13212963
 ] 

Amol Kekre commented on MAPREDUCE-3353:
---

Vinod,
Should this be in a .23.1 RC or can we move it to .23.2?

 Need a RM-AM channel to inform AMs about faulty/unhealthy/lost nodes
 -

 Key: MAPREDUCE-3353
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3353
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster, mrv2, resourcemanager
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
Priority: Critical
 Fix For: 0.23.1


 When a node gets lost or turns faulty, AM needs to know about that event so 
 that it can take some action like for e.g. re-executing map tasks whose 
 intermediate output live on that faulty node.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira