[jira] [Commented] (MAPREDUCE-5652) NM Recovery. ShuffleHandler should handle NM restarts

2014-05-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13996771#comment-13996771
 ] 

Hadoop QA commented on MAPREDUCE-5652:
--

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12644621/MAPREDUCE-5652-v10.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4601//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4601//console

This message is automatically generated.

 NM Recovery. ShuffleHandler should handle NM restarts
 -

 Key: MAPREDUCE-5652
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5652
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.2.0
Reporter: Karthik Kambatla
Assignee: Jason Lowe
  Labels: shuffle
 Attachments: MAPREDUCE-5652-v10.patch, MAPREDUCE-5652-v2.patch, 
 MAPREDUCE-5652-v3.patch, MAPREDUCE-5652-v4.patch, MAPREDUCE-5652-v5.patch, 
 MAPREDUCE-5652-v6.patch, MAPREDUCE-5652-v7.patch, MAPREDUCE-5652-v8.patch, 
 MAPREDUCE-5652-v9-and-YARN-1987.patch, MAPREDUCE-5652.patch


 ShuffleHandler should work across NM restarts and not require re-running 
 map-tasks. On NM restart, the map outputs are cleaned up requiring 
 re-execution of map tasks and should be avoided.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5652) NM Recovery. ShuffleHandler should handle NM restarts

2014-05-16 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999002#comment-13999002
 ] 

Hudson commented on MAPREDUCE-5652:
---

SUCCESS: Integrated in Hadoop-trunk-Commit #5605 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5605/])
MAPREDUCE-5652. NM Recovery. ShuffleHandler should handle NM restarts. (Jason 
Lowe via kasha) (kasha: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1594329)
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* /hadoop/common/trunk/hadoop-mapreduce-project/LICENSE.txt
* /hadoop/common/trunk/hadoop-mapreduce-project/dev-support/findbugs-exclude.xml
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/JobID.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/pom.xml
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main/java/org/apache/hadoop/mapred/ShuffleHandler.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main/proto
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main/proto/ShuffleHandlerRecovery.proto
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/test/java/org/apache/hadoop/mapred/TestShuffleHandler.java
* /hadoop/common/trunk/hadoop-mapreduce-project/pom.xml


 NM Recovery. ShuffleHandler should handle NM restarts
 -

 Key: MAPREDUCE-5652
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5652
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.2.0
Reporter: Karthik Kambatla
Assignee: Jason Lowe
  Labels: shuffle
 Fix For: 2.5.0

 Attachments: MAPREDUCE-5652-v10.patch, MAPREDUCE-5652-v2.patch, 
 MAPREDUCE-5652-v3.patch, MAPREDUCE-5652-v4.patch, MAPREDUCE-5652-v5.patch, 
 MAPREDUCE-5652-v6.patch, MAPREDUCE-5652-v7.patch, MAPREDUCE-5652-v8.patch, 
 MAPREDUCE-5652-v9-and-YARN-1987.patch, MAPREDUCE-5652.patch


 ShuffleHandler should work across NM restarts and not require re-running 
 map-tasks. On NM restart, the map outputs are cleaned up requiring 
 re-execution of map tasks and should be avoided.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5652) NM Recovery. ShuffleHandler should handle NM restarts

2014-05-15 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998170#comment-13998170
 ] 

Hudson commented on MAPREDUCE-5652:
---

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1779 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1779/])
MAPREDUCE-5652. NM Recovery. ShuffleHandler should handle NM restarts. (Jason 
Lowe via kasha) (kasha: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1594329)
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* /hadoop/common/trunk/hadoop-mapreduce-project/LICENSE.txt
* /hadoop/common/trunk/hadoop-mapreduce-project/dev-support/findbugs-exclude.xml
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/JobID.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/pom.xml
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main/java/org/apache/hadoop/mapred/ShuffleHandler.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main/proto
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main/proto/ShuffleHandlerRecovery.proto
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/test/java/org/apache/hadoop/mapred/TestShuffleHandler.java
* /hadoop/common/trunk/hadoop-mapreduce-project/pom.xml


 NM Recovery. ShuffleHandler should handle NM restarts
 -

 Key: MAPREDUCE-5652
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5652
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.2.0
Reporter: Karthik Kambatla
Assignee: Jason Lowe
  Labels: shuffle
 Fix For: 2.5.0

 Attachments: MAPREDUCE-5652-v10.patch, MAPREDUCE-5652-v2.patch, 
 MAPREDUCE-5652-v3.patch, MAPREDUCE-5652-v4.patch, MAPREDUCE-5652-v5.patch, 
 MAPREDUCE-5652-v6.patch, MAPREDUCE-5652-v7.patch, MAPREDUCE-5652-v8.patch, 
 MAPREDUCE-5652-v9-and-YARN-1987.patch, MAPREDUCE-5652.patch


 ShuffleHandler should work across NM restarts and not require re-running 
 map-tasks. On NM restart, the map outputs are cleaned up requiring 
 re-execution of map tasks and should be avoided.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5652) NM Recovery. ShuffleHandler should handle NM restarts

2014-05-15 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13996497#comment-13996497
 ] 

Karthik Kambatla commented on MAPREDUCE-5652:
-

+1, pending Jenkins.

 NM Recovery. ShuffleHandler should handle NM restarts
 -

 Key: MAPREDUCE-5652
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5652
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.2.0
Reporter: Karthik Kambatla
Assignee: Jason Lowe
  Labels: shuffle
 Attachments: MAPREDUCE-5652-v10.patch, MAPREDUCE-5652-v2.patch, 
 MAPREDUCE-5652-v3.patch, MAPREDUCE-5652-v4.patch, MAPREDUCE-5652-v5.patch, 
 MAPREDUCE-5652-v6.patch, MAPREDUCE-5652-v7.patch, MAPREDUCE-5652-v8.patch, 
 MAPREDUCE-5652-v9-and-YARN-1987.patch, MAPREDUCE-5652.patch


 ShuffleHandler should work across NM restarts and not require re-running 
 map-tasks. On NM restart, the map outputs are cleaned up requiring 
 re-execution of map tasks and should be avoided.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5652) NM Recovery. ShuffleHandler should handle NM restarts

2014-05-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998208#comment-13998208
 ] 

Hudson commented on MAPREDUCE-5652:
---

FAILURE: Integrated in Hadoop-Hdfs-trunk #1753 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1753/])
MAPREDUCE-5652. NM Recovery. ShuffleHandler should handle NM restarts. (Jason 
Lowe via kasha) (kasha: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1594329)
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* /hadoop/common/trunk/hadoop-mapreduce-project/LICENSE.txt
* /hadoop/common/trunk/hadoop-mapreduce-project/dev-support/findbugs-exclude.xml
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/JobID.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/pom.xml
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main/java/org/apache/hadoop/mapred/ShuffleHandler.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main/proto
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main/proto/ShuffleHandlerRecovery.proto
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/test/java/org/apache/hadoop/mapred/TestShuffleHandler.java
* /hadoop/common/trunk/hadoop-mapreduce-project/pom.xml


 NM Recovery. ShuffleHandler should handle NM restarts
 -

 Key: MAPREDUCE-5652
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5652
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.2.0
Reporter: Karthik Kambatla
Assignee: Jason Lowe
  Labels: shuffle
 Fix For: 2.5.0

 Attachments: MAPREDUCE-5652-v10.patch, MAPREDUCE-5652-v2.patch, 
 MAPREDUCE-5652-v3.patch, MAPREDUCE-5652-v4.patch, MAPREDUCE-5652-v5.patch, 
 MAPREDUCE-5652-v6.patch, MAPREDUCE-5652-v7.patch, MAPREDUCE-5652-v8.patch, 
 MAPREDUCE-5652-v9-and-YARN-1987.patch, MAPREDUCE-5652.patch


 ShuffleHandler should work across NM restarts and not require re-running 
 map-tasks. On NM restart, the map outputs are cleaned up requiring 
 re-execution of map tasks and should be avoided.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5652) NM Recovery. ShuffleHandler should handle NM restarts

2014-05-11 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1399#comment-1399
 ] 

Karthik Kambatla commented on MAPREDUCE-5652:
-

[~jlowe] - now that YARN-1987 is committed, mind updating the patch against the 
latest trunk. We should be able to get this in soon. 

 NM Recovery. ShuffleHandler should handle NM restarts
 -

 Key: MAPREDUCE-5652
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5652
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.2.0
Reporter: Karthik Kambatla
Assignee: Jason Lowe
  Labels: shuffle
 Attachments: MAPREDUCE-5652-v2.patch, MAPREDUCE-5652-v3.patch, 
 MAPREDUCE-5652-v4.patch, MAPREDUCE-5652-v5.patch, MAPREDUCE-5652-v6.patch, 
 MAPREDUCE-5652-v7.patch, MAPREDUCE-5652-v8.patch, 
 MAPREDUCE-5652-v9-and-YARN-1987.patch, MAPREDUCE-5652.patch


 ShuffleHandler should work across NM restarts and not require re-running 
 map-tasks. On NM restart, the map outputs are cleaned up requiring 
 re-execution of map tasks and should be avoided.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5652) NM Recovery. ShuffleHandler should handle NM restarts

2014-05-02 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13987668#comment-13987668
 ] 

Jason Lowe commented on MAPREDUCE-5652:
---

bq. Just to confirm, protobuf should be backward compatible, e.g., the store 
state serialized with version 2.4 should be readable by NM/MR compiled with 
version 2.5.

Yes, the protobuf incompatibility between 2.4 and 2.5 is an issue with the 
interfaces to the protobuf code and not an incompatibility with the data layout 
of protobuf messages.

bq. On an unrelated note, based on how NM's AuxServices' serviceStart handles 
error for each AuxService' serviceStart, if one AuxService throws some 
exception, the rest of AuxServices' serviceStart will be skipped.

I might be reading the code incorrectly, but it looks like 
AuxServices#serviceStart doesn't try to handle exceptions coming from 
individual aux services at all.  If one of their startups throws then it will 
be converted into a RuntimeException (by AbstractService#start) which will 
bubble up out of AuxServices and likely all the way up such that the NM startup 
will fail.

As you pointed out before, a better way to handle aux services would be to run 
them outside of the NM (maybe even within containers).  It'd also be nice to 
make them more dynamic, such that application submissions can provide an aux 
service they require.  They could be started on demand, ref-counted, and 
stopped accordingly based on usage, which would be a smoother answer to rolling 
upgrades and sharing multiple versions of the aux services within the cluster.  
This is of course a non-trivial amount of work for another JIRA. ;-)

 NM Recovery. ShuffleHandler should handle NM restarts
 -

 Key: MAPREDUCE-5652
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5652
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.2.0
Reporter: Karthik Kambatla
Assignee: Jason Lowe
  Labels: shuffle
 Attachments: MAPREDUCE-5652-v2.patch, MAPREDUCE-5652-v3.patch, 
 MAPREDUCE-5652-v4.patch, MAPREDUCE-5652-v5.patch, MAPREDUCE-5652-v6.patch, 
 MAPREDUCE-5652-v7.patch, MAPREDUCE-5652-v8.patch, 
 MAPREDUCE-5652-v9-and-YARN-1987.patch, MAPREDUCE-5652.patch


 ShuffleHandler should work across NM restarts and not require re-running 
 map-tasks. On NM restart, the map outputs are cleaned up requiring 
 re-execution of map tasks and should be avoided.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5652) NM Recovery. ShuffleHandler should handle NM restarts

2014-05-01 Thread Ming Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13987115#comment-13987115
 ] 

Ming Ma commented on MAPREDUCE-5652:


Sounds good, we can use a new jira to cover best effort work.

The patch looks good. Just to confirm, protobuf should be backward compatible, 
e.g., the store state serialized with version 2.4 should be readable by NM/MR 
compiled with version 2.5.

On an unrelated note, based on how NM's AuxServices' serviceStart handles error 
for each AuxService' serviceStart, if one AuxService throws some exception, the 
rest of AuxServices' serviceStart will be skipped. That isn't important given 
we only have one AuxService. Perhaps there is some policy around that as well, 
should NM skip failed AuxService? It seems in general we might need to improve 
AuxService handling if there are other AuxServices.

 NM Recovery. ShuffleHandler should handle NM restarts
 -

 Key: MAPREDUCE-5652
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5652
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.2.0
Reporter: Karthik Kambatla
Assignee: Jason Lowe
  Labels: shuffle
 Attachments: MAPREDUCE-5652-v2.patch, MAPREDUCE-5652-v3.patch, 
 MAPREDUCE-5652-v4.patch, MAPREDUCE-5652-v5.patch, MAPREDUCE-5652-v6.patch, 
 MAPREDUCE-5652-v7.patch, MAPREDUCE-5652-v8.patch, 
 MAPREDUCE-5652-v9-and-YARN-1987.patch, MAPREDUCE-5652.patch


 ShuffleHandler should work across NM restarts and not require re-running 
 map-tasks. On NM restart, the map outputs are cleaned up requiring 
 re-execution of map tasks and should be avoided.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5652) NM Recovery. ShuffleHandler should handle NM restarts

2014-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13987248#comment-13987248
 ] 

Hadoop QA commented on MAPREDUCE-5652:
--

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12642848/MAPREDUCE-5652-v9-and-YARN-1987.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4573//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4573//console

This message is automatically generated.

 NM Recovery. ShuffleHandler should handle NM restarts
 -

 Key: MAPREDUCE-5652
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5652
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.2.0
Reporter: Karthik Kambatla
Assignee: Jason Lowe
  Labels: shuffle
 Attachments: MAPREDUCE-5652-v2.patch, MAPREDUCE-5652-v3.patch, 
 MAPREDUCE-5652-v4.patch, MAPREDUCE-5652-v5.patch, MAPREDUCE-5652-v6.patch, 
 MAPREDUCE-5652-v7.patch, MAPREDUCE-5652-v8.patch, 
 MAPREDUCE-5652-v9-and-YARN-1987.patch, MAPREDUCE-5652.patch


 ShuffleHandler should work across NM restarts and not require re-running 
 map-tasks. On NM restart, the map outputs are cleaned up requiring 
 re-execution of map tasks and should be avoided.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5652) NM Recovery. ShuffleHandler should handle NM restarts

2014-04-30 Thread Ming Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13985932#comment-13985932
 ] 

Ming Ma commented on MAPREDUCE-5652:


1. Regarding generic interface for restore/recover,  I agree there is no much 
benefit to generalize things for the sake of it. One scenario could be 
something like ShuffleHandler, some ShuffleHandlers support recovery, some 
don't. NM can ask if a specific ShuffleHandler if it supports recovery, NM will 
manage the underlying store and pass the store object to ShuffleHandler and 
ShuffleHandler manages the serialization and deserialization, etc. If NM 
decides to change the underlying store and ShuffleHandler doesn't need to 
change. But at this point, it seems unnecessary.
2. If ShuffleHandler gets DBException during recoverState as part of 
serviceStart, should ShuffleHandler ignore the exception and continue like the 
store doesn't exist? The argument for ignoring it is it is soft state and 
ShuffleHandler can still run without it. Or maybe this can be configurable.

 NM Recovery. ShuffleHandler should handle NM restarts
 -

 Key: MAPREDUCE-5652
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5652
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.2.0
Reporter: Karthik Kambatla
Assignee: Jason Lowe
  Labels: shuffle
 Attachments: MAPREDUCE-5652-v2.patch, MAPREDUCE-5652-v3.patch, 
 MAPREDUCE-5652-v4.patch, MAPREDUCE-5652-v5.patch, MAPREDUCE-5652-v6.patch, 
 MAPREDUCE-5652-v7.patch, MAPREDUCE-5652-v8.patch, MAPREDUCE-5652.patch


 ShuffleHandler should work across NM restarts and not require re-running 
 map-tasks. On NM restart, the map outputs are cleaned up requiring 
 re-execution of map tasks and should be avoided.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5652) NM Recovery. ShuffleHandler should handle NM restarts

2014-04-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13981063#comment-13981063
 ] 

Hadoop QA commented on MAPREDUCE-5652:
--

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12641927/MAPREDUCE-5652-v8.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4558//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4558//console

This message is automatically generated.

 NM Recovery. ShuffleHandler should handle NM restarts
 -

 Key: MAPREDUCE-5652
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5652
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.2.0
Reporter: Karthik Kambatla
Assignee: Jason Lowe
  Labels: shuffle
 Attachments: MAPREDUCE-5652-v2.patch, MAPREDUCE-5652-v3.patch, 
 MAPREDUCE-5652-v4.patch, MAPREDUCE-5652-v5.patch, MAPREDUCE-5652-v6.patch, 
 MAPREDUCE-5652-v7.patch, MAPREDUCE-5652-v8.patch, MAPREDUCE-5652.patch


 ShuffleHandler should work across NM restarts and not require re-running 
 map-tasks. On NM restart, the map outputs are cleaned up requiring 
 re-execution of map tasks and should be avoided.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5652) NM Recovery. ShuffleHandler should handle NM restarts

2014-04-25 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13981076#comment-13981076
 ] 

Karthik Kambatla commented on MAPREDUCE-5652:
-

+1 to the wrapper approach. We should probably go ahead and make it a util 
class, and ping yarn-dev@ so other features using LevelDB are aware of this.

 NM Recovery. ShuffleHandler should handle NM restarts
 -

 Key: MAPREDUCE-5652
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5652
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.2.0
Reporter: Karthik Kambatla
Assignee: Jason Lowe
  Labels: shuffle
 Attachments: MAPREDUCE-5652-v2.patch, MAPREDUCE-5652-v3.patch, 
 MAPREDUCE-5652-v4.patch, MAPREDUCE-5652-v5.patch, MAPREDUCE-5652-v6.patch, 
 MAPREDUCE-5652-v7.patch, MAPREDUCE-5652-v8.patch, MAPREDUCE-5652.patch


 ShuffleHandler should work across NM restarts and not require re-running 
 map-tasks. On NM restart, the map outputs are cleaned up requiring 
 re-execution of map tasks and should be avoided.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5652) NM Recovery. ShuffleHandler should handle NM restarts

2014-04-25 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13981114#comment-13981114
 ] 

Jason Lowe commented on MAPREDUCE-5652:
---

As for the utility class, where would it go?  I'm assuming yarn-server-common, 
otherwise we end up with clients picking it up.  We'd have to add the leveldb 
dependency there which is unfortunate for those servers that don't really need 
it (i.e.: resourcemanager, webproxy).

 NM Recovery. ShuffleHandler should handle NM restarts
 -

 Key: MAPREDUCE-5652
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5652
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.2.0
Reporter: Karthik Kambatla
Assignee: Jason Lowe
  Labels: shuffle
 Attachments: MAPREDUCE-5652-v2.patch, MAPREDUCE-5652-v3.patch, 
 MAPREDUCE-5652-v4.patch, MAPREDUCE-5652-v5.patch, MAPREDUCE-5652-v6.patch, 
 MAPREDUCE-5652-v7.patch, MAPREDUCE-5652-v8.patch, MAPREDUCE-5652.patch


 ShuffleHandler should work across NM restarts and not require re-running 
 map-tasks. On NM restart, the map outputs are cleaned up requiring 
 re-execution of map tasks and should be avoided.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5652) NM Recovery. ShuffleHandler should handle NM restarts

2014-04-25 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13981357#comment-13981357
 ] 

Karthik Kambatla commented on MAPREDUCE-5652:
-

yarn-server-common seems the best place.

 NM Recovery. ShuffleHandler should handle NM restarts
 -

 Key: MAPREDUCE-5652
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5652
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.2.0
Reporter: Karthik Kambatla
Assignee: Jason Lowe
  Labels: shuffle
 Attachments: MAPREDUCE-5652-v2.patch, MAPREDUCE-5652-v3.patch, 
 MAPREDUCE-5652-v4.patch, MAPREDUCE-5652-v5.patch, MAPREDUCE-5652-v6.patch, 
 MAPREDUCE-5652-v7.patch, MAPREDUCE-5652-v8.patch, MAPREDUCE-5652.patch


 ShuffleHandler should work across NM restarts and not require re-running 
 map-tasks. On NM restart, the map outputs are cleaned up requiring 
 re-execution of map tasks and should be avoided.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5652) NM Recovery. ShuffleHandler should handle NM restarts

2014-04-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13979920#comment-13979920
 ] 

Hadoop QA commented on MAPREDUCE-5652:
--

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12641736/MAPREDUCE-5652-v7.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4554//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4554//console

This message is automatically generated.

 NM Recovery. ShuffleHandler should handle NM restarts
 -

 Key: MAPREDUCE-5652
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5652
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.2.0
Reporter: Karthik Kambatla
Assignee: Jason Lowe
  Labels: shuffle
 Attachments: MAPREDUCE-5652-v2.patch, MAPREDUCE-5652-v3.patch, 
 MAPREDUCE-5652-v4.patch, MAPREDUCE-5652-v5.patch, MAPREDUCE-5652-v6.patch, 
 MAPREDUCE-5652-v7.patch, MAPREDUCE-5652.patch


 ShuffleHandler should work across NM restarts and not require re-running 
 map-tasks. On NM restart, the map outputs are cleaned up requiring 
 re-execution of map tasks and should be avoided.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5652) NM Recovery. ShuffleHandler should handle NM restarts

2014-04-23 Thread Ming Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13977938#comment-13977938
 ] 

Ming Ma commented on MAPREDUCE-5652:


Nice work. Jason, I would like to clarify how the following scenarios are 
handled. Perhaps they are covered at the YARN layer as part of 
https://issues.apache.org/jira/browse/YARN-1336.

1. NM crash scenario. There is a corner case, after RM notifies NM regarding 
the completion of a specific application, right before AuxServices get the 
chance to process the event, NM crashes. The app entry won't be removed after 
the recovery store after NM is restarted, as APPLICATION_STOP won't be 
delivered to NM for that application after NM restart.

2. NM graceful shutdown. It seems ContainerManagerImpl's serviceStop will 
generate ContainerManagerEventType.FINISH_APPS event. That means AuxServices 
could clean up and remove it from the recovery store as part of NM shutdown.

 NM Recovery. ShuffleHandler should handle NM restarts
 -

 Key: MAPREDUCE-5652
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5652
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.2.0
Reporter: Karthik Kambatla
Assignee: Jason Lowe
  Labels: shuffle
 Attachments: MAPREDUCE-5652-v2.patch, MAPREDUCE-5652-v3.patch, 
 MAPREDUCE-5652-v4.patch, MAPREDUCE-5652-v5.patch, MAPREDUCE-5652-v6.patch, 
 MAPREDUCE-5652.patch


 ShuffleHandler should work across NM restarts and not require re-running 
 map-tasks. On NM restart, the map outputs are cleaned up requiring 
 re-execution of map tasks and should be avoided.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5652) NM Recovery. ShuffleHandler should handle NM restarts

2014-04-23 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13978561#comment-13978561
 ] 

Jason Lowe commented on MAPREDUCE-5652:
---

Thanks for the feedback, Ming!

For the NM crash scenario above we need YARN-1336 so applications are persisted 
outside of an individual aux service.  Once that's present, it remembers when 
applications are finishing and persists that before responding to the NM 
heartbeat telling it of the apps that have just finished.  Upon recovery it 
will recover the application and re-send the finish event which will send the 
app stop event to the aux services.

For the graceful shutdown scenario we need YARN-1336 and/or YARN-1362.  Either 
YARN-1336 will never send app stop events upon NM shutdown if recovery is 
enabled or we need to be able to distinguish between a graceful NM shutdown and 
an NM kill/crash to know whether to send the app stop event to aux services.

 NM Recovery. ShuffleHandler should handle NM restarts
 -

 Key: MAPREDUCE-5652
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5652
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.2.0
Reporter: Karthik Kambatla
Assignee: Jason Lowe
  Labels: shuffle
 Attachments: MAPREDUCE-5652-v2.patch, MAPREDUCE-5652-v3.patch, 
 MAPREDUCE-5652-v4.patch, MAPREDUCE-5652-v5.patch, MAPREDUCE-5652-v6.patch, 
 MAPREDUCE-5652.patch


 ShuffleHandler should work across NM restarts and not require re-running 
 map-tasks. On NM restart, the map outputs are cleaned up requiring 
 re-execution of map tasks and should be avoided.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5652) NM Recovery. ShuffleHandler should handle NM restarts

2014-04-23 Thread Ming Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13978830#comment-13978830
 ] 

Ming Ma commented on MAPREDUCE-5652:


Thanks, Jason. It is good to know it will be taken care of at YARN layer. I 
will post some more comments at YARN-1336.

1. Does leveDB's delete method throw exception? JNI has some exception handling 
and the caller needs to retrieve the exceptions, etc.
2. It seems like recover/restore are common in NM/RM restart. Any abstract 
interface defined for that?

 NM Recovery. ShuffleHandler should handle NM restarts
 -

 Key: MAPREDUCE-5652
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5652
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.2.0
Reporter: Karthik Kambatla
Assignee: Jason Lowe
  Labels: shuffle
 Attachments: MAPREDUCE-5652-v2.patch, MAPREDUCE-5652-v3.patch, 
 MAPREDUCE-5652-v4.patch, MAPREDUCE-5652-v5.patch, MAPREDUCE-5652-v6.patch, 
 MAPREDUCE-5652.patch


 ShuffleHandler should work across NM restarts and not require re-running 
 map-tasks. On NM restart, the map outputs are cleaned up requiring 
 re-execution of map tasks and should be avoided.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5652) NM Recovery. ShuffleHandler should handle NM restarts

2014-04-22 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13977065#comment-13977065
 ] 

Hadoop QA commented on MAPREDUCE-5652:
--

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12641270/MAPREDUCE-5652-v6.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4543//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4543//console

This message is automatically generated.

 NM Recovery. ShuffleHandler should handle NM restarts
 -

 Key: MAPREDUCE-5652
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5652
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.2.0
Reporter: Karthik Kambatla
Assignee: Jason Lowe
  Labels: shuffle
 Attachments: MAPREDUCE-5652-v2.patch, MAPREDUCE-5652-v3.patch, 
 MAPREDUCE-5652-v4.patch, MAPREDUCE-5652-v5.patch, MAPREDUCE-5652-v6.patch, 
 MAPREDUCE-5652.patch


 ShuffleHandler should work across NM restarts and not require re-running 
 map-tasks. On NM restart, the map outputs are cleaned up requiring 
 re-execution of map tasks and should be avoided.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5652) NM Recovery. ShuffleHandler should handle NM restarts

2014-04-14 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13968971#comment-13968971
 ] 

Karthik Kambatla commented on MAPREDUCE-5652:
-

bq. However that's outside of the scope (and project) of this JIRA. I'll try to 
address that in YARN-1354 or possibly a separate YARN JIRA.
Thanks for the clarification. Let us track this in YARN-1354. 

bq. serviceStart() indirectly calls initStateStore to open the database, and 
serviceStop() closes the database.
I see. Makes sense to leave it in serviceStart(). Do you think renaming 
initStateStore to initAndOpenStore or startStore is reasonable? 

bq. Maybe may need the equivalent of YARN-888 for hadoop-mapreduce-project poms 
to only declare them in the leaf modules and still have them be picked up 
properly.
MAPREDUCE-5362 . Let us try to get that in first. Mind taking a look? I ll also 
look at it shortly.

 NM Recovery. ShuffleHandler should handle NM restarts
 -

 Key: MAPREDUCE-5652
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5652
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.2.0
Reporter: Karthik Kambatla
Assignee: Jason Lowe
  Labels: shuffle
 Attachments: MAPREDUCE-5652-v2.patch, MAPREDUCE-5652-v3.patch, 
 MAPREDUCE-5652-v4.patch, MAPREDUCE-5652-v5.patch, MAPREDUCE-5652.patch


 ShuffleHandler should work across NM restarts and not require re-running 
 map-tasks. On NM restart, the map outputs are cleaned up requiring 
 re-execution of map tasks and should be avoided.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5652) NM Recovery. ShuffleHandler should handle NM restarts

2014-04-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13967206#comment-13967206
 ] 

Hadoop QA commented on MAPREDUCE-5652:
--

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12639878/MAPREDUCE-5652-v5.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4505//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4505//console

This message is automatically generated.

 NM Recovery. ShuffleHandler should handle NM restarts
 -

 Key: MAPREDUCE-5652
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5652
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.2.0
Reporter: Karthik Kambatla
Assignee: Jason Lowe
  Labels: shuffle
 Attachments: MAPREDUCE-5652-v2.patch, MAPREDUCE-5652-v3.patch, 
 MAPREDUCE-5652-v4.patch, MAPREDUCE-5652-v5.patch, MAPREDUCE-5652.patch


 ShuffleHandler should work across NM restarts and not require re-running 
 map-tasks. On NM restart, the map outputs are cleaned up requiring 
 re-execution of map tasks and should be avoided.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5652) NM Recovery. ShuffleHandler should handle NM restarts

2014-04-10 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13965478#comment-13965478
 ] 

Karthik Kambatla commented on MAPREDUCE-5652:
-

Point 6 above - should we make those two maps non-static?

 NM Recovery. ShuffleHandler should handle NM restarts
 -

 Key: MAPREDUCE-5652
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5652
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.2.0
Reporter: Karthik Kambatla
Assignee: Jason Lowe
  Labels: shuffle
 Attachments: MAPREDUCE-5652-v2.patch, MAPREDUCE-5652-v3.patch, 
 MAPREDUCE-5652-v4.patch, MAPREDUCE-5652.patch


 ShuffleHandler should work across NM restarts and not require re-running 
 map-tasks. On NM restart, the map outputs are cleaned up requiring 
 re-execution of map tasks and should be avoided.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5652) NM Recovery. ShuffleHandler should handle NM restarts

2014-04-10 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13965476#comment-13965476
 ] 

Karthik Kambatla commented on MAPREDUCE-5652:
-

Approach looks good. Comments:
# How do we handle applications that finish while the NM is down? 
# Code related to initStateStore should ideally go into serviceInit(), 
primarily to future-proof against us supporting (re)starting stopped services.
# Use the constant {{JOB}} here?
{code}
  iter.seek(bytes(job));
{code}
# ShuffleHandler#recordJobShuffleInfo: addJobToken() should come after attempt 
to include in the store? Fail early if we can't write to the store for any 
reason. The place where we call this method, we catch-ignore all exceptions.
# ShuffleHandler#close() should probably take care of clearing the static maps. 
Alternately, we could just make those maps non-static.
# ShuffleHander#forgetJob() - should we make those two maps
# Do we need to change hadoop-mapreduce-project/pom.xml, given we already add 
the dependencies in the shuffle module?
# Nice test.

 NM Recovery. ShuffleHandler should handle NM restarts
 -

 Key: MAPREDUCE-5652
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5652
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.2.0
Reporter: Karthik Kambatla
Assignee: Jason Lowe
  Labels: shuffle
 Attachments: MAPREDUCE-5652-v2.patch, MAPREDUCE-5652-v3.patch, 
 MAPREDUCE-5652-v4.patch, MAPREDUCE-5652.patch


 ShuffleHandler should work across NM restarts and not require re-running 
 map-tasks. On NM restart, the map outputs are cleaned up requiring 
 re-execution of map tasks and should be avoided.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5652) NM Recovery. ShuffleHandler should handle NM restarts

2014-04-08 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963376#comment-13963376
 ] 

Hadoop QA commented on MAPREDUCE-5652:
--

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12639250/MAPREDUCE-5652-v4.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4492//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4492//console

This message is automatically generated.

 NM Recovery. ShuffleHandler should handle NM restarts
 -

 Key: MAPREDUCE-5652
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5652
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.2.0
Reporter: Karthik Kambatla
Assignee: Jason Lowe
  Labels: shuffle
 Attachments: MAPREDUCE-5652-v2.patch, MAPREDUCE-5652-v3.patch, 
 MAPREDUCE-5652-v4.patch, MAPREDUCE-5652.patch


 ShuffleHandler should work across NM restarts and not require re-running 
 map-tasks. On NM restart, the map outputs are cleaned up requiring 
 re-execution of map tasks and should be avoided.



--
This message was sent by Atlassian JIRA
(v6.2#6252)