[jira] [Commented] (MAPREDUCE-5542) Killing a job just as it finishes can generate an NPE in client

2014-10-15 Thread Rohith (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14173361#comment-14173361
 ] 

Rohith commented on MAPREDUCE-5542:
---

bq. the client would then loop until the full timeout before returning.
I see. Agree. This can be improved.

> Killing a job just as it finishes can generate an NPE in client
> ---
>
> Key: MAPREDUCE-5542
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5542
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: client, mrv2
>Affects Versions: 2.1.0-beta, 0.23.9
>Reporter: Jason Lowe
>Assignee: Rohith
> Attachments: MAPREDUCE-5542.1.patch, MAPREDUCE-5542.2.patch, 
> MAPREDUCE-5542.3.patch, MAPREDUCE-5542.4.patch
>
>
> If a client tries to kill a job just as the job is finishing then the client 
> can crash with an NPE.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-4818) Easier identification of tasks that timeout during localization

2014-10-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14173230#comment-14173230
 ] 

Hadoop QA commented on MAPREDUCE-4818:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12675155/MAPREDUCE-4818.v5.patch
  against trunk revision 466f087.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 6 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 3 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app 
hadoop-tools/hadoop-sls hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  org.apache.hadoop.yarn.sls.TestSLSRunner
  
org.apache.hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart
  
org.apache.hadoop.yarn.server.resourcemanager.ahs.TestRMApplicationHistoryWriter

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4967//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4967//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-common.html
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4967//console

This message is automatically generated.

> Easier identification of tasks that timeout during localization
> ---
>
> Key: MAPREDUCE-4818
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4818
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mr-am
>Affects Versions: 0.23.3, 2.0.3-alpha
>Reporter: Jason Lowe
>Assignee: Siqi Li
>  Labels: usability
> Attachments: MAPREDUCE-4818.v1.patch, MAPREDUCE-4818.v2.patch, 
> MAPREDUCE-4818.v3.patch, MAPREDUCE-4818.v4.patch, MAPREDUCE-4818.v5.patch
>
>
> When a task is taking too long to localize and is killed by the AM due to 
> task timeout, the job UI/history is not very helpful.  The attempt simply 
> lists a diagnostic stating it was killed due to timeout, but there are no 
> logs for the attempt since it never actually got started.  There are log 
> messages on the NM that show the container never made it past localization by 
> the time it was killed, but users often do not have access to those logs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6129) Job failed due to counter out of limited in MRAppMaster

2014-10-15 Thread Min Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Min Zhou updated MAPREDUCE-6129:

Attachment: MAPREDUCE-6129.diff

> Job failed due to counter out of limited in MRAppMaster
> ---
>
> Key: MAPREDUCE-6129
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6129
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster
>Affects Versions: 3.0.0, 2.3.0, 2.5.0, 2.4.1, 2.5.1
>Reporter: Min Zhou
> Attachments: MAPREDUCE-6129.diff
>
>
> Lots of of cluster's job use more than 120 counters, those kind of jobs  
> failed with exception like below
> {noformat}
> 2014-10-15 22:55:43,742 WARN [Socket Reader #1 for port 45673] 
> org.apache.hadoop.ipc.Server: Unable to read call parameters for client 
> 10.180.216.12on connection protocol 
> org.apache.hadoop.mapred.TaskUmbilicalProtocol for rpcKind RPC_WRITABLE
> org.apache.hadoop.mapreduce.counters.LimitExceededException: Too many 
> counters: 121 max=120
>   at 
> org.apache.hadoop.mapreduce.counters.Limits.checkCounters(Limits.java:103)
>   at 
> org.apache.hadoop.mapreduce.counters.Limits.incrCounters(Limits.java:110)
>   at 
> org.apache.hadoop.mapreduce.counters.AbstractCounterGroup.readFields(AbstractCounterGroup.java:175)
>   at org.apache.hadoop.mapred.Counters$Group.readFields(Counters.java:324)
>   at 
> org.apache.hadoop.mapreduce.counters.AbstractCounters.readFields(AbstractCounters.java:314)
>   at org.apache.hadoop.mapred.TaskStatus.readFields(TaskStatus.java:489)
>   at 
> org.apache.hadoop.mapred.ReduceTaskStatus.readFields(ReduceTaskStatus.java:140)
>   at 
> org.apache.hadoop.io.ObjectWritable.readObject(ObjectWritable.java:285)
>   at 
> org.apache.hadoop.ipc.WritableRpcEngine$Invocation.readFields(WritableRpcEngine.java:157)
>   at 
> org.apache.hadoop.ipc.Server$Connection.processRpcRequest(Server.java:1802)
>   at 
> org.apache.hadoop.ipc.Server$Connection.processOneRpc(Server.java:1734)
>   at 
> org.apache.hadoop.ipc.Server$Connection.readAndProcess(Server.java:1494)
>   at org.apache.hadoop.ipc.Server$Listener.doRead(Server.java:732)
>   at 
> org.apache.hadoop.ipc.Server$Listener$Reader.doRunLoop(Server.java:606)
>   at org.apache.hadoop.ipc.Server$Listener$Reader.run(Server.java:577)
> {noformat}
> The class org.apache.hadoop.mapreduce.counters.Limits load the 
> mapred-site.xml on nodemanager node for JobConf if it hasn't been inited. 
> If the mapred-site.xml on nodemanager node is not exist or the 
> mapreduce.job.counters.max hasn't been defined on that file, Class 
> org.apache.hadoop.mapreduce.counters.Limits will just  use the default value 
> 120. 
> Instead, we should read user job's conf file rather than config files on 
> nodemanager for checking counters limits.
> I will submitt a patch later.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6129) Job failed due to counter out of limited in MRAppMaster

2014-10-15 Thread Min Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Min Zhou updated MAPREDUCE-6129:

Affects Version/s: 3.0.0
   2.3.0
   2.5.0
   2.4.1
   2.5.1

> Job failed due to counter out of limited in MRAppMaster
> ---
>
> Key: MAPREDUCE-6129
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6129
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster
>Affects Versions: 3.0.0, 2.3.0, 2.5.0, 2.4.1, 2.5.1
>Reporter: Min Zhou
>
> Lots of of cluster's job use more than 120 counters, those kind of jobs  
> failed with exception like below
> {noformat}
> 2014-10-15 22:55:43,742 WARN [Socket Reader #1 for port 45673] 
> org.apache.hadoop.ipc.Server: Unable to read call parameters for client 
> 10.180.216.12on connection protocol 
> org.apache.hadoop.mapred.TaskUmbilicalProtocol for rpcKind RPC_WRITABLE
> org.apache.hadoop.mapreduce.counters.LimitExceededException: Too many 
> counters: 121 max=120
>   at 
> org.apache.hadoop.mapreduce.counters.Limits.checkCounters(Limits.java:103)
>   at 
> org.apache.hadoop.mapreduce.counters.Limits.incrCounters(Limits.java:110)
>   at 
> org.apache.hadoop.mapreduce.counters.AbstractCounterGroup.readFields(AbstractCounterGroup.java:175)
>   at org.apache.hadoop.mapred.Counters$Group.readFields(Counters.java:324)
>   at 
> org.apache.hadoop.mapreduce.counters.AbstractCounters.readFields(AbstractCounters.java:314)
>   at org.apache.hadoop.mapred.TaskStatus.readFields(TaskStatus.java:489)
>   at 
> org.apache.hadoop.mapred.ReduceTaskStatus.readFields(ReduceTaskStatus.java:140)
>   at 
> org.apache.hadoop.io.ObjectWritable.readObject(ObjectWritable.java:285)
>   at 
> org.apache.hadoop.ipc.WritableRpcEngine$Invocation.readFields(WritableRpcEngine.java:157)
>   at 
> org.apache.hadoop.ipc.Server$Connection.processRpcRequest(Server.java:1802)
>   at 
> org.apache.hadoop.ipc.Server$Connection.processOneRpc(Server.java:1734)
>   at 
> org.apache.hadoop.ipc.Server$Connection.readAndProcess(Server.java:1494)
>   at org.apache.hadoop.ipc.Server$Listener.doRead(Server.java:732)
>   at 
> org.apache.hadoop.ipc.Server$Listener$Reader.doRunLoop(Server.java:606)
>   at org.apache.hadoop.ipc.Server$Listener$Reader.run(Server.java:577)
> {noformat}
> The class org.apache.hadoop.mapreduce.counters.Limits load the 
> mapred-site.xml on nodemanager node for JobConf if it hasn't been inited. 
> If the mapred-site.xml on nodemanager node is not exist or the 
> mapreduce.job.counters.max hasn't been defined on that file, Class 
> org.apache.hadoop.mapreduce.counters.Limits will just  use the default value 
> 120. 
> Instead, we should read user job's conf file rather than config files on 
> nodemanager for checking counters limits.
> I will submitt a patch later.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MAPREDUCE-6129) Job failed due to counter out of limited in MRAppMaster

2014-10-15 Thread Min Zhou (JIRA)
Min Zhou created MAPREDUCE-6129:
---

 Summary: Job failed due to counter out of limited in MRAppMaster
 Key: MAPREDUCE-6129
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6129
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster
Reporter: Min Zhou


Lots of of cluster's job use more than 120 counters, those kind of jobs  failed 
with exception like below
{noformat}
2014-10-15 22:55:43,742 WARN [Socket Reader #1 for port 45673] 
org.apache.hadoop.ipc.Server: Unable to read call parameters for client 
10.180.216.12on connection protocol 
org.apache.hadoop.mapred.TaskUmbilicalProtocol for rpcKind RPC_WRITABLE
org.apache.hadoop.mapreduce.counters.LimitExceededException: Too many counters: 
121 max=120
at 
org.apache.hadoop.mapreduce.counters.Limits.checkCounters(Limits.java:103)
at 
org.apache.hadoop.mapreduce.counters.Limits.incrCounters(Limits.java:110)
at 
org.apache.hadoop.mapreduce.counters.AbstractCounterGroup.readFields(AbstractCounterGroup.java:175)
at org.apache.hadoop.mapred.Counters$Group.readFields(Counters.java:324)
at 
org.apache.hadoop.mapreduce.counters.AbstractCounters.readFields(AbstractCounters.java:314)
at org.apache.hadoop.mapred.TaskStatus.readFields(TaskStatus.java:489)
at 
org.apache.hadoop.mapred.ReduceTaskStatus.readFields(ReduceTaskStatus.java:140)
at 
org.apache.hadoop.io.ObjectWritable.readObject(ObjectWritable.java:285)
at 
org.apache.hadoop.ipc.WritableRpcEngine$Invocation.readFields(WritableRpcEngine.java:157)
at 
org.apache.hadoop.ipc.Server$Connection.processRpcRequest(Server.java:1802)
at 
org.apache.hadoop.ipc.Server$Connection.processOneRpc(Server.java:1734)
at 
org.apache.hadoop.ipc.Server$Connection.readAndProcess(Server.java:1494)
at org.apache.hadoop.ipc.Server$Listener.doRead(Server.java:732)
at 
org.apache.hadoop.ipc.Server$Listener$Reader.doRunLoop(Server.java:606)
at org.apache.hadoop.ipc.Server$Listener$Reader.run(Server.java:577)

{noformat}

The class org.apache.hadoop.mapreduce.counters.Limits load the mapred-site.xml 
on nodemanager node for JobConf if it hasn't been inited. 
If the mapred-site.xml on nodemanager node is not exist or the 
mapreduce.job.counters.max hasn't been defined on that file, Class 
org.apache.hadoop.mapreduce.counters.Limits will just  use the default value 
120. 

Instead, we should read user job's conf file rather than config files on 
nodemanager for checking counters limits.

I will submitt a patch later.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-4818) Easier identification of tasks that timeout during localization

2014-10-15 Thread Siqi Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siqi Li updated MAPREDUCE-4818:
---
Attachment: MAPREDUCE-4818.v5.patch

> Easier identification of tasks that timeout during localization
> ---
>
> Key: MAPREDUCE-4818
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4818
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mr-am
>Affects Versions: 0.23.3, 2.0.3-alpha
>Reporter: Jason Lowe
>Assignee: Siqi Li
>  Labels: usability
> Attachments: MAPREDUCE-4818.v1.patch, MAPREDUCE-4818.v2.patch, 
> MAPREDUCE-4818.v3.patch, MAPREDUCE-4818.v4.patch, MAPREDUCE-4818.v5.patch
>
>
> When a task is taking too long to localize and is killed by the AM due to 
> task timeout, the job UI/history is not very helpful.  The attempt simply 
> lists a diagnostic stating it was killed due to timeout, but there are no 
> logs for the attempt since it never actually got started.  There are log 
> messages on the NM that show the container never made it past localization by 
> the time it was killed, but users often do not have access to those logs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-2841) Task level native optimization

2014-10-15 Thread Nathan Roberts (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14172967#comment-14172967
 ] 

Nathan Roberts commented on MAPREDUCE-2841:
---

{quote}
Let's let this bake in trunk for a little while and consider a backport to 
branch-2 down the road if there is demand. Marking the issue as resolved for 
now.
{quote}
Nice Work! Not sure how much baking really happens on trunk;) Looking forward 
to this getting onto branch 2. 

> Task level native optimization
> --
>
> Key: MAPREDUCE-2841
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2841
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: task
> Environment: x86-64 Linux/Unix
>Reporter: Binglin Chang
>Assignee: Sean Zhong
> Fix For: 3.0.0
>
> Attachments: DESIGN.html, MAPREDUCE-2841.v1.patch, 
> MAPREDUCE-2841.v2.patch, MR-2841benchmarks.pdf, dualpivot-0.patch, 
> dualpivotv20-0.patch, fb-shuffle.patch, 
> hadoop-3.0-mapreduce-2841-2014-7-17.patch, micro-benchmark.txt, 
> mr-2841-merge-2.txt, mr-2841-merge-3.patch, mr-2841-merge-4.patch, 
> mr-2841-merge.txt
>
>
> I'm recently working on native optimization for MapTask based on JNI. 
> The basic idea is that, add a NativeMapOutputCollector to handle k/v pairs 
> emitted by mapper, therefore sort, spill, IFile serialization can all be done 
> in native code, preliminary test(on Xeon E5410, jdk6u24) showed promising 
> results:
> 1. Sort is about 3x-10x as fast as java(only binary string compare is 
> supported)
> 2. IFile serialization speed is about 3x of java, about 500MB/s, if hardware 
> CRC32C is used, things can get much faster(1G/
> 3. Merge code is not completed yet, so the test use enough io.sort.mb to 
> prevent mid-spill
> This leads to a total speed up of 2x~3x for the whole MapTask, if 
> IdentityMapper(mapper does nothing) is used
> There are limitations of course, currently only Text and BytesWritable is 
> supported, and I have not think through many things right now, such as how to 
> support map side combine. I had some discussion with somebody familiar with 
> hive, it seems that these limitations won't be much problem for Hive to 
> benefit from those optimizations, at least. Advices or discussions about 
> improving compatibility are most welcome:) 
> Currently NativeMapOutputCollector has a static method called canEnable(), 
> which checks if key/value type, comparator type, combiner are all compatible, 
> then MapTask can choose to enable NativeMapOutputCollector.
> This is only a preliminary test, more work need to be done. I expect better 
> final results, and I believe similar optimization can be adopt to reduce task 
> and shuffle too. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6083) Map/Reduce dangerously adds Guava @Beta class to CryptoUtils

2014-10-15 Thread Christopher Tubbs (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14172896#comment-14172896
 ] 

Christopher Tubbs commented on MAPREDUCE-6083:
--

Would this be more likely to be accepted for 2.6.0 if it were provided as a 
copied/re-implemented version of LimitInputStream instead of a dependency 
version change?

> Map/Reduce dangerously adds Guava @Beta class to CryptoUtils
> 
>
> Key: MAPREDUCE-6083
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6083
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Christopher Tubbs
>  Labels: beta, deprecated, guava
> Attachments: 
> 0001-MAPREDUCE-6083-Avoid-client-use-of-deprecated-LimitI.patch
>
>
> See HDFS-7040 for more background/details.
> In recent 2.6.0-SNAPSHOTs, the use of LimitInputStream was added to 
> CryptoUtils. This is part of the API components of Hadoop, which severely 
> impacts users who were utilizing newer versions of Guava, where the @Beta and 
> @Deprecated class, LimitInputStream, has been removed (removed in version 15 
> and later), beyond the impact already experienced in 2.4.0 as identified in 
> HDFS-7040.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6117) Hadoop ignores yarn.nodemanager.hostname for RPC listeners

2014-10-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14172821#comment-14172821
 ] 

Hadoop QA commented on MAPREDUCE-6117:
--

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12675078/MapReduce-534.patch
  against trunk revision f19771a.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4966//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4966//console

This message is automatically generated.

> Hadoop ignores yarn.nodemanager.hostname for RPC listeners
> --
>
> Key: MAPREDUCE-6117
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6117
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: client, task
>Affects Versions: 2.2.1, 2.4.1, 2.5.1
> Environment: Any mapreduce example with standard cluster.  In our 
> case each node has four networks.  It is important that all internode 
> communication be done on a specific network.
>Reporter: Waldyn Benbenek
>Assignee: Waldyn Benbenek
> Fix For: 2.5.1
>
> Attachments: MapReduce-534.patch
>
>   Original Estimate: 48h
>  Time Spent: 384h
>  Remaining Estimate: 0h
>
> The RPC listeners for an application are using the hostname of the node as 
> the binding address of the listener,  They ignore yarn.nodemanager.hostname 
> for this.  In our setup we want all communication between nodes to be done 
> via the network addresses we specify in yarn.nodemanager.hostname on each 
> node.  
> TaskAttemptListenerImpl.java and MRClientService.java are two places I have 
> found where the default address is used rather that NM_host.   The node 
> Manager hostname should be used for all communication between nodes including 
> the RPC listeners.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6117) Hadoop ignores yarn.nodemanager.hostname for RPC listeners

2014-10-15 Thread Waldyn Benbenek (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Waldyn Benbenek updated MAPREDUCE-6117:
---
Release Note: 
This patch has few new tests for the following reasons:
TestTaskAttemptListenerImpl does not test or even perform the service start 
where the change is made. This is because that would starting a new process.
TestMRClientService already checks the NM_HOST which change does effect. 
The change pulls the NM_HOST from the environment.  This needs to be passed to 
a spawned process which none of the tests do.  
In general , it would be better if NM_HOST were more pervasive, that is, if the 
property were passed to the all the parts of the application, in particular the 
parts that deal with RPC.  Since that is not the case, I have chosen to pull it 
from the environment where once can depend upon its being. 

I have tested it in clusters with multiple networks where the nm host is 
configured and those where it is not.  It works as designed.  That is, if the 
NM host is configured on the node the TaskAttempt Listner  and the Client 
Service listen on the give NM host, otherwise they listen on the node's 
"hostname".

  was:
This patch has no new tests for the following reasons:
TestTaskAttemptListenerImpl does not test or even perform the service start 
where the change is made. This is because that would starting a new process.
TestMRClientService already checks the NM_HOST which change does effect. 
The change pulls the NM_HOST from the environment.  This needs to be passed to 
a spawned process which none of the tests do.  
In general , it would be better if NM_HOST were more pervasive, that is, if the 
property were passed to the all the parts of the application, in particular the 
parts that deal with RPC.  Since that is not the case, I have chosen to pull it 
from the environment where once can depend upon its being. 

I have tested it in clusters with multiple networks where the nm host is 
configured and those where it is not.  It works as designed.  That is, if the 
NM host is configured on the node the TaskAttempt Listner  and the Client 
Service listen on the give NM host, otherwise they listen on the node's 
"hostname".

  Status: Patch Available  (was: Open)

> Hadoop ignores yarn.nodemanager.hostname for RPC listeners
> --
>
> Key: MAPREDUCE-6117
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6117
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: client, task
>Affects Versions: 2.5.1, 2.4.1, 2.2.1
> Environment: Any mapreduce example with standard cluster.  In our 
> case each node has four networks.  It is important that all internode 
> communication be done on a specific network.
>Reporter: Waldyn Benbenek
>Assignee: Waldyn Benbenek
> Fix For: 2.5.1
>
> Attachments: MapReduce-534.patch
>
>   Original Estimate: 48h
>  Time Spent: 384h
>  Remaining Estimate: 0h
>
> The RPC listeners for an application are using the hostname of the node as 
> the binding address of the listener,  They ignore yarn.nodemanager.hostname 
> for this.  In our setup we want all communication between nodes to be done 
> via the network addresses we specify in yarn.nodemanager.hostname on each 
> node.  
> TaskAttemptListenerImpl.java and MRClientService.java are two places I have 
> found where the default address is used rather that NM_host.   The node 
> Manager hostname should be used for all communication between nodes including 
> the RPC listeners.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6117) Hadoop ignores yarn.nodemanager.hostname for RPC listeners

2014-10-15 Thread Waldyn Benbenek (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Waldyn Benbenek updated MAPREDUCE-6117:
---
Attachment: MapReduce-534.patch

Same patch with test update

> Hadoop ignores yarn.nodemanager.hostname for RPC listeners
> --
>
> Key: MAPREDUCE-6117
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6117
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: client, task
>Affects Versions: 2.2.1, 2.4.1, 2.5.1
> Environment: Any mapreduce example with standard cluster.  In our 
> case each node has four networks.  It is important that all internode 
> communication be done on a specific network.
>Reporter: Waldyn Benbenek
>Assignee: Waldyn Benbenek
> Fix For: 2.5.1
>
> Attachments: MapReduce-534.patch
>
>   Original Estimate: 48h
>  Time Spent: 384h
>  Remaining Estimate: 0h
>
> The RPC listeners for an application are using the hostname of the node as 
> the binding address of the listener,  They ignore yarn.nodemanager.hostname 
> for this.  In our setup we want all communication between nodes to be done 
> via the network addresses we specify in yarn.nodemanager.hostname on each 
> node.  
> TaskAttemptListenerImpl.java and MRClientService.java are two places I have 
> found where the default address is used rather that NM_host.   The node 
> Manager hostname should be used for all communication between nodes including 
> the RPC listeners.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5542) Killing a job just as it finishes can generate an NPE in client

2014-10-15 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14172724#comment-14172724
 ] 

Jason Lowe commented on MAPREDUCE-5542:
---

Thanks for updating the patch.  Looks better, but I noticed something I should 
have before.  After sending the kill directive via the MR client the code then 
loops as long as the job state isn't KILLED.  However if the job is finishing 
just as we send the kill then the job may finish in a non-killed state.  I 
think the client would then loop until the full timeout before returning.  
Instead of checking for not KILLED we should be checking for a non-terminal 
state instead (e.g. != KILLED, SUCCEEDED, or FAILED).  We can make an EnumSet 
of the terminal job states and check if the status is not in that set.

> Killing a job just as it finishes can generate an NPE in client
> ---
>
> Key: MAPREDUCE-5542
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5542
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: client, mrv2
>Affects Versions: 2.1.0-beta, 0.23.9
>Reporter: Jason Lowe
>Assignee: Rohith
> Attachments: MAPREDUCE-5542.1.patch, MAPREDUCE-5542.2.patch, 
> MAPREDUCE-5542.3.patch, MAPREDUCE-5542.4.patch
>
>
> If a client tries to kill a job just as the job is finishing then the client 
> can crash with an NPE.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5970) Provide a boolean switch to enable MR-AM profiling

2014-10-15 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14172691#comment-14172691
 ] 

Hudson commented on MAPREDUCE-5970:
---

FAILURE: Integrated in Hadoop-trunk-Commit #6266 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/6266/])
MAPREDUCE-5970. Provide a boolean switch to enable MR-AM profiling. Contributed 
by Gera Shegalov (jlowe: rev f19771a24c2f90982cf6dec35889836a6146c968)
* hadoop-mapreduce-project/CHANGES.txt
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/main/java/org/apache/hadoop/mapred/YARNRunner.java
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestYARNRunner.java
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/JobConf.java


> Provide a boolean switch to enable MR-AM profiling
> --
>
> Key: MAPREDUCE-5970
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5970
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: applicationmaster, client
>Affects Versions: 2.4.1
>Reporter: Gera Shegalov
>Assignee: Gera Shegalov
>Priority: Minor
> Fix For: 2.6.0
>
> Attachments: MAPREDUCE-5970.v01.patch, MAPREDUCE-5970.v02.patch, 
> MAPREDUCE-5970.v03.patch
>
>
> MR task profiling can be enabled with a simple switch 
> {{mapreduce.task.profile=true}}. We can analogously have 
> {{yarn.app.mapreduce.am.profile}} for MR-AM



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-5970) Provide a boolean switch to enable MR-AM profiling

2014-10-15 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated MAPREDUCE-5970:
--
   Resolution: Fixed
Fix Version/s: 2.6.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Thanks, Gera!  I committed this to trunk, branch-2, and branch-2.6.

> Provide a boolean switch to enable MR-AM profiling
> --
>
> Key: MAPREDUCE-5970
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5970
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: applicationmaster, client
>Affects Versions: 2.4.1
>Reporter: Gera Shegalov
>Assignee: Gera Shegalov
>Priority: Minor
> Fix For: 2.6.0
>
> Attachments: MAPREDUCE-5970.v01.patch, MAPREDUCE-5970.v02.patch, 
> MAPREDUCE-5970.v03.patch
>
>
> MR task profiling can be enabled with a simple switch 
> {{mapreduce.task.profile=true}}. We can analogously have 
> {{yarn.app.mapreduce.am.profile}} for MR-AM



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5970) Provide a boolean switch to enable MR-AM profiling

2014-10-15 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14172662#comment-14172662
 ] 

Jason Lowe commented on MAPREDUCE-5970:
---

+1 lgtm.  Committing this.

> Provide a boolean switch to enable MR-AM profiling
> --
>
> Key: MAPREDUCE-5970
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5970
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: applicationmaster, client
>Affects Versions: 2.4.1
>Reporter: Gera Shegalov
>Assignee: Gera Shegalov
>Priority: Minor
> Attachments: MAPREDUCE-5970.v01.patch, MAPREDUCE-5970.v02.patch, 
> MAPREDUCE-5970.v03.patch
>
>
> MR task profiling can be enabled with a simple switch 
> {{mapreduce.task.profile=true}}. We can analogously have 
> {{yarn.app.mapreduce.am.profile}} for MR-AM



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5873) Shuffle bandwidth computation includes time spent waiting for maps

2014-10-15 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14172512#comment-14172512
 ] 

Hudson commented on MAPREDUCE-5873:
---

SUCCESS: Integrated in Hadoop-trunk-Commit #6264 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/6264/])
MAPREDUCE-5873. Shuffle bandwidth computation includes time spent waiting for 
maps. Contributed by Siqi Li (jlowe: rev 
b9edad64034a9c8a121ec2b37792c190ba561e26)
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/task/reduce/TestShuffleScheduler.java
* hadoop-mapreduce-project/CHANGES.txt
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Fetcher.java
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/LocalFetcher.java
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/ShuffleSchedulerImpl.java


> Shuffle bandwidth computation includes time spent waiting for maps
> --
>
> Key: MAPREDUCE-5873
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5873
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.3.0
>Reporter: Siqi Li
>Assignee: Siqi Li
> Fix For: 2.6.0
>
> Attachments: MAPREDUCE-5873.v1.patch, MAPREDUCE-5873.v2.patch, 
> MAPREDUCE-5873.v3.patch, MAPREDUCE-5873.v4.patch, MAPREDUCE-5873.v5.patch, 
> MAPREDUCE-5873.v6.patch, MAPREDUCE-5873.v9.patch
>
>
> Currently ShuffleScheduler in ReduceTask JVM status displays bandwidth. Its 
> definition however is confusing because it captures the time where there is 
> no copying because there is a pause between when new wave of map outputs is 
> available.
> current bw is definded as (bytes copied so far) / (total time in the copy 
> phase so far)
> It would be more useful 
> 1) to measure bandwidth of a single copy call.
> 2) display aggregated bw as long as there is at least one fetcher is in the 
> copy call.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-5873) Shuffle bandwidth computation includes time spent waiting for maps

2014-10-15 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated MAPREDUCE-5873:
--
   Resolution: Fixed
Fix Version/s: 2.6.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Thanks, Siqi!  I committed this to trunk, branch-2, and branch-2.6.

> Shuffle bandwidth computation includes time spent waiting for maps
> --
>
> Key: MAPREDUCE-5873
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5873
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.3.0
>Reporter: Siqi Li
>Assignee: Siqi Li
> Fix For: 2.6.0
>
> Attachments: MAPREDUCE-5873.v1.patch, MAPREDUCE-5873.v2.patch, 
> MAPREDUCE-5873.v3.patch, MAPREDUCE-5873.v4.patch, MAPREDUCE-5873.v5.patch, 
> MAPREDUCE-5873.v6.patch, MAPREDUCE-5873.v9.patch
>
>
> Currently ShuffleScheduler in ReduceTask JVM status displays bandwidth. Its 
> definition however is confusing because it captures the time where there is 
> no copying because there is a pause between when new wave of map outputs is 
> available.
> current bw is definded as (bytes copied so far) / (total time in the copy 
> phase so far)
> It would be more useful 
> 1) to measure bandwidth of a single copy call.
> 2) display aggregated bw as long as there is at least one fetcher is in the 
> copy call.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-5873) Shuffle bandwidth computation includes time spent waiting for maps

2014-10-15 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated MAPREDUCE-5873:
--
Summary: Shuffle bandwidth computation includes time spent waiting for maps 
 (was: Measure bw of a single copy call and display the correct aggregated bw)

> Shuffle bandwidth computation includes time spent waiting for maps
> --
>
> Key: MAPREDUCE-5873
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5873
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.3.0
>Reporter: Siqi Li
>Assignee: Siqi Li
> Attachments: MAPREDUCE-5873.v1.patch, MAPREDUCE-5873.v2.patch, 
> MAPREDUCE-5873.v3.patch, MAPREDUCE-5873.v4.patch, MAPREDUCE-5873.v5.patch, 
> MAPREDUCE-5873.v6.patch, MAPREDUCE-5873.v9.patch
>
>
> Currently ShuffleScheduler in ReduceTask JVM status displays bandwidth. Its 
> definition however is confusing because it captures the time where there is 
> no copying because there is a pause between when new wave of map outputs is 
> available.
> current bw is definded as (bytes copied so far) / (total time in the copy 
> phase so far)
> It would be more useful 
> 1) to measure bandwidth of a single copy call.
> 2) display aggregated bw as long as there is at least one fetcher is in the 
> copy call.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5873) Measure bw of a single copy call and display the correct aggregated bw

2014-10-15 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14172488#comment-14172488
 ] 

Jason Lowe commented on MAPREDUCE-5873:
---

+1 lgtm.  Committing this.

> Measure bw of a single copy call and display the correct aggregated bw
> --
>
> Key: MAPREDUCE-5873
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5873
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.3.0
>Reporter: Siqi Li
>Assignee: Siqi Li
> Attachments: MAPREDUCE-5873.v1.patch, MAPREDUCE-5873.v2.patch, 
> MAPREDUCE-5873.v3.patch, MAPREDUCE-5873.v4.patch, MAPREDUCE-5873.v5.patch, 
> MAPREDUCE-5873.v6.patch, MAPREDUCE-5873.v9.patch
>
>
> Currently ShuffleScheduler in ReduceTask JVM status displays bandwidth. Its 
> definition however is confusing because it captures the time where there is 
> no copying because there is a pause between when new wave of map outputs is 
> available.
> current bw is definded as (bytes copied so far) / (total time in the copy 
> phase so far)
> It would be more useful 
> 1) to measure bandwidth of a single copy call.
> 2) display aggregated bw as long as there is at least one fetcher is in the 
> copy call.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6117) Hadoop ignores yarn.nodemanager.hostname for RPC listeners

2014-10-15 Thread Waldyn Benbenek (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Waldyn Benbenek updated MAPREDUCE-6117:
---
Attachment: (was: MapReduce-325.patch)

> Hadoop ignores yarn.nodemanager.hostname for RPC listeners
> --
>
> Key: MAPREDUCE-6117
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6117
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: client, task
>Affects Versions: 2.2.1, 2.4.1, 2.5.1
> Environment: Any mapreduce example with standard cluster.  In our 
> case each node has four networks.  It is important that all internode 
> communication be done on a specific network.
>Reporter: Waldyn Benbenek
>Assignee: Waldyn Benbenek
> Fix For: 2.5.1
>
>   Original Estimate: 48h
>  Time Spent: 384h
>  Remaining Estimate: 0h
>
> The RPC listeners for an application are using the hostname of the node as 
> the binding address of the listener,  They ignore yarn.nodemanager.hostname 
> for this.  In our setup we want all communication between nodes to be done 
> via the network addresses we specify in yarn.nodemanager.hostname on each 
> node.  
> TaskAttemptListenerImpl.java and MRClientService.java are two places I have 
> found where the default address is used rather that NM_host.   The node 
> Manager hostname should be used for all communication between nodes including 
> the RPC listeners.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6117) Hadoop ignores yarn.nodemanager.hostname for RPC listeners

2014-10-15 Thread Waldyn Benbenek (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Waldyn Benbenek updated MAPREDUCE-6117:
---
Status: Open  (was: Patch Available)

Replacing patch with one including test change

> Hadoop ignores yarn.nodemanager.hostname for RPC listeners
> --
>
> Key: MAPREDUCE-6117
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6117
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: client, task
>Affects Versions: 2.5.1, 2.4.1, 2.2.1
> Environment: Any mapreduce example with standard cluster.  In our 
> case each node has four networks.  It is important that all internode 
> communication be done on a specific network.
>Reporter: Waldyn Benbenek
>Assignee: Waldyn Benbenek
> Fix For: 2.5.1
>
> Attachments: MapReduce-325.patch
>
>   Original Estimate: 48h
>  Time Spent: 384h
>  Remaining Estimate: 0h
>
> The RPC listeners for an application are using the hostname of the node as 
> the binding address of the listener,  They ignore yarn.nodemanager.hostname 
> for this.  In our setup we want all communication between nodes to be done 
> via the network addresses we specify in yarn.nodemanager.hostname on each 
> node.  
> TaskAttemptListenerImpl.java and MRClientService.java are two places I have 
> found where the default address is used rather that NM_host.   The node 
> Manager hostname should be used for all communication between nodes including 
> the RPC listeners.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5269) Preemption of Reducer (and Shuffle) via checkpointing

2014-10-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14172257#comment-14172257
 ] 

Hadoop QA commented on MAPREDUCE-5269:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12674970/MAPREDUCE-5269.4.patch
  against trunk revision 128ace1.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4965//console

This message is automatically generated.

> Preemption of Reducer (and Shuffle) via checkpointing
> -
>
> Key: MAPREDUCE-5269
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5269
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mrv2
>Reporter: Carlo Curino
>Assignee: Carlo Curino
> Attachments: MAPREDUCE-5269.2.patch, MAPREDUCE-5269.3.patch, 
> MAPREDUCE-5269.4.patch, MAPREDUCE-5269.patch
>
>
> This patch tracks the changes in the task runtime (shuffle, reducer context, 
> etc.) that are required to implement checkpoint-based preemption of reducer 
> tasks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-5269) Preemption of Reducer (and Shuffle) via checkpointing

2014-10-15 Thread Augusto Souza (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Augusto Souza updated MAPREDUCE-5269:
-
Attachment: MAPREDUCE-5269.4.patch

Fixing MAPREDUCE-5269.3 by removing prefixes in files

> Preemption of Reducer (and Shuffle) via checkpointing
> -
>
> Key: MAPREDUCE-5269
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5269
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mrv2
>Reporter: Carlo Curino
>Assignee: Carlo Curino
> Attachments: MAPREDUCE-5269.2.patch, MAPREDUCE-5269.3.patch, 
> MAPREDUCE-5269.4.patch, MAPREDUCE-5269.patch
>
>
> This patch tracks the changes in the task runtime (shuffle, reducer context, 
> etc.) that are required to implement checkpoint-based preemption of reducer 
> tasks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6128) Automatic addition of bundled jars to distributed cache

2014-10-15 Thread Gera Shegalov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gera Shegalov updated MAPREDUCE-6128:
-
Attachment: MAPREDUCE-6128.v01.patch

v01 to illustrate the idea.

> Automatic addition of bundled jars to distributed cache 
> 
>
> Key: MAPREDUCE-6128
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6128
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Affects Versions: 2.5.1
>Reporter: Gera Shegalov
>Assignee: Gera Shegalov
> Attachments: MAPREDUCE-6128.v01.patch
>
>
> On the client side, JDK adds Class-Path elements from the job jar manifest
> on the classpath. In theory there could be many bundled jars in many 
> directories such that adding them manually via libjars or similar means to 
> task classpaths is cumbersome. If this property is enabled, the same jars are 
> added
> to the task classpaths automatically.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MAPREDUCE-6128) Automatic addition of bundled jars to distributed cache

2014-10-15 Thread Gera Shegalov (JIRA)
Gera Shegalov created MAPREDUCE-6128:


 Summary: Automatic addition of bundled jars to distributed cache 
 Key: MAPREDUCE-6128
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6128
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: client
Affects Versions: 2.5.1
Reporter: Gera Shegalov
Assignee: Gera Shegalov


On the client side, JDK adds Class-Path elements from the job jar manifest
on the classpath. In theory there could be many bundled jars in many 
directories such that adding them manually via libjars or similar means to task 
classpaths is cumbersome. If this property is enabled, the same jars are added
to the task classpaths automatically.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)