[jira] [Updated] (MAPREDUCE-5867) Possible NPE in KillAMPreemptionPolicy related to ProportionalCapacityPreemptionPolicy

2014-05-14 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated MAPREDUCE-5867:
---

Issue Type: Sub-task  (was: Bug)
Parent: MAPREDUCE-4584

 Possible NPE in KillAMPreemptionPolicy related to 
 ProportionalCapacityPreemptionPolicy
 --

 Key: MAPREDUCE-5867
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5867
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 2.3.0
Reporter: Sunil G
Assignee: Sunil G
 Attachments: MapReduce-5867-updated.patch, MapReduce-5867.2.patch, 
 MapReduce-5867.3.patch, Yarn-1980.1.patch


 I configured KillAMPreemptionPolicy for My Application Master and tried to 
 check preemption of queues.
 In one scenario I have seen below NPE in my AM
 014-04-24 15:11:08,860 ERROR [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: ERROR IN 
 CONTACTING RM. 
 java.lang.NullPointerException
   at 
 org.apache.hadoop.mapreduce.v2.app.rm.preemption.KillAMPreemptionPolicy.preempt(KillAMPreemptionPolicy.java:57)
   at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.getResources(RMContainerAllocator.java:662)
   at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:246)
   at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$1.run(RMCommunicator.java:267)
   at java.lang.Thread.run(Thread.java:662)
 I was using 2.2.0 and merged MAPREDUCE-5189 to see how AM preemption works.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5889) Deprecate FileInputFormat.setInputPaths(Job, String) and FileInputFormat.addInputPaths(Job, String)

2014-05-14 Thread Akira AJISAKA (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA updated MAPREDUCE-5889:
-

Target Version/s: 2.5.0
Release Note: Deprecate o.a.h.mapreduce.lib.input.setInputPaths(Job, 
String) and o.a.h.mapreduce.lib.input.addInputPaths(Job, String). Use 
setInputPaths(Job, Path...) and addInputPaths(Job, Path...) instead.
  Status: Patch Available  (was: Open)

 Deprecate FileInputFormat.setInputPaths(Job, String) and 
 FileInputFormat.addInputPaths(Job, String)
 ---

 Key: MAPREDUCE-5889
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5889
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Akira AJISAKA
Assignee: Akira AJISAKA
Priority: Minor
  Labels: newbie
 Attachments: MAPREDUCE-5889.patch


 {{FileInputFormat.setInputPaths(Job job, String commaSeparatedPaths)}} and 
 {{FileInputFormat.addInputPaths(Job job, String commaSeparatedPaths)}} fail 
 to parse commaSeparatedPaths if a comma is included in the file path. (e.g. 
 Path: {{/path/file,with,comma}})
 We should deprecate these methods and document to use {{setInputPaths(Job 
 job, Path... inputPaths)}} and {{addInputPaths(Job job, Path... inputPaths)}} 
 instead.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5652) NM Recovery. ShuffleHandler should handle NM restarts

2014-05-14 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated MAPREDUCE-5652:


   Resolution: Fixed
Fix Version/s: 2.5.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Thanks for the contribution and patience with multiple reviews, Jason.

Just committed this to trunk and branch-2.

 NM Recovery. ShuffleHandler should handle NM restarts
 -

 Key: MAPREDUCE-5652
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5652
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.2.0
Reporter: Karthik Kambatla
Assignee: Jason Lowe
  Labels: shuffle
 Fix For: 2.5.0

 Attachments: MAPREDUCE-5652-v10.patch, MAPREDUCE-5652-v2.patch, 
 MAPREDUCE-5652-v3.patch, MAPREDUCE-5652-v4.patch, MAPREDUCE-5652-v5.patch, 
 MAPREDUCE-5652-v6.patch, MAPREDUCE-5652-v7.patch, MAPREDUCE-5652-v8.patch, 
 MAPREDUCE-5652-v9-and-YARN-1987.patch, MAPREDUCE-5652.patch


 ShuffleHandler should work across NM restarts and not require re-running 
 map-tasks. On NM restart, the map outputs are cleaned up requiring 
 re-execution of map tasks and should be avoided.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-4071) NPE while executing MRAppMaster shutdown hook

2014-05-14 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13996832#comment-13996832
 ] 

Chen He commented on MAPREDUCE-4071:


ping

 NPE while executing MRAppMaster shutdown hook
 -

 Key: MAPREDUCE-4071
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4071
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mr-am, mrv2
Affects Versions: 0.23.3, 2.0.0-alpha, trunk
Reporter: Bhallamudi Venkata Siva Kamesh
 Attachments: MAPREDUCE-4071-1.patch, MAPREDUCE-4071-2.patch, 
 MAPREDUCE-4071-2.patch, MAPREDUCE-4071.patch


 While running the shutdown hook of MRAppMaster, hit NPE
 {noformat}
 Exception in thread Thread-1 java.lang.NullPointerException
   at 
 org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter.setSignalled(MRAppMaster.java:668)
   at 
 org.apache.hadoop.mapreduce.v2.app.MRAppMaster$MRAppMasterShutdownHook.run(MRAppMaster.java:1004)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Assigned] (MAPREDUCE-5889) Deprecate FileInputFormat.setInputPaths(JobConf, String) and FileInputFormat.addInputPaths(JobConf, String)

2014-05-14 Thread Akira AJISAKA (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA reassigned MAPREDUCE-5889:


Assignee: Akira AJISAKA

 Deprecate FileInputFormat.setInputPaths(JobConf, String) and 
 FileInputFormat.addInputPaths(JobConf, String)
 ---

 Key: MAPREDUCE-5889
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5889
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Akira AJISAKA
Assignee: Akira AJISAKA
Priority: Minor
  Labels: newbie

 {{FileInputFormat.setInputPaths(JobConf conf, String commaSeparatedPaths)}} 
 and {{FileInputFormat.addInputPaths(JobConf conf, String 
 commaSeparatedPaths)}} fail to parse commaSeparatedPaths if a comma is 
 included in the file path. (e.g. Path: {{/path/file,with,comma}})
 We should deprecate these methods and document to use {{setInputPaths(JobConf 
 conf, Path... inputPaths)}} and {{addInputPaths(JobConf conf, Path... 
 inputPaths)}} instead.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5887) Move split creation from submission client to MRAppMaster

2014-05-14 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13997431#comment-13997431
 ] 

Steve Loughran commented on MAPREDUCE-5887:
---

This is good, especially when the client is something like a laptop trying to 
submit to in-cloud deployments.

One test to try there is what happens when the blocksize is reported as very, 
very small (you can configure this in swiftfs). in the client this will cause 
the submitting process to OOM and fail. Presumably the same outcome in the AM 
is the simplest to implement -we just need to make sure that YARN recognises 
this as a failure and only tries a couple of times

 Move split creation from submission client to MRAppMaster
 -

 Key: MAPREDUCE-5887
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5887
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: applicationmaster, client
Affects Versions: 2.4.0
Reporter: Gera Shegalov
Assignee: Gera Shegalov
 Attachments: MAPREDUCE-5887.v01.patch


 This JIRA is filed to improve scalability of job submission, specifically 
 when there is a significant latency between the submission client and the 
 cluster nodes RM and NN, e.g. in a multi-datacenter environment.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5888) Failed job leaves hung AM after it unregisters

2014-05-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13996732#comment-13996732
 ] 

Hadoop QA commented on MAPREDUCE-5888:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12644640/MAPREDUCE-5888.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4599//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4599//console

This message is automatically generated.

 Failed job leaves hung AM after it unregisters 
 ---

 Key: MAPREDUCE-5888
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5888
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mr-am
Affects Versions: 2.2.0
Reporter: Jason Lowe
Assignee: Jason Lowe
 Attachments: MAPREDUCE-5888.patch


 When a job fails the AM hangs during shutdown.  A non-daemon thread pool 
 executor thread prevents the JVM teardown from completing, and the AM lingers 
 on the cluster for the AM expiry interval in the FINISHING state until 
 eventually the RM expires it and kills the container.  If application limits 
 on the queue are relatively low (e.g.: small queue or small cluster) this can 
 cause unnecessary delays in resource scheduling on the cluster.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5889) Deprecate FileInputFormat.setInputPaths(Job, String) and FileInputFormat.addInputPaths(Job, String)

2014-05-14 Thread Akira AJISAKA (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA updated MAPREDUCE-5889:
-

Description: 
{{FileInputFormat.setInputPaths(Job job, String commaSeparatedPaths)}} and 
{{FileInputFormat.addInputPaths(Job job, String commaSeparatedPaths)}} fail to 
parse commaSeparatedPaths if a comma is included in the file path. (e.g. Path: 
{{/path/file,with,comma}})
We should deprecate these methods and document to use {{setInputPaths(Job job, 
Path... inputPaths)}} and {{addInputPaths(Job job, Path... inputPaths)}} 
instead.

  was:
{{FileInputFormat.setInputPaths(JobConf conf, String commaSeparatedPaths)}} and 
{{FileInputFormat.addInputPaths(JobConf conf, String commaSeparatedPaths)}} 
fail to parse commaSeparatedPaths if a comma is included in the file path. 
(e.g. Path: {{/path/file,with,comma}})
We should deprecate these methods and document to use {{setInputPaths(JobConf 
conf, Path... inputPaths)}} and {{addInputPaths(JobConf conf, Path... 
inputPaths)}} instead.

Summary: Deprecate FileInputFormat.setInputPaths(Job, String) and 
FileInputFormat.addInputPaths(Job, String)  (was: Deprecate 
FileInputFormat.setInputPaths(JobConf, String) and 
FileInputFormat.addInputPaths(JobConf, String))

 Deprecate FileInputFormat.setInputPaths(Job, String) and 
 FileInputFormat.addInputPaths(Job, String)
 ---

 Key: MAPREDUCE-5889
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5889
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Akira AJISAKA
Assignee: Akira AJISAKA
Priority: Minor
  Labels: newbie

 {{FileInputFormat.setInputPaths(Job job, String commaSeparatedPaths)}} and 
 {{FileInputFormat.addInputPaths(Job job, String commaSeparatedPaths)}} fail 
 to parse commaSeparatedPaths if a comma is included in the file path. (e.g. 
 Path: {{/path/file,with,comma}})
 We should deprecate these methods and document to use {{setInputPaths(Job 
 job, Path... inputPaths)}} and {{addInputPaths(Job job, Path... inputPaths)}} 
 instead.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5886) Allow wordcount example job to accept multiple input paths.

2014-05-14 Thread Gera Shegalov (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13997088#comment-13997088
 ] 

Gera Shegalov commented on MAPREDUCE-5886:
--

Chris, thanks for the JDK pointer, I am aware of the behavior.

 Allow wordcount example job to accept multiple input paths.
 ---

 Key: MAPREDUCE-5886
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5886
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: examples
Affects Versions: 3.0.0, 2.4.0
Reporter: Chris Nauroth
Assignee: Chris Nauroth
Priority: Minor
 Attachments: MAPREDUCE-5886.1.patch, MAPREDUCE-5886.2.patch, 
 MAPREDUCE-5886.3.patch


 It would be convenient if the wordcount example MapReduce job could accept 
 multiple input paths and run the word count on all of them.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5889) Deprecate FileInputFormat.setInputPaths(Job, String) and FileInputFormat.addInputPaths(Job, String)

2014-05-14 Thread Akira AJISAKA (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA updated MAPREDUCE-5889:
-

Attachment: MAPREDUCE-5889.patch

Attaching a patch to deprecate these methods and to fix javac warnings.

 Deprecate FileInputFormat.setInputPaths(Job, String) and 
 FileInputFormat.addInputPaths(Job, String)
 ---

 Key: MAPREDUCE-5889
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5889
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Akira AJISAKA
Assignee: Akira AJISAKA
Priority: Minor
  Labels: newbie
 Attachments: MAPREDUCE-5889.patch


 {{FileInputFormat.setInputPaths(Job job, String commaSeparatedPaths)}} and 
 {{FileInputFormat.addInputPaths(Job job, String commaSeparatedPaths)}} fail 
 to parse commaSeparatedPaths if a comma is included in the file path. (e.g. 
 Path: {{/path/file,with,comma}})
 We should deprecate these methods and document to use {{setInputPaths(Job 
 job, Path... inputPaths)}} and {{addInputPaths(Job job, Path... inputPaths)}} 
 instead.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5309) 2.0.4 JobHistoryParser can't parse certain failed job history files generated by 2.0.3 history server

2014-05-14 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated MAPREDUCE-5309:
--

 Description: 
When the 2.0.4 JobHistoryParser tries to parse a job history file generated by 
hadoop 2.0.3, the jobhistoryparser throws as an error as

java.lang.ClassCastException: org.apache.avro.generic.GenericData$Array cannot 
be cast to org.apache.hadoop.mapreduce.jobhistory.JhCounters
at 
org.apache.hadoop.mapreduce.jobhistory.TaskAttemptUnsuccessfulCompletion.put(TaskAttemptUnsuccessfulCompletion.java:58)
at org.apache.avro.generic.GenericData.setField(GenericData.java:463)
at 
org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:166)
at 
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138)
at 
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:142)
at 
org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:166)
at 
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138)
at 
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:129)
at 
org.apache.hadoop.mapreduce.jobhistory.EventReader.getNextEvent(EventReader.java:93)
at 
org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:111)
at 
org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:156)
at 
org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:142)
at 
com.twitter.somepackage.Test20JobHistoryParsing.testFileAvro(Test20JobHistoryParsing.java:23)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:76)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184)
at org.junit.runners.ParentRunner.run(ParentRunner.java:236)
at 
org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50)
at 
org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
at 
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467)
at 
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683)
at 
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390)
at 
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197)


Test code and the job history file are attached.

Test code:
package com.twitter.somepackagel;

import java.io.IOException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser;
import org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.JobInfo;
import org.junit.Test;
import org.apache.hadoop.yarn.YarnException;

public class Test20JobHistoryParsing {
   
  @Test
  public void testFileAvro() throws IOException
  {
  Path local_path2 = new Path(/tmp/job_2_0_3-KILLED.jhist);
 JobHistoryParser parser2 = new JobHistoryParser(FileSystem.getLocal(new 
Configuration()), local_path2);
 try {
   JobInfo ji2 = parser2.parse();
   System.out.println( job info:  + ji2.getJobname() +  
 + ji2.getFinishedMaps() +  
 + ji2.getTotalMaps() +  
 + ji2.getJobId() ) ;
 }
 catch (IOException e) {
throw new YarnException(Could not load history file 
   + local_path2.getName(), e);
 }
  }
}

This seems to stem from the fix in 
https://issues.apache.org/jira/browse/MAPREDUCE-4693
that added counters to the historyserver  for failed tasks.

This breaks backward compatibility with JobHistoryServer. 



  was:

When the 2.0.4 JobHistoryParser tries to parse a job history file generated by 
hadoop 2.0.3, the 

[jira] [Updated] (MAPREDUCE-5814) fat jar with *-default.xml may fail when mapreduce.job.classloader=true.

2014-05-14 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated MAPREDUCE-5814:
--

   Resolution: Fixed
Fix Version/s: 2.5.0
   3.0.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Thanks, Gera!  I committed this to trunk and branch-2.

 fat jar with *-default.xml may fail when mapreduce.job.classloader=true.
 

 Key: MAPREDUCE-5814
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5814
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 2.3.0
Reporter: Gera Shegalov
Assignee: Gera Shegalov
 Fix For: 3.0.0, 2.5.0

 Attachments: MAPREDUCE-5814.v01.patch, MAPREDUCE-5814.v02.patch, 
 MAPREDUCE-5814.v03.patch


 We faced a failure when a job.jar compiled against 0.20+ hadoop artifacts had 
 to run with {{mapreduce.job.classloader=true}} because it needed a more 
 recent guava as a dependency. The job failed because the cluster's 
 {{*-default.xml}} files were overshadowed by the ones in the fat jar. We 
 propose to treat these default config files like the system packages 
 {{org.apache.hadoop.}} to avoid a counterintuitivie behavior as if we had 
 {{mapreduce.job.user.classpath.first}} set.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5888) Failed job leaves hung AM after it unregisters

2014-05-14 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated MAPREDUCE-5888:
--

Attachment: MAPREDUCE-5888.patch

Quick patch to fix the issue.  Manually tested it with a fail job and saw that 
the MRAppMaster hung after unregistering without the change and does not hang 
with the patch.

 Failed job leaves hung AM after it unregisters 
 ---

 Key: MAPREDUCE-5888
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5888
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mr-am
Affects Versions: 2.2.0
Reporter: Jason Lowe
Assignee: Jason Lowe
 Attachments: MAPREDUCE-5888.patch


 When a job fails the AM hangs during shutdown.  A non-daemon thread pool 
 executor thread prevents the JVM teardown from completing, and the AM lingers 
 on the cluster for the AM expiry interval in the FINISHING state until 
 eventually the RM expires it and kills the container.  If application limits 
 on the queue are relatively low (e.g.: small queue or small cluster) this can 
 cause unnecessary delays in resource scheduling on the cluster.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5888) Failed job leaves hung AM after it unregisters

2014-05-14 Thread Jonathan Eagles (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13996622#comment-13996622
 ] 

Jonathan Eagles commented on MAPREDUCE-5888:


pending Hadoop QA +1

 Failed job leaves hung AM after it unregisters 
 ---

 Key: MAPREDUCE-5888
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5888
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mr-am
Affects Versions: 2.2.0
Reporter: Jason Lowe
Assignee: Jason Lowe
 Attachments: MAPREDUCE-5888.patch


 When a job fails the AM hangs during shutdown.  A non-daemon thread pool 
 executor thread prevents the JVM teardown from completing, and the AM lingers 
 on the cluster for the AM expiry interval in the FINISHING state until 
 eventually the RM expires it and kills the container.  If application limits 
 on the queue are relatively low (e.g.: small queue or small cluster) this can 
 cause unnecessary delays in resource scheduling on the cluster.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5637) Convert Hadoop Streaming document to APT

2014-05-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13992835#comment-13992835
 ] 

Hudson commented on MAPREDUCE-5637:
---

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1777 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1777/])
MAPREDUCE-5637. Convert Hadoop Streaming document to APT (Akira AJISAKA via 
jeagles) (jeagles: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1592789)
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/apt/HadoopStreaming.apt.vm
* /hadoop/common/trunk/hadoop-project/src/site/site.xml


 Convert Hadoop Streaming document to APT
 

 Key: MAPREDUCE-5637
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5637
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: documentation
Affects Versions: 2.2.0
Reporter: Akira AJISAKA
Assignee: Akira AJISAKA
 Fix For: 3.0.0, 2.5.0

 Attachments: MAPREDUCE-5637.2.patch, MAPREDUCE-5637.patch


 Convert Hadoop Streaming document from forrest to APT.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5867) Possible NPE in KillAMPreemptionPolicy related to ProportionalCapacityPreemptionPolicy

2014-05-14 Thread Devaraj K (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj K updated MAPREDUCE-5867:
-

Status: Open  (was: Patch Available)

Thanks Sunil for the updated patch with the test case. Please find some 
comments for the added test.

* make all the instance/static variables as private in the test class
* can you cover the test with these cases 
** with strictContract as null
** with strictContract as null
** with contract as null
** with contract as not null
* It may not require to have those many no of containers for preempting
* remove SOP in setup()


 Possible NPE in KillAMPreemptionPolicy related to 
 ProportionalCapacityPreemptionPolicy
 --

 Key: MAPREDUCE-5867
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5867
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.3.0
Reporter: Sunil G
Assignee: Sunil G
 Attachments: MapReduce-5867.2.patch, MapReduce-5867.3.patch, 
 Yarn-1980.1.patch


 I configured KillAMPreemptionPolicy for My Application Master and tried to 
 check preemption of queues.
 In one scenario I have seen below NPE in my AM
 014-04-24 15:11:08,860 ERROR [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: ERROR IN 
 CONTACTING RM. 
 java.lang.NullPointerException
   at 
 org.apache.hadoop.mapreduce.v2.app.rm.preemption.KillAMPreemptionPolicy.preempt(KillAMPreemptionPolicy.java:57)
   at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.getResources(RMContainerAllocator.java:662)
   at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:246)
   at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$1.run(RMCommunicator.java:267)
   at java.lang.Thread.run(Thread.java:662)
 I was using 2.2.0 and merged MAPREDUCE-5189 to see how AM preemption works.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5885) build/test/test.mapred.spill causes release audit warnings

2014-05-14 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He updated MAPREDUCE-5885:
---

Attachment: MAPREDUCE-5885.patch

patch submitted.

 build/test/test.mapred.spill causes release audit warnings
 --

 Key: MAPREDUCE-5885
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5885
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Affects Versions: trunk
Reporter: Jason Lowe
Assignee: Chen He
 Attachments: MAPREDUCE-5885.patch


 Multiple unit tests are creating files under 
 hadoop-mapreduce-client-jobclient/build/test/test.mapred.spill which are 
 causing release audit warnings during Jenkins patch precommit builds.  In 
 addition to being in a poor location for test output and not cleaning up 
 after the test, there are multiple tests using this location which will cause 
 conflicts if tests are run in parallel.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5465) Container killed before hprof dumps profile.out

2014-05-14 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13993652#comment-13993652
 ] 

Jason Lowe commented on MAPREDUCE-5465:
---

The release audit warnings are unrelated, filed MAPREDUCE-5885.  The 
TestPipeApplication timeout is also unrelated, see MAPREDUCE-5868.

Thanks for updating the patch, Ming!  Sorry for the long delay in getting back 
to this.  I've been thinking about the performance implications of this change. 
 I'm wondering if we should treat the finishing states as if they're the 
corresponding completed states from external entities (i.e.: task/job).  We 
would send T_ATTEMPT_SUCCEEDED or T_ATTEMPT_FAILED and set task finish times to 
the time the attempt said it succeeded or failed rather than the time the 
container completed.  Similarly we would map the internal finishing states to 
their respective external SUCCEEDED/FAILED state rather than RUNNING.  From the 
task/job perspective they're not particularly interested in when the attempt 
exits, rather they only care about when the task says it's output is available. 
 This would allow the task and job to react to success/failure transitions in 
the same timeframe that it does today, so there should be a minimal performance 
impact.  The only impact would be if the container needs to complete to free up 
enough space for the next task's container to be allocated, and in most cases 
the task will complete quick enough that the AM will receive the new container 
in the same heartbeat that it used to before this change.  Actually this may 
end up being slightly faster than what it does today, since today it connects 
to the NM and sends the kill command before it considers the task completed.  
This proposal would have the task complete as soon as the task indicated via 
the umbilical.

Other comments on the latest patch:
- Rather than have the finishing states call the cleanup container transition 
and have that transition have to special-case being called by finishing states, 
it'd be cleaner to factor out the common code from the cleanup container 
transition that they're trying to leverage and call that instead.  Transitions 
doing state or event checks usually means somethings a bit off, since the 
transition should already know what event triggered it and what state(s) it 
applies to.
- Similarly, the timeout transitions should have dedicated transition code that 
not only warns in the AM log but also sets an attempt diagnostic message.   It 
can re-use some/all of the cleanup container transition so it's not replicating 
code.  With the diagnostic it will be much more likely the user will be aware 
of the timeout issue and fix their task code.  Tasks that timeout during 
finishing can still succeed, so users probably won't even know something went 
wrong unless they bother to examine the AM log and happen to notice it.
- This change looks like some accidental reformatting:
{noformat}
--- 
a/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapred/LocalContainerLauncher.java
+++ 
b/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapred/LocalContainerLauncher.java
@@ -222,7 +222,7 @@ public void run() {
   // remember the current attempt
   futures.put(event.getTaskAttemptID(), future);
 
-} else if (event.getType() == EventType.CONTAINER_REMOTE_CLEANUP) {
+  } else if (event.getType() == EventType.CONTAINER_REMOTE_CLEANUP) {
 
   // cancel (and interrupt) the current running task associated with 
the
   // event
{noformat}
- Nit: a sendContainerCompleted utility method to send the CONTAINER_COMPLETED 
event would be nice
- Nit: code should be formatted to 80 columns, comments for the state 
transitions in particular.

 Container killed before hprof dumps profile.out
 ---

 Key: MAPREDUCE-5465
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5465
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mr-am, mrv2
Affects Versions: trunk, 2.0.3-alpha
Reporter: Radim Kolar
Assignee: Ming Ma
 Attachments: MAPREDUCE-5465-2.patch, MAPREDUCE-5465-3.patch, 
 MAPREDUCE-5465-4.patch, MAPREDUCE-5465-5.patch, MAPREDUCE-5465-6.patch, 
 MAPREDUCE-5465.patch


 If there is profiling enabled for mapper or reducer then hprof dumps 
 profile.out at process exit. It is dumped after task signaled to AM that work 
 is finished.
 AM kills container with finished work without waiting for hprof to finish 
 dumps. If hprof is dumping larger outputs (such as with depth=4 while depth=3 
 works) , it could not finish dump in time before being killed making entire 
 dump unusable because cpu and heap stats are 

[jira] [Commented] (MAPREDUCE-5888) Failed job leaves hung AM after it unregisters

2014-05-14 Thread Tsuyoshi OZAWA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13996590#comment-13996590
 ] 

Tsuyoshi OZAWA commented on MAPREDUCE-5888:
---

+1 (non-binding).

 Failed job leaves hung AM after it unregisters 
 ---

 Key: MAPREDUCE-5888
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5888
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mr-am
Affects Versions: 2.2.0
Reporter: Jason Lowe
Assignee: Jason Lowe
 Attachments: MAPREDUCE-5888.patch


 When a job fails the AM hangs during shutdown.  A non-daemon thread pool 
 executor thread prevents the JVM teardown from completing, and the AM lingers 
 on the cluster for the AM expiry interval in the FINISHING state until 
 eventually the RM expires it and kills the container.  If application limits 
 on the queue are relatively low (e.g.: small queue or small cluster) this can 
 cause unnecessary delays in resource scheduling on the cluster.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5018) Support raw binary data with Hadoop streaming

2014-05-14 Thread Steven Willis (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steven Willis updated MAPREDUCE-5018:
-

Attachment: MAPREDUCE-5018.patch

New patch with tests

 Support raw binary data with Hadoop streaming
 -

 Key: MAPREDUCE-5018
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5018
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: contrib/streaming
Reporter: Jay Hacker
Priority: Minor
 Attachments: MAPREDUCE-5018-branch-1.1.patch, MAPREDUCE-5018.patch, 
 MAPREDUCE-5018.patch, justbytes.jar, mapstream


 People often have a need to run older programs over many files, and turn to 
 Hadoop streaming as a reliable, performant batch system.  There are good 
 reasons for this:
 1. Hadoop is convenient: they may already be using it for mapreduce jobs, and 
 it is easy to spin up a cluster in the cloud.
 2. It is reliable: HDFS replicates data and the scheduler retries failed jobs.
 3. It is reasonably performant: it moves the code to the data, maintaining 
 locality, and scales with the number of nodes.
 Historically Hadoop is of course oriented toward processing key/value pairs, 
 and so needs to interpret the data passing through it.  Unfortunately, this 
 makes it difficult to use Hadoop streaming with programs that don't deal in 
 key/value pairs, or with binary data in general.  For example, something as 
 simple as running md5sum to verify the integrity of files will not give the 
 correct result, due to Hadoop's interpretation of the data.  
 There have been several attempts at binary serialization schemes for Hadoop 
 streaming, such as TypedBytes (HADOOP-1722); however, these are still aimed 
 at efficiently encoding key/value pairs, and not passing data through 
 unmodified.  Even the RawBytes serialization scheme adds length fields to 
 the data, rendering it not-so-raw.
 I often have a need to run a Unix filter on files stored in HDFS; currently, 
 the only way I can do this on the raw data is to copy the data out and run 
 the filter on one machine, which is inconvenient, slow, and unreliable.  It 
 would be very convenient to run the filter as a map-only job, allowing me to 
 build on existing (well-tested!) building blocks in the Unix tradition 
 instead of reimplementing them as mapreduce programs.
 However, most existing tools don't know about file splits, and so want to 
 process whole files; and of course many expect raw binary input and output.  
 The solution is to run a map-only job with an InputFormat and OutputFormat 
 that just pass raw bytes and don't split.  It turns out to be a little more 
 complicated with streaming; I have attached a patch with the simplest 
 solution I could come up with.  I call the format JustBytes (as RawBytes 
 was already taken), and it should be usable with most recent versions of 
 Hadoop.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5874) Creating MapReduce REST API section

2014-05-14 Thread Akira AJISAKA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13995939#comment-13995939
 ] 

Akira AJISAKA commented on MAPREDUCE-5874:
--

+1 (non-binding)

 Creating MapReduce REST API section
 ---

 Key: MAPREDUCE-5874
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5874
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: documentation
Affects Versions: 2.4.0
Reporter: Ravi Prakash
Assignee: Tsuyoshi OZAWA
 Attachments: MAPREDUCE-5874.2.patch, MAPREDUCE-5874.3.patch, 
 YARN-1999.1.patch


 Now that we have the YARN HistoryServer, perhaps we should move 
 HistoryServerRest.apt.vm and MapRedAppMasterRest.apt.vm into the MapReduce 
 section where it really belongs?



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5886) Allow wordcount example job to accept multiple input paths.

2014-05-14 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13996830#comment-13996830
 ] 

Chris Nauroth commented on MAPREDUCE-5886:
--

Hi, [~jira.shegalov] and [~ajisakaa].  Thanks for looking at this and 
contributing some new ideas.

Regarding {{FileInputFormat#addInputPaths}}, in addition to the issue raised by 
Akira for supporting comma in a file name, there is another reason why I didn't 
use that method.  On Windows Command Prompt, the comma acts as an argument 
separator, much like space.  This would have the potential to create confusion 
for users on Windows.

The basic concept of the new API looks good to me.  We might instead consider 
passing varargs and no range indices.  Word count could chop the input args 
down to the correct range using {{Arrays#copyOfRange}} or {{List#subList}}.

Would you mind moving all of the API work to another jira?  MAPREDUCE-5889 
probably would work for that.  For this issue, I was hoping to put in a quick 
trivial patch in just word count to enable this.  IOW, I'd like to pursue a 
binding +1 on patch v1 and commit it.

Thanks again!

 Allow wordcount example job to accept multiple input paths.
 ---

 Key: MAPREDUCE-5886
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5886
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: examples
Affects Versions: 3.0.0, 2.4.0
Reporter: Chris Nauroth
Assignee: Chris Nauroth
Priority: Minor
 Attachments: MAPREDUCE-5886.1.patch, MAPREDUCE-5886.2.patch, 
 MAPREDUCE-5886.3.patch


 It would be convenient if the wordcount example MapReduce job could accept 
 multiple input paths and run the word count on all of them.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5886) Allow wordcount example job to accept multiple input paths.

2014-05-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13995682#comment-13995682
 ] 

Hadoop QA commented on MAPREDUCE-5886:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12644305/MAPREDUCE-5886.2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-examples.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4597//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4597//console

This message is automatically generated.

 Allow wordcount example job to accept multiple input paths.
 ---

 Key: MAPREDUCE-5886
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5886
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: examples
Affects Versions: 3.0.0, 2.4.0
Reporter: Chris Nauroth
Assignee: Chris Nauroth
Priority: Minor
 Attachments: MAPREDUCE-5886.1.patch, MAPREDUCE-5886.2.patch


 It would be convenient if the wordcount example MapReduce job could accept 
 multiple input paths and run the word count on all of them.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5774) Job overview in History UI should list reducer phases in chronological order

2014-05-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998207#comment-13998207
 ] 

Hudson commented on MAPREDUCE-5774:
---

FAILURE: Integrated in Hadoop-Hdfs-trunk #1753 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1753/])
MAPREDUCE-5774. Job overview in History UI should list reducer phases in 
chronological order. (Gera Shegalov via kasha) (kasha: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1593890)
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/webapp/HsJobBlock.java


 Job overview in History UI should list reducer phases in chronological order
 

 Key: MAPREDUCE-5774
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5774
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobhistoryserver
Affects Versions: 2.3.0
Reporter: Gera Shegalov
Assignee: Gera Shegalov
Priority: Trivial
 Fix For: 2.5.0

 Attachments: MAPREDUCE-5774.v01.patch


 Current order:
 Average Map Time   9sec
 Average Reduce Time0sec
 Average Shuffle Time   22sec
 Average Merge Time 0sec
 Proposed order:
 Average Map Time   9sec
 Average Shuffle Time   22sec
 Average Merge Time 0sec
 Average Reduce Time0sec



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5652) NM Recovery. ShuffleHandler should handle NM restarts

2014-05-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998208#comment-13998208
 ] 

Hudson commented on MAPREDUCE-5652:
---

FAILURE: Integrated in Hadoop-Hdfs-trunk #1753 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1753/])
MAPREDUCE-5652. NM Recovery. ShuffleHandler should handle NM restarts. (Jason 
Lowe via kasha) (kasha: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1594329)
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* /hadoop/common/trunk/hadoop-mapreduce-project/LICENSE.txt
* /hadoop/common/trunk/hadoop-mapreduce-project/dev-support/findbugs-exclude.xml
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/JobID.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/pom.xml
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main/java/org/apache/hadoop/mapred/ShuffleHandler.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main/proto
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main/proto/ShuffleHandlerRecovery.proto
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/test/java/org/apache/hadoop/mapred/TestShuffleHandler.java
* /hadoop/common/trunk/hadoop-mapreduce-project/pom.xml


 NM Recovery. ShuffleHandler should handle NM restarts
 -

 Key: MAPREDUCE-5652
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5652
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.2.0
Reporter: Karthik Kambatla
Assignee: Jason Lowe
  Labels: shuffle
 Fix For: 2.5.0

 Attachments: MAPREDUCE-5652-v10.patch, MAPREDUCE-5652-v2.patch, 
 MAPREDUCE-5652-v3.patch, MAPREDUCE-5652-v4.patch, MAPREDUCE-5652-v5.patch, 
 MAPREDUCE-5652-v6.patch, MAPREDUCE-5652-v7.patch, MAPREDUCE-5652-v8.patch, 
 MAPREDUCE-5652-v9-and-YARN-1987.patch, MAPREDUCE-5652.patch


 ShuffleHandler should work across NM restarts and not require re-running 
 map-tasks. On NM restart, the map outputs are cleaned up requiring 
 re-execution of map tasks and should be avoided.



--
This message was sent by Atlassian JIRA
(v6.2#6252)