[jira] [Updated] (MAPREDUCE-5867) Possible NPE in KillAMPreemptionPolicy related to ProportionalCapacityPreemptionPolicy

2014-05-13 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated MAPREDUCE-5867:
---

Issue Type: Sub-task  (was: Bug)
Parent: MAPREDUCE-4584

> Possible NPE in KillAMPreemptionPolicy related to 
> ProportionalCapacityPreemptionPolicy
> --
>
> Key: MAPREDUCE-5867
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5867
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.3.0
>Reporter: Sunil G
>Assignee: Sunil G
> Attachments: MapReduce-5867-updated.patch, MapReduce-5867.2.patch, 
> MapReduce-5867.3.patch, Yarn-1980.1.patch
>
>
> I configured KillAMPreemptionPolicy for My Application Master and tried to 
> check preemption of queues.
> In one scenario I have seen below NPE in my AM
> 014-04-24 15:11:08,860 ERROR [RMCommunicator Allocator] 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: ERROR IN 
> CONTACTING RM. 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.mapreduce.v2.app.rm.preemption.KillAMPreemptionPolicy.preempt(KillAMPreemptionPolicy.java:57)
>   at 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.getResources(RMContainerAllocator.java:662)
>   at 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:246)
>   at 
> org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$1.run(RMCommunicator.java:267)
>   at java.lang.Thread.run(Thread.java:662)
> I was using 2.2.0 and merged MAPREDUCE-5189 to see how AM preemption works.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5848) MapReduce counts forcibly preempted containers as FAILED

2014-05-13 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated MAPREDUCE-5848:
---

Issue Type: Sub-task  (was: Bug)
Parent: MAPREDUCE-4584

> MapReduce counts forcibly preempted containers as FAILED
> 
>
> Key: MAPREDUCE-5848
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5848
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>Affects Versions: 2.1.0-beta
>Reporter: Carlo Curino
>Assignee: Subramaniam Krishnan
> Attachments: MR-5848.patch, MR-5848.patch, YARN-1958.patch
>
>
> The MapReduce AM is considering a forcibly preempted container as FAILED, 
> while I think it should be considered as KILLED (i.e., not count against the 
> maximum number of failures). 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5176) Preemptable annotations (to support preemption in MR)

2014-05-13 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated MAPREDUCE-5176:
---

Issue Type: Sub-task  (was: Improvement)
Parent: MAPREDUCE-4584

> Preemptable annotations (to support preemption in MR)
> -
>
> Key: MAPREDUCE-5176
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5176
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>  Components: mrv2
>Reporter: Carlo Curino
>Assignee: Carlo Curino
> Fix For: 2.1.0-beta
>
> Attachments: MAPREDUCE-5176.1.patch, MAPREDUCE-5176.2.patch, 
> MAPREDUCE-5176.3.patch, MAPREDUCE-5176.patch
>
>
> Proposing a patch that introduces a new annotation @Checkpointable that 
> represents to the framework property of user-supplied classes (e.g., Reducer, 
> OutputCommiter). The intended semantics is that a tagged class is safe to be 
> preempted between invocations. 
> (this is in spirit similar to the Output Contracts of [Nephele/PACT | 
> https://stratosphere.eu/sites/default/files/papers/ComparingMapReduceAndPACTs_11.pdf])



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5888) Failed job leaves hung AM after it unregisters

2014-05-13 Thread Jonathan Eagles (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles updated MAPREDUCE-5888:
---

   Resolution: Fixed
Fix Version/s: 2.5.0
   3.0.0
   Status: Resolved  (was: Patch Available)

> Failed job leaves hung AM after it unregisters 
> ---
>
> Key: MAPREDUCE-5888
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5888
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mr-am
>Affects Versions: 2.2.0
>Reporter: Jason Lowe
>Assignee: Jason Lowe
> Fix For: 3.0.0, 2.5.0
>
> Attachments: MAPREDUCE-5888.patch
>
>
> When a job fails the AM hangs during shutdown.  A non-daemon thread pool 
> executor thread prevents the JVM teardown from completing, and the AM lingers 
> on the cluster for the AM expiry interval in the FINISHING state until 
> eventually the RM expires it and kills the container.  If application limits 
> on the queue are relatively low (e.g.: small queue or small cluster) this can 
> cause unnecessary delays in resource scheduling on the cluster.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5889) Deprecate FileInputFormat.setInputPaths(JobConf, String) and FileInputFormat.addInputPaths(JobConf, String)

2014-05-13 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13996740#comment-13996740
 ] 

Chris Nauroth commented on MAPREDUCE-5889:
--

I think that makes sense.  +1 for the proposal.  Thanks, Akira.

> Deprecate FileInputFormat.setInputPaths(JobConf, String) and 
> FileInputFormat.addInputPaths(JobConf, String)
> ---
>
> Key: MAPREDUCE-5889
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5889
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Akira AJISAKA
>Priority: Minor
>  Labels: newbie
>
> {{FileInputFormat.setInputPaths(JobConf conf, String commaSeparatedPaths)}} 
> and {{FileInputFormat.addInputPaths(JobConf conf, String 
> commaSeparatedPaths)}} fail to parse commaSeparatedPaths if a comma is 
> included in the file path. (e.g. Path: {{/path/file,with,comma}})
> We should deprecate these methods and document to use {{setInputPaths(JobConf 
> conf, Path... inputPaths)}} and {{addInputPaths(JobConf conf, Path... 
> inputPaths)}} instead.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5016) GridMix Error: Found no satisfactory file in path

2014-05-13 Thread chaitali gupta (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13996872#comment-13996872
 ] 

chaitali gupta commented on MAPREDUCE-5016:
---

I also ran into similar issue earlier. the reason for this error is that - you 
are generating 20m of input. Gridmix code, when runs traces, looks for input 
data files that have more than 128mb in size. So if  you generate 20m of data, 
file filter search does not find any input > 128mb of size. Hence, the error. 

THe fix is to add a config parameter to the gridmix command - 
"-Dgridmix.min.file.size= ". Hope this helps. 

> GridMix Error:  Found no satisfactory file in path 
> ---
>
> Key: MAPREDUCE-5016
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5016
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: contrib/gridmix
>Affects Versions: 1.1.0, 1.1.1, 2.0.3-alpha
> Environment: Ubuntu 12.04
>Reporter: Light 
>
> Hello,
> Everytime i launch gridmix with the command:
> PAPATH=/home/light/Bureau/test_gridmix
> bin/hadoop -classpath $JAR_CLASSPATH org.apache.hadoop.mapred.gridmix.Gridmix 
>  -Dgridmix.min.file.size=10m 
> -Dgridmix.output.directory=/home/light/Bureau/test -generate 20m $PAPATH 
> /home/light/Bureau/test_rumen_output/job-trace.json
> I have this: Found no satisfactory file in /home/light/Bureau/test_gridmix
> This happen even if i use a hdfs path.
> I have exactly the same problem at first than 
> [MAPREDUCE-2015|https://issues.apache.org/jira/browse/MAPREDUCE-2015] (File 
> already exist) and finish by having the same problem once my path problem was 
> solved. 
> What is bugging me:
> First: GRIDMIX_GENDATA (job_local_0001) success, even if it says it's a 
> succes, in my folder i only have a _SUCCESS file of size 0.
> I added some wait at this point in GridMix and juste before the check, there 
> is no file in the output folder.
> Second: Whatever the size asked it will do it in 1s, so i think the problem 
> may be here: There is a bug for me, no file is generated.
> I tried with every hadoop version and none of them is working.
> Here is the output:
> 13/02/20 14:42:47 INFO gridmix.SubmitterUserResolver:  Current user resolver 
> is SubmitterUserResolver 
> 13/02/20 14:42:47 WARN gridmix.Gridmix: Resource null ignored
> 13/02/20 14:42:47 WARN util.NativeCodeLoader: Unable to load native-hadoop 
> library for your platform... using builtin-java classes where applicable
> 13/02/20 14:42:47 INFO gridmix.Gridmix:  Submission policy is STRESS
> 13/02/20 14:42:47 INFO gridmix.Gridmix: Generating 20,0m of test data...
> 13/02/20 14:42:47 INFO gridmix.Statistics: Not tracking job GRIDMIX_GENDATA 
> as seq id is less than zero: -1
> 13/02/20 14:42:52 INFO gridmix.JobMonitor: GRIDMIX_GENDATA (job_local_0001) 
> success
> 13/02/20 14:42:57 INFO gridmix.Gridmix: Changing the permissions for 
> inputPath /home/light/Bureau/test_gridmix
> 13/02/20 14:42:57 INFO gridmix.Gridmix: Done.
> 13/02/20 14:44:12 ERROR gridmix.Gridmix: Startup failed
> java.io.IOException: Found no satisfactory file in 
> /home/light/Bureau/test_gridmix
>   at org.apache.hadoop.mapred.gridmix.FilePool.refresh(FilePool.java:105)
>   at 
> org.apache.hadoop.mapred.gridmix.JobSubmitter.refreshFilePool(JobSubmitter.java:159)
>   at org.apache.hadoop.mapred.gridmix.Gridmix.start(Gridmix.java:291)
>   at org.apache.hadoop.mapred.gridmix.Gridmix.runJob(Gridmix.java:264)
>   at org.apache.hadoop.mapred.gridmix.Gridmix.access$000(Gridmix.java:55)
>   at org.apache.hadoop.mapred.gridmix.Gridmix$1.run(Gridmix.java:217)
>   at org.apache.hadoop.mapred.gridmix.Gridmix$1.run(Gridmix.java:215)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:416)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
>   at org.apache.hadoop.mapred.gridmix.Gridmix.run(Gridmix.java:215)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>   at org.apache.hadoop.mapred.gridmix.Gridmix.main(Gridmix.java:395)
> 13/02/20 14:44:12 INFO gridmix.Gridmix: Exiting...
> Thanks in advance for any responses
>  



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5851) Enable regular expression in the DistCp input

2014-05-13 Thread Avinash Kujur (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13996217#comment-13996217
 ] 

Avinash Kujur commented on MAPREDUCE-5851:
--

i can see many parameters in Distcp class. in which parameter do we need to 
enable regular expressions?

private static final String usage = NAME
  + " [OPTIONS] * " +
  "\n\nOPTIONS:" +
  "\n-p[rbugp]  Preserve status" +
  "\n   r: replication number" +
  "\n   b: block size" +
  "\n   u: user" +
  "\n   g: group" +
  "\n   p: permission" +
  "\n   -p alone is equivalent to -prbugp" +
  "\n-i Ignore failures" +
  "\n-log   Write logs to " +
  "\n-m   Maximum number of simultaneous copies" +
  "\n-overwrite Overwrite destination" +
  "\n-updateOverwrite if src size different from dst size" +
  "\n-fUse list at  as src list" +
  "\n-filelimit  Limit the total number of files to be <= n" +
  "\n-sizelimit  Limit the total size to be <= n bytes" +
  "\n-deleteDelete the files existing in the dst but not in 
src" +
  "\n-mapredSslConf  Filename of SSL configuration for mapper task" +
 
  "\n\nNOTE 1: if -overwrite or -update are set, each source URI is " +
  "\n  interpreted as an isomorphic update to an existing directory." +
  "\nFor example:" +
  "\nhadoop " + NAME + " -p -update \"hdfs://A:8020/user/foo/bar\" " +
  "\"hdfs://B:8020/user/foo/baz\"\n" +
  "\n would update all descendants of 'baz' also in 'bar'; it would " +
  "\n *not* update /user/foo/baz/bar" +

  "\n\nNOTE 2: The parameter  in -filelimit and -sizelimit can be " +
  "\n specified with symbolic representation.  For examples," +
  "\n   1230k = 1230 * 1024 = 1259520" +
  "\n   891g = 891 * 1024^3 = 956703965184" +
 
  "\n";

> Enable regular expression in the DistCp input
> -
>
> Key: MAPREDUCE-5851
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5851
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: distcp
>Reporter: Yan Qi
>Assignee: Yan Qi
>Priority: Minor
>  Labels: distcp, expression, regular
>
> DistCp doesn't support regular expression as the input. If the files to copy 
> are in the different locations, it is quite verbose to put a long list of 
> inputs in the command. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5886) Allow wordcount example job to accept multiple input paths.

2014-05-13 Thread Gera Shegalov (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13996980#comment-13996980
 ] 

Gera Shegalov commented on MAPREDUCE-5886:
--

I considered both {{Arrays#copyOfRange}} and {{List#subList}} but discarded 
this due to creation of throwaway objects. Thanks for discussion, [~ajisakaa] 
and [~cnauroth]. We can move FIF changes to another JIRA.

> Allow wordcount example job to accept multiple input paths.
> ---
>
> Key: MAPREDUCE-5886
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5886
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: examples
>Affects Versions: 3.0.0, 2.4.0
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
>Priority: Minor
> Attachments: MAPREDUCE-5886.1.patch, MAPREDUCE-5886.2.patch, 
> MAPREDUCE-5886.3.patch
>
>
> It would be convenient if the wordcount example MapReduce job could accept 
> multiple input paths and run the word count on all of them.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-207) Computing Input Splits on the MR Cluster

2014-05-13 Thread Gera Shegalov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gera Shegalov updated MAPREDUCE-207:


Attachment: MAPREDUCE-207.v03.patch

v03 to handle local jobs correctly

> Computing Input Splits on the MR Cluster
> 
>
> Key: MAPREDUCE-207
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-207
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: applicationmaster, mrv2
>Reporter: Philip Zeyliger
>Assignee: Arun C Murthy
> Attachments: MAPREDUCE-207.patch, MAPREDUCE-207.v02.patch, 
> MAPREDUCE-207.v03.patch
>
>
> Instead of computing the input splits as part of job submission, Hadoop could 
> have a separate "job task type" that computes the input splits, therefore 
> allowing that computation to happen on the cluster.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5886) Allow wordcount example job to accept multiple input paths.

2014-05-13 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13997045#comment-13997045
 ] 

Chris Nauroth commented on MAPREDUCE-5886:
--

bq. I considered both {{Arrays#copyOfRange}} and {{List#subList}} but discarded 
this due to creation of throwaway objects.

It's not too bad for {{ArrayList#subList}}.  It retains the original array and 
wraps it with different offset indices:

http://hg.openjdk.java.net/jdk6/jdk6/jdk/file/tip/src/share/classes/java/util/ArrayList.java#l891

You pay a flat cost for the extra indices and object overhead, but it's not a 
full array reallocation.

> Allow wordcount example job to accept multiple input paths.
> ---
>
> Key: MAPREDUCE-5886
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5886
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: examples
>Affects Versions: 3.0.0, 2.4.0
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
>Priority: Minor
> Attachments: MAPREDUCE-5886.1.patch, MAPREDUCE-5886.2.patch, 
> MAPREDUCE-5886.3.patch
>
>
> It would be convenient if the wordcount example MapReduce job could accept 
> multiple input paths and run the word count on all of them.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5889) Deprecate FileInputFormat.setInputPaths(JobConf, String) and FileInputFormat.addInputPaths(JobConf, String)

2014-05-13 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13996815#comment-13996815
 ] 

Chen He commented on MAPREDUCE-5889:


Agreed. +1 for the idea

> Deprecate FileInputFormat.setInputPaths(JobConf, String) and 
> FileInputFormat.addInputPaths(JobConf, String)
> ---
>
> Key: MAPREDUCE-5889
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5889
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Akira AJISAKA
>Priority: Minor
>  Labels: newbie
>
> {{FileInputFormat.setInputPaths(JobConf conf, String commaSeparatedPaths)}} 
> and {{FileInputFormat.addInputPaths(JobConf conf, String 
> commaSeparatedPaths)}} fail to parse commaSeparatedPaths if a comma is 
> included in the file path. (e.g. Path: {{/path/file,with,comma}})
> We should deprecate these methods and document to use {{setInputPaths(JobConf 
> conf, Path... inputPaths)}} and {{addInputPaths(JobConf conf, Path... 
> inputPaths)}} instead.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5309) 2.0.4 JobHistoryParser can't parse certain failed job history files generated by 2.0.3 history server

2014-05-13 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated MAPREDUCE-5309:
--

Attachment: MAPREDUCE-5309.patch

> 2.0.4 JobHistoryParser can't parse certain failed job history files generated 
> by 2.0.3 history server
> -
>
> Key: MAPREDUCE-5309
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5309
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver, mrv2
>Affects Versions: 2.0.4-alpha
>Reporter: Vrushali C
>Assignee: Rushabh S Shah
> Attachments: MAPREDUCE-5309.patch, Test20JobHistoryParsing.java, 
> job_2_0_3-KILLED.jhist
>
>
> When the 2.0.4 JobHistoryParser tries to parse a job history file generated 
> by hadoop 2.0.3, the jobhistoryparser throws as an error as
> java.lang.ClassCastException: org.apache.avro.generic.GenericData$Array 
> cannot be cast to org.apache.hadoop.mapreduce.jobhistory.JhCounters
> at 
> org.apache.hadoop.mapreduce.jobhistory.TaskAttemptUnsuccessfulCompletion.put(TaskAttemptUnsuccessfulCompletion.java:58)
> at org.apache.avro.generic.GenericData.setField(GenericData.java:463)
> at 
> org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:166)
> at 
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138)
> at 
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:142)
> at 
> org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:166)
> at 
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138)
> at 
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:129)
> at 
> org.apache.hadoop.mapreduce.jobhistory.EventReader.getNextEvent(EventReader.java:93)
> at 
> org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:111)
> at 
> org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:156)
> at 
> org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:142)
> at 
> com.twitter.somepackage.Test20JobHistoryParsing.testFileAvro(Test20JobHistoryParsing.java:23)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
> at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
> at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
> at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
> at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:76)
> at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
> at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193)
> at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52)
> at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191)
> at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42)
> at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184)
> at org.junit.runners.ParentRunner.run(ParentRunner.java:236)
> at 
> org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50)
> at 
> org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
> at 
> org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467)
> at 
> org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683)
> at 
> org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390)
> at 
> org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197)
> Test code and the job history file are attached.
> Test code:
> package com.twitter.somepackagel;
> import java.io.IOException;
> import org.apache.hadoop.conf.Configuration;
> import org.apache.hadoop.fs.FileSystem;
> import org.apache.hadoop.fs.Path;
> import org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser;
> import org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.JobInfo;
> import org.junit.Test;
> import org.apache.hadoop.yarn.YarnException;
> public class Test20JobHistoryParsing {
>
>   @Test
>   public void testFileAvro() throws IOException
>   {
>   Path local_path2 = new Path("/tmp/job_2_0_3-KILLE

[jira] [Commented] (MAPREDUCE-5706) toBeDeleted parent directories aren't being cleaned up

2014-05-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13995755#comment-13995755
 ] 

Hudson commented on MAPREDUCE-5706:
---

FAILURE: Integrated in Hadoop-Mapreduce-22-branch #116 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-22-branch/116/])
MAPREDUCE-5706. toBeDeleted parent directories aren't being cleaned up. (Robert 
Kanter via kasha) (kasha: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1593894)
* /hadoop/common/branches/branch-0.22/mapreduce/CHANGES.txt
* 
/hadoop/common/branches/branch-0.22/mapreduce/src/java/org/apache/hadoop/mapreduce/util/MRAsyncDiskService.java


> toBeDeleted parent directories aren't being cleaned up
> --
>
> Key: MAPREDUCE-5706
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5706
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: security
>Affects Versions: 0.22.0
>Reporter: Robert Kanter
>Assignee: Robert Kanter
> Fix For: 0.22.1
>
> Attachments: MAPREDUCE-5706.patch
>
>
> When security is enabled on 0.22, MRASyncDiskService doesn't always delete 
> the parent directories under {{toBeDeleted}}.
> MRAsyncDiskService goes through {{toBeDeleted}} and creates "tasks" to delete 
> the directories under there using the LinuxTaskController. It chooses which 
> user to run as by looking at who owns that directory.
> For example:
> {noformat}
> ls -al /mapred/local/toBeDeleted/2013-07-05_05-37-49.052_0
> total 12
> drwxr-xr-x 3 mapred mapred 4096 Jul  5 05:37 .
> drwxr-xr-x 5 mapred mapred 4096 Dec 19 10:15 ..
> drwxr-s--- 4 test   mapred 4096 Jul  2 02:54 test
> {noformat}
> It would create a task to use "test" user to delete 
> /mapred/local/toBeDeleted/2013-07-05_05-37-49.052_0/test (there could be more 
> in there for other users). It then creates a task to use "mapred" user to 
> delete /mapred/local/toBeDeleted/2013-07-05_05-37-49.052_0.
> So, the problem is that we normally configure "mapred" to not be allowed by 
> the LinuxTaskController in the 
> /etc/hadoop/conf.cloudera.mapreduce1/taskcontroller.cfg.  The permissions on 
> the toBeDeleted dir is drwxr-xr-x mapred:mapred, which means that only 
> "mapred" can delete things in it (i.e. the timestamped dirs).  However, the 
> MRAsyncDiskService is already running as the mapred user, so there's no 
> reason to use the LinuxTaskController for impersonation anyway; we can 
> directly do it from the Java code.
> Another issue is that {{MRAsyncDiskService#deletePathsInSecureCluster}} 
> expects an absolute file path (e.g. 
> {{/mapred/local/toBeDeleted/2013-07-05_05-37-49.052_0}}, but 
> {{MRAsyncDiskService#moveAndDeleteRelativePath}} passes in a relative path 
> (e.g. {{toBeDeleted/2013-07-05_05-37-49.052_0}}).  



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (MAPREDUCE-5888) Failed job leaves hung AM after it unregisters

2014-05-13 Thread Jason Lowe (JIRA)
Jason Lowe created MAPREDUCE-5888:
-

 Summary: Failed job leaves hung AM after it unregisters 
 Key: MAPREDUCE-5888
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5888
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mr-am
Affects Versions: 2.2.0
Reporter: Jason Lowe
Assignee: Jason Lowe


When a job fails the AM hangs during shutdown.  A non-daemon thread pool 
executor thread prevents the JVM teardown from completing, and the AM lingers 
on the cluster for the AM expiry interval in the FINISHING state until 
eventually the RM expires it and kills the container.  If application limits on 
the queue are relatively low (e.g.: small queue or small cluster) this can 
cause unnecessary delays in resource scheduling on the cluster.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Assigned] (MAPREDUCE-5309) 2.0.4 JobHistoryParser can't parse certain failed job history files generated by 2.0.3 history server

2014-05-13 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah reassigned MAPREDUCE-5309:
-

Assignee: Rushabh S Shah

> 2.0.4 JobHistoryParser can't parse certain failed job history files generated 
> by 2.0.3 history server
> -
>
> Key: MAPREDUCE-5309
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5309
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver, mrv2
>Affects Versions: 2.0.4-alpha
>Reporter: Vrushali C
>Assignee: Rushabh S Shah
> Attachments: Test20JobHistoryParsing.java, job_2_0_3-KILLED.jhist
>
>
> When the 2.0.4 JobHistoryParser tries to parse a job history file generated 
> by hadoop 2.0.3, the jobhistoryparser throws as an error as
> java.lang.ClassCastException: org.apache.avro.generic.GenericData$Array 
> cannot be cast to org.apache.hadoop.mapreduce.jobhistory.JhCounters
> at 
> org.apache.hadoop.mapreduce.jobhistory.TaskAttemptUnsuccessfulCompletion.put(TaskAttemptUnsuccessfulCompletion.java:58)
> at org.apache.avro.generic.GenericData.setField(GenericData.java:463)
> at 
> org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:166)
> at 
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138)
> at 
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:142)
> at 
> org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:166)
> at 
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138)
> at 
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:129)
> at 
> org.apache.hadoop.mapreduce.jobhistory.EventReader.getNextEvent(EventReader.java:93)
> at 
> org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:111)
> at 
> org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:156)
> at 
> org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:142)
> at 
> com.twitter.somepackage.Test20JobHistoryParsing.testFileAvro(Test20JobHistoryParsing.java:23)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
> at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
> at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
> at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
> at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:76)
> at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
> at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193)
> at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52)
> at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191)
> at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42)
> at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184)
> at org.junit.runners.ParentRunner.run(ParentRunner.java:236)
> at 
> org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50)
> at 
> org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
> at 
> org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467)
> at 
> org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683)
> at 
> org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390)
> at 
> org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197)
> Test code and the job history file are attached.
> Test code:
> package com.twitter.somepackagel;
> import java.io.IOException;
> import org.apache.hadoop.conf.Configuration;
> import org.apache.hadoop.fs.FileSystem;
> import org.apache.hadoop.fs.Path;
> import org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser;
> import org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.JobInfo;
> import org.junit.Test;
> import org.apache.hadoop.yarn.YarnException;
> public class Test20JobHistoryParsing {
>
>   @Test
>   public void testFileAvro() throws IOException
>   {
>   Path local_path2 = new Path("/tmp/job_2_0_3-KILLED.jhist");
>  JobHistor

[jira] [Updated] (MAPREDUCE-5885) build/test/test.mapred.spill causes release audit warnings

2014-05-13 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He updated MAPREDUCE-5885:
---

Target Version/s: 0.23.11, 2.5.0
  Status: Patch Available  (was: Open)

> build/test/test.mapred.spill causes release audit warnings
> --
>
> Key: MAPREDUCE-5885
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5885
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: test
>Affects Versions: trunk
>Reporter: Jason Lowe
>Assignee: Chen He
> Attachments: MAPREDUCE-5885.patch
>
>
> Multiple unit tests are creating files under 
> hadoop-mapreduce-client-jobclient/build/test/test.mapred.spill which are 
> causing release audit warnings during Jenkins patch precommit builds.  In 
> addition to being in a poor location for test output and not cleaning up 
> after the test, there are multiple tests using this location which will cause 
> conflicts if tests are run in parallel.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5652) NM Recovery. ShuffleHandler should handle NM restarts

2014-05-13 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated MAPREDUCE-5652:
--

Attachment: MAPREDUCE-5652-v10.patch

Updated patch to trunk.

> NM Recovery. ShuffleHandler should handle NM restarts
> -
>
> Key: MAPREDUCE-5652
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5652
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Karthik Kambatla
>Assignee: Jason Lowe
>  Labels: shuffle
> Attachments: MAPREDUCE-5652-v10.patch, MAPREDUCE-5652-v2.patch, 
> MAPREDUCE-5652-v3.patch, MAPREDUCE-5652-v4.patch, MAPREDUCE-5652-v5.patch, 
> MAPREDUCE-5652-v6.patch, MAPREDUCE-5652-v7.patch, MAPREDUCE-5652-v8.patch, 
> MAPREDUCE-5652-v9-and-YARN-1987.patch, MAPREDUCE-5652.patch
>
>
> ShuffleHandler should work across NM restarts and not require re-running 
> map-tasks. On NM restart, the map outputs are cleaned up requiring 
> re-execution of map tasks and should be avoided.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5888) Failed job leaves hung AM after it unregisters

2014-05-13 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13996534#comment-13996534
 ] 

Jason Lowe commented on MAPREDUCE-5888:
---

This was caused by MAPREDUCE-5317.  The ScheduledThreadPoolExecutor that was 
added was not marked to create daemon threads.

> Failed job leaves hung AM after it unregisters 
> ---
>
> Key: MAPREDUCE-5888
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5888
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mr-am
>Affects Versions: 2.2.0
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>
> When a job fails the AM hangs during shutdown.  A non-daemon thread pool 
> executor thread prevents the JVM teardown from completing, and the AM lingers 
> on the cluster for the AM expiry interval in the FINISHING state until 
> eventually the RM expires it and kills the container.  If application limits 
> on the queue are relatively low (e.g.: small queue or small cluster) this can 
> cause unnecessary delays in resource scheduling on the cluster.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5885) build/test/test.mapred.spill causes release audit warnings

2014-05-13 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13996518#comment-13996518
 ] 

Chen He commented on MAPREDUCE-5885:


As well as TestMapReduce.java. 

> build/test/test.mapred.spill causes release audit warnings
> --
>
> Key: MAPREDUCE-5885
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5885
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: test
>Affects Versions: trunk
>Reporter: Jason Lowe
>Assignee: Chen He
>
> Multiple unit tests are creating files under 
> hadoop-mapreduce-client-jobclient/build/test/test.mapred.spill which are 
> causing release audit warnings during Jenkins patch precommit builds.  In 
> addition to being in a poor location for test output and not cleaning up 
> after the test, there are multiple tests using this location which will cause 
> conflicts if tests are run in parallel.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5081) Backport DistCpV2 and the related JIRAs to branch-1

2014-05-13 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13996487#comment-13996487
 ] 

Tsz Wo Nicholas Sze commented on MAPREDUCE-5081:


By hdfs2, do you mean hdfs in branch-2?  The DistCpV2 hers is for branch-1.  
You may try it in your setup.

> Backport DistCpV2 and the related JIRAs to branch-1
> ---
>
> Key: MAPREDUCE-5081
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5081
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: distcp
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
> Fix For: 1.2.0
>
> Attachments: DistCp.java.diff, m5081_20130328.patch, 
> m5081_20130328b.patch, m5981_20130321.patch, m5981_20130321b.patch, 
> m5981_20130323.patch
>
>
> Here is a list of DistCpV2 JIRAs:
> - MAPREDUCE-2765: DistCpV2 main jira
> - HADOOP-8703: turn CRC checking off for 0 byte size 
> - HDFS-3054: distcp -skipcrccheck has no effect.
> - HADOOP-8431: Running distcp without args throws IllegalArgumentException
> - HADOOP-8775: non-positive value to -bandwidth
> - MAPREDUCE-4654: TestDistCp is ignored
> - HADOOP-9022: distcp fails to copy file if -m 0 specified
> - HADOOP-9025: TestCopyListing failing
> - MAPREDUCE-5075: DistCp leaks input file handles
> - distcp part of HADOOP-8341: Fix findbugs issues in hadoop-tools
> - MAPREDUCE-5014: custom CopyListing



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5874) Creating MapReduce REST API section

2014-05-13 Thread Tsuyoshi OZAWA (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsuyoshi OZAWA updated MAPREDUCE-5874:
--

Attachment: MAPREDUCE-5874.3.patch

Thanks for your suggestion, Akira. Removed a link to top page from both docs.

> Creating MapReduce REST API section
> ---
>
> Key: MAPREDUCE-5874
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5874
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 2.4.0
>Reporter: Ravi Prakash
>Assignee: Tsuyoshi OZAWA
> Attachments: MAPREDUCE-5874.2.patch, MAPREDUCE-5874.3.patch, 
> YARN-1999.1.patch
>
>
> Now that we have the YARN HistoryServer, perhaps we should move 
> HistoryServerRest.apt.vm and MapRedAppMasterRest.apt.vm into the MapReduce 
> section where it really belongs?



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5878) some standard JDK APIs are not part of system classes defaults

2014-05-13 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13994090#comment-13994090
 ] 

Sangjin Lee commented on MAPREDUCE-5878:


This could in theory cause issues in areas such as jaxb.

> some standard JDK APIs are not part of system classes defaults
> --
>
> Key: MAPREDUCE-5878
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5878
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 2.4.0
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
>
> There are some standard JDK APIs that are not part of the 
> mapreduce.job.classloader.system.classes property value.
> Currently the default value covers only "java.,javax." from the JDK. However, 
> there are other APIs that are as well-established as these, such as 
> org.w3c.dom and org.xml.sax. In other similar systems (e.g. OSGi), it is a 
> standard practice to include both of these packages in the system classes. We 
> should add these to the default values.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5867) Possible NPE in KillAMPreemptionPolicy related to ProportionalCapacityPreemptionPolicy

2014-05-13 Thread Sunil G (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil G updated MAPREDUCE-5867:
---

Attachment: MapReduce-5867-updated.patch

Thank you very much [~devaraj.k] for the comments.
Rework is done as per the comments and updated the patch.
Please review.

> Possible NPE in KillAMPreemptionPolicy related to 
> ProportionalCapacityPreemptionPolicy
> --
>
> Key: MAPREDUCE-5867
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5867
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.3.0
>Reporter: Sunil G
>Assignee: Sunil G
> Attachments: MapReduce-5867-updated.patch, MapReduce-5867.2.patch, 
> MapReduce-5867.3.patch, Yarn-1980.1.patch
>
>
> I configured KillAMPreemptionPolicy for My Application Master and tried to 
> check preemption of queues.
> In one scenario I have seen below NPE in my AM
> 014-04-24 15:11:08,860 ERROR [RMCommunicator Allocator] 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: ERROR IN 
> CONTACTING RM. 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.mapreduce.v2.app.rm.preemption.KillAMPreemptionPolicy.preempt(KillAMPreemptionPolicy.java:57)
>   at 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.getResources(RMContainerAllocator.java:662)
>   at 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:246)
>   at 
> org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$1.run(RMCommunicator.java:267)
>   at java.lang.Thread.run(Thread.java:662)
> I was using 2.2.0 and merged MAPREDUCE-5189 to see how AM preemption works.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5874) Creating MapReduce REST API section

2014-05-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13995694#comment-13995694
 ] 

Hadoop QA commented on MAPREDUCE-5874:
--

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12644468/MAPREDUCE-5874.3.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+0 tests included{color}.  The patch appears to be a 
documentation patch that doesn't require tests.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4598//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4598//console

This message is automatically generated.

> Creating MapReduce REST API section
> ---
>
> Key: MAPREDUCE-5874
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5874
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 2.4.0
>Reporter: Ravi Prakash
>Assignee: Tsuyoshi OZAWA
> Attachments: MAPREDUCE-5874.2.patch, MAPREDUCE-5874.3.patch, 
> YARN-1999.1.patch
>
>
> Now that we have the YARN HistoryServer, perhaps we should move 
> HistoryServerRest.apt.vm and MapRedAppMasterRest.apt.vm into the MapReduce 
> section where it really belongs?



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5867) Possible NPE in KillAMPreemptionPolicy related to ProportionalCapacityPreemptionPolicy

2014-05-13 Thread Sunil G (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil G updated MAPREDUCE-5867:
---

Status: Patch Available  (was: Open)

> Possible NPE in KillAMPreemptionPolicy related to 
> ProportionalCapacityPreemptionPolicy
> --
>
> Key: MAPREDUCE-5867
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5867
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.3.0
>Reporter: Sunil G
>Assignee: Sunil G
> Attachments: MapReduce-5867-updated.patch, MapReduce-5867.2.patch, 
> MapReduce-5867.3.patch, Yarn-1980.1.patch
>
>
> I configured KillAMPreemptionPolicy for My Application Master and tried to 
> check preemption of queues.
> In one scenario I have seen below NPE in my AM
> 014-04-24 15:11:08,860 ERROR [RMCommunicator Allocator] 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: ERROR IN 
> CONTACTING RM. 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.mapreduce.v2.app.rm.preemption.KillAMPreemptionPolicy.preempt(KillAMPreemptionPolicy.java:57)
>   at 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.getResources(RMContainerAllocator.java:662)
>   at 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:246)
>   at 
> org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$1.run(RMCommunicator.java:267)
>   at java.lang.Thread.run(Thread.java:662)
> I was using 2.2.0 and merged MAPREDUCE-5189 to see how AM preemption works.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (MAPREDUCE-5889) Deprecate FileInputFormat.setInputPaths(JobConf, String) and FileInputFormat.addInputPaths(JobConf, String)

2014-05-13 Thread Akira AJISAKA (JIRA)
Akira AJISAKA created MAPREDUCE-5889:


 Summary: Deprecate FileInputFormat.setInputPaths(JobConf, String) 
and FileInputFormat.addInputPaths(JobConf, String)
 Key: MAPREDUCE-5889
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5889
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Akira AJISAKA
Priority: Minor


{{FileInputFormat.setInputPaths(JobConf conf, String commaSeparatedPaths)}} and 
{{FileInputFormat.addInputPaths(JobConf conf, String commaSeparatedPaths)}} 
fail to parse commaSeparatedPaths if a comma is included in the file path. 
(e.g. Path: {{/path/file,with,comma}})
We should deprecate these methods and document to use {{setInputPaths(JobConf 
conf, Path... inputPaths)}} and {{addInputPaths(JobConf conf, Path... 
inputPaths)}} instead.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5831) Old MR client is not compatible with new MR application

2014-05-13 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated MAPREDUCE-5831:
---

Target Version/s: 2.5.0, 2.4.1

> Old MR client is not compatible with new MR application
> ---
>
> Key: MAPREDUCE-5831
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5831
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: client, mr-am
>Affects Versions: 2.2.0, 2.3.0
>Reporter: Zhijie Shen
>Assignee: Tan, Wangda
>Priority: Critical
>
> Recently, we saw the following scenario:
> 1. The user setup a cluster of hadoop 2.3., which contains YARN 2.3 and MR  
> 2.3.
> 2. The user client on a machine that MR 2.2 is installed and in the classpath.
> Then, when the user submitted a simple wordcount job, he saw the following 
> message:
> {code}
> 16:00:41,027  INFO main mapreduce.Job:1345 -  map 100% reduce 100%
> 16:00:41,036  INFO main mapreduce.Job:1356 - Job job_1396468045458_0006 
> completed successfully
> 16:02:20,535  WARN main mapreduce.JobRunner:212 - Cannot start job 
> [wordcountJob]
> java.lang.IllegalArgumentException: No enum constant 
> org.apache.hadoop.mapreduce.JobCounter.MB_MILLIS_REDUCES
>   at java.lang.Enum.valueOf(Enum.java:236)
>   at 
> org.apache.hadoop.mapreduce.counters.FrameworkCounterGroup.valueOf(FrameworkCounterGroup.java:148)
>   at 
> org.apache.hadoop.mapreduce.counters.FrameworkCounterGroup.findCounter(FrameworkCounterGroup.java:182)
>   at 
> org.apache.hadoop.mapreduce.counters.AbstractCounters.findCounter(AbstractCounters.java:154)
>   at 
> org.apache.hadoop.mapreduce.TypeConverter.fromYarn(TypeConverter.java:240)
>   at 
> org.apache.hadoop.mapred.ClientServiceDelegate.getJobCounters(ClientServiceDelegate.java:370)
>   at 
> org.apache.hadoop.mapred.YARNRunner.getJobCounters(YARNRunner.java:511)
>   at org.apache.hadoop.mapreduce.Job$7.run(Job.java:756)
>   at org.apache.hadoop.mapreduce.Job$7.run(Job.java:753)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>   at org.apache.hadoop.mapreduce.Job.getCounters(Job.java:753)
>   at org.apache.hadoop.mapreduce.Job.monitorAndPrintJob(Job.java:1361)
>   at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1289)
> . . .
> {code}
> The problem is that the wordcount job was running on one or more than one 
> nodes of the YARN cluster, where MR 2.3 libs were installed, and 
> JobCounter.MB_MILLIS_REDUCES is available in the counters. On the other side, 
> due to the classpath setting, the client was likely to run with MR 2.2 libs. 
> After the client retrieved the counters from MR AM, it tried to construct the 
> Counter object with the received counter name. Unfortunately, the enum didn't 
> exist in the client's classpath. Therefore, "No enum constant" exception is 
> thrown here.
> JobCounter.MB_MILLIS_REDUCES is brought to MR2 via MAPREDUCE-5464 since 
> Hadoop 2.3.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5081) Backport DistCpV2 and the related JIRAs to branch-1

2014-05-13 Thread Tianying Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13995579#comment-13995579
 ] 

Tianying Chang commented on MAPREDUCE-5081:
---

[~szetszwo] is this distcpV2 available if I am using hdfs2 with Mapreduce1? It 
seems it has a API break change on this API:  
org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(Lorg/apache/hadoop/mapreduce/Cluster;Lorg/apache/hadoop/conf/Configuration;)
BTW, we are using Hadoop 2.0.0-cdh4.2.0 



> Backport DistCpV2 and the related JIRAs to branch-1
> ---
>
> Key: MAPREDUCE-5081
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5081
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: distcp
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
> Fix For: 1.2.0
>
> Attachments: DistCp.java.diff, m5081_20130328.patch, 
> m5081_20130328b.patch, m5981_20130321.patch, m5981_20130321b.patch, 
> m5981_20130323.patch
>
>
> Here is a list of DistCpV2 JIRAs:
> - MAPREDUCE-2765: DistCpV2 main jira
> - HADOOP-8703: turn CRC checking off for 0 byte size 
> - HDFS-3054: distcp -skipcrccheck has no effect.
> - HADOOP-8431: Running distcp without args throws IllegalArgumentException
> - HADOOP-8775: non-positive value to -bandwidth
> - MAPREDUCE-4654: TestDistCp is ignored
> - HADOOP-9022: distcp fails to copy file if -m 0 specified
> - HADOOP-9025: TestCopyListing failing
> - MAPREDUCE-5075: DistCp leaks input file handles
> - distcp part of HADOOP-8341: Fix findbugs issues in hadoop-tools
> - MAPREDUCE-5014: custom CopyListing



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5886) Allow wordcount example job to accept multiple input paths.

2014-05-13 Thread Akira AJISAKA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13996154#comment-13996154
 ] 

Akira AJISAKA commented on MAPREDUCE-5886:
--

Thanks [~jira.shegalov] for the comment. Filed MAPREDUCE-5889 to deprecate 
{{FIF.addInputPaths((Job job, String commaSeparatedPaths)}}.

> Allow wordcount example job to accept multiple input paths.
> ---
>
> Key: MAPREDUCE-5886
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5886
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: examples
>Affects Versions: 3.0.0, 2.4.0
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
>Priority: Minor
> Attachments: MAPREDUCE-5886.1.patch, MAPREDUCE-5886.2.patch
>
>
> It would be convenient if the wordcount example MapReduce job could accept 
> multiple input paths and run the word count on all of them.



--
This message was sent by Atlassian JIRA
(v6.2#6252)