date:20130723

[jira] [Commented] (MAPREDUCE-5143) TestLineRecordReader was no test case for compressed files

2013-07-23 Thread Tsuyoshi OZAWA (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13717986#comment-13717986
 ] 

Tsuyoshi OZAWA commented on MAPREDUCE-5143:
---

[~gelesh], I could create the review request on Review board. 
https://reviews.apache.org/r/12892/
Thanks for your help :-) The reason I failed to attach is diff format. I needed 
to use git diff command with "--full-index" option, but I didn't.

> TestLineRecordReader was no test case for compressed files
> --
>
> Key: MAPREDUCE-5143
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5143
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 3.0.0, trunk, 2.1.0-beta
>Reporter: Sonu Prathap
>Assignee: Tsuyoshi OZAWA
>Priority: Minor
> Attachments: MAPREDUCE-5143.1.patch, MAPREDUCE-5143.2.patch
>
>
> TestLineRecordReader was no test case for compressed files

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (MAPREDUCE-5412) Change MR to use multiple containers API of ContainerManager after YARN-926

2013-07-23 Thread Vinod Kumar Vavilapalli (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli resolved MAPREDUCE-5412.


   Resolution: Fixed
Fix Version/s: 2.1.0-beta
 Hadoop Flags: Reviewed

Committed this together with YARN-926. Closing.

> Change MR to use multiple containers API of ContainerManager after YARN-926
> ---
>
> Key: MAPREDUCE-5412
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5412
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Jian He
>Assignee: Jian He
> Fix For: 2.1.0-beta
>
> Attachments: MAPREDUCE-5412.txt
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-5412) Change MR to use multiple containers API of ContainerManager after YARN-926

2013-07-23 Thread Vinod Kumar Vavilapalli (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated MAPREDUCE-5412:
---

Attachment: MAPREDUCE-5412.txt

Attaching MR part of YARN-926 on Jian's behalf. Reviewing and committing this 
as part of YARN-926.

> Change MR to use multiple containers API of ContainerManager after YARN-926
> ---
>
> Key: MAPREDUCE-5412
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5412
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Jian He
>Assignee: Jian He
> Attachments: MAPREDUCE-5412.txt
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-5372) ControlledJob#getMapredJobID capitalization is inconsistent between MR1 and MR2

2013-07-23 Thread Akira AJISAKA (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13717863#comment-13717863
 ] 

Akira AJISAKA commented on MAPREDUCE-5372:
--

[~zjshen], thanks for the review. I don't want to make an incompatible change 
again. I think this issue should be "won't fix". [~sandyr], what do you think 
of this?

> ControlledJob#getMapredJobID capitalization is inconsistent between MR1 and 
> MR2
> ---
>
> Key: MAPREDUCE-5372
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5372
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.1.0-beta
>Reporter: Sandy Ryza
>Assignee: Akira AJISAKA
>  Labels: newbie
> Attachments: MAPREDUCE-5372-1.patch, MAPREDUCE-5372-2.patch, 
> MAPREDUCE-5372-3.patch, MAPREDUCE-5372-4.patch
>
>
> In MR2, the 'd' in Id is lowercase, but in MR1, it is capitalized.  While 
> ControlledJob is marked as Evolving, there is no reason to be inconsistent 
> here.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-5413) Killing mapred job in Initial stage leads to java.lang.NullPointerException

2013-07-23 Thread yeshavora (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

yeshavora updated MAPREDUCE-5413:
-

Attachment: syslog_MapredKillTask(1).docx

> Killing mapred job in Initial stage leads to java.lang.NullPointerException
> ---
>
> Key: MAPREDUCE-5413
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5413
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: yeshavora
>Assignee: Omkar Vinit Joshi
> Attachments: syslog_MapredKillTask(1).docx
>
>
> Run a MR job and kill it as soon as jobId is known. After killing the job, 
> try to kill its attempts. 
> Steps:
> 1) start a map reduce job
> hadoop jar hadoop-mapreduce-client-jobclient-tests.jar sleep -m 10 -r 10 -mt 
> 5 -rt 5
> 2)Kill the job
> hadoop job -kill 
> 3)Try to kill the attempts for above job.
> Result:
> mapred job was not able to shut down properly. Attaching syslog. It does not 
> kill its attempts also.
> Above steps leads to below exception,
> org.apache.hadoop.util.ShutdownHookManager: ShutdownHook 
> 'MRAppMasterShutdownHook' failed, java.lang.NullPointerException
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter.setSignalled(MRAppMaster.java:811)
>   at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$MRAppMasterShutdownHook.run(MRAppMaster.java:1249)
>   at 
> org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-5413) Killing mapred job in Initial stage leads to java.lang.NullPointerException

2013-07-23 Thread Omkar Vinit Joshi (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Omkar Vinit Joshi updated MAPREDUCE-5413:
-

Attachment: (was: syslog_MapredKillTask.docx)

> Killing mapred job in Initial stage leads to java.lang.NullPointerException
> ---
>
> Key: MAPREDUCE-5413
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5413
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: yeshavora
>Assignee: Omkar Vinit Joshi
>
> Run a MR job and kill it as soon as jobId is known. After killing the job, 
> try to kill its attempts. 
> Steps:
> 1) start a map reduce job
> hadoop jar hadoop-mapreduce-client-jobclient-tests.jar sleep -m 10 -r 10 -mt 
> 5 -rt 5
> 2)Kill the job
> hadoop job -kill 
> 3)Try to kill the attempts for above job.
> Result:
> mapred job was not able to shut down properly. Attaching syslog. It does not 
> kill its attempts also.
> Above steps leads to below exception,
> org.apache.hadoop.util.ShutdownHookManager: ShutdownHook 
> 'MRAppMasterShutdownHook' failed, java.lang.NullPointerException
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter.setSignalled(MRAppMaster.java:811)
>   at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$MRAppMasterShutdownHook.run(MRAppMaster.java:1249)
>   at 
> org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-5413) Killing mapred job in Initial stage leads to java.lang.NullPointerException

2013-07-23 Thread yeshavora (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

yeshavora updated MAPREDUCE-5413:
-

Description: 
Run a MR job and kill it as soon as jobId is known. After killing the job, try 
to kill its attempts. 

Steps:
1) start a map reduce job
hadoop jar hadoop-mapreduce-client-jobclient-tests.jar sleep -m 10 -r 10 -mt 
5 -rt 5
2)Kill the job
hadoop job -kill 
3)Try to kill the attempts for above job.

Result:
mapred job was not able to shut down properly. Attaching syslog. It does not 
kill its attempts also.


Above steps leads to below exception,
org.apache.hadoop.util.ShutdownHookManager: ShutdownHook 
'MRAppMasterShutdownHook' failed, java.lang.NullPointerException
java.lang.NullPointerException
at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter.setSignalled(MRAppMaster.java:811)
at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$MRAppMasterShutdownHook.run(MRAppMaster.java:1249)
at 
org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)


  was:
Run a MR job and kill it as soon as jobId is known. After killing the job, try 
to kill its attempts. 

Steps:
1) start a map reduce job
/usr/lib/hadoop/bin/hadoop jar 
/usr/lib/hadoop-mapreduce/hadoop-mapreduce-client-jobclient-2.0.3.22-alpha-tests.jar
 sleep -m 10 -r 10 -mt 5 -rt 5
2)Kill the job
/usr/bin/mapred -kill 
3)Try to kill the attempts for above job.

Result:
mapred job was not able to shut down properly. Attaching syslog. It does not 
kill its attempts also.


Above steps leads to below exception,
org.apache.hadoop.util.ShutdownHookManager: ShutdownHook 
'MRAppMasterShutdownHook' failed, java.lang.NullPointerException
java.lang.NullPointerException
at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter.setSignalled(MRAppMaster.java:811)
at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$MRAppMasterShutdownHook.run(MRAppMaster.java:1249)
at 
org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)



> Killing mapred job in Initial stage leads to java.lang.NullPointerException
> ---
>
> Key: MAPREDUCE-5413
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5413
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: yeshavora
>Assignee: Omkar Vinit Joshi
> Attachments: syslog_MapredKillTask.docx
>
>
> Run a MR job and kill it as soon as jobId is known. After killing the job, 
> try to kill its attempts. 
> Steps:
> 1) start a map reduce job
> hadoop jar hadoop-mapreduce-client-jobclient-tests.jar sleep -m 10 -r 10 -mt 
> 5 -rt 5
> 2)Kill the job
> hadoop job -kill 
> 3)Try to kill the attempts for above job.
> Result:
> mapred job was not able to shut down properly. Attaching syslog. It does not 
> kill its attempts also.
> Above steps leads to below exception,
> org.apache.hadoop.util.ShutdownHookManager: ShutdownHook 
> 'MRAppMasterShutdownHook' failed, java.lang.NullPointerException
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter.setSignalled(MRAppMaster.java:811)
>   at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$MRAppMasterShutdownHook.run(MRAppMaster.java:1249)
>   at 
> org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-5413) Killing mapred job in Initial stage leads to java.lang.NullPointerException

2013-07-23 Thread Omkar Vinit Joshi (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Omkar Vinit Joshi updated MAPREDUCE-5413:
-

Attachment: syslog_MapredKillTask.docx

> Killing mapred job in Initial stage leads to java.lang.NullPointerException
> ---
>
> Key: MAPREDUCE-5413
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5413
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: yeshavora
>Assignee: Omkar Vinit Joshi
> Attachments: syslog_MapredKillTask.docx
>
>
> Run a MR job and kill it as soon as jobId is known. After killing the job, 
> try to kill its attempts. 
> Steps:
> 1) start a map reduce job
> /usr/lib/hadoop/bin/hadoop jar 
> /usr/lib/hadoop-mapreduce/hadoop-mapreduce-client-jobclient-2.0.3.22-alpha-tests.jar
>  sleep -m 10 -r 10 -mt 5 -rt 5
> 2)Kill the job
> /usr/bin/mapred -kill 
> 3)Try to kill the attempts for above job.
> Result:
> mapred job was not able to shut down properly. Attaching syslog. It does not 
> kill its attempts also.
> Above steps leads to below exception,
> org.apache.hadoop.util.ShutdownHookManager: ShutdownHook 
> 'MRAppMasterShutdownHook' failed, java.lang.NullPointerException
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter.setSignalled(MRAppMaster.java:811)
>   at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$MRAppMasterShutdownHook.run(MRAppMaster.java:1249)
>   at 
> org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-5413) Killing mapred job in Initial stage leads to java.lang.NullPointerException

2013-07-23 Thread Omkar Vinit Joshi (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Omkar Vinit Joshi updated MAPREDUCE-5413:
-

Description: 
Run a MR job and kill it as soon as jobId is known. After killing the job, try 
to kill its attempts. 

Steps:
1) start a map reduce job
/usr/lib/hadoop/bin/hadoop jar 
/usr/lib/hadoop-mapreduce/hadoop-mapreduce-client-jobclient-2.0.3.22-alpha-tests.jar
 sleep -m 10 -r 10 -mt 5 -rt 5
2)Kill the job
/usr/bin/mapred -kill 
3)Try to kill the attempts for above job.

Result:
mapred job was not able to shut down properly. Attaching syslog. It does not 
kill its attempts also.


Above steps leads to below exception,
org.apache.hadoop.util.ShutdownHookManager: ShutdownHook 
'MRAppMasterShutdownHook' failed, java.lang.NullPointerException
java.lang.NullPointerException
at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter.setSignalled(MRAppMaster.java:811)
at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$MRAppMasterShutdownHook.run(MRAppMaster.java:1249)
at 
org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)


  was:
Run a MR job and kill it as soon as jobId is known. After killing the job, try 
to kill its attempts. 

Above steps leads to below exception,
org.apache.hadoop.util.ShutdownHookManager: ShutdownHook 
'MRAppMasterShutdownHook' failed, java.lang.NullPointerException
java.lang.NullPointerException
at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter.setSignalled(MRAppMaster.java:811)
at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$MRAppMasterShutdownHook.run(MRAppMaster.java:1249)
at 
org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)



> Killing mapred job in Initial stage leads to java.lang.NullPointerException
> ---
>
> Key: MAPREDUCE-5413
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5413
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: yeshavora
>Assignee: Omkar Vinit Joshi
>
> Run a MR job and kill it as soon as jobId is known. After killing the job, 
> try to kill its attempts. 
> Steps:
> 1) start a map reduce job
> /usr/lib/hadoop/bin/hadoop jar 
> /usr/lib/hadoop-mapreduce/hadoop-mapreduce-client-jobclient-2.0.3.22-alpha-tests.jar
>  sleep -m 10 -r 10 -mt 5 -rt 5
> 2)Kill the job
> /usr/bin/mapred -kill 
> 3)Try to kill the attempts for above job.
> Result:
> mapred job was not able to shut down properly. Attaching syslog. It does not 
> kill its attempts also.
> Above steps leads to below exception,
> org.apache.hadoop.util.ShutdownHookManager: ShutdownHook 
> 'MRAppMasterShutdownHook' failed, java.lang.NullPointerException
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter.setSignalled(MRAppMaster.java:811)
>   at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$MRAppMasterShutdownHook.run(MRAppMaster.java:1249)
>   at 
> org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-5413) Killing mapred job in Initial stage leads to java.lang.NullPointerException

2013-07-23 Thread yeshavora (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

yeshavora updated MAPREDUCE-5413:
-

Target Version/s: 2.1.0-beta

> Killing mapred job in Initial stage leads to java.lang.NullPointerException
> ---
>
> Key: MAPREDUCE-5413
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5413
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: yeshavora
>Assignee: Omkar Vinit Joshi
>
> Run a MR job and kill it as soon as jobId is known. After killing the job, 
> try to kill its attempts. 
> Above steps leads to below exception,
> org.apache.hadoop.util.ShutdownHookManager: ShutdownHook 
> 'MRAppMasterShutdownHook' failed, java.lang.NullPointerException
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter.setSignalled(MRAppMaster.java:811)
>   at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$MRAppMasterShutdownHook.run(MRAppMaster.java:1249)
>   at 
> org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-5413) Killing mapred job in Initial stage leads to java.lang.NullPointerException

2013-07-23 Thread yeshavora (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

yeshavora updated MAPREDUCE-5413:
-

Assignee: Omkar Vinit Joshi

> Killing mapred job in Initial stage leads to java.lang.NullPointerException
> ---
>
> Key: MAPREDUCE-5413
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5413
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: yeshavora
>Assignee: Omkar Vinit Joshi
>
> Run a MR job and kill it as soon as jobId is known. After killing the job, 
> try to kill its attempts. 
> Above steps leads to below exception,
> org.apache.hadoop.util.ShutdownHookManager: ShutdownHook 
> 'MRAppMasterShutdownHook' failed, java.lang.NullPointerException
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter.setSignalled(MRAppMaster.java:811)
>   at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$MRAppMasterShutdownHook.run(MRAppMaster.java:1249)
>   at 
> org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (MAPREDUCE-5413) Killing mapred job in Initial stage leads to java.lang.NullPointerException

2013-07-23 Thread yeshavora (JIRA)

yeshavora created MAPREDUCE-5413:


 Summary: Killing mapred job in Initial stage leads to 
java.lang.NullPointerException
 Key: MAPREDUCE-5413
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5413
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: yeshavora


Run a MR job and kill it as soon as jobId is known. After killing the job, try 
to kill its attempts. 

Above steps leads to below exception,
org.apache.hadoop.util.ShutdownHookManager: ShutdownHook 
'MRAppMasterShutdownHook' failed, java.lang.NullPointerException
java.lang.NullPointerException
at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter.setSignalled(MRAppMaster.java:811)
at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$MRAppMasterShutdownHook.run(MRAppMaster.java:1249)
at 
org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-5367) Local jobs all use same local working directory

2013-07-23 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13717751#comment-13717751
 ] 

Hadoop QA commented on MAPREDUCE-5367:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12593094/MAPREDUCE-5367-b1.patch
  against trunk revision .

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3887//console

This message is automatically generated.

> Local jobs all use same local working directory
> ---
>
> Key: MAPREDUCE-5367
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5367
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 1.2.0
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: MAPREDUCE-5367-b1.patch
>
>
> This means that local jobs, even in different JVMs, can't run concurrently 
> because they might delete each other's files during work directory setup.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-5367) Local jobs all use same local working directory

2013-07-23 Thread Sandy Ryza (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated MAPREDUCE-5367:
--

Status: Patch Available  (was: Open)

> Local jobs all use same local working directory
> ---
>
> Key: MAPREDUCE-5367
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5367
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 1.2.0
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: MAPREDUCE-5367-b1.patch
>
>
> This means that local jobs, even in different JVMs, can't run concurrently 
> because they might delete each other's files during work directory setup.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4049) plugin for generic shuffle service

2013-07-23 Thread Alejandro Abdelnur (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13717725#comment-13717725
 ] 

Alejandro Abdelnur commented on MAPREDUCE-4049:
---

[~avnerb], sorry for the delay. I've just tried applying the patch to branch-1 
and there is one hunk failing, it seems a trivial rebase. Mind rebasing the 
patch to current HEAD of branch-1? Once you do that it can go in.

> plugin for generic shuffle service
> --
>
> Key: MAPREDUCE-4049
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4049
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>  Components: performance, task, tasktracker
>Affects Versions: 1.0.3, 1.1.0, 2.0.0-alpha, 3.0.0
>Reporter: Avner BenHanoch
>Assignee: Avner BenHanoch
>  Labels: merge, plugin, rdma, shuffle
> Fix For: 2.0.3-alpha
>
> Attachments: HADOOP-1.x.y.patch, Hadoop Shuffle Plugin Design.rtf, 
> MAPREDUCE-4049--branch-1.patch, MAPREDUCE-4049--branch-1.patch, 
> MAPREDUCE-4049--branch-1.patch, mapreduce-4049.patch
>
>
> Support generic shuffle service as set of two plugins: ShuffleProvider & 
> ShuffleConsumer.
> This will satisfy the following needs:
> # Better shuffle and merge performance. For example: we are working on 
> shuffle plugin that performs shuffle over RDMA in fast networks (10gE, 40gE, 
> or Infiniband) instead of using the current HTTP shuffle. Based on the fast 
> RDMA shuffle, the plugin can also utilize a suitable merge approach during 
> the intermediate merges. Hence, getting much better performance.
> # Satisfy MAPREDUCE-3060 - generic shuffle service for avoiding hidden 
> dependency of NodeManager with a specific version of mapreduce shuffle 
> (currently targeted to 0.24.0).
> References:
> # Hadoop Acceleration through Network Levitated Merging, by Prof. Weikuan Yu 
> from Auburn University with others, 
> [http://pasl.eng.auburn.edu/pubs/sc11-netlev.pdf]
> # I am attaching 2 documents with suggested Top Level Design for both plugins 
> (currently, based on 1.0 branch)
> # I am providing link for downloading UDA - Mellanox's open source plugin 
> that implements generic shuffle service using RDMA and levitated merge.  
> Note: At this phase, the code is in C++ through JNI and you should consider 
> it as beta only.  Still, it can serve anyone that wants to implement or 
> contribute to levitated merge. (Please be advised that levitated merge is 
> mostly suit in very fast networks) - 
> [http://www.mellanox.com/content/pages.php?pg=products_dyn&product_family=144&menu_section=69]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-5251) Reducer should not implicate map attempt if it has insufficient space to fetch map output

2013-07-23 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13717714#comment-13717714
 ] 

Hadoop QA commented on MAPREDUCE-5251:
--

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12593790/MAPREDUCE-5251-6.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3886//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3886//console

This message is automatically generated.

> Reducer should not implicate map attempt if it has insufficient space to 
> fetch map output
> -
>
> Key: MAPREDUCE-5251
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5251
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 0.23.7, 2.0.4-alpha
>Reporter: Jason Lowe
>Assignee: Ashwin Shankar
> Attachments: MAPREDUCE-5251-2.txt, MAPREDUCE-5251-3.txt, 
> MAPREDUCE-5251-4.txt, MAPREDUCE-5251-5.txt, MAPREDUCE-5251-6.txt
>
>
> A job can fail if a reducer happens to run on a node with insufficient space 
> to hold a map attempt's output.  The reducer keeps reporting the map attempt 
> as bad, and if the map attempt ends up being re-launched too many times 
> before the reducer decides maybe it is the real problem the job can fail.
> In that scenario it would be better to re-launch the reduce attempt and 
> hopefully it will run on another node that has sufficient space to complete 
> the shuffle.  Reporting the map attempt is bad and relaunching the map task 
> doesn't change the fact that the reducer can't hold the output.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-5403) yarn.application.classpath requires client to know service internals

2013-07-23 Thread Sandy Ryza (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13717687#comment-13717687
 ] 

Sandy Ryza commented on MAPREDUCE-5403:
---

Test failures are related to YARN-960

> yarn.application.classpath requires client to know service internals
> 
>
> Key: MAPREDUCE-5403
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5403
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Affects Versions: 2.0.5-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: MAPREDUCE-5403.patch
>
>
> yarn.application.classpath is a confusing property because it is used by 
> MapReduce and not YARN, and MapReduce already has 
> mapreduce.application.classpath, which provides the same functionality.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-5403) yarn.application.classpath requires client to know service internals

2013-07-23 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13717685#comment-13717685
 ] 

Hadoop QA commented on MAPREDUCE-5403:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12593775/MAPREDUCE-5403.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient
 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell
 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:

  org.apache.hadoop.mapreduce.security.TestBinaryTokenFile
  org.apache.hadoop.mapreduce.security.TestMRCredentials

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3884//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3884//console

This message is automatically generated.

> yarn.application.classpath requires client to know service internals
> 
>
> Key: MAPREDUCE-5403
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5403
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Affects Versions: 2.0.5-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: MAPREDUCE-5403.patch
>
>
> yarn.application.classpath is a confusing property because it is used by 
> MapReduce and not YARN, and MapReduce already has 
> mapreduce.application.classpath, which provides the same functionality.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-5251) Reducer should not implicate map attempt if it has insufficient space to fetch map output

2013-07-23 Thread Ashwin Shankar (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashwin Shankar updated MAPREDUCE-5251:
--

Attachment: MAPREDUCE-5251-6.txt

Makes sense,both the comments addressed in latest patch.

> Reducer should not implicate map attempt if it has insufficient space to 
> fetch map output
> -
>
> Key: MAPREDUCE-5251
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5251
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 0.23.7, 2.0.4-alpha
>Reporter: Jason Lowe
>Assignee: Ashwin Shankar
> Attachments: MAPREDUCE-5251-2.txt, MAPREDUCE-5251-3.txt, 
> MAPREDUCE-5251-4.txt, MAPREDUCE-5251-5.txt, MAPREDUCE-5251-6.txt
>
>
> A job can fail if a reducer happens to run on a node with insufficient space 
> to hold a map attempt's output.  The reducer keeps reporting the map attempt 
> as bad, and if the map attempt ends up being re-launched too many times 
> before the reducer decides maybe it is the real problem the job can fail.
> In that scenario it would be better to re-launch the reduce attempt and 
> hopefully it will run on another node that has sufficient space to complete 
> the shuffle.  Reporting the map attempt is bad and relaunching the map task 
> doesn't change the fact that the reducer can't hold the output.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-1981) Improve getSplits performance by using listFiles, the new FileSystem API

2013-07-23 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13717615#comment-13717615
 ] 

Hadoop QA commented on MAPREDUCE-1981:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12593773/MAPREDUCE-1981.branch-0.23.patch
  against trunk revision .

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3885//console

This message is automatically generated.

> Improve getSplits performance by using listFiles, the new FileSystem API
> 
>
> Key: MAPREDUCE-1981
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1981
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: job submission
>Affects Versions: 0.23.0
>Reporter: Hairong Kuang
>Assignee: Hairong Kuang
> Attachments: mapredListFiles1.patch, mapredListFiles2.patch, 
> mapredListFiles3.patch, mapredListFiles4.patch, mapredListFiles5.patch, 
> mapredListFiles.patch, MAPREDUCE-1981.branch-0.23.patch, MAPREDUCE-1981.patch
>
>
> This jira will make FileInputFormat and CombinedFileInputForm to use the new 
> API, thus reducing the number of RPCs to HDFS NameNode.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-5403) yarn.application.classpath requires client to know service internals

2013-07-23 Thread Sandy Ryza (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated MAPREDUCE-5403:
--

Summary: yarn.application.classpath requires client to know service 
internals  (was: Get rid of yarn.application.classpath)

> yarn.application.classpath requires client to know service internals
> 
>
> Key: MAPREDUCE-5403
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5403
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Affects Versions: 2.0.5-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: MAPREDUCE-5403.patch
>
>
> yarn.application.classpath is a confusing property because it is used by 
> MapReduce and not YARN, and MapReduce already has 
> mapreduce.application.classpath, which provides the same functionality.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-5403) Get rid of yarn.application.classpath

2013-07-23 Thread Sandy Ryza (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated MAPREDUCE-5403:
--

Status: Patch Available  (was: Open)

> Get rid of yarn.application.classpath
> -
>
> Key: MAPREDUCE-5403
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5403
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Affects Versions: 2.0.5-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: MAPREDUCE-5403.patch
>
>
> yarn.application.classpath is a confusing property because it is used by 
> MapReduce and not YARN, and MapReduce already has 
> mapreduce.application.classpath, which provides the same functionality.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-5403) Get rid of yarn.application.classpath

2013-07-23 Thread Sandy Ryza (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated MAPREDUCE-5403:
--

Attachment: MAPREDUCE-5403.patch

> Get rid of yarn.application.classpath
> -
>
> Key: MAPREDUCE-5403
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5403
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Affects Versions: 2.0.5-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: MAPREDUCE-5403.patch
>
>
> yarn.application.classpath is a confusing property because it is used by 
> MapReduce and not YARN, and MapReduce already has 
> mapreduce.application.classpath, which provides the same functionality.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-5403) Get rid of yarn.application.classpath

2013-07-23 Thread Sandy Ryza (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13717603#comment-13717603
 ] 

Sandy Ryza commented on MAPREDUCE-5403:
---

bq. I wanted to point out there are other interesting tidbits of information in 
yarn-site.xml besides the classpath that clients may want to access, and I'm 
wondering what criteria qualifies a client-consumed property to graduate to an 
environment variable or some other mechanism for determining the value besides 
parsing yarn-site.xml.
My saying that YARN configs should not be client configs was imprecise, and I 
totally agree that the reality has more shares of gray.  Regarding the 
criteria, my general ideal would be to only keep client configs that relate to 
how to locate and communicate with the YARN services.  Though that might need 
to be amended to cover some requirements I'm not thinking of?

Uploading a patch that makes yarn.application.classpath a server-side config.  
Whatever is placed into a NodeManager's yarn.application.classpath gets placed 
in the container environment as $YARN_APPLICATION_CLASSPATH.

> Get rid of yarn.application.classpath
> -
>
> Key: MAPREDUCE-5403
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5403
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Affects Versions: 2.0.5-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
>
> yarn.application.classpath is a confusing property because it is used by 
> MapReduce and not YARN, and MapReduce already has 
> mapreduce.application.classpath, which provides the same functionality.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-1981) Improve getSplits performance by using listFiles, the new FileSystem API

2013-07-23 Thread Jason Lowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated MAPREDUCE-1981:
--

Attachment: MAPREDUCE-1981.branch-0.23.patch

Thanks for the review, Kihwal.  Here's the equivalent patch for branch-0.23.

> Improve getSplits performance by using listFiles, the new FileSystem API
> 
>
> Key: MAPREDUCE-1981
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1981
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: job submission
>Affects Versions: 0.23.0
>Reporter: Hairong Kuang
>Assignee: Hairong Kuang
> Attachments: mapredListFiles1.patch, mapredListFiles2.patch, 
> mapredListFiles3.patch, mapredListFiles4.patch, mapredListFiles5.patch, 
> mapredListFiles.patch, MAPREDUCE-1981.branch-0.23.patch, MAPREDUCE-1981.patch
>
>
> This jira will make FileInputFormat and CombinedFileInputForm to use the new 
> API, thus reducing the number of RPCs to HDFS NameNode.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4959) bundling classpath into jar manifest on Windows does not expand environment variables or wildcards

2013-07-23 Thread Chris Nauroth (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated MAPREDUCE-4959:
-

Target Version/s: 1-win, 1.3.0  (was: 1-win)

Expanding scope of target versions to include next unreleased 1.x version, 
1.3.0.  HBase and others would benefit from having this code in branch-1, as 
per discussion on YARN-358.

[~ndimiduk], you mentioned needing this in 1.1.x.  Would it be acceptable for 
HBase to upgrade to 1.3.0 to pick up this change, or would it really need to be 
a new 1.1.x version?

> bundling classpath into jar manifest on Windows does not expand environment 
> variables or wildcards
> --
>
> Key: MAPREDUCE-4959
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4959
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: tasktracker
>Affects Versions: 1-win
>Reporter: Chris Nauroth
>
> To support long classpaths on Windows, the class path entries get bundled 
> into a small temporary jar with a manifest that has a Class-Path attribute.  
> When a classpath is specified in a jar manifest like this, it does not expand 
> environment variables (i.e. %HADOOP_COMMON_HOME%), and it does not expand 
> wildcards (i.e. lib/*.jar).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (MAPREDUCE-5412) Change MR to use multiple containers API of ContainerManager after YARN-926

2013-07-23 Thread Jian He (JIRA)

Jian He created MAPREDUCE-5412:
--

 Summary: Change MR to use multiple containers API of 
ContainerManager after YARN-926
 Key: MAPREDUCE-5412
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5412
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Jian He
Assignee: Jian He




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-5251) Reducer should not implicate map attempt if it has insufficient space to fetch map output

2013-07-23 Thread Jason Lowe (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13717531#comment-13717531
 ] 

Jason Lowe commented on MAPREDUCE-5251:
---

I see reportLocalError is now throwing UnknownHostException.  Unfortunately 
since that is an IOException, if it ever does do that it will end up catching 
that in the outer try-catch block in copyMapOutput and a map attempt wil be 
blamed for it.

Also now that I think of it, we arguably should be incrementing the ioErrs 
counter before calling reportLocalError since this is an I/O error during the 
shuffle that prevented a successful map output transfer.

> Reducer should not implicate map attempt if it has insufficient space to 
> fetch map output
> -
>
> Key: MAPREDUCE-5251
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5251
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 0.23.7, 2.0.4-alpha
>Reporter: Jason Lowe
>Assignee: Ashwin Shankar
> Attachments: MAPREDUCE-5251-2.txt, MAPREDUCE-5251-3.txt, 
> MAPREDUCE-5251-4.txt, MAPREDUCE-5251-5.txt
>
>
> A job can fail if a reducer happens to run on a node with insufficient space 
> to hold a map attempt's output.  The reducer keeps reporting the map attempt 
> as bad, and if the map attempt ends up being re-launched too many times 
> before the reducer decides maybe it is the real problem the job can fail.
> In that scenario it would be better to re-launch the reduce attempt and 
> hopefully it will run on another node that has sufficient space to complete 
> the shuffle.  Reporting the map attempt is bad and relaunching the map task 
> doesn't change the fact that the reducer can't hold the output.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-5251) Reducer should not implicate map attempt if it has insufficient space to fetch map output

2013-07-23 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13717519#comment-13717519
 ] 

Hadoop QA commented on MAPREDUCE-5251:
--

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12593765/MAPREDUCE-5251-5.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3883//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3883//console

This message is automatically generated.

> Reducer should not implicate map attempt if it has insufficient space to 
> fetch map output
> -
>
> Key: MAPREDUCE-5251
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5251
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 0.23.7, 2.0.4-alpha
>Reporter: Jason Lowe
>Assignee: Ashwin Shankar
> Attachments: MAPREDUCE-5251-2.txt, MAPREDUCE-5251-3.txt, 
> MAPREDUCE-5251-4.txt, MAPREDUCE-5251-5.txt
>
>
> A job can fail if a reducer happens to run on a node with insufficient space 
> to hold a map attempt's output.  The reducer keeps reporting the map attempt 
> as bad, and if the map attempt ends up being re-launched too many times 
> before the reducer decides maybe it is the real problem the job can fail.
> In that scenario it would be better to re-launch the reduce attempt and 
> hopefully it will run on another node that has sufficient space to complete 
> the shuffle.  Reporting the map attempt is bad and relaunching the map task 
> doesn't change the fact that the reducer can't hold the output.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-5356) Ability to refresh aggregated log retention period and check interval

2013-07-23 Thread Jason Lowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated MAPREDUCE-5356:
--

   Resolution: Fixed
Fix Version/s: 2.3.0
   3.0.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Thanks, Ashwin!  I committed this to trunk and branch-2.

> Ability to refresh aggregated log retention period and check interval 
> --
>
> Key: MAPREDUCE-5356
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5356
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>  Components: jobhistoryserver
>Affects Versions: 2.1.0-beta
>Reporter: Ashwin Shankar
>Assignee: Ashwin Shankar
>  Labels: features
> Fix For: 3.0.0, 2.3.0
>
> Attachments: MAPREDUCE-5266-2.txt, MAPREDUCE-5266-3.txt, 
> MAPREDUCE-5266-4.txt, MAPREDUCE-5356-5.txt, MAPREDUCE-5356-5.txt, 
> WHOLE_PATCH_NOT_TO_BE_CHKEDIN-MAPREDUCE-5356-5.txt
>
>
> We want to be able to refresh log aggregation retention time
> and 'check interval' time on the fly by changing configs so that we dont have 
> to bounce history server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-5356) Ability to refresh aggregated log retention period and check interval

2013-07-23 Thread Jason Lowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated MAPREDUCE-5356:
--

Summary: Ability to refresh aggregated log retention period and check 
interval   (was: Refresh Log aggregation 'retention period' and 'check 
interval' )

> Ability to refresh aggregated log retention period and check interval 
> --
>
> Key: MAPREDUCE-5356
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5356
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>  Components: jobhistoryserver
>Affects Versions: 2.1.0-beta
>Reporter: Ashwin Shankar
>Assignee: Ashwin Shankar
>  Labels: features
> Attachments: MAPREDUCE-5266-2.txt, MAPREDUCE-5266-3.txt, 
> MAPREDUCE-5266-4.txt, MAPREDUCE-5356-5.txt, MAPREDUCE-5356-5.txt, 
> WHOLE_PATCH_NOT_TO_BE_CHKEDIN-MAPREDUCE-5356-5.txt
>
>
> We want to be able to refresh log aggregation retention time
> and 'check interval' time on the fly by changing configs so that we dont have 
> to bounce history server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-5251) Reducer should not implicate map attempt if it has insufficient space to fetch map output

2013-07-23 Thread Ashwin Shankar (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashwin Shankar updated MAPREDUCE-5251:
--

Attachment: MAPREDUCE-5251-5.txt

Thanks,patch updated.

> Reducer should not implicate map attempt if it has insufficient space to 
> fetch map output
> -
>
> Key: MAPREDUCE-5251
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5251
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 0.23.7, 2.0.4-alpha
>Reporter: Jason Lowe
>Assignee: Ashwin Shankar
> Attachments: MAPREDUCE-5251-2.txt, MAPREDUCE-5251-3.txt, 
> MAPREDUCE-5251-4.txt, MAPREDUCE-5251-5.txt
>
>
> A job can fail if a reducer happens to run on a node with insufficient space 
> to hold a map attempt's output.  The reducer keeps reporting the map attempt 
> as bad, and if the map attempt ends up being re-launched too many times 
> before the reducer decides maybe it is the real problem the job can fail.
> In that scenario it would be better to re-launch the reduce attempt and 
> hopefully it will run on another node that has sufficient space to complete 
> the shuffle.  Reporting the map attempt is bad and relaunching the map task 
> doesn't change the fact that the reducer can't hold the output.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-5356) Refresh Log aggregation 'retention period' and 'check interval'

2013-07-23 Thread Jason Lowe (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13717494#comment-13717494
 ] 

Jason Lowe commented on MAPREDUCE-5356:
---

+1, lgtm.  Committing this.

> Refresh Log aggregation 'retention period' and 'check interval' 
> 
>
> Key: MAPREDUCE-5356
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5356
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>  Components: jobhistoryserver
>Affects Versions: 2.1.0-beta
>Reporter: Ashwin Shankar
>Assignee: Ashwin Shankar
>  Labels: features
> Attachments: MAPREDUCE-5266-2.txt, MAPREDUCE-5266-3.txt, 
> MAPREDUCE-5266-4.txt, MAPREDUCE-5356-5.txt, MAPREDUCE-5356-5.txt, 
> WHOLE_PATCH_NOT_TO_BE_CHKEDIN-MAPREDUCE-5356-5.txt
>
>
> We want to be able to refresh log aggregation retention time
> and 'check interval' time on the fly by changing configs so that we dont have 
> to bounce history server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-5251) Reducer should not implicate map attempt if it has insufficient space to fetch map output

2013-07-23 Thread Jason Lowe (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13717487#comment-13717487
 ] 

Jason Lowe commented on MAPREDUCE-5251:
---

Thanks for the update, Aswhin.  Couple of minor things:

* reportLocalError probably should just compute the hostname itself rather than 
requiring callers to do so
* there is whitespace missing between arguments added in the latest patch 
(which will be fixed if we remove the reduceHost arg to reportLocalError)

> Reducer should not implicate map attempt if it has insufficient space to 
> fetch map output
> -
>
> Key: MAPREDUCE-5251
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5251
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 0.23.7, 2.0.4-alpha
>Reporter: Jason Lowe
>Assignee: Ashwin Shankar
> Attachments: MAPREDUCE-5251-2.txt, MAPREDUCE-5251-3.txt, 
> MAPREDUCE-5251-4.txt
>
>
> A job can fail if a reducer happens to run on a node with insufficient space 
> to hold a map attempt's output.  The reducer keeps reporting the map attempt 
> as bad, and if the map attempt ends up being re-launched too many times 
> before the reducer decides maybe it is the real problem the job can fail.
> In that scenario it would be better to re-launch the reduce attempt and 
> hopefully it will run on another node that has sufficient space to complete 
> the shuffle.  Reporting the map attempt is bad and relaunching the map task 
> doesn't change the fact that the reducer can't hold the output.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-5356) Refresh Log aggregation 'retention period' and 'check interval'

2013-07-23 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13717472#comment-13717472
 ] 

Hadoop QA commented on MAPREDUCE-5356:
--

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12593755/MAPREDUCE-5356-5.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3882//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3882//console

This message is automatically generated.

> Refresh Log aggregation 'retention period' and 'check interval' 
> 
>
> Key: MAPREDUCE-5356
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5356
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>  Components: jobhistoryserver
>Affects Versions: 2.1.0-beta
>Reporter: Ashwin Shankar
>Assignee: Ashwin Shankar
>  Labels: features
> Attachments: MAPREDUCE-5266-2.txt, MAPREDUCE-5266-3.txt, 
> MAPREDUCE-5266-4.txt, MAPREDUCE-5356-5.txt, MAPREDUCE-5356-5.txt, 
> WHOLE_PATCH_NOT_TO_BE_CHKEDIN-MAPREDUCE-5356-5.txt
>
>
> We want to be able to refresh log aggregation retention time
> and 'check interval' time on the fly by changing configs so that we dont have 
> to bounce history server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-5251) Reducer should not implicate map attempt if it has insufficient space to fetch map output

2013-07-23 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13716704#comment-13716704
 ] 

Hadoop QA commented on MAPREDUCE-5251:
--

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12593044/MAPREDUCE-5251-4.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3881//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3881//console

This message is automatically generated.

> Reducer should not implicate map attempt if it has insufficient space to 
> fetch map output
> -
>
> Key: MAPREDUCE-5251
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5251
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 0.23.7, 2.0.4-alpha
>Reporter: Jason Lowe
>Assignee: Ashwin Shankar
> Attachments: MAPREDUCE-5251-2.txt, MAPREDUCE-5251-3.txt, 
> MAPREDUCE-5251-4.txt
>
>
> A job can fail if a reducer happens to run on a node with insufficient space 
> to hold a map attempt's output.  The reducer keeps reporting the map attempt 
> as bad, and if the map attempt ends up being re-launched too many times 
> before the reducer decides maybe it is the real problem the job can fail.
> In that scenario it would be better to re-launch the reduce attempt and 
> hopefully it will run on another node that has sufficient space to complete 
> the shuffle.  Reporting the map attempt is bad and relaunching the map task 
> doesn't change the fact that the reducer can't hold the output.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4366) mapred metrics shows negative count of waiting maps and reduces

2013-07-23 Thread Alejandro Abdelnur (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13716701#comment-13716701
 ] 

Alejandro Abdelnur commented on MAPREDUCE-4366:
---

[~acmurthy], if you don't have any further comments/concerns, I'll commit this 
later this week.

> mapred metrics shows negative count of waiting maps and reduces
> ---
>
> Key: MAPREDUCE-4366
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4366
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobtracker
>Affects Versions: 1.0.2
>Reporter: Thomas Graves
>Assignee: Sandy Ryza
> Attachments: MAPREDUCE-4366-branch-1-1.patch, 
> MAPREDUCE-4366-branch-1.patch
>
>
> Negative waiting_maps and waiting_reduces count is observed in the mapred 
> metrics.  MAPREDUCE-1238 partially fixed this but it appears there is still 
> issues as we are seeing it, but not as bad.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-5356) Refresh Log aggregation 'retention period' and 'check interval'

2013-07-23 Thread Jason Lowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated MAPREDUCE-5356:
--

Attachment: MAPREDUCE-5356-5.txt

Now that MAPREDUCE-5265 is in, re-attaching latest version of patch so Jenkins 
can comment.

> Refresh Log aggregation 'retention period' and 'check interval' 
> 
>
> Key: MAPREDUCE-5356
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5356
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>  Components: jobhistoryserver
>Affects Versions: 2.1.0-beta
>Reporter: Ashwin Shankar
>Assignee: Ashwin Shankar
>  Labels: features
> Attachments: MAPREDUCE-5266-2.txt, MAPREDUCE-5266-3.txt, 
> MAPREDUCE-5266-4.txt, MAPREDUCE-5356-5.txt, MAPREDUCE-5356-5.txt, 
> WHOLE_PATCH_NOT_TO_BE_CHKEDIN-MAPREDUCE-5356-5.txt
>
>
> We want to be able to refresh log aggregation retention time
> and 'check interval' time on the fly by changing configs so that we dont have 
> to bounce history server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-1981) Improve getSplits performance by using listFiles, the new FileSystem API

2013-07-23 Thread Kihwal Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13716688#comment-13716688
 ] 

Kihwal Lee commented on MAPREDUCE-1981:
---

+1 The patch looks good. I also ran some tests and they worked successfully. 
Thanks for fixing both mapred and mapreduce. 

> Improve getSplits performance by using listFiles, the new FileSystem API
> 
>
> Key: MAPREDUCE-1981
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1981
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: job submission
>Affects Versions: 0.23.0
>Reporter: Hairong Kuang
>Assignee: Hairong Kuang
> Attachments: mapredListFiles1.patch, mapredListFiles2.patch, 
> mapredListFiles3.patch, mapredListFiles4.patch, mapredListFiles5.patch, 
> mapredListFiles.patch, MAPREDUCE-1981.patch
>
>
> This jira will make FileInputFormat and CombinedFileInputForm to use the new 
> API, thus reducing the number of RPCs to HDFS NameNode.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-5251) Reducer should not implicate map attempt if it has insufficient space to fetch map output

2013-07-23 Thread Jason Lowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated MAPREDUCE-5251:
--

Target Version/s: 3.0.0, 2.3.0, 0.23.10  (was: 3.0.0, 2.1.0-beta, 0.23.10)
  Status: Patch Available  (was: Open)

> Reducer should not implicate map attempt if it has insufficient space to 
> fetch map output
> -
>
> Key: MAPREDUCE-5251
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5251
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 2.0.4-alpha, 0.23.7
>Reporter: Jason Lowe
>Assignee: Ashwin Shankar
> Attachments: MAPREDUCE-5251-2.txt, MAPREDUCE-5251-3.txt, 
> MAPREDUCE-5251-4.txt
>
>
> A job can fail if a reducer happens to run on a node with insufficient space 
> to hold a map attempt's output.  The reducer keeps reporting the map attempt 
> as bad, and if the map attempt ends up being re-launched too many times 
> before the reducer decides maybe it is the real problem the job can fail.
> In that scenario it would be better to re-launch the reduce attempt and 
> hopefully it will run on another node that has sufficient space to complete 
> the shuffle.  Reporting the map attempt is bad and relaunching the map task 
> doesn't change the fact that the reducer can't hold the output.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-5409) MRAppMaster throws InvalidStateTransitonException: Invalid event: TA_TOO_MANY_FETCH_FAILURE at KILLED for TaskAttemptImpl

2013-07-23 Thread Zhijie Shen (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13716639#comment-13716639
 ] 

Zhijie Shen commented on MAPREDUCE-5409:


[~devaraj.k], would you mind sharing more context of the exception? For 
example, the state of TaskAttemptImpl before KILLED.

My guess is that before TaskAttemptImpl entered KILLED, it was probably at 
RUNNING or COMMIT_PENDING, where StatusUpdater transition could happen. This 
transition would send a JOB_TASK_ATTEMPT_FETCH_FAILURE event to JobImpl, 
TaskAttemptFetchFailureTransition would be triggered, and a 
TA_TOO_MANY_FETCH_FAILURE would be sent back to TaskAttemptImpl. Before this 
event was processed, TaskAttemptImpl went through KilledTransition as it 
received a TA_CONTAINER_CLEANED event in advance. Therefore, 
TA_TOO_MANY_FETCH_FAILURE at KILLED happened. 

> MRAppMaster throws InvalidStateTransitonException: Invalid event: 
> TA_TOO_MANY_FETCH_FAILURE at KILLED for TaskAttemptImpl
> -
>
> Key: MAPREDUCE-5409
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5409
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.0.5-alpha
>Reporter: Devaraj K
>Assignee: Devaraj K
>
> {code:xml}
> 2013-07-23 12:28:05,217 INFO [IPC Server handler 29 on 50796] 
> org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt 
> attempt_1374560536158_0003_m_40_0 is : 0.0
> 2013-07-23 12:28:05,221 INFO [AsyncDispatcher event handler] 
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Too many fetch-failures 
> for output of task attempt: attempt_1374560536158_0003_m_07_0 ... raising 
> fetch failure to map
> 2013-07-23 12:28:05,222 ERROR [AsyncDispatcher event handler] 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Can't handle 
> this event at current state for attempt_1374560536158_0003_m_07_0
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
> TA_TOO_MANY_FETCH_FAILURE at KILLED
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:445)
>   at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:1032)
>   at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:143)
>   at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1123)
>   at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1115)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:130)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:77)
>   at java.lang.Thread.run(Thread.java:662)
> 2013-07-23 12:28:05,249 INFO [AsyncDispatcher event handler] 
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: 
> job_1374560536158_0003Job Transitioned from RUNNING to ERROR
> 2013-07-23 12:28:05,338 INFO [IPC Server handler 16 on 50796] 
> org.apache.hadoop.mapred.TaskAttemptListenerImpl: Status update from 
> attempt_1374560536158_0003_m_40_0
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-5411) Refresh size of loaded job cache on history server

2013-07-23 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13716609#comment-13716609
 ] 

Hadoop QA commented on MAPREDUCE-5411:
--

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12593729/LOADED_JOB_CACHE_MR5411-1.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3880//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3880//console

This message is automatically generated.

> Refresh size of loaded job cache on history server
> --
>
> Key: MAPREDUCE-5411
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5411
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>  Components: jobhistoryserver
>Affects Versions: 2.1.0-beta
>Reporter: Ashwin Shankar
>Assignee: Ashwin Shankar
>  Labels: features
> Attachments: LOADED_JOB_CACHE_MR5411-1.txt
>
>
> We want to be able to refresh size of the loaded job 
> cache(mapreduce.jobhistory.loadedjobs.cache.size) of history server
> through history server's admin interface.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-5402) DynamicInputFormat should allow overriding of MAX_CHUNKS_TOLERABLE

2013-07-23 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13716584#comment-13716584
 ] 

Hadoop QA commented on MAPREDUCE-5402:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12593727/MAPREDUCE-5402.3.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-tools/hadoop-distcp.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3879//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3879//console

This message is automatically generated.

> DynamicInputFormat should allow overriding of MAX_CHUNKS_TOLERABLE
> --
>
> Key: MAPREDUCE-5402
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5402
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: distcp, mrv2
>Reporter: David Rosenstrauch
>Assignee: Tsuyoshi OZAWA
> Attachments: MAPREDUCE-5402.1.patch, MAPREDUCE-5402.2.patch, 
> MAPREDUCE-5402.3.patch
>
>
> In MAPREDUCE-2765, which provided the design spec for DistCpV2, the author 
> describes the implementation of DynamicInputFormat, with one of the main 
> motivations cited being to reduce the chance of long-tails where a few 
> leftover mappers run much longer than the rest.
> However, I today ran into a situation where I experienced exactly such a long 
> tail using DistCpV2 and DynamicInputFormat.  And when I tried to alleviate 
> the problem by overriding the number of mappers and the split ratio used by 
> the DynamicInputFormat, I was prevented from doing so by the hard-coded limit 
> set in the code by the MAX_CHUNKS_TOLERABLE constant.  (Currently set to 400.)
> This constant is actually set quite low for production use.  (See a 
> description of my use case below.)  And although MAPREDUCE-2765 states that 
> this is an "overridable maximum", when reading through the code there does 
> not actually appear to be any mechanism available to override it.
> This should be changed.  It should be possible to expand the maximum # of 
> chunks beyond this arbitrary limit.
> For example, here is the situation I ran into today:
> I ran a distcpv2 job on a cluster with 8 machines containing 128 map slots.  
> The job consisted of copying ~2800 files from HDFS to Amazon S3.  I overrode 
> the number of mappers for the job from the default of 20 to 128, so as to 
> more properly parallelize the copy across the cluster.  The number of chunk 
> files created was calculated as 241, and mapred.num.entries.per.chunk was 
> calculated as 12.
> As the job ran on, it reached a point where there were only 4 remaining map 
> tasks, which had each been running for over 2 hours.  The reason for this was 
> that each of the 12 files that those mappers were copying were quite large 
> (several hundred megabytes in size) and took ~20 minutes each.  However, 
> during this time, all the other 124 mappers sat idle.
> In theory I should be able to alleviate this problem with DynamicInputFormat. 
>  If I were able to, say, quadruple the number of chunk files created, that 
> would have made each chunk contain only 3 files, and these large files would 
> have gotten distributed better around the cluster and copied in parallel.
> However, when I tried to do that - by overriding mapred.listing.split.ratio 
> to, say, 10 - DynamicInputFormat responded with an exception ("Too many 
> chunks created with splitRatio:10, numMaps:128. Reduce numMaps or decrease 
> split-ratio to proceed.") - presumably because I exceeded the 
> MAX_CHUNKS_TOLERABLE value of 400.
> Is there any particular logic behind this MAX_CHUNKS_TOLERABLE limit?  I 
> can't personally s

[jira] [Updated] (MAPREDUCE-5317) Stale files left behind for failed jobs

2013-07-23 Thread Jason Lowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated MAPREDUCE-5317:
--

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Thanks, Ravi.  I committed this to trunk, branch-2, and branch-0.23.

> Stale files left behind for failed jobs
> ---
>
> Key: MAPREDUCE-5317
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5317
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 3.0.0, 2.0.4-alpha, 0.23.8
>Reporter: Ravi Prakash
>Assignee: Ravi Prakash
> Fix For: 3.0.0, 2.3.0, 0.23.10
>
> Attachments: MAPREDUCE-5317.branch-0.23.patch, 
> MAPREDUCE-5317.branch-0.23.patch, MAPREDUCE-5317.branch-0.23.patch, 
> MAPREDUCE-5317.patch, MAPREDUCE-5317.patch, MAPREDUCE-5317.patch, 
> MAPREDUCE-5317.patch, MAPREDUCE-5317.patch, MAPREDUCE-5317.patch
>
>
> Courtesy [~amar_kamat]!
> {quote}
> We are seeing _temporary files left behind in the output folder if the job
> fails.
> The job were failed due to hitting quota issue.
> I simply ran the randomwriter (from hadoop examples) with the default setting.
> That failed and left behind some stray files.
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-5411) Refresh size of loaded job cache on history server

2013-07-23 Thread Ashwin Shankar (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashwin Shankar updated MAPREDUCE-5411:
--

Attachment: LOADED_JOB_CACHE_MR5411-1.txt

> Refresh size of loaded job cache on history server
> --
>
> Key: MAPREDUCE-5411
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5411
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>  Components: jobhistoryserver
>Affects Versions: 2.1.0-beta
>Reporter: Ashwin Shankar
>Assignee: Ashwin Shankar
>  Labels: features
> Attachments: LOADED_JOB_CACHE_MR5411-1.txt
>
>
> We want to be able to refresh size of the loaded job 
> cache(mapreduce.jobhistory.loadedjobs.cache.size) of history server
> through history server's admin interface.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-5411) Refresh size of loaded job cache on history server

2013-07-23 Thread Ashwin Shankar (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashwin Shankar updated MAPREDUCE-5411:
--

Attachment: LOADED_JOB_CACHE_MR5411-1.txt

> Refresh size of loaded job cache on history server
> --
>
> Key: MAPREDUCE-5411
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5411
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>  Components: jobhistoryserver
>Affects Versions: 2.1.0-beta
>Reporter: Ashwin Shankar
>Assignee: Ashwin Shankar
>  Labels: features
> Attachments: LOADED_JOB_CACHE_MR5411-1.txt
>
>
> We want to be able to refresh size of the loaded job 
> cache(mapreduce.jobhistory.loadedjobs.cache.size) of history server
> through history server's admin interface.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-5411) Refresh size of loaded job cache on history server

2013-07-23 Thread Ashwin Shankar (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashwin Shankar updated MAPREDUCE-5411:
--

Status: Patch Available  (was: Open)

Added a new command on history server's admin interface  
'refreshLoadedJobCache' which refreshes the size of loaded job cache.

> Refresh size of loaded job cache on history server
> --
>
> Key: MAPREDUCE-5411
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5411
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>  Components: jobhistoryserver
>Affects Versions: 2.1.0-beta
>Reporter: Ashwin Shankar
>Assignee: Ashwin Shankar
>  Labels: features
> Attachments: LOADED_JOB_CACHE_MR5411-1.txt
>
>
> We want to be able to refresh size of the loaded job 
> cache(mapreduce.jobhistory.loadedjobs.cache.size) of history server
> through history server's admin interface.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-5402) DynamicInputFormat should allow overriding of MAX_CHUNKS_TOLERABLE

2013-07-23 Thread Tsuyoshi OZAWA (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsuyoshi OZAWA updated MAPREDUCE-5402:
--

Attachment: MAPREDUCE-5402.3.patch

Fixed to pass compile.

> DynamicInputFormat should allow overriding of MAX_CHUNKS_TOLERABLE
> --
>
> Key: MAPREDUCE-5402
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5402
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: distcp, mrv2
>Reporter: David Rosenstrauch
>Assignee: Tsuyoshi OZAWA
> Attachments: MAPREDUCE-5402.1.patch, MAPREDUCE-5402.2.patch, 
> MAPREDUCE-5402.3.patch
>
>
> In MAPREDUCE-2765, which provided the design spec for DistCpV2, the author 
> describes the implementation of DynamicInputFormat, with one of the main 
> motivations cited being to reduce the chance of long-tails where a few 
> leftover mappers run much longer than the rest.
> However, I today ran into a situation where I experienced exactly such a long 
> tail using DistCpV2 and DynamicInputFormat.  And when I tried to alleviate 
> the problem by overriding the number of mappers and the split ratio used by 
> the DynamicInputFormat, I was prevented from doing so by the hard-coded limit 
> set in the code by the MAX_CHUNKS_TOLERABLE constant.  (Currently set to 400.)
> This constant is actually set quite low for production use.  (See a 
> description of my use case below.)  And although MAPREDUCE-2765 states that 
> this is an "overridable maximum", when reading through the code there does 
> not actually appear to be any mechanism available to override it.
> This should be changed.  It should be possible to expand the maximum # of 
> chunks beyond this arbitrary limit.
> For example, here is the situation I ran into today:
> I ran a distcpv2 job on a cluster with 8 machines containing 128 map slots.  
> The job consisted of copying ~2800 files from HDFS to Amazon S3.  I overrode 
> the number of mappers for the job from the default of 20 to 128, so as to 
> more properly parallelize the copy across the cluster.  The number of chunk 
> files created was calculated as 241, and mapred.num.entries.per.chunk was 
> calculated as 12.
> As the job ran on, it reached a point where there were only 4 remaining map 
> tasks, which had each been running for over 2 hours.  The reason for this was 
> that each of the 12 files that those mappers were copying were quite large 
> (several hundred megabytes in size) and took ~20 minutes each.  However, 
> during this time, all the other 124 mappers sat idle.
> In theory I should be able to alleviate this problem with DynamicInputFormat. 
>  If I were able to, say, quadruple the number of chunk files created, that 
> would have made each chunk contain only 3 files, and these large files would 
> have gotten distributed better around the cluster and copied in parallel.
> However, when I tried to do that - by overriding mapred.listing.split.ratio 
> to, say, 10 - DynamicInputFormat responded with an exception ("Too many 
> chunks created with splitRatio:10, numMaps:128. Reduce numMaps or decrease 
> split-ratio to proceed.") - presumably because I exceeded the 
> MAX_CHUNKS_TOLERABLE value of 400.
> Is there any particular logic behind this MAX_CHUNKS_TOLERABLE limit?  I 
> can't personally see any.
> If this limit has no particular logic behind it, then it should be 
> overridable - or even better:  removed altogether.  After all, I'm not sure I 
> see any need for it.  Even if numMaps * splitRatio resulted in an 
> extraordinarily large number, if the code were modified so that the number of 
> chunks got calculated as Math.min( numMaps * splitRatio, numFiles), then 
> there would be no need for MAX_CHUNKS_TOLERABLE.  In this worst-case scenario 
> where the product of numMaps and splitRatio is large, capping the number of 
> chunks at the number of files (numberOfChunks = numberOfFiles) would result 
> in 1 file per chunk - the maximum parallelization possible.  That may not be 
> the best-tuned solution for some users, but I would think that it should be 
> left up to the user to deal with the potential consequence of not having 
> tuned their job properly.  Certainly that would be better than having an 
> arbitrary hard-coded limit that *prevents* proper parallelization when 
> dealing with large files and/or large numbers of mappers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (MAPREDUCE-5411) Refresh size of loaded job cache on history server

2013-07-23 Thread Ashwin Shankar (JIRA)

Ashwin Shankar created MAPREDUCE-5411:
-

 Summary: Refresh size of loaded job cache on history server
 Key: MAPREDUCE-5411
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5411
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: jobhistoryserver
Affects Versions: 2.1.0-beta
Reporter: Ashwin Shankar
Assignee: Ashwin Shankar


We want to be able to refresh size of the loaded job 
cache(mapreduce.jobhistory.loadedjobs.cache.size) of history server
through history server's admin interface.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Comment Edited] (MAPREDUCE-5408) CLONE - The logging level of the tasks should be configurable by the job

2013-07-23 Thread Arun C Murthy (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13716490#comment-13716490
 ] 

Arun C Murthy edited comment on MAPREDUCE-5408 at 7/23/13 3:57 PM:
---

Thanks Hitesh, I've fixed default level to be INFO. W.r.t the first comment, 
let's keep backport as the same to ensure we are compatible with branch-2 (for 
binary compat). 

  was (Author: acmurthy):
Thanks Hitesh, I've fixed default level to be INFO. W.r.t the first 
comment, let's keep backport as the same. 
  
> CLONE - The logging level of the tasks should be configurable by the job
> 
>
> Key: MAPREDUCE-5408
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5408
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Owen O'Malley
>Assignee: Arun C Murthy
> Fix For: 1.3.0
>
> Attachments: MAPREDUCE-336_branch1.patch, MAPREDUCE-336_branch1.patch
>
>
> It would be nice to be able to configure the logging level of the Task JVM's 
> separately from the server JVM's. Reducing logging substantially increases 
> performance and reduces the consumption of local disk on the task trackers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-5317) Stale files left behind for failed jobs

2013-07-23 Thread Jason Lowe (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13716512#comment-13716512
 ] 

Jason Lowe commented on MAPREDUCE-5317:
---

+1

> Stale files left behind for failed jobs
> ---
>
> Key: MAPREDUCE-5317
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5317
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 3.0.0, 2.0.4-alpha, 0.23.8
>Reporter: Ravi Prakash
>Assignee: Ravi Prakash
> Fix For: 3.0.0, 2.3.0, 0.23.10
>
> Attachments: MAPREDUCE-5317.branch-0.23.patch, 
> MAPREDUCE-5317.branch-0.23.patch, MAPREDUCE-5317.branch-0.23.patch, 
> MAPREDUCE-5317.patch, MAPREDUCE-5317.patch, MAPREDUCE-5317.patch, 
> MAPREDUCE-5317.patch, MAPREDUCE-5317.patch, MAPREDUCE-5317.patch
>
>
> Courtesy [~amar_kamat]!
> {quote}
> We are seeing _temporary files left behind in the output folder if the job
> fails.
> The job were failed due to hitting quota issue.
> I simply ran the randomwriter (from hadoop examples) with the default setting.
> That failed and left behind some stray files.
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-5403) Get rid of yarn.application.classpath

2013-07-23 Thread Jason Lowe (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13716503#comment-13716503
 ] 

Jason Lowe commented on MAPREDUCE-5403:
---

bq. I don't think yarn.resourcemanager.address should be an environment 
variable. I think there is a conceptual difference between a client knowing how 
to contact a service and a client knowing about details of the service's 
internals. Do you disagree?

I'm just pointing out that locating the YARN jars is not the only thing clients 
may need to do to communicate with YARN.  e.g.: a non-Java YARN client still 
needs to locate the ResourceManager address somehow, and currently it can do 
this via parsing yarn-site.xml.  To me, yarn-site.xml is a site-specific config 
regardless of whether it's a YARN server or YARN client processing it.  For a 
site that doesn't host any YARN daemons (e.g.: gateway or launcher node), that 
becomes essentially a client-side-only config.

I can see where you're coming from, and maybe for the classpath we should do 
something different (e.g.: environment variable and/or clients should use "yarn 
classpath" to get the classpath instead).  I wanted to point out there are 
other interesting tidbits of information in yarn-site.xml besides the classpath 
that clients may want to access, and I'm wondering what criteria qualifies a 
client-consumed property to graduate to an environment variable or some other 
mechanism for determining the value besides parsing yarn-site.xml.

> Get rid of yarn.application.classpath
> -
>
> Key: MAPREDUCE-5403
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5403
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Affects Versions: 2.0.5-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
>
> yarn.application.classpath is a confusing property because it is used by 
> MapReduce and not YARN, and MapReduce already has 
> mapreduce.application.classpath, which provides the same functionality.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-5408) CLONE - The logging level of the tasks should be configurable by the job

2013-07-23 Thread Arun C Murthy (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated MAPREDUCE-5408:
-

Attachment: MAPREDUCE-336_branch1.patch

Thanks Hitesh, I've fixed default level to be INFO. W.r.t the first comment, 
let's keep backport as the same. 

> CLONE - The logging level of the tasks should be configurable by the job
> 
>
> Key: MAPREDUCE-5408
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5408
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Owen O'Malley
>Assignee: Arun C Murthy
> Fix For: 1.3.0
>
> Attachments: MAPREDUCE-336_branch1.patch, MAPREDUCE-336_branch1.patch
>
>
> It would be nice to be able to configure the logging level of the Task JVM's 
> separately from the server JVM's. Reducing logging substantially increases 
> performance and reduces the consumption of local disk on the task trackers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-5408) CLONE - The logging level of the tasks should be configurable by the job

2013-07-23 Thread Arun C Murthy (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated MAPREDUCE-5408:
-

Resolution: Fixed
Status: Resolved  (was: Patch Available)

I just committed this. 

> CLONE - The logging level of the tasks should be configurable by the job
> 
>
> Key: MAPREDUCE-5408
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5408
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Owen O'Malley
>Assignee: Arun C Murthy
> Fix For: 1.3.0
>
> Attachments: MAPREDUCE-336_branch1.patch, MAPREDUCE-336_branch1.patch
>
>
> It would be nice to be able to configure the logging level of the Task JVM's 
> separately from the server JVM's. Reducing logging substantially increases 
> performance and reduces the consumption of local disk on the task trackers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-5402) DynamicInputFormat should allow overriding of MAX_CHUNKS_TOLERABLE

2013-07-23 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13716472#comment-13716472
 ] 

Hadoop QA commented on MAPREDUCE-5402:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12593707/MAPREDUCE-5402.2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:red}-1 javac{color:red}.  The patch appears to cause the build to 
fail.

Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3878//console

This message is automatically generated.

> DynamicInputFormat should allow overriding of MAX_CHUNKS_TOLERABLE
> --
>
> Key: MAPREDUCE-5402
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5402
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: distcp, mrv2
>Reporter: David Rosenstrauch
>Assignee: Tsuyoshi OZAWA
> Attachments: MAPREDUCE-5402.1.patch, MAPREDUCE-5402.2.patch
>
>
> In MAPREDUCE-2765, which provided the design spec for DistCpV2, the author 
> describes the implementation of DynamicInputFormat, with one of the main 
> motivations cited being to reduce the chance of long-tails where a few 
> leftover mappers run much longer than the rest.
> However, I today ran into a situation where I experienced exactly such a long 
> tail using DistCpV2 and DynamicInputFormat.  And when I tried to alleviate 
> the problem by overriding the number of mappers and the split ratio used by 
> the DynamicInputFormat, I was prevented from doing so by the hard-coded limit 
> set in the code by the MAX_CHUNKS_TOLERABLE constant.  (Currently set to 400.)
> This constant is actually set quite low for production use.  (See a 
> description of my use case below.)  And although MAPREDUCE-2765 states that 
> this is an "overridable maximum", when reading through the code there does 
> not actually appear to be any mechanism available to override it.
> This should be changed.  It should be possible to expand the maximum # of 
> chunks beyond this arbitrary limit.
> For example, here is the situation I ran into today:
> I ran a distcpv2 job on a cluster with 8 machines containing 128 map slots.  
> The job consisted of copying ~2800 files from HDFS to Amazon S3.  I overrode 
> the number of mappers for the job from the default of 20 to 128, so as to 
> more properly parallelize the copy across the cluster.  The number of chunk 
> files created was calculated as 241, and mapred.num.entries.per.chunk was 
> calculated as 12.
> As the job ran on, it reached a point where there were only 4 remaining map 
> tasks, which had each been running for over 2 hours.  The reason for this was 
> that each of the 12 files that those mappers were copying were quite large 
> (several hundred megabytes in size) and took ~20 minutes each.  However, 
> during this time, all the other 124 mappers sat idle.
> In theory I should be able to alleviate this problem with DynamicInputFormat. 
>  If I were able to, say, quadruple the number of chunk files created, that 
> would have made each chunk contain only 3 files, and these large files would 
> have gotten distributed better around the cluster and copied in parallel.
> However, when I tried to do that - by overriding mapred.listing.split.ratio 
> to, say, 10 - DynamicInputFormat responded with an exception ("Too many 
> chunks created with splitRatio:10, numMaps:128. Reduce numMaps or decrease 
> split-ratio to proceed.") - presumably because I exceeded the 
> MAX_CHUNKS_TOLERABLE value of 400.
> Is there any particular logic behind this MAX_CHUNKS_TOLERABLE limit?  I 
> can't personally see any.
> If this limit has no particular logic behind it, then it should be 
> overridable - or even better:  removed altogether.  After all, I'm not sure I 
> see any need for it.  Even if numMaps * splitRatio resulted in an 
> extraordinarily large number, if the code were modified so that the number of 
> chunks got calculated as Math.min( numMaps * splitRatio, numFiles), then 
> there would be no need for MAX_CHUNKS_TOLERABLE.  In this worst-case scenario 
> where the product of numMaps and splitRatio is large, capping the number of 
> chunks at the number of files (numberOfChunks = numberOfFiles) would result 
> in 1 file per chunk - the maximum parallelization possible.  That may not be 
> the best-tuned solution for some users, bu

[jira] [Commented] (MAPREDUCE-5408) CLONE - The logging level of the tasks should be configurable by the job

2013-07-23 Thread Hitesh Shah (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13716470#comment-13716470
 ] 

Hitesh Shah commented on MAPREDUCE-5408:


Mostly looks good. A couple of minor comments:

  - DEFAULT_LOG_LEVEL could be renamed to DEFAULT_TASK_LOG_LEVEL and the type 
changed to a string. Having the type as Level is not buying much as it always 
ends up being converted to a string when used. If the intention is to retain 
the backport as is, this comment can be ignored for now. 

  - Level.toLevel() has an api which takes in a default value. In the event 
that the user has a typo, the current usage falls back to using DEBUG where as 
the default-based api can be made to fall back to INFO.


 

> CLONE - The logging level of the tasks should be configurable by the job
> 
>
> Key: MAPREDUCE-5408
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5408
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Owen O'Malley
>Assignee: Arun C Murthy
> Fix For: 1.3.0
>
> Attachments: MAPREDUCE-336_branch1.patch
>
>
> It would be nice to be able to configure the logging level of the Task JVM's 
> separately from the server JVM's. Reducing logging substantially increases 
> performance and reduces the consumption of local disk on the task trackers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-5402) DynamicInputFormat should allow overriding of MAX_CHUNKS_TOLERABLE

2013-07-23 Thread Tsuyoshi OZAWA (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsuyoshi OZAWA updated MAPREDUCE-5402:
--

Attachment: MAPREDUCE-5402.2.patch

Fixed to pass findbugs warnings.

> DynamicInputFormat should allow overriding of MAX_CHUNKS_TOLERABLE
> --
>
> Key: MAPREDUCE-5402
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5402
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: distcp, mrv2
>Reporter: David Rosenstrauch
>Assignee: Tsuyoshi OZAWA
> Attachments: MAPREDUCE-5402.1.patch, MAPREDUCE-5402.2.patch
>
>
> In MAPREDUCE-2765, which provided the design spec for DistCpV2, the author 
> describes the implementation of DynamicInputFormat, with one of the main 
> motivations cited being to reduce the chance of long-tails where a few 
> leftover mappers run much longer than the rest.
> However, I today ran into a situation where I experienced exactly such a long 
> tail using DistCpV2 and DynamicInputFormat.  And when I tried to alleviate 
> the problem by overriding the number of mappers and the split ratio used by 
> the DynamicInputFormat, I was prevented from doing so by the hard-coded limit 
> set in the code by the MAX_CHUNKS_TOLERABLE constant.  (Currently set to 400.)
> This constant is actually set quite low for production use.  (See a 
> description of my use case below.)  And although MAPREDUCE-2765 states that 
> this is an "overridable maximum", when reading through the code there does 
> not actually appear to be any mechanism available to override it.
> This should be changed.  It should be possible to expand the maximum # of 
> chunks beyond this arbitrary limit.
> For example, here is the situation I ran into today:
> I ran a distcpv2 job on a cluster with 8 machines containing 128 map slots.  
> The job consisted of copying ~2800 files from HDFS to Amazon S3.  I overrode 
> the number of mappers for the job from the default of 20 to 128, so as to 
> more properly parallelize the copy across the cluster.  The number of chunk 
> files created was calculated as 241, and mapred.num.entries.per.chunk was 
> calculated as 12.
> As the job ran on, it reached a point where there were only 4 remaining map 
> tasks, which had each been running for over 2 hours.  The reason for this was 
> that each of the 12 files that those mappers were copying were quite large 
> (several hundred megabytes in size) and took ~20 minutes each.  However, 
> during this time, all the other 124 mappers sat idle.
> In theory I should be able to alleviate this problem with DynamicInputFormat. 
>  If I were able to, say, quadruple the number of chunk files created, that 
> would have made each chunk contain only 3 files, and these large files would 
> have gotten distributed better around the cluster and copied in parallel.
> However, when I tried to do that - by overriding mapred.listing.split.ratio 
> to, say, 10 - DynamicInputFormat responded with an exception ("Too many 
> chunks created with splitRatio:10, numMaps:128. Reduce numMaps or decrease 
> split-ratio to proceed.") - presumably because I exceeded the 
> MAX_CHUNKS_TOLERABLE value of 400.
> Is there any particular logic behind this MAX_CHUNKS_TOLERABLE limit?  I 
> can't personally see any.
> If this limit has no particular logic behind it, then it should be 
> overridable - or even better:  removed altogether.  After all, I'm not sure I 
> see any need for it.  Even if numMaps * splitRatio resulted in an 
> extraordinarily large number, if the code were modified so that the number of 
> chunks got calculated as Math.min( numMaps * splitRatio, numFiles), then 
> there would be no need for MAX_CHUNKS_TOLERABLE.  In this worst-case scenario 
> where the product of numMaps and splitRatio is large, capping the number of 
> chunks at the number of files (numberOfChunks = numberOfFiles) would result 
> in 1 file per chunk - the maximum parallelization possible.  That may not be 
> the best-tuned solution for some users, but I would think that it should be 
> left up to the user to deal with the potential consequence of not having 
> tuned their job properly.  Certainly that would be better than having an 
> arbitrary hard-coded limit that *prevents* proper parallelization when 
> dealing with large files and/or large numbers of mappers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-5379) Include FS delegation token ID in job conf

2013-07-23 Thread Alejandro Abdelnur (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13716386#comment-13716386
 ] 

Alejandro Abdelnur commented on MAPREDUCE-5379:
---

[~daryn], ping.

> Include FS delegation token ID in job conf
> --
>
> Key: MAPREDUCE-5379
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5379
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: job submission, security
>Affects Versions: 2.1.0-beta
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: MAPREDUCE-5379-1.patch, MAPREDUCE-5379.patch
>
>
> Making a job's FS delegation token ID accessible will allow external services 
> to associate it with the file system operations it performs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (MAPREDUCE-5410) MapReduce output issue

2013-07-23 Thread Mullangi (JIRA)

Mullangi created MAPREDUCE-5410:
---

 Summary: MapReduce output issue
 Key: MAPREDUCE-5410
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5410
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: examples, job submission
Affects Versions: 1.0.3
 Environment: ubuntu
Reporter: Mullangi


Hi,

I am new to Hadoop concepts. 
While practicing with one custom MapReduce program, I found the result is not 
as expected after executing the code on HDFS based file. Please note that when 
I execute the same program using Unix based file,getting expected result.
Below are the details of my code.

MapReduce in java
==

import java.io.IOException;
import java.util.*;

import org.apache.hadoop.fs.Path;
import org.apache.hadoop.conf.*;
import org.apache.hadoop.io.*;
import org.apache.hadoop.mapred.*;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.util.*;

public class WordCount1 {

public static class Map extends MapReduceBase implements Mapper {
  private final static IntWritable one = new IntWritable(1);
  private Text word = new Text();

  public void map(LongWritable key, Text value, OutputCollector output, 
Reporter reporter) throws IOException {
String line = value.toString();
String tokenedZone=null;
StringTokenizer tokenizer = new StringTokenizer(line);
while (tokenizer.hasMoreTokens()) {
  tokenedZone=tokenizer.nextToken();
  word.set(tokenedZone);
  output.collect(word, one);
}
  }
}

public static class Reduce extends MapReduceBase implements Reducer {
  public void reduce(Text key, Iterator values, OutputCollector output, 
Reporter reporter) throws IOException {
int sum = 0;
int val = 0;
while (values.hasNext()) {
val = values.next().get();
sum += val;
}
if(sum>1)
output.collect(key, new IntWritable(sum));
  }
}

public static void main(String[] args) throws Exception {
  JobConf conf = new JobConf();
  conf.setJarByClass(WordCount1.class);
  conf.setJobName("wordcount1");
  
  conf.setOutputKeyClass(Text.class);
  conf.setOutputValueClass(IntWritable.class);

  conf.setMapperClass(Map.class);
  conf.setCombinerClass(Reduce.class);
  conf.setReducerClass(Reduce.class);

  conf.setInputFormat(TextInputFormat.class);
  conf.setOutputFormat(TextOutputFormat.class);
  
  Path inPath = new Path(args[0]);
  Path outPath = new Path(args[0]);

  FileInputFormat.setInputPaths(conf,inPath );
  FileOutputFormat.setOutputPath(conf, outPath);

  JobClient.runJob(conf);
}
  
}


input File
===
test my program
during test and my hadoop 
your during
get program


hadoop generated output file on HDFS file system
===
during  2
my  2
test2

hadoop generated output file on local file system
===
during  2
my  2
program 2
test2

Please help me on this issue


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (MAPREDUCE-5409) MRAppMaster throws InvalidStateTransitonException: Invalid event: TA_TOO_MANY_FETCH_FAILURE at KILLED for TaskAttemptImpl

2013-07-23 Thread Devaraj K (JIRA)

Devaraj K created MAPREDUCE-5409:


 Summary: MRAppMaster throws InvalidStateTransitonException: 
Invalid event: TA_TOO_MANY_FETCH_FAILURE at KILLED for TaskAttemptImpl
 Key: MAPREDUCE-5409
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5409
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.0.5-alpha
Reporter: Devaraj K
Assignee: Devaraj K


{code:xml}
2013-07-23 12:28:05,217 INFO [IPC Server handler 29 on 50796] 
org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt 
attempt_1374560536158_0003_m_40_0 is : 0.0
2013-07-23 12:28:05,221 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Too many fetch-failures 
for output of task attempt: attempt_1374560536158_0003_m_07_0 ... raising 
fetch failure to map
2013-07-23 12:28:05,222 ERROR [AsyncDispatcher event handler] 
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Can't handle this 
event at current state for attempt_1374560536158_0003_m_07_0
org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
TA_TOO_MANY_FETCH_FAILURE at KILLED
at 
org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
at 
org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43)
at 
org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:445)
at 
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:1032)
at 
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:143)
at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1123)
at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1115)
at 
org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:130)
at 
org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:77)
at java.lang.Thread.run(Thread.java:662)
2013-07-23 12:28:05,249 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1374560536158_0003Job 
Transitioned from RUNNING to ERROR
2013-07-23 12:28:05,338 INFO [IPC Server handler 16 on 50796] 
org.apache.hadoop.mapred.TaskAttemptListenerImpl: Status update from 
attempt_1374560536158_0003_m_40_0
{code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-5408) CLONE - The logging level of the tasks should be configurable by the job

2013-07-23 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13716202#comment-13716202
 ] 

Hadoop QA commented on MAPREDUCE-5408:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12593650/MAPREDUCE-336_branch1.patch
  against trunk revision .

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3877//console

This message is automatically generated.

> CLONE - The logging level of the tasks should be configurable by the job
> 
>
> Key: MAPREDUCE-5408
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5408
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Owen O'Malley
>Assignee: Arun C Murthy
> Fix For: 1.3.0
>
> Attachments: MAPREDUCE-336_branch1.patch
>
>
> It would be nice to be able to configure the logging level of the Task JVM's 
> separately from the server JVM's. Reducing logging substantially increases 
> performance and reduces the consumption of local disk on the task trackers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-5408) CLONE - The logging level of the tasks should be configurable by the job

2013-07-23 Thread Arun C Murthy (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated MAPREDUCE-5408:
-

Status: Patch Available  (was: Open)

> CLONE - The logging level of the tasks should be configurable by the job
> 
>
> Key: MAPREDUCE-5408
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5408
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Owen O'Malley
>Assignee: Arun C Murthy
> Fix For: 1.3.0
>
> Attachments: MAPREDUCE-336_branch1.patch
>
>
> It would be nice to be able to configure the logging level of the Task JVM's 
> separately from the server JVM's. Reducing logging substantially increases 
> performance and reduces the consumption of local disk on the task trackers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

62 matches

Mail list logo