date:20121009

[jira] [Created] (MAPREDUCE-4717) Mapreduce job fails to run after configuring multiple namespaces [HDFS Federation]

2012-10-09 Thread Sagar Shimpi (JIRA)

Sagar Shimpi created MAPREDUCE-4717:
---

 Summary: Mapreduce job fails to run after configuring multiple 
namespaces [HDFS Federation]
 Key: MAPREDUCE-4717
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4717
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1
Affects Versions: 0.20.2
 Environment: 2 Standalone Desktop with 360Gb HDD and 4GB RAM - Acting 
as two Namenodes
2 Virtual Machine with 60GB HDD and 1GB RAM - Acting as Job tracker and 
zookeeper
Reporter: Sagar Shimpi


I am having setup of 4 nodes with following details -

Standalone Desktop-1 -> 
NameNode1,Tasktracker,Zookeeper,Jobtracker,datanode,HMaster

Standalone Desktop-2 -> NameNode2,Tasktracker,datanode.RegionServer

Virtual Machine-1 -> Namenode3,Datanode,Tasktracker

Virtual Machine-2 -> Namenode4,Datanode,Tasktracker


I have configured HDFS Federation with following name service -
a) nameservice1
b) oss-hadoop-nameservice

While executing Mapreduce job I am getting following error -


-bash-4.1$ id
uid=496(hdfs) gid=496(hdfs) groups=496(hdfs),497(hadoop)
-bash-4.1$ hadoop jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar 
wordcount /hbase/install.log.syslog /hbase/testing
12/10/10 12:30:21 ERROR security.UserGroupInformation: 
PriviledgedActionException as:hdfs (auth:SIMPLE) cause:java.io.IOException: 
viewfs://cluster6/
java.io.IOException: viewfs://cluster6/
at org.apache.hadoop.fs.viewfs.InodeTree.(InodeTree.java:338)
at 
org.apache.hadoop.fs.viewfs.ViewFileSystem$1.(ViewFileSystem.java:178)
at 
org.apache.hadoop.fs.viewfs.ViewFileSystem.initialize(ViewFileSystem.java:178)
at 
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2150)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:80)
at 
org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2184)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2166)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:302)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:194)
at 
org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:103)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:844)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
at 
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:844)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:481)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:511)
at org.apache.hadoop.examples.WordCount.main(WordCount.java:67)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72)
at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:144)
at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
-bash-4.1$


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4671) AM does not tell the RM about container requests that are no longer needed

2012-10-09 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13473035#comment-13473035
 ] 

Hadoop QA commented on MAPREDUCE-4671:
--

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12548529/MAPREDUCE-4671.3.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2923//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2923//console

This message is automatically generated.

> AM does not tell the RM about container requests that are no longer needed
> --
>
> Key: MAPREDUCE-4671
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4671
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 0.23.3, 2.0.0-alpha
>Reporter: Bikas Saha
>Assignee: Bikas Saha
> Attachments: MAPREDUCE-4671.1.patch, MAPREDUCE-4671.2.patch, 
> MAPREDUCE-4671.3.patch
>
>
> Say the AM wanted a container at hosts h1, h2, h3. After getting a container 
> at h1 it should tell RM that it no longer needs containers at h2, h3. 
> Otherwise on the RM h2, h3 remain valid allocation locations.
> The AM RMContainerAllocator does remove these resource requests internally. 
> When the resource request container count drops to 0 then it drops the 
> resource request from its tables but forgets to send the 0 sized request to 
> the RM.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-3655) Exception from launching allocated container

2012-10-09 Thread Li Ming (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Ming updated MAPREDUCE-3655:
---

Affects Version/s: 2.0.1-alpha

> Exception from launching allocated container
> 
>
> Key: MAPREDUCE-3655
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3655
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster
>Affects Versions: 0.23.0, 2.0.1-alpha
>Reporter: Bing Jiang
>
> I use Hadoop-Yarn to deploy my real-time distributed computation system, and 
> I get reply from mapreduce-u...@hadoop.apache.org to follow these guilders 
> below:
>  
> http://hadoop.apache.org/common/docs/r0.23.0/hadoop-yarn/hadoop-yarn-site/YARN.html
>  
> http://hadoop.apache.org/common/docs/r0.23.0/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html
> When I follow the steps to construct my Client, ApplicationMaster. And an 
> issue occurs to me that  NM fail to launch a Container because of  
> java.io.FileNotFoundException.
> So the part of NM log  has been attached below:
>  
> 2011-12-29 15:49:16,250 INFO org.apache.hadoop.yarn.server.
> nodemanager.containermanager.application.Application: Adding 
> container_1325062142731_0006_01_01 to application 
> application_1325062142731_0006
> 2011-12-29 15:49:16,250 DEBUG org.apache.hadoop.yarn.event.AsyncDispatcher: 
> Dispatching the event 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.event.ApplicationLocalizationEvent.EventType:
>  INIT_APPLICATION_RESOURCES
> 2011-12-29 15:49:16,250 DEBUG org.apache.hadoop.yarn.event.AsyncDispatcher: 
> Dispatching the event 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationInitedEvent.EventType:
>  APPLICATION_INITED
> 2011-12-29 15:49:16,250 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application:
>  Processing application_1325062142731_0006 of type APPLICATION_INITED
> 2011-12-29 15:49:16,250 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application:
>  Application application_1325062142731_0006 transitioned from INITING to 
> RUNNING
> 2011-12-29 15:49:16,250 DEBUG org.apache.hadoop.yarn.event.AsyncDispatcher: 
> Dispatching the event 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.loghandler.event.LogHandlerAppStartedEvent.EventType:
>  APPLICATION_STARTED
> 2011-12-29 15:49:16,250 DEBUG org.apache.hadoop.yarn.event.AsyncDispatcher: 
> Dispatching the event 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerInitEvent.EventType:
>  INIT_CONTAINER
> 2011-12-29 15:49:16,250 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>  Processing container_1325062142731_0006_01_01 of type INIT_CONTAINER
> 2011-12-29 15:49:16,250 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>  Container container_1325062142731_0006_01_01 transitioned from NEW to 
> LOCALIZED
> 2011-12-29 15:49:16,250 DEBUG org.apache.hadoop.yarn.event.AsyncDispatcher: 
> Dispatching the event 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncherEvent.EventType:
>  LAUNCH_CONTAINER
> 2011-12-29 15:49:16,287 DEBUG org.apache.hadoop.yarn.event.AsyncDispatcher: 
> Dispatching the event 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerEvent.EventType:
>  CONTAINER_LAUNCHED
> 2011-12-29 15:49:16,287 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>  Processing container_1325062142731_0006_01_01 of type CONTAINER_LAUNCHED
> 2011-12-29 15:49:16,287 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>  Container container_1325062142731_0006_01_01 transitioned from LOCALIZED 
> to RUNNING
> 2011-12-29 15:49:16,288 DEBUG org.apache.hadoop.yarn.event.AsyncDispatcher: 
> Dispatching the event 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainerStartMonitoringEvent.EventType:
>  START_MONITORING_CONTAINER
> 2011-12-29 15:49:16,289 WARN 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch:
>  Failed to launch container
> java.io.FileNotFoundException: File 
> /tmp/nm-local-dir/usercache/jiangbing/appcache/application_1325062142731_0006 
> does not exist
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:431)
> at org.apache.hadoop.fs.FileSystem.primitiveMkdir(FileSystem.java:815)
> at 
> org.apache.hadoop.fs.DelegateToFileSystem.mkdir(DelegateToFileSystem.java:143)
> at org.apache.hadoop.fs.FilterFs.mkdir(FilterFs.java:189)
> at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:700)
> at

[jira] [Commented] (MAPREDUCE-3655) Exception from launching allocated container

2012-10-09 Thread Li Ming (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13473030#comment-13473030
 ] 

Li Ming commented on MAPREDUCE-3655:


This is also happens on 2.0.1-alpha, it seems related to the resource 
localization. In the DistributedShell example, the ContainerLaunchContext of AM 
has LocalResources which are the AppMaster.jar, but other task containers do 
not have this. And only the container with local resources will create the 
directory like 
/tmp/nm-local-dir/usercache/jiangbing/appcache/application_1325062142731_0006, 
so the non-AM containers will fail to use these directories.

> Exception from launching allocated container
> 
>
> Key: MAPREDUCE-3655
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3655
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster
>Affects Versions: 0.23.0
>Reporter: Bing Jiang
>
> I use Hadoop-Yarn to deploy my real-time distributed computation system, and 
> I get reply from mapreduce-u...@hadoop.apache.org to follow these guilders 
> below:
>  
> http://hadoop.apache.org/common/docs/r0.23.0/hadoop-yarn/hadoop-yarn-site/YARN.html
>  
> http://hadoop.apache.org/common/docs/r0.23.0/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html
> When I follow the steps to construct my Client, ApplicationMaster. And an 
> issue occurs to me that  NM fail to launch a Container because of  
> java.io.FileNotFoundException.
> So the part of NM log  has been attached below:
>  
> 2011-12-29 15:49:16,250 INFO org.apache.hadoop.yarn.server.
> nodemanager.containermanager.application.Application: Adding 
> container_1325062142731_0006_01_01 to application 
> application_1325062142731_0006
> 2011-12-29 15:49:16,250 DEBUG org.apache.hadoop.yarn.event.AsyncDispatcher: 
> Dispatching the event 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.event.ApplicationLocalizationEvent.EventType:
>  INIT_APPLICATION_RESOURCES
> 2011-12-29 15:49:16,250 DEBUG org.apache.hadoop.yarn.event.AsyncDispatcher: 
> Dispatching the event 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationInitedEvent.EventType:
>  APPLICATION_INITED
> 2011-12-29 15:49:16,250 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application:
>  Processing application_1325062142731_0006 of type APPLICATION_INITED
> 2011-12-29 15:49:16,250 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application:
>  Application application_1325062142731_0006 transitioned from INITING to 
> RUNNING
> 2011-12-29 15:49:16,250 DEBUG org.apache.hadoop.yarn.event.AsyncDispatcher: 
> Dispatching the event 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.loghandler.event.LogHandlerAppStartedEvent.EventType:
>  APPLICATION_STARTED
> 2011-12-29 15:49:16,250 DEBUG org.apache.hadoop.yarn.event.AsyncDispatcher: 
> Dispatching the event 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerInitEvent.EventType:
>  INIT_CONTAINER
> 2011-12-29 15:49:16,250 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>  Processing container_1325062142731_0006_01_01 of type INIT_CONTAINER
> 2011-12-29 15:49:16,250 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>  Container container_1325062142731_0006_01_01 transitioned from NEW to 
> LOCALIZED
> 2011-12-29 15:49:16,250 DEBUG org.apache.hadoop.yarn.event.AsyncDispatcher: 
> Dispatching the event 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncherEvent.EventType:
>  LAUNCH_CONTAINER
> 2011-12-29 15:49:16,287 DEBUG org.apache.hadoop.yarn.event.AsyncDispatcher: 
> Dispatching the event 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerEvent.EventType:
>  CONTAINER_LAUNCHED
> 2011-12-29 15:49:16,287 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>  Processing container_1325062142731_0006_01_01 of type CONTAINER_LAUNCHED
> 2011-12-29 15:49:16,287 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>  Container container_1325062142731_0006_01_01 transitioned from LOCALIZED 
> to RUNNING
> 2011-12-29 15:49:16,288 DEBUG org.apache.hadoop.yarn.event.AsyncDispatcher: 
> Dispatching the event 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainerStartMonitoringEvent.EventType:
>  START_MONITORING_CONTAINER
> 2011-12-29 15:49:16,289 WARN 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch:
>  Failed to launch container
> java.io.FileNotFoundException: File 
> /tmp/nm-local-dir/u

[jira] [Commented] (MAPREDUCE-4671) AM does not tell the RM about container requests that are no longer needed

2012-10-09 Thread Bikas Saha (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13473028#comment-13473028
 ] 

Bikas Saha commented on MAPREDUCE-4671:
---

New patch.
Checking for resource request falling below zero.
Changing the ask list to have custom comparator to avoid duplication of 
resource requests.

> AM does not tell the RM about container requests that are no longer needed
> --
>
> Key: MAPREDUCE-4671
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4671
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 0.23.3, 2.0.0-alpha
>Reporter: Bikas Saha
>Assignee: Bikas Saha
> Attachments: MAPREDUCE-4671.1.patch, MAPREDUCE-4671.2.patch, 
> MAPREDUCE-4671.3.patch
>
>
> Say the AM wanted a container at hosts h1, h2, h3. After getting a container 
> at h1 it should tell RM that it no longer needs containers at h2, h3. 
> Otherwise on the RM h2, h3 remain valid allocation locations.
> The AM RMContainerAllocator does remove these resource requests internally. 
> When the resource request container count drops to 0 then it drops the 
> resource request from its tables but forgets to send the 0 sized request to 
> the RM.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4671) AM does not tell the RM about container requests that are no longer needed

2012-10-09 Thread Bikas Saha (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikas Saha updated MAPREDUCE-4671:
--

Status: Patch Available  (was: Open)

> AM does not tell the RM about container requests that are no longer needed
> --
>
> Key: MAPREDUCE-4671
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4671
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha, 0.23.3
>Reporter: Bikas Saha
>Assignee: Bikas Saha
> Attachments: MAPREDUCE-4671.1.patch, MAPREDUCE-4671.2.patch, 
> MAPREDUCE-4671.3.patch
>
>
> Say the AM wanted a container at hosts h1, h2, h3. After getting a container 
> at h1 it should tell RM that it no longer needs containers at h2, h3. 
> Otherwise on the RM h2, h3 remain valid allocation locations.
> The AM RMContainerAllocator does remove these resource requests internally. 
> When the resource request container count drops to 0 then it drops the 
> resource request from its tables but forgets to send the 0 sized request to 
> the RM.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4671) AM does not tell the RM about container requests that are no longer needed

2012-10-09 Thread Bikas Saha (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikas Saha updated MAPREDUCE-4671:
--

Attachment: MAPREDUCE-4671.3.patch

> AM does not tell the RM about container requests that are no longer needed
> --
>
> Key: MAPREDUCE-4671
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4671
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 0.23.3, 2.0.0-alpha
>Reporter: Bikas Saha
>Assignee: Bikas Saha
> Attachments: MAPREDUCE-4671.1.patch, MAPREDUCE-4671.2.patch, 
> MAPREDUCE-4671.3.patch
>
>
> Say the AM wanted a container at hosts h1, h2, h3. After getting a container 
> at h1 it should tell RM that it no longer needs containers at h2, h3. 
> Otherwise on the RM h2, h3 remain valid allocation locations.
> The AM RMContainerAllocator does remove these resource requests internally. 
> When the resource request container count drops to 0 then it drops the 
> resource request from its tables but forgets to send the 0 sized request to 
> the RM.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4568) Throw "early" exception when duplicate files or archives are found in distributed cache

2012-10-09 Thread Mohammad Kamrul Islam (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13472981#comment-13472981
 ] 

Mohammad Kamrul Islam commented on MAPREDUCE-4568:
--

In addition, it will be better, if there is a way of checking whether some file 
is already added in DC.


> Throw "early" exception when duplicate files or archives are found in 
> distributed cache
> ---
>
> Key: MAPREDUCE-4568
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4568
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Mohammad Kamrul Islam
>Assignee: Arun C Murthy
>
> According to #MAPREDUCE-4549, Hadoop 2.x throws exception if duplicates found 
> in cacheFiles or cacheArchives. The exception  throws during job submission.
> This JIRA is to throw the exception ==early== when it is first added to the 
> Distributed Cache through addCacheFile or addFileToClassPath.
> It will help the client to decide whether to fail-fast or continue w/o the 
> duplicated entries.
> Alternatively, Hadoop could provide a knob where user will choose whether to 
> throw error( coming behavior) or silently ignore (old behavior).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-1806) CombineFileInputFormat does not work with paths not on default FS

2012-10-09 Thread Gera Shegalov (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13472965#comment-13472965
 ] 

Gera Shegalov commented on MAPREDUCE-1806:
--

Hi Ivan, thanks for looking at the patch. I was not working on it in the 
meantime. Backport should be straightforward and I'll check out the failing 
test.

> CombineFileInputFormat does not work with paths not on default FS
> -
>
> Key: MAPREDUCE-1806
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1806
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: harchive
>Affects Versions: 0.22.0, 0.23.1
>Reporter: Paul Yang
> Attachments: MAPREDUCE-1806.patch, MAPREDUCE-1806.rev2.patch, 
> MAPREDUCE-1806.rev3.patch
>
>
> In generating the splits in CombineFileInputFormat, the scheme and authority 
> are stripped out. This creates problems when trying to access the files while 
> generating the splits, as without the har:/, the file won't be accessed 
> through the HarFileSystem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4436) AppRejectedTransition does not unregister app from master service and scheduler

2012-10-09 Thread Siddharth Seth (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13472872#comment-13472872
 ] 

Siddharth Seth commented on MAPREDUCE-4436:
---

+1. Thanks Bikas.

> AppRejectedTransition does not unregister app from master service and 
> scheduler
> ---
>
> Key: MAPREDUCE-4436
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4436
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 0.23.1, 2.0.0-alpha, 3.0.0
>Reporter: Bikas Saha
>Assignee: Bikas Saha
> Attachments: MAPREDUCE-4436.1.patch
>
>
> AttemptStartedTransition() adds the app to the ApplicationMasterService and 
> scheduler. when the scheduler rejects the app then AppRejectedTransition() 
> forgets to unregister it from the ApplicationMasterService.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4671) AM does not tell the RM about container requests that are no longer needed

2012-10-09 Thread Siddharth Seth (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13472865#comment-13472865
 ] 

Siddharth Seth commented on MAPREDUCE-4671:
---

iirc, blacklisting the node changes the request directly in the 
RMContainerRequestor. When the RMContainerAllocator removes a previous ask 
(assigned task, etc) - it includes the blacklisted host in this request.
Asserting the value never goes below 0 - will need to change this behaviour. 
Otherwise a simple check and ignore is sufficient.

> AM does not tell the RM about container requests that are no longer needed
> --
>
> Key: MAPREDUCE-4671
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4671
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 0.23.3, 2.0.0-alpha
>Reporter: Bikas Saha
>Assignee: Bikas Saha
> Attachments: MAPREDUCE-4671.1.patch, MAPREDUCE-4671.2.patch
>
>
> Say the AM wanted a container at hosts h1, h2, h3. After getting a container 
> at h1 it should tell RM that it no longer needs containers at h2, h3. 
> Otherwise on the RM h2, h3 remain valid allocation locations.
> The AM RMContainerAllocator does remove these resource requests internally. 
> When the resource request container count drops to 0 then it drops the 
> resource request from its tables but forgets to send the 0 sized request to 
> the RM.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4716) TestHsWebServicesJobsQuery.testJobsQueryStateInvalid fails with jdk7

2012-10-09 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13472850#comment-13472850
 ] 

Hadoop QA commented on MAPREDUCE-4716:
--

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12548428/MAPREDUCE-4716.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2922//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2922//console

This message is automatically generated.

> TestHsWebServicesJobsQuery.testJobsQueryStateInvalid fails with jdk7
> 
>
> Key: MAPREDUCE-4716
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4716
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver
>Affects Versions: 0.23.3, 3.0.0, 2.0.2-alpha
>Reporter: Thomas Graves
>Assignee: Thomas Graves
>  Labels: java7
> Attachments: MAPREDUCE-4716.patch
>
>
> Using jdk7 TestHsWebServicesJobsQuery.testJobsQueryStateInvalid  fails.
> It looks like the string changed from "const class" to "constant" in jdk7.
> Tests run: 25, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 9.713 sec 
> <<< FAILURE!
> testJobsQueryStateInvalid(org.apache.hadoop.mapreduce.v2.hs.webapp.TestHsWebServicesJobsQuery)
>   Time elapsed: 0.371 sec  <<< FAILURE!
> java.lang.AssertionError: exception message doesn't match, got: No enum 
> constant org.apache.hadoop.mapreduce.v2.api.records.JobState.InvalidState 
> expected: No enum const class 
> org.apache.hadoop.mapreduce.v2.api.records.JobState.InvalidState
> at org.junit.Assert.fail(Assert.java:91)at 
> org.junit.Assert.assertTrue(Assert.java:43)
> at 
> org.apache.hadoop.yarn.webapp.WebServicesTestUtils.checkStringMatch(WebServicesTestUtils.java:77)
> at 
> org.apache.hadoop.mapreduce.v2.hs.webapp.TestHsWebServicesJobsQuery.testJobsQueryStateInvalid(TestHsWebServicesJobsQuery.java:286)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4716) TestHsWebServicesJobsQuery.testJobsQueryStateInvalid fails with jdk7

2012-10-09 Thread Vinod Kumar Vavilapalli (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated MAPREDUCE-4716:
---

Hadoop Flags: Reviewed
  Status: Patch Available  (was: Open)

YARN-30 is in. Kicking Jenkins.

> TestHsWebServicesJobsQuery.testJobsQueryStateInvalid fails with jdk7
> 
>
> Key: MAPREDUCE-4716
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4716
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver
>Affects Versions: 0.23.3, 3.0.0, 2.0.2-alpha
>Reporter: Thomas Graves
>Assignee: Thomas Graves
>  Labels: java7
> Attachments: MAPREDUCE-4716.patch
>
>
> Using jdk7 TestHsWebServicesJobsQuery.testJobsQueryStateInvalid  fails.
> It looks like the string changed from "const class" to "constant" in jdk7.
> Tests run: 25, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 9.713 sec 
> <<< FAILURE!
> testJobsQueryStateInvalid(org.apache.hadoop.mapreduce.v2.hs.webapp.TestHsWebServicesJobsQuery)
>   Time elapsed: 0.371 sec  <<< FAILURE!
> java.lang.AssertionError: exception message doesn't match, got: No enum 
> constant org.apache.hadoop.mapreduce.v2.api.records.JobState.InvalidState 
> expected: No enum const class 
> org.apache.hadoop.mapreduce.v2.api.records.JobState.InvalidState
> at org.junit.Assert.fail(Assert.java:91)at 
> org.junit.Assert.assertTrue(Assert.java:43)
> at 
> org.apache.hadoop.yarn.webapp.WebServicesTestUtils.checkStringMatch(WebServicesTestUtils.java:77)
> at 
> org.apache.hadoop.mapreduce.v2.hs.webapp.TestHsWebServicesJobsQuery.testJobsQueryStateInvalid(TestHsWebServicesJobsQuery.java:286)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4661) Add HTTPS for JobTracker and TaskTracker

2012-10-09 Thread Plamen Jeliazkov (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Plamen Jeliazkov updated MAPREDUCE-4661:


Attachment: MAPREDUCE-4661.patch

This latest patch removes ALOT of the unrelated code. It is focused on just the 
HTTPS of the webUIs. I can confirm it compiling on top of HDP 1 currently. I 
will create a patch for trunk once I can validate with some testing that this 
patch works.

> Add HTTPS for JobTracker and TaskTracker
> 
>
> Key: MAPREDUCE-4661
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4661
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: webapps
>Affects Versions: 1.0.3
>Reporter: Plamen Jeliazkov
>Assignee: Plamen Jeliazkov
> Fix For: 1.0.4
>
> Attachments: https.patch, MAPREDUCE-4461.patch, MAPREDUCE-4661.patch, 
> MAPREDUCE-4661.patch
>
>
> In order to provide full security around the cluster, the webUI should also 
> be secure if desired to prevent cookie theft and user masquerading. 
> Here is my proposed work. Currently I can only add HTTPS support. I do not 
> know how to switch reliance of the HttpServer from HTTP to HTTPS fully.
> In order to facilitate this change I propose the following configuration 
> additions:
> CONFIG PROPERTY -> DEFAULT VALUE
> mapred.https.enable -> false
> mapred.https.need.client.auth -> false
> mapred.https.server.keystore.resource -> "ssl-server.xml"
> mapred.job.tracker.https.port -> 50035
> mapred.job.tracker.https.address -> ":50035"
> mapred.task.tracker.https.port -> 50065
> mapred.task.tracker.https.address -> ":50065"
> I tested this on my local box after using keytool to generate a SSL 
> certficate. You will need to change ssl-server.xml to point to the .keystore 
> file after. Truststore may not be necessary; you can just point it to the 
> keystore.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4278) cannot run two local jobs in parallel from the same gateway.

2012-10-09 Thread Sandy Ryza (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13472773#comment-13472773
 ] 

Sandy Ryza commented on MAPREDUCE-4278:
---

I was able to reproduce the issue using Pig and verify that the patch fixed it.

> cannot run two local jobs in parallel from the same gateway.
> 
>
> Key: MAPREDUCE-4278
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4278
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 0.20.205.0
>Reporter: Araceli Henley
> Attachments: MAPREDUCE-4278-branch1.patch
>
>
> I cannot run two local mode jobs from Pig in parallel from the same gateway, 
> this is a typical use case. If I re-run the tests sequentially, then the test 
> pass. This seems to be a problem from Hadoop.
> Additionally, the pig harness, expects to be able to run 
> Pig-version-undertest against Pig-version-stable from the same gateway.
> To replicate the error:
> I have two clusters running from the same gateway.
> If I run the Pig regression suites nightly.conf in local mode in paralell - 
> once on each cluster. Conflicts in M/R local mode result in failures in the 
> tests. 
> ERROR1:
> org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find
> output/file.out in any of the configured local directories
> at
> org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:429)
> at
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:160)
> at
> org.apache.hadoop.mapred.MapOutputFile.getOutputFile(MapOutputFile.java:56)
> at org.apache.hadoop.mapred.Task.calculateOutputSize(Task.java:944)
> at org.apache.hadoop.mapred.Task.sendLastUpdate(Task.java:924)
> at org.apache.hadoop.mapred.Task.done(Task.java:875)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:374)
> ---
> ERROR2:
> 2012-05-17 20:25:36,762 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> -
> HadoopJobId: job_local_0001
> 2012-05-17 20:25:36,778 [Thread-3] INFO  org.apache.hadoop.mapred.Task -
> Using ResourceCalculatorPlugin : org.apache.
> hadoop.util.LinuxResourceCalculatorPlugin@ffa490e
> 2012-05-17 20:25:36,837 [Thread-3] WARN
> org.apache.hadoop.mapred.LocalJobRunner - job_local_0001
> java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
> at java.util.ArrayList.RangeCheck(ArrayList.java:547)
> at java.util.ArrayList.get(ArrayList.java:322)
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getLoadFunc(PigInputFormat.java
> :153)
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.createRecordReader(PigInputForm
> at.java:106)
> at
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.(MapTask.java:489)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:731)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
> at
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
> 2012-05-17 20:25:41,291 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4659) Confusing output when running "hadoop version" from one hadoop installation when HADOOP_HOME points to another

2012-10-09 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13472732#comment-13472732
 ] 

Hadoop QA commented on MAPREDUCE-4659:
--

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12548443/MAPREDUCE-4659-5.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-common-project/hadoop-common 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2921//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2921//console

This message is automatically generated.

> Confusing output when running "hadoop version" from one hadoop installation 
> when HADOOP_HOME points to another
> --
>
> Key: MAPREDUCE-4659
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4659
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: client
>Affects Versions: 0.20.2, 2.0.1-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: MAPREDUCE-4659-2.patch, MAPREDUCE-4659-3.patch, 
> MAPREDUCE-4659-4.patch, MAPREDUCE-4659-5.patch, MAPREDUCE-4659.patch
>
>
> Hadoop version X is downloaded to ~/hadoop-x, and Hadoop version Y is 
> downloaded to ~/hadoop-y.  HADOOP_HOME is set to hadoop-x.  A user running 
> hadoop-y/bin/hadoop might expect to be running the hadoop-y jars, but, 
> because of HADOOP_HOME, will actually be running hadoop-x jars.
> "hadoop version" could help clear this up a little by reporting the current 
> HADOOP_HOME.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4716) TestHsWebServicesJobsQuery.testJobsQueryStateInvalid fails with jdk7

2012-10-09 Thread Vinod Kumar Vavilapalli (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13472713#comment-13472713
 ] 

Vinod Kumar Vavilapalli commented on MAPREDUCE-4716:


+1, looks good. Will wake up Jenkins once YARN-30 is in.

> TestHsWebServicesJobsQuery.testJobsQueryStateInvalid fails with jdk7
> 
>
> Key: MAPREDUCE-4716
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4716
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver
>Affects Versions: 0.23.3, 3.0.0, 2.0.2-alpha
>Reporter: Thomas Graves
>Assignee: Thomas Graves
>  Labels: java7
> Attachments: MAPREDUCE-4716.patch
>
>
> Using jdk7 TestHsWebServicesJobsQuery.testJobsQueryStateInvalid  fails.
> It looks like the string changed from "const class" to "constant" in jdk7.
> Tests run: 25, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 9.713 sec 
> <<< FAILURE!
> testJobsQueryStateInvalid(org.apache.hadoop.mapreduce.v2.hs.webapp.TestHsWebServicesJobsQuery)
>   Time elapsed: 0.371 sec  <<< FAILURE!
> java.lang.AssertionError: exception message doesn't match, got: No enum 
> constant org.apache.hadoop.mapreduce.v2.api.records.JobState.InvalidState 
> expected: No enum const class 
> org.apache.hadoop.mapreduce.v2.api.records.JobState.InvalidState
> at org.junit.Assert.fail(Assert.java:91)at 
> org.junit.Assert.assertTrue(Assert.java:43)
> at 
> org.apache.hadoop.yarn.webapp.WebServicesTestUtils.checkStringMatch(WebServicesTestUtils.java:77)
> at 
> org.apache.hadoop.mapreduce.v2.hs.webapp.TestHsWebServicesJobsQuery.testJobsQueryStateInvalid(TestHsWebServicesJobsQuery.java:286)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4676) Add test for job history cleaner

2012-10-09 Thread Sandy Ryza (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13472675#comment-13472675
 ] 

Sandy Ryza commented on MAPREDUCE-4676:
---

I should have two copies of the same method?  It's a pretty large method.  I 
could break the method into smaller parts and have the test call the parts that 
it needs?

> Add test for job history cleaner
> 
>
> Key: MAPREDUCE-4676
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4676
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver
>Affects Versions: 1.0.3, 2.0.1-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: MAPREDUCE-4676.patch, MAPREDUCE-4676-trunk.patch
>
>
> Add a test to TestJobHistory that verifies that the HistoryCleaner cleans up 
> the job history

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4659) Confusing output when running "hadoop version" from one hadoop installation when HADOOP_HOME points to another

2012-10-09 Thread Sandy Ryza (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated MAPREDUCE-4659:
--

Attachment: MAPREDUCE-4659-5.patch

> Confusing output when running "hadoop version" from one hadoop installation 
> when HADOOP_HOME points to another
> --
>
> Key: MAPREDUCE-4659
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4659
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: client
>Affects Versions: 0.20.2, 2.0.1-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: MAPREDUCE-4659-2.patch, MAPREDUCE-4659-3.patch, 
> MAPREDUCE-4659-4.patch, MAPREDUCE-4659-5.patch, MAPREDUCE-4659.patch
>
>
> Hadoop version X is downloaded to ~/hadoop-x, and Hadoop version Y is 
> downloaded to ~/hadoop-y.  HADOOP_HOME is set to hadoop-x.  A user running 
> hadoop-y/bin/hadoop might expect to be running the hadoop-y jars, but, 
> because of HADOOP_HOME, will actually be running hadoop-x jars.
> "hadoop version" could help clear this up a little by reporting the current 
> HADOOP_HOME.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-695) MiniMRCluster while shutting down should not wait for currently running jobs to finish

2012-10-09 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13472666#comment-13472666
 ] 

Steve Loughran commented on MAPREDUCE-695:
--

Although this doesn't apply to trunk, it does work on branch 1. It is testable 
though -the patch really needs it to catch regressions.

> MiniMRCluster while shutting down should not wait for currently running jobs 
> to finish
> --
>
> Key: MAPREDUCE-695
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-695
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: test
>Affects Versions: 1.0.3
>Reporter: Sreekanth Ramakrishnan
>Priority: Minor
> Attachments: mapreduce-695.patch
>
>
> Currently in {{org.apache.hadoop.mapred.MiniMRCluster.shutdown()}} we do a 
> {{waitTaskTrackers()}} which can cause {{MiniMRCluster}} to hang indefinitely 
> when used in conjunction with Controlled jobs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4712) mr-jobhistory-daemon.sh doesn't accept --config

2012-10-09 Thread Arun C Murthy (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated MAPREDUCE-4712:
-

Fix Version/s: (was: 3.0.0)

> mr-jobhistory-daemon.sh doesn't accept --config
> ---
>
> Key: MAPREDUCE-4712
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4712
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver
>Affects Versions: 2.0.2-alpha
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Vinod Kumar Vavilapalli
> Fix For: 2.0.3-alpha
>
> Attachments: MAPREDUCE-4712-20121005.txt
>
>
> It says
> {code}
> $ $HADOOP_MAPRED_HOME/sbin/mr-jobhistory-daemon.sh --config 
> /Users/vinodkv/tmp/conf/ start historyserver
> Usage: mr-jobhistory-daemon.sh [--config ] (start|stop) 
> 
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-3936) Clients should not enforce counter limits

2012-10-09 Thread Robert Joseph Evans (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13472659#comment-13472659
 ] 

Robert Joseph Evans commented on MAPREDUCE-3936:


MAPREDUCE-3061 is only a concept right now.  The JIRA was created over a year 
ago, and the only update since was someone asking for more clarification about 
the requirements to which no one responded.  I don't really want to wait for a 
JIRA that is likely to be off in the future to fix a very real problem that we 
have right now.

Additionally I don't see splitting the history server into two independent 
parts as being something that will solve this problem.  It could help, and any 
changes we make should ideally have this split in mind, but it will not just 
solve the issue.  The issue is how much data can the history server cache in 
memory vs. leave in HDFS and reconstruct on demand. And what is the granularity 
of that caching.  Right now the caching is happening on a per job basis, which 
is way too large.

We could fix this by not caching at all. Every time a page is loaded, a web 
service call is made, or an RPC call comes in we parse the job history log and 
reconstruct just the data for that request and nothing else.  On some very 
large jobs(50,000+ tasks) I have seen parsing the log take 10 seconds so this 
would have a negative impact on page load times.  Also what kind of extra load 
would we be placing on HDFS doing this every time? It really depends on how 
used the history server becomes.

The final solution really has to be some middle ground where we can cache a 
known quantity of data, and then reconstruct everything else on demand as 
needed.  This is a lot of work, and so in the short term I would prefer to see 
something that allows the history server to not crash with an OOM, but will 
still provide most of the needed functionality until something better can be 
written.

I know that the History Server can easily get OOMs when loading large jobs with 
lots of tasks, which is a far bigger concern to me then the counters are right 
now, simply because the AM still tries to enforce the counter limits.

> Clients should not enforce counter limits 
> --
>
> Key: MAPREDUCE-3936
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3936
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mrv1
>Reporter: Tom White
>Assignee: Tom White
> Attachments: MAPREDUCE-3936.patch, MAPREDUCE-3936.patch
>
>
> The code for enforcing counter limits (from MAPREDUCE-1943) creates a static 
> JobConf instance to load the limits, which may throw an exception if the 
> client limit is set to be lower than the limit on the cluster (perhaps 
> because the cluster limit was raised from the default).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4659) Confusing output when running "hadoop version" from one hadoop installation when HADOOP_HOME points to another

2012-10-09 Thread Alejandro Abdelnur (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13472629#comment-13472629
 ] 

Alejandro Abdelnur commented on MAPREDUCE-4659:
---

Looks good, only NIT is that in hadoop-common the standard practice is to do 
use the private annotation as @InterfaceAudience.Private, would you please 
update the patch accordingly?

> Confusing output when running "hadoop version" from one hadoop installation 
> when HADOOP_HOME points to another
> --
>
> Key: MAPREDUCE-4659
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4659
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: client
>Affects Versions: 0.20.2, 2.0.1-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: MAPREDUCE-4659-2.patch, MAPREDUCE-4659-3.patch, 
> MAPREDUCE-4659-4.patch, MAPREDUCE-4659.patch
>
>
> Hadoop version X is downloaded to ~/hadoop-x, and Hadoop version Y is 
> downloaded to ~/hadoop-y.  HADOOP_HOME is set to hadoop-x.  A user running 
> hadoop-y/bin/hadoop might expect to be running the hadoop-y jars, but, 
> because of HADOOP_HOME, will actually be running hadoop-x jars.
> "hadoop version" could help clear this up a little by reporting the current 
> HADOOP_HOME.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4716) TestHsWebServicesJobsQuery.testJobsQueryStateInvalid fails with jdk7

2012-10-09 Thread Thomas Graves (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated MAPREDUCE-4716:
-

Labels: java7  (was: )

> TestHsWebServicesJobsQuery.testJobsQueryStateInvalid fails with jdk7
> 
>
> Key: MAPREDUCE-4716
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4716
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver
>Affects Versions: 0.23.3, 3.0.0, 2.0.2-alpha
>Reporter: Thomas Graves
>Assignee: Thomas Graves
>  Labels: java7
> Attachments: MAPREDUCE-4716.patch
>
>
> Using jdk7 TestHsWebServicesJobsQuery.testJobsQueryStateInvalid  fails.
> It looks like the string changed from "const class" to "constant" in jdk7.
> Tests run: 25, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 9.713 sec 
> <<< FAILURE!
> testJobsQueryStateInvalid(org.apache.hadoop.mapreduce.v2.hs.webapp.TestHsWebServicesJobsQuery)
>   Time elapsed: 0.371 sec  <<< FAILURE!
> java.lang.AssertionError: exception message doesn't match, got: No enum 
> constant org.apache.hadoop.mapreduce.v2.api.records.JobState.InvalidState 
> expected: No enum const class 
> org.apache.hadoop.mapreduce.v2.api.records.JobState.InvalidState
> at org.junit.Assert.fail(Assert.java:91)at 
> org.junit.Assert.assertTrue(Assert.java:43)
> at 
> org.apache.hadoop.yarn.webapp.WebServicesTestUtils.checkStringMatch(WebServicesTestUtils.java:77)
> at 
> org.apache.hadoop.mapreduce.v2.hs.webapp.TestHsWebServicesJobsQuery.testJobsQueryStateInvalid(TestHsWebServicesJobsQuery.java:286)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4716) TestHsWebServicesJobsQuery.testJobsQueryStateInvalid fails with jdk7

2012-10-09 Thread Thomas Graves (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated MAPREDUCE-4716:
-

Attachment: MAPREDUCE-4716.patch

patch but requires YARN-30 function, so will wait to put this into patch 
available until YARN-30 goes in.

> TestHsWebServicesJobsQuery.testJobsQueryStateInvalid fails with jdk7
> 
>
> Key: MAPREDUCE-4716
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4716
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver
>Affects Versions: 0.23.3, 3.0.0, 2.0.2-alpha
>Reporter: Thomas Graves
>Assignee: Thomas Graves
> Attachments: MAPREDUCE-4716.patch
>
>
> Using jdk7 TestHsWebServicesJobsQuery.testJobsQueryStateInvalid  fails.
> It looks like the string changed from "const class" to "constant" in jdk7.
> Tests run: 25, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 9.713 sec 
> <<< FAILURE!
> testJobsQueryStateInvalid(org.apache.hadoop.mapreduce.v2.hs.webapp.TestHsWebServicesJobsQuery)
>   Time elapsed: 0.371 sec  <<< FAILURE!
> java.lang.AssertionError: exception message doesn't match, got: No enum 
> constant org.apache.hadoop.mapreduce.v2.api.records.JobState.InvalidState 
> expected: No enum const class 
> org.apache.hadoop.mapreduce.v2.api.records.JobState.InvalidState
> at org.junit.Assert.fail(Assert.java:91)at 
> org.junit.Assert.assertTrue(Assert.java:43)
> at 
> org.apache.hadoop.yarn.webapp.WebServicesTestUtils.checkStringMatch(WebServicesTestUtils.java:77)
> at 
> org.apache.hadoop.mapreduce.v2.hs.webapp.TestHsWebServicesJobsQuery.testJobsQueryStateInvalid(TestHsWebServicesJobsQuery.java:286)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (MAPREDUCE-4716) TestHsWebServicesJobsQuery.testJobsQueryStateInvalid fails with jdk7

2012-10-09 Thread Thomas Graves (JIRA)

Thomas Graves created MAPREDUCE-4716:


 Summary: TestHsWebServicesJobsQuery.testJobsQueryStateInvalid 
fails with jdk7
 Key: MAPREDUCE-4716
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4716
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver
Affects Versions: 0.23.3, 3.0.0, 2.0.2-alpha
Reporter: Thomas Graves
Assignee: Thomas Graves


Using jdk7 TestHsWebServicesJobsQuery.testJobsQueryStateInvalid  fails.

It looks like the string changed from "const class" to "constant" in jdk7.


Tests run: 25, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 9.713 sec <<< 
FAILURE!
testJobsQueryStateInvalid(org.apache.hadoop.mapreduce.v2.hs.webapp.TestHsWebServicesJobsQuery)
  Time elapsed: 0.371 sec  <<< FAILURE!
java.lang.AssertionError: exception message doesn't match, got: No enum 
constant org.apache.hadoop.mapreduce.v2.api.records.JobState.InvalidState 
expected: No enum const class 
org.apache.hadoop.mapreduce.v2.api.records.JobState.InvalidState
at org.junit.Assert.fail(Assert.java:91)at 
org.junit.Assert.assertTrue(Assert.java:43)
at 
org.apache.hadoop.yarn.webapp.WebServicesTestUtils.checkStringMatch(WebServicesTestUtils.java:77)
at 
org.apache.hadoop.mapreduce.v2.hs.webapp.TestHsWebServicesJobsQuery.testJobsQueryStateInvalid(TestHsWebServicesJobsQuery.java:286)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4278) cannot run two local jobs in parallel from the same gateway.

2012-10-09 Thread Araceli Henley (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13472539#comment-13472539
 ] 

Araceli Henley commented on MAPREDUCE-4278:
---

I was able to reproduce the problem easily by kicking off the pig end to
end tests in parallel. For example, if you kick off the nightly.conf in
parallel from the same gateway, there are always conflicts. The specific
conflict varies from run to run depending on the timing, but there are
always conflicts.





> cannot run two local jobs in parallel from the same gateway.
> 
>
> Key: MAPREDUCE-4278
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4278
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 0.20.205.0
>Reporter: Araceli Henley
> Attachments: MAPREDUCE-4278-branch1.patch
>
>
> I cannot run two local mode jobs from Pig in parallel from the same gateway, 
> this is a typical use case. If I re-run the tests sequentially, then the test 
> pass. This seems to be a problem from Hadoop.
> Additionally, the pig harness, expects to be able to run 
> Pig-version-undertest against Pig-version-stable from the same gateway.
> To replicate the error:
> I have two clusters running from the same gateway.
> If I run the Pig regression suites nightly.conf in local mode in paralell - 
> once on each cluster. Conflicts in M/R local mode result in failures in the 
> tests. 
> ERROR1:
> org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find
> output/file.out in any of the configured local directories
> at
> org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:429)
> at
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:160)
> at
> org.apache.hadoop.mapred.MapOutputFile.getOutputFile(MapOutputFile.java:56)
> at org.apache.hadoop.mapred.Task.calculateOutputSize(Task.java:944)
> at org.apache.hadoop.mapred.Task.sendLastUpdate(Task.java:924)
> at org.apache.hadoop.mapred.Task.done(Task.java:875)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:374)
> ---
> ERROR2:
> 2012-05-17 20:25:36,762 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> -
> HadoopJobId: job_local_0001
> 2012-05-17 20:25:36,778 [Thread-3] INFO  org.apache.hadoop.mapred.Task -
> Using ResourceCalculatorPlugin : org.apache.
> hadoop.util.LinuxResourceCalculatorPlugin@ffa490e
> 2012-05-17 20:25:36,837 [Thread-3] WARN
> org.apache.hadoop.mapred.LocalJobRunner - job_local_0001
> java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
> at java.util.ArrayList.RangeCheck(ArrayList.java:547)
> at java.util.ArrayList.get(ArrayList.java:322)
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getLoadFunc(PigInputFormat.java
> :153)
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.createRecordReader(PigInputForm
> at.java:106)
> at
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.(MapTask.java:489)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:731)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
> at
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
> 2012-05-17 20:25:41,291 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4568) Throw "early" exception when duplicate files or archives are found in distributed cache

2012-10-09 Thread Robert Joseph Evans (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13472526#comment-13472526
 ] 

Robert Joseph Evans commented on MAPREDUCE-4568:


I spoke with Virag about this before he filed the JIRA.  The main goal here is 
to provide a way for Oozie to be able to maintain a bit more of a semblance of 
backwards compatibility even after MAPREDUCE-4549 goes in.  They essentially 
want to de-dupe the entires in the dist cache that would cause an error.  We 
originally decided on having a exception thrown because it would allow for 
other errors/checks that may show up in the future to also be added in.  I 
don't think there would be a problem with adding in a new API that throws an 
exception if that API was also added into the 1.x line as well, but perhaps did 
not throw anything because there are not the same limitations.

I realize that adding in new APIs, especially since we already have 3 classes 
that have these types of APIs in them, is not ideal, but it is the only way to 
maintain backwards compatibility and evolve the API.

> Throw "early" exception when duplicate files or archives are found in 
> distributed cache
> ---
>
> Key: MAPREDUCE-4568
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4568
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Mohammad Kamrul Islam
>Assignee: Arun C Murthy
>
> According to #MAPREDUCE-4549, Hadoop 2.x throws exception if duplicates found 
> in cacheFiles or cacheArchives. The exception  throws during job submission.
> This JIRA is to throw the exception ==early== when it is first added to the 
> Distributed Cache through addCacheFile or addFileToClassPath.
> It will help the client to decide whether to fail-fast or continue w/o the 
> duplicated entries.
> Alternatively, Hadoop could provide a knob where user will choose whether to 
> throw error( coming behavior) or silently ignore (old behavior).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4676) Add test for job history cleaner

2012-10-09 Thread Alejandro Abdelnur (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13472519#comment-13472519
 ] 

Alejandro Abdelnur commented on MAPREDUCE-4676:
---

After looking at the code, it would quite difficult to mock things without a 
big refactoring. Thus, I agree with Tom, I'd suggest to create a method with 
the old signature and use that it the production code and have the method with 
the boolean param for testing with the @VisibleForTesting annotation.

> Add test for job history cleaner
> 
>
> Key: MAPREDUCE-4676
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4676
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver
>Affects Versions: 1.0.3, 2.0.1-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: MAPREDUCE-4676.patch, MAPREDUCE-4676-trunk.patch
>
>
> Add a test to TestJobHistory that verifies that the HistoryCleaner cleans up 
> the job history

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-695) MiniMRCluster while shutting down should not wait for currently running jobs to finish

2012-10-09 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13472502#comment-13472502
 ] 

Hadoop QA commented on MAPREDUCE-695:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12412383/mapreduce-695.patch
  against trunk revision .

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2920//console

This message is automatically generated.

> MiniMRCluster while shutting down should not wait for currently running jobs 
> to finish
> --
>
> Key: MAPREDUCE-695
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-695
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: test
>Affects Versions: 1.0.3
>Reporter: Sreekanth Ramakrishnan
>Priority: Minor
> Attachments: mapreduce-695.patch
>
>
> Currently in {{org.apache.hadoop.mapred.MiniMRCluster.shutdown()}} we do a 
> {{waitTaskTrackers()}} which can cause {{MiniMRCluster}} to hang indefinitely 
> when used in conjunction with Controlled jobs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-695) MiniMRCluster while shutting down should not wait for currently running jobs to finish

2012-10-09 Thread Steve Loughran (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated MAPREDUCE-695:
-

Status: Patch Available  (was: Open)

submitting this as a patch

> MiniMRCluster while shutting down should not wait for currently running jobs 
> to finish
> --
>
> Key: MAPREDUCE-695
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-695
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: test
>Affects Versions: 1.0.3
>Reporter: Sreekanth Ramakrishnan
>Priority: Minor
> Attachments: mapreduce-695.patch
>
>
> Currently in {{org.apache.hadoop.mapred.MiniMRCluster.shutdown()}} we do a 
> {{waitTaskTrackers()}} which can cause {{MiniMRCluster}} to hang indefinitely 
> when used in conjunction with Controlled jobs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-2454) Allow external sorter plugin for MR

2012-10-09 Thread Sean Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13472494#comment-13472494
 ] 

Sean Zhang commented on MAPREDUCE-2454:
---

This is going to enable a new set of potential for Hadoop. I hope this can be 
committed soon. 

> Allow external sorter plugin for MR
> ---
>
> Key: MAPREDUCE-2454
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2454
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Affects Versions: 2.0.0-alpha, 3.0.0, 2.0.2-alpha
>Reporter: Mariappan Asokan
>Assignee: Mariappan Asokan
>Priority: Minor
>  Labels: features, performance, plugin, sort
> Attachments: HadoopSortPlugin.pdf, HadoopSortPlugin.pdf, 
> KeyValueIterator.java, MapOutputSorterAbstract.java, MapOutputSorter.java, 
> mapreduce-2454.patch, mapreduce-2454.patch, mapreduce-2454.patch, 
> mapreduce-2454.patch, mapreduce-2454.patch, mapreduce-2454.patch, 
> mapreduce-2454.patch, mapreduce-2454.patch, mapreduce-2454.patch, 
> mapreduce-2454.patch, mapreduce-2454.patch, 
> mr-2454-on-mr-279-build82.patch.gz, MR-2454-trunkPatchPreview.gz, 
> ReduceInputSorter.java
>
>
> Define interfaces and some abstract classes in the Hadoop framework to 
> facilitate external sorter plugins both on the Map and Reduce sides.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4711) Append time elapsed since job-start-time for finished tasks

2012-10-09 Thread Ravi Prakash (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13472493#comment-13472493
 ] 

Ravi Prakash commented on MAPREDUCE-4711:
-

My main purpose in filing this JIRA was to present ALL the information that was 
available in branch-1 in the web UI of 0.23. After this JIRA, for retired jobs, 
nothing will be missing.

> Append time elapsed since job-start-time for finished tasks
> ---
>
> Key: MAPREDUCE-4711
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4711
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver
>Affects Versions: 0.23.3
>Reporter: Ravi Prakash
>Assignee: Ravi Prakash
> Attachments: MAPREDUCE-4711.branch-0.23.patch
>
>
> In 0.20.x/1.x, the analyze job link gave this information
> bq. The last Map task task_ finished at (relative to the Job launch 
> time): 5/10 20:23:10 (1hrs, 27mins, 54sec)
> The time it took for the last task to finish needs to be calculated mentally 
> in 0.23. I believe we should print it next to the finish time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4711) Append time elapsed since job-start-time for finished tasks

2012-10-09 Thread Ravi Prakash (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13472490#comment-13472490
 ] 

Ravi Prakash commented on MAPREDUCE-4711:
-

Bobby! I agree with you. The table already was, and I've made it more, crowded. 
I am going to be working on adding more intuitive visual representations of the 
job progress. Graphs and all. My vision is that the table will only serve as a 
repository to look up all the raw data about job start and finish times. If you 
want to let this JIRA stay uncommitted until I add in the visual 
representation, I'm fine with that.

Like Jason already pointed out in our discussion, hiding/showing coloumns will 
result in "column hell" (I believe he can be credited with coining this term 
;-) ) where URLs shared between users might not incorporate the information 
about which columns are being shown and which are hidden. We could incorporate 
that into the URL, but I'm guessing there are better ways to present the 
information. We could also avoid printing the date for jobs which finish on the 
same day as they started. Like you suggested, we can address that in a separate 
JIRA.

An "Analyze the job" link in my mind would incorporate something like Vaidya so 
I'm wary of adding it right now.

I'll file the JIRA to fix the start time.

> Append time elapsed since job-start-time for finished tasks
> ---
>
> Key: MAPREDUCE-4711
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4711
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver
>Affects Versions: 0.23.3
>Reporter: Ravi Prakash
>Assignee: Ravi Prakash
> Attachments: MAPREDUCE-4711.branch-0.23.patch
>
>
> In 0.20.x/1.x, the analyze job link gave this information
> bq. The last Map task task_ finished at (relative to the Job launch 
> time): 5/10 20:23:10 (1hrs, 27mins, 54sec)
> The time it took for the last task to finish needs to be calculated mentally 
> in 0.23. I believe we should print it next to the finish time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4654) TestDistCp is @ignored

2012-10-09 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13472465#comment-13472465
 ] 

Hudson commented on MAPREDUCE-4654:
---

Integrated in Hadoop-Mapreduce-trunk-Commit #2859 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2859/])
MAPREDUCE-4654. TestDistCp is ignored. Contributed by Sandy Ryza. (Revision 
1396047)

 Result = FAILURE
tomwhite : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1396047
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-tools/hadoop-distcp/src/test/java/org/apache/hadoop/tools/TestDistCp.java
* 
/hadoop/common/trunk/hadoop-tools/hadoop-distcp/src/test/java/org/apache/hadoop/tools/TestIntegration.java


> TestDistCp is @ignored
> --
>
> Key: MAPREDUCE-4654
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4654
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.0.2-alpha
>Reporter: Colin Patrick McCabe
>Assignee: Sandy Ryza
>Priority: Critical
> Fix For: 2.0.3-alpha
>
> Attachments: MAPREDUCE-4654.patch
>
>
> We should fix TestDistCp so that it actually runs, rather than being ignored.
> {code}
> @ignore
> public class TestDistCp {
>   private static final Log LOG = LogFactory.getLog(TestDistCp.class);
>   private static List pathList = new ArrayList();
>   ...
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4654) TestDistCp is @ignored

2012-10-09 Thread Tom White (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White updated MAPREDUCE-4654:
-

   Resolution: Fixed
Fix Version/s: 2.0.3-alpha
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

I just committed this. Thanks Sandy!

> TestDistCp is @ignored
> --
>
> Key: MAPREDUCE-4654
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4654
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.0.2-alpha
>Reporter: Colin Patrick McCabe
>Assignee: Sandy Ryza
>Priority: Critical
> Fix For: 2.0.3-alpha
>
> Attachments: MAPREDUCE-4654.patch
>
>
> We should fix TestDistCp so that it actually runs, rather than being ignored.
> {code}
> @ignore
> public class TestDistCp {
>   private static final Log LOG = LogFactory.getLog(TestDistCp.class);
>   private static List pathList = new ArrayList();
>   ...
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4654) TestDistCp is @ignored

2012-10-09 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13472435#comment-13472435
 ] 

Hudson commented on MAPREDUCE-4654:
---

Integrated in Hadoop-Common-trunk-Commit #2836 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2836/])
MAPREDUCE-4654. TestDistCp is ignored. Contributed by Sandy Ryza. (Revision 
1396047)

 Result = SUCCESS
tomwhite : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1396047
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-tools/hadoop-distcp/src/test/java/org/apache/hadoop/tools/TestDistCp.java
* 
/hadoop/common/trunk/hadoop-tools/hadoop-distcp/src/test/java/org/apache/hadoop/tools/TestIntegration.java


> TestDistCp is @ignored
> --
>
> Key: MAPREDUCE-4654
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4654
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.0.2-alpha
>Reporter: Colin Patrick McCabe
>Assignee: Sandy Ryza
>Priority: Critical
> Attachments: MAPREDUCE-4654.patch
>
>
> We should fix TestDistCp so that it actually runs, rather than being ignored.
> {code}
> @ignore
> public class TestDistCp {
>   private static final Log LOG = LogFactory.getLog(TestDistCp.class);
>   private static List pathList = new ArrayList();
>   ...
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4654) TestDistCp is @ignored

2012-10-09 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13472433#comment-13472433
 ] 

Hudson commented on MAPREDUCE-4654:
---

Integrated in Hadoop-Hdfs-trunk-Commit #2898 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2898/])
MAPREDUCE-4654. TestDistCp is ignored. Contributed by Sandy Ryza. (Revision 
1396047)

 Result = SUCCESS
tomwhite : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1396047
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-tools/hadoop-distcp/src/test/java/org/apache/hadoop/tools/TestDistCp.java
* 
/hadoop/common/trunk/hadoop-tools/hadoop-distcp/src/test/java/org/apache/hadoop/tools/TestIntegration.java


> TestDistCp is @ignored
> --
>
> Key: MAPREDUCE-4654
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4654
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.0.2-alpha
>Reporter: Colin Patrick McCabe
>Assignee: Sandy Ryza
>Priority: Critical
> Attachments: MAPREDUCE-4654.patch
>
>
> We should fix TestDistCp so that it actually runs, rather than being ignored.
> {code}
> @ignore
> public class TestDistCp {
>   private static final Log LOG = LogFactory.getLog(TestDistCp.class);
>   private static List pathList = new ArrayList();
>   ...
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4715) Heap memory growing when JVM reuse option enabled

2012-10-09 Thread Vladimir Klimontovich (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13472432#comment-13472432
 ] 

Vladimir Klimontovich commented on MAPREDUCE-4715:
--

FileSystem class keeps a static cache of all FileSystem objects. See 
FileSystem#CACHE static field. The cache key is a pair of file system URI 
(string) and Configuration object. To fix that I'd suggest to run 
FileSystem.closeAll() (which clears the cache) somewhere in the cleanup section 
of TaskRunner.java

> Heap memory growing when JVM reuse option enabled
> -
>
> Key: MAPREDUCE-4715
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4715
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: job submission
>Affects Versions: 0.20.2
> Environment: Linux, cdh3u5
>Reporter: Boris Ryakhovskiy
>Priority: Minor
> Attachments: VisualVM_screenshot.png
>
>
> When mapred.job.reuse.jvm.num.tasks option is set to 100 or more tasks, JVM 
> fails with "java.lang.OutOfMemoryError: Java heap space" error message. Jobs 
> itself are small and job heap size is set to -Xmx200m .
> It looks like the reason of the issue is a FileSystem$Cache object which 
> collects JobConf objects from all jobs which were executed in current JVM. GC 
> root for the FileSystem$Cache is ApplicationShutdownHooks class (cannot 
> attach screenshot from visualvm)
> In my case each of JobConf objects allocates ~500K (50M total in case of 100 
> jobs).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-3678) The Map tasks logs should have the value of input split it processed

2012-10-09 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13472427#comment-13472427
 ] 

Hudson commented on MAPREDUCE-3678:
---

Integrated in Hadoop-Mapreduce-trunk-Commit #2858 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2858/])
MAPREDUCE-3678. The Map tasks logs should have the value of input split it 
processed. Contributed by Harsh J. (harsh) (Revision 1396032)

 Result = FAILURE
harsh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1396032
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/MapTask.java


> The Map tasks logs should have the value of input split it processed
> 
>
> Key: MAPREDUCE-3678
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3678
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: mrv1, mrv2
>Affects Versions: 1.0.0, 2.0.0-alpha
>Reporter: Bejoy KS
>Assignee: Harsh J
> Fix For: 1.2.0, 2.0.3-alpha
>
> Attachments: MAPREDUCE-3678-branch-1.patch, MAPREDUCE-3678.patch
>
>
> It would be easier to debug some corner in tasks if we knew what was the 
> input split processed by that task. Map reduce task tracker log should 
> accommodate the same. Also in the jobdetails web UI, the split also should be 
> displayed along with the Split Locations. 
> Sample as
> Input Split
> hdfs://myserver:9000/userdata/sampleapp/inputdir/file1.csv -  no>/
> This would be much beneficial to nail down some data quality issues in large 
> data volume processing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4278) cannot run two local jobs in parallel from the same gateway.

2012-10-09 Thread Tom White (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13472419#comment-13472419
 ] 

Tom White commented on MAPREDUCE-4278:
--

You're right - it's not easy to create a unit test where the job IDs collide 
with the current code. Can you run a manual test without the patch that runs 
two jobs and produces a collision, and then test that with the patch there is 
no collision as a sanity check?

> Also, I realized that with my approach the randids could get mixed if two 
> jobs were submitted concurrently using the same LocalJobRunner. Is this a 
> concern?

LocalJobRunner doesn't support running multiple jobs concurrently, so I don't 
think your change makes things worse. We could add some class javadoc to 
clarify what it supports (i.e. use an instance of LJR per job to run multiple 
jobs in a single JVM). 

> cannot run two local jobs in parallel from the same gateway.
> 
>
> Key: MAPREDUCE-4278
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4278
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 0.20.205.0
>Reporter: Araceli Henley
> Attachments: MAPREDUCE-4278-branch1.patch
>
>
> I cannot run two local mode jobs from Pig in parallel from the same gateway, 
> this is a typical use case. If I re-run the tests sequentially, then the test 
> pass. This seems to be a problem from Hadoop.
> Additionally, the pig harness, expects to be able to run 
> Pig-version-undertest against Pig-version-stable from the same gateway.
> To replicate the error:
> I have two clusters running from the same gateway.
> If I run the Pig regression suites nightly.conf in local mode in paralell - 
> once on each cluster. Conflicts in M/R local mode result in failures in the 
> tests. 
> ERROR1:
> org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find
> output/file.out in any of the configured local directories
> at
> org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:429)
> at
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:160)
> at
> org.apache.hadoop.mapred.MapOutputFile.getOutputFile(MapOutputFile.java:56)
> at org.apache.hadoop.mapred.Task.calculateOutputSize(Task.java:944)
> at org.apache.hadoop.mapred.Task.sendLastUpdate(Task.java:924)
> at org.apache.hadoop.mapred.Task.done(Task.java:875)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:374)
> ---
> ERROR2:
> 2012-05-17 20:25:36,762 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> -
> HadoopJobId: job_local_0001
> 2012-05-17 20:25:36,778 [Thread-3] INFO  org.apache.hadoop.mapred.Task -
> Using ResourceCalculatorPlugin : org.apache.
> hadoop.util.LinuxResourceCalculatorPlugin@ffa490e
> 2012-05-17 20:25:36,837 [Thread-3] WARN
> org.apache.hadoop.mapred.LocalJobRunner - job_local_0001
> java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
> at java.util.ArrayList.RangeCheck(ArrayList.java:547)
> at java.util.ArrayList.get(ArrayList.java:322)
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getLoadFunc(PigInputFormat.java
> :153)
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.createRecordReader(PigInputForm
> at.java:106)
> at
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.(MapTask.java:489)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:731)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
> at
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
> 2012-05-17 20:25:41,291 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4715) Heap memory growing when JVM reuse option enabled

2012-10-09 Thread Boris Ryakhovskiy (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boris Ryakhovskiy updated MAPREDUCE-4715:
-

Attachment: VisualVM_screenshot.png

VisualVM screenshot attached

> Heap memory growing when JVM reuse option enabled
> -
>
> Key: MAPREDUCE-4715
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4715
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: job submission
>Affects Versions: 0.20.2
> Environment: Linux, cdh3u5
>Reporter: Boris Ryakhovskiy
>Priority: Minor
> Attachments: VisualVM_screenshot.png
>
>
> When mapred.job.reuse.jvm.num.tasks option is set to 100 or more tasks, JVM 
> fails with "java.lang.OutOfMemoryError: Java heap space" error message. Jobs 
> itself are small and job heap size is set to -Xmx200m .
> It looks like the reason of the issue is a FileSystem$Cache object which 
> collects JobConf objects from all jobs which were executed in current JVM. GC 
> root for the FileSystem$Cache is ApplicationShutdownHooks class (cannot 
> attach screenshot from visualvm)
> In my case each of JobConf objects allocates ~500K (50M total in case of 100 
> jobs).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (MAPREDUCE-4715) Heap memory growing when JVM reuse option enabled

2012-10-09 Thread Boris Ryakhovskiy (JIRA)

Boris Ryakhovskiy created MAPREDUCE-4715:


 Summary: Heap memory growing when JVM reuse option enabled
 Key: MAPREDUCE-4715
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4715
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: job submission
Affects Versions: 0.20.2
 Environment: Linux, cdh3u5
Reporter: Boris Ryakhovskiy
Priority: Minor
 Attachments: VisualVM_screenshot.png

When mapred.job.reuse.jvm.num.tasks option is set to 100 or more tasks, JVM 
fails with "java.lang.OutOfMemoryError: Java heap space" error message. Jobs 
itself are small and job heap size is set to -Xmx200m .
It looks like the reason of the issue is a FileSystem$Cache object which 
collects JobConf objects from all jobs which were executed in current JVM. GC 
root for the FileSystem$Cache is ApplicationShutdownHooks class (cannot attach 
screenshot from visualvm)
In my case each of JobConf objects allocates ~500K (50M total in case of 100 
jobs).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4554) Job Credentials are not transmitted if security is turned off

2012-10-09 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13472403#comment-13472403
 ] 

Hudson commented on MAPREDUCE-4554:
---

Integrated in Hadoop-Mapreduce-trunk #1221 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1221/])
MAPREDUCE-4554. Job Credentials are not transmitted if security is turned 
off (Benoy Antony via bobby) (Revision 1395769)

 Result = SUCCESS
bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1395769
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/MRAppMaster.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskAttemptImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestStagingCleanup.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/main/java/org/apache/hadoop/mapred/YARNRunner.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/security/CredentialsTestJob.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/security/TestMRCredentials.java


> Job Credentials are not transmitted if security is turned off
> -
>
> Key: MAPREDUCE-4554
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4554
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: job submission, security
>Affects Versions: 2.0.0-alpha
>Reporter: Benoy Antony
>Assignee: Benoy Antony
> Fix For: 3.0.0, 2.0.3-alpha, 0.23.5
>
> Attachments: MR_4554_trunk.patch, MR_4554_trunk.patch, 
> MR_4554_trunk.patch, MR_4554_trunk.patch, MR_4554_trunk.patch, 
> MR_4554_trunk.patch, MR_4554_trunk.patch, MR_4554_trunk.patch, 
> MR_4554_trunk.patch
>
>
> Credentials (secret keys) can be passed to a job via 
> mapreduce.job.credentials.json or mapreduce.job.credentials.binary .
> These credentials get submitted during job submission and are made available 
> to the task processes.
> In HADOOP 1, these credentials get submitted and routed to task processes 
> even if security was off.
> In HADOOP 2 , these credentials are transmitted only when the security is 
> turned on.
> This should be changed for two reasons:
> 1) It is not backward compatible. 
> 2) Credentials should be passed even if security is turned off .

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4712) mr-jobhistory-daemon.sh doesn't accept --config

2012-10-09 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13472402#comment-13472402
 ] 

Hudson commented on MAPREDUCE-4712:
---

Integrated in Hadoop-Mapreduce-trunk #1221 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1221/])
MAPREDUCE-4712. mr-jobhistory-daemon.sh doesn't accept --config (Vinod 
Kumar Vavilapalli via tgraves) (Revision 1395724)

 Result = SUCCESS
tgraves : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1395724
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* /hadoop/common/trunk/hadoop-mapreduce-project/bin/mapred-config.sh
* /hadoop/common/trunk/hadoop-mapreduce-project/bin/mr-jobhistory-daemon.sh


> mr-jobhistory-daemon.sh doesn't accept --config
> ---
>
> Key: MAPREDUCE-4712
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4712
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver
>Affects Versions: 2.0.2-alpha
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Vinod Kumar Vavilapalli
> Fix For: 3.0.0, 2.0.3-alpha
>
> Attachments: MAPREDUCE-4712-20121005.txt
>
>
> It says
> {code}
> $ $HADOOP_MAPRED_HOME/sbin/mr-jobhistory-daemon.sh --config 
> /Users/vinodkv/tmp/conf/ start historyserver
> Usage: mr-jobhistory-daemon.sh [--config ] (start|stop) 
> 
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4574) Fix TotalOrderParitioner to work with non-WritableComparable key types

2012-10-09 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13472400#comment-13472400
 ] 

Hudson commented on MAPREDUCE-4574:
---

Integrated in Hadoop-Mapreduce-trunk #1221 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1221/])
MAPREDUCE-4574. Fix TotalOrderParitioner to work with 
non-WritableComparable key types. Contributed by Harsh J. (harsh) (Revision 
1395936)

 Result = SUCCESS
harsh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1395936
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/lib/TotalOrderPartitioner.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/partition/TotalOrderPartitioner.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/lib/partition/TestTotalOrderPartitioner.java


> Fix TotalOrderParitioner to work with non-WritableComparable key types
> --
>
> Key: MAPREDUCE-4574
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4574
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: client
>Affects Versions: 2.0.0-alpha
>Reporter: Harsh J
>Assignee: Harsh J
>Priority: Trivial
> Fix For: 3.0.0
>
> Attachments: MAPREDUCE-4574.patch, MAPREDUCE-4574.patch
>
>
> The current TotalOrderPartitioner class will not work with an alternative 
> serialization library such as Avro.
> To make it work, we may edit the readPartitions bits in it to support 
> non-WritableComparable keys and also remove the WritableComparable check in 
> the class types definition.
> That is, since we do not use the values at all (NullWritable), we may as well 
> do:
> {code}
>   private K[] readPartitions(FileSystem fs, Path p, Class keyClass,
>   Configuration conf) throws IOException {
> …
> while ((key = (K) reader.next(key)) != null) {
>   parts.add(key);
>   key = ReflectionUtils.newInstance(keyClass, conf);
> }
> …
>   }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4705) Historyserver links expire before the history data does

2012-10-09 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13472389#comment-13472389
 ] 

Hudson commented on MAPREDUCE-4705:
---

Integrated in Hadoop-Mapreduce-trunk #1221 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1221/])
MAPREDUCE-4705. Fix a bug in job history lookup, which makes older jobs 
inaccessible despite the presence of a valid history file. (Contributed by 
Jason Lowe) (Revision 1395850)

 Result = SUCCESS
sseth : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1395850
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/HistoryFileManager.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/test/java/org/apache/hadoop/mapreduce/v2/hs/TestJobHistoryParsing.java


> Historyserver links expire before the history data does
> ---
>
> Key: MAPREDUCE-4705
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4705
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver, mrv2
>Affects Versions: 0.23.3
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Critical
> Fix For: 2.0.3-alpha, 0.23.5
>
> Attachments: MAPREDUCE-4705.patch
>
>
> The historyserver can serve up links to jobs that become useless well before 
> the job history files are purged.  For example on a large, heavily used 
> cluster we can end up rotating through the maximum number of jobs the 
> historyserver can track fairly quickly.  If a user was investigating an issue 
> with a job using a saved historyserver URL, that URL can become useless 
> because the historyserver has forgotten about the job even though the history 
> files are still sitting in HDFS.
> We can tell the historyserver to keep track of more jobs by increasing 
> {{mapreduce.jobhistory.joblist.cache.size}}, but this has a direct impact on 
> the responsiveness of the main historyserver page since it serves up all the 
> entries to the client at once.  It looks like Hadoop 1.x avoided this issue 
> by encoding the history file location into the URLs served up by the 
> historyserver, so it didn't have to track a mapping between job ID and 
> history file location.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-695) MiniMRCluster while shutting down should not wait for currently running jobs to finish

2012-10-09 Thread Steve Loughran (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated MAPREDUCE-695:
-

 Priority: Minor  (was: Major)
Affects Version/s: 1.0.3

tagging as still existing in 1.0.3

> MiniMRCluster while shutting down should not wait for currently running jobs 
> to finish
> --
>
> Key: MAPREDUCE-695
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-695
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: test
>Affects Versions: 1.0.3
>Reporter: Sreekanth Ramakrishnan
>Priority: Minor
> Attachments: mapreduce-695.patch
>
>
> Currently in {{org.apache.hadoop.mapred.MiniMRCluster.shutdown()}} we do a 
> {{waitTaskTrackers()}} which can cause {{MiniMRCluster}} to hang indefinitely 
> when used in conjunction with Controlled jobs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-3678) The Map tasks logs should have the value of input split it processed

2012-10-09 Thread Harsh J (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated MAPREDUCE-3678:
---

Component/s: (was: nodemanager)
 (was: tasktracker)
 mrv2
 mrv1

> The Map tasks logs should have the value of input split it processed
> 
>
> Key: MAPREDUCE-3678
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3678
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: mrv1, mrv2
>Affects Versions: 1.0.0, 2.0.0-alpha
>Reporter: Bejoy KS
>Assignee: Harsh J
> Fix For: 1.2.0, 2.0.3-alpha
>
> Attachments: MAPREDUCE-3678-branch-1.patch, MAPREDUCE-3678.patch
>
>
> It would be easier to debug some corner in tasks if we knew what was the 
> input split processed by that task. Map reduce task tracker log should 
> accommodate the same. Also in the jobdetails web UI, the split also should be 
> displayed along with the Split Locations. 
> Sample as
> Input Split
> hdfs://myserver:9000/userdata/sampleapp/inputdir/file1.csv -  no>/
> This would be much beneficial to nail down some data quality issues in large 
> data volume processing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-3678) The Map tasks logs should have the value of input split it processed

2012-10-09 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13472375#comment-13472375
 ] 

Hudson commented on MAPREDUCE-3678:
---

Integrated in Hadoop-Common-trunk-Commit #2835 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2835/])
MAPREDUCE-3678. The Map tasks logs should have the value of input split it 
processed. Contributed by Harsh J. (harsh) (Revision 1396032)

 Result = SUCCESS
harsh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1396032
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/MapTask.java


> The Map tasks logs should have the value of input split it processed
> 
>
> Key: MAPREDUCE-3678
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3678
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: mrv1, mrv2
>Affects Versions: 1.0.0, 2.0.0-alpha
>Reporter: Bejoy KS
>Assignee: Harsh J
> Fix For: 1.2.0, 2.0.3-alpha
>
> Attachments: MAPREDUCE-3678-branch-1.patch, MAPREDUCE-3678.patch
>
>
> It would be easier to debug some corner in tasks if we knew what was the 
> input split processed by that task. Map reduce task tracker log should 
> accommodate the same. Also in the jobdetails web UI, the split also should be 
> displayed along with the Split Locations. 
> Sample as
> Input Split
> hdfs://myserver:9000/userdata/sampleapp/inputdir/file1.csv -  no>/
> This would be much beneficial to nail down some data quality issues in large 
> data volume processing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-3678) The Map tasks logs should have the value of input split it processed

2012-10-09 Thread Harsh J (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated MAPREDUCE-3678:
---

  Resolution: Fixed
   Fix Version/s: 2.0.3-alpha
  1.2.0
Target Version/s:   (was: 1.2.0, 2.0.2-alpha)
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Thanks Tom. I committed this to trunk, branch-2 and branch-1.

> The Map tasks logs should have the value of input split it processed
> 
>
> Key: MAPREDUCE-3678
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3678
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: mrv1, mrv2
>Affects Versions: 1.0.0, 2.0.0-alpha
>Reporter: Bejoy KS
>Assignee: Harsh J
> Fix For: 1.2.0, 2.0.3-alpha
>
> Attachments: MAPREDUCE-3678-branch-1.patch, MAPREDUCE-3678.patch
>
>
> It would be easier to debug some corner in tasks if we knew what was the 
> input split processed by that task. Map reduce task tracker log should 
> accommodate the same. Also in the jobdetails web UI, the split also should be 
> displayed along with the Split Locations. 
> Sample as
> Input Split
> hdfs://myserver:9000/userdata/sampleapp/inputdir/file1.csv -  no>/
> This would be much beneficial to nail down some data quality issues in large 
> data volume processing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-3678) The Map tasks logs should have the value of input split it processed

2012-10-09 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13472372#comment-13472372
 ] 

Hudson commented on MAPREDUCE-3678:
---

Integrated in Hadoop-Hdfs-trunk-Commit #2897 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2897/])
MAPREDUCE-3678. The Map tasks logs should have the value of input split it 
processed. Contributed by Harsh J. (harsh) (Revision 1396032)

 Result = SUCCESS
harsh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1396032
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/MapTask.java


> The Map tasks logs should have the value of input split it processed
> 
>
> Key: MAPREDUCE-3678
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3678
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: nodemanager, tasktracker
>Affects Versions: 1.0.0, 2.0.0-alpha
>Reporter: Bejoy KS
>Assignee: Harsh J
> Attachments: MAPREDUCE-3678-branch-1.patch, MAPREDUCE-3678.patch
>
>
> It would be easier to debug some corner in tasks if we knew what was the 
> input split processed by that task. Map reduce task tracker log should 
> accommodate the same. Also in the jobdetails web UI, the split also should be 
> displayed along with the Split Locations. 
> Sample as
> Input Split
> hdfs://myserver:9000/userdata/sampleapp/inputdir/file1.csv -  no>/
> This would be much beneficial to nail down some data quality issues in large 
> data volume processing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-2584) Check for serializers early, and give out more information regarding missing serializers

2012-10-09 Thread Tom White (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13472368#comment-13472368
 ] 

Tom White commented on MAPREDUCE-2584:
--

This looks like a good check to me. Can you avoid using the @SuppressWarnings 
annotation?

> Check for serializers early, and give out more information regarding missing 
> serializers
> 
>
> Key: MAPREDUCE-2584
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2584
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: task
>Affects Versions: 0.20.2
>Reporter: Harsh J
>Assignee: Harsh J
>  Labels: serializers, tasks
> Attachments: MAPREDUCE-2584.r2.diff, MAPREDUCE-2584.r3.diff, 
> MAPREDUCE-2584.r4.diff, MAPREDUCE-2584.r5.diff, MAPREDUCE-2584.r6.diff, 
> MAPREDUCE-2584.r7.diff, MAPREDUCE-2584.r7.diff
>
>
> As discussed on HADOOP-7328, MapReduce can handle serializers in a much 
> better way in case of bad configuration, improper imports (Some odd Text 
> class instead of the Writable Text set as key), etc..
> This issue covers the MapReduce parts of the improvements (made to IFile, 
> MapOutputBuffer, etc. and possible early-check of serializer availability 
> pre-submit) that provide more information than just an NPE as is the current 
> case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4676) Add test for job history cleaner

2012-10-09 Thread Tom White (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13472360#comment-13472360
 ] 

Tom White commented on MAPREDUCE-4676:
--

> I'm not very convinced on modifying methods signature for testing purposes 
> only.

Use Guava's @VisibleForTesting when there's no obvious alternative.

> Add test for job history cleaner
> 
>
> Key: MAPREDUCE-4676
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4676
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver
>Affects Versions: 1.0.3, 2.0.1-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: MAPREDUCE-4676.patch, MAPREDUCE-4676-trunk.patch
>
>
> Add a test to TestJobHistory that verifies that the HistoryCleaner cleans up 
> the job history

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4574) Fix TotalOrderParitioner to work with non-WritableComparable key types

2012-10-09 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13472343#comment-13472343
 ] 

Hudson commented on MAPREDUCE-4574:
---

Integrated in Hadoop-Hdfs-trunk #1190 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1190/])
MAPREDUCE-4574. Fix TotalOrderParitioner to work with 
non-WritableComparable key types. Contributed by Harsh J. (harsh) (Revision 
1395936)

 Result = SUCCESS
harsh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1395936
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/lib/TotalOrderPartitioner.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/partition/TotalOrderPartitioner.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/lib/partition/TestTotalOrderPartitioner.java


> Fix TotalOrderParitioner to work with non-WritableComparable key types
> --
>
> Key: MAPREDUCE-4574
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4574
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: client
>Affects Versions: 2.0.0-alpha
>Reporter: Harsh J
>Assignee: Harsh J
>Priority: Trivial
> Fix For: 3.0.0
>
> Attachments: MAPREDUCE-4574.patch, MAPREDUCE-4574.patch
>
>
> The current TotalOrderPartitioner class will not work with an alternative 
> serialization library such as Avro.
> To make it work, we may edit the readPartitions bits in it to support 
> non-WritableComparable keys and also remove the WritableComparable check in 
> the class types definition.
> That is, since we do not use the values at all (NullWritable), we may as well 
> do:
> {code}
>   private K[] readPartitions(FileSystem fs, Path p, Class keyClass,
>   Configuration conf) throws IOException {
> …
> while ((key = (K) reader.next(key)) != null) {
>   parts.add(key);
>   key = ReflectionUtils.newInstance(keyClass, conf);
> }
> …
>   }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4712) mr-jobhistory-daemon.sh doesn't accept --config

2012-10-09 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13472345#comment-13472345
 ] 

Hudson commented on MAPREDUCE-4712:
---

Integrated in Hadoop-Hdfs-trunk #1190 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1190/])
MAPREDUCE-4712. mr-jobhistory-daemon.sh doesn't accept --config (Vinod 
Kumar Vavilapalli via tgraves) (Revision 1395724)

 Result = SUCCESS
tgraves : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1395724
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* /hadoop/common/trunk/hadoop-mapreduce-project/bin/mapred-config.sh
* /hadoop/common/trunk/hadoop-mapreduce-project/bin/mr-jobhistory-daemon.sh


> mr-jobhistory-daemon.sh doesn't accept --config
> ---
>
> Key: MAPREDUCE-4712
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4712
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver
>Affects Versions: 2.0.2-alpha
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Vinod Kumar Vavilapalli
> Fix For: 3.0.0, 2.0.3-alpha
>
> Attachments: MAPREDUCE-4712-20121005.txt
>
>
> It says
> {code}
> $ $HADOOP_MAPRED_HOME/sbin/mr-jobhistory-daemon.sh --config 
> /Users/vinodkv/tmp/conf/ start historyserver
> Usage: mr-jobhistory-daemon.sh [--config ] (start|stop) 
> 
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4554) Job Credentials are not transmitted if security is turned off

2012-10-09 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13472346#comment-13472346
 ] 

Hudson commented on MAPREDUCE-4554:
---

Integrated in Hadoop-Hdfs-trunk #1190 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1190/])
MAPREDUCE-4554. Job Credentials are not transmitted if security is turned 
off (Benoy Antony via bobby) (Revision 1395769)

 Result = SUCCESS
bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1395769
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/MRAppMaster.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskAttemptImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestStagingCleanup.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/main/java/org/apache/hadoop/mapred/YARNRunner.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/security/CredentialsTestJob.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/security/TestMRCredentials.java


> Job Credentials are not transmitted if security is turned off
> -
>
> Key: MAPREDUCE-4554
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4554
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: job submission, security
>Affects Versions: 2.0.0-alpha
>Reporter: Benoy Antony
>Assignee: Benoy Antony
> Fix For: 3.0.0, 2.0.3-alpha, 0.23.5
>
> Attachments: MR_4554_trunk.patch, MR_4554_trunk.patch, 
> MR_4554_trunk.patch, MR_4554_trunk.patch, MR_4554_trunk.patch, 
> MR_4554_trunk.patch, MR_4554_trunk.patch, MR_4554_trunk.patch, 
> MR_4554_trunk.patch
>
>
> Credentials (secret keys) can be passed to a job via 
> mapreduce.job.credentials.json or mapreduce.job.credentials.binary .
> These credentials get submitted during job submission and are made available 
> to the task processes.
> In HADOOP 1, these credentials get submitted and routed to task processes 
> even if security was off.
> In HADOOP 2 , these credentials are transmitted only when the security is 
> turned on.
> This should be changed for two reasons:
> 1) It is not backward compatible. 
> 2) Credentials should be passed even if security is turned off .

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4705) Historyserver links expire before the history data does

2012-10-09 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13472332#comment-13472332
 ] 

Hudson commented on MAPREDUCE-4705:
---

Integrated in Hadoop-Hdfs-trunk #1190 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1190/])
MAPREDUCE-4705. Fix a bug in job history lookup, which makes older jobs 
inaccessible despite the presence of a valid history file. (Contributed by 
Jason Lowe) (Revision 1395850)

 Result = SUCCESS
sseth : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1395850
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/HistoryFileManager.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/test/java/org/apache/hadoop/mapreduce/v2/hs/TestJobHistoryParsing.java


> Historyserver links expire before the history data does
> ---
>
> Key: MAPREDUCE-4705
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4705
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver, mrv2
>Affects Versions: 0.23.3
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Critical
> Fix For: 2.0.3-alpha, 0.23.5
>
> Attachments: MAPREDUCE-4705.patch
>
>
> The historyserver can serve up links to jobs that become useless well before 
> the job history files are purged.  For example on a large, heavily used 
> cluster we can end up rotating through the maximum number of jobs the 
> historyserver can track fairly quickly.  If a user was investigating an issue 
> with a job using a saved historyserver URL, that URL can become useless 
> because the historyserver has forgotten about the job even though the history 
> files are still sitting in HDFS.
> We can tell the historyserver to keep track of more jobs by increasing 
> {{mapreduce.jobhistory.joblist.cache.size}}, but this has a direct impact on 
> the responsiveness of the main historyserver page since it serves up all the 
> entries to the client at once.  It looks like Hadoop 1.x avoided this issue 
> by encoding the history file location into the URLs served up by the 
> historyserver, so it didn't have to track a mapping between job ID and 
> history file location.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4705) Historyserver links expire before the history data does

2012-10-09 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13472312#comment-13472312
 ] 

Hudson commented on MAPREDUCE-4705:
---

Integrated in Hadoop-Hdfs-0.23-Build #399 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/399/])
merge MAPREDUCE-4705 from trunk. Fix a bug in job history lookup, which 
makes older jobs inaccessible despite the presence of a valid history file. 
(Contributed by Jason Lowe) (Revision 1395852)

 Result = SUCCESS
sseth : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1395852
Files : 
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/HistoryFileManager.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/test/java/org/apache/hadoop/mapreduce/v2/hs/TestJobHistoryParsing.java


> Historyserver links expire before the history data does
> ---
>
> Key: MAPREDUCE-4705
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4705
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver, mrv2
>Affects Versions: 0.23.3
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Critical
> Fix For: 2.0.3-alpha, 0.23.5
>
> Attachments: MAPREDUCE-4705.patch
>
>
> The historyserver can serve up links to jobs that become useless well before 
> the job history files are purged.  For example on a large, heavily used 
> cluster we can end up rotating through the maximum number of jobs the 
> historyserver can track fairly quickly.  If a user was investigating an issue 
> with a job using a saved historyserver URL, that URL can become useless 
> because the historyserver has forgotten about the job even though the history 
> files are still sitting in HDFS.
> We can tell the historyserver to keep track of more jobs by increasing 
> {{mapreduce.jobhistory.joblist.cache.size}}, but this has a direct impact on 
> the responsiveness of the main historyserver page since it serves up all the 
> entries to the client at once.  It looks like Hadoop 1.x avoided this issue 
> by encoding the history file location into the URLs served up by the 
> historyserver, so it didn't have to track a mapping between job ID and 
> history file location.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4554) Job Credentials are not transmitted if security is turned off

2012-10-09 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13472313#comment-13472313
 ] 

Hudson commented on MAPREDUCE-4554:
---

Integrated in Hadoop-Hdfs-0.23-Build #399 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/399/])
svn merge -c 1395769 FIXES: MAPREDUCE-4554. Job Credentials are not 
transmitted if security is turned off (Benoy Antony via bobby) (Revision 
1395774)

 Result = SUCCESS
bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1395774
Files : 
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/MRAppMaster.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskAttemptImpl.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestStagingCleanup.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/main/java/org/apache/hadoop/mapred/YARNRunner.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/security/CredentialsTestJob.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/security/TestMRCredentials.java


> Job Credentials are not transmitted if security is turned off
> -
>
> Key: MAPREDUCE-4554
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4554
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: job submission, security
>Affects Versions: 2.0.0-alpha
>Reporter: Benoy Antony
>Assignee: Benoy Antony
> Fix For: 3.0.0, 2.0.3-alpha, 0.23.5
>
> Attachments: MR_4554_trunk.patch, MR_4554_trunk.patch, 
> MR_4554_trunk.patch, MR_4554_trunk.patch, MR_4554_trunk.patch, 
> MR_4554_trunk.patch, MR_4554_trunk.patch, MR_4554_trunk.patch, 
> MR_4554_trunk.patch
>
>
> Credentials (secret keys) can be passed to a job via 
> mapreduce.job.credentials.json or mapreduce.job.credentials.binary .
> These credentials get submitted during job submission and are made available 
> to the task processes.
> In HADOOP 1, these credentials get submitted and routed to task processes 
> even if security was off.
> In HADOOP 2 , these credentials are transmitted only when the security is 
> turned on.
> This should be changed for two reasons:
> 1) It is not backward compatible. 
> 2) Credentials should be passed even if security is turned off .

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-3678) The Map tasks logs should have the value of input split it processed

2012-10-09 Thread Tom White (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13472275#comment-13472275
 ] 

Tom White commented on MAPREDUCE-3678:
--

+1

> The Map tasks logs should have the value of input split it processed
> 
>
> Key: MAPREDUCE-3678
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3678
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: nodemanager, tasktracker
>Affects Versions: 1.0.0, 2.0.0-alpha
>Reporter: Bejoy KS
>Assignee: Harsh J
> Attachments: MAPREDUCE-3678-branch-1.patch, MAPREDUCE-3678.patch
>
>
> It would be easier to debug some corner in tasks if we knew what was the 
> input split processed by that task. Map reduce task tracker log should 
> accommodate the same. Also in the jobdetails web UI, the split also should be 
> displayed along with the Split Locations. 
> Sample as
> Input Split
> hdfs://myserver:9000/userdata/sampleapp/inputdir/file1.csv -  no>/
> This would be much beneficial to nail down some data quality issues in large 
> data volume processing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4574) Fix TotalOrderParitioner to work with non-WritableComparable key types

2012-10-09 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13472266#comment-13472266
 ] 

Hudson commented on MAPREDUCE-4574:
---

Integrated in Hadoop-Mapreduce-trunk-Commit #2857 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2857/])
MAPREDUCE-4574. Fix TotalOrderParitioner to work with 
non-WritableComparable key types. Contributed by Harsh J. (harsh) (Revision 
1395936)

 Result = FAILURE
harsh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1395936
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/lib/TotalOrderPartitioner.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/partition/TotalOrderPartitioner.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/lib/partition/TestTotalOrderPartitioner.java


> Fix TotalOrderParitioner to work with non-WritableComparable key types
> --
>
> Key: MAPREDUCE-4574
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4574
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: client
>Affects Versions: 2.0.0-alpha
>Reporter: Harsh J
>Assignee: Harsh J
>Priority: Trivial
> Fix For: 3.0.0
>
> Attachments: MAPREDUCE-4574.patch, MAPREDUCE-4574.patch
>
>
> The current TotalOrderPartitioner class will not work with an alternative 
> serialization library such as Avro.
> To make it work, we may edit the readPartitions bits in it to support 
> non-WritableComparable keys and also remove the WritableComparable check in 
> the class types definition.
> That is, since we do not use the values at all (NullWritable), we may as well 
> do:
> {code}
>   private K[] readPartitions(FileSystem fs, Path p, Class keyClass,
>   Configuration conf) throws IOException {
> …
> while ((key = (K) reader.next(key)) != null) {
>   parts.add(key);
>   key = ReflectionUtils.newInstance(keyClass, conf);
> }
> …
>   }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4574) Fix TotalOrderParitioner to work with non-WritableComparable key types

2012-10-09 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13472256#comment-13472256
 ] 

Hudson commented on MAPREDUCE-4574:
---

Integrated in Hadoop-Common-trunk-Commit #2834 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2834/])
MAPREDUCE-4574. Fix TotalOrderParitioner to work with 
non-WritableComparable key types. Contributed by Harsh J. (harsh) (Revision 
1395936)

 Result = SUCCESS
harsh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1395936
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/lib/TotalOrderPartitioner.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/partition/TotalOrderPartitioner.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/lib/partition/TestTotalOrderPartitioner.java


> Fix TotalOrderParitioner to work with non-WritableComparable key types
> --
>
> Key: MAPREDUCE-4574
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4574
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: client
>Affects Versions: 2.0.0-alpha
>Reporter: Harsh J
>Assignee: Harsh J
>Priority: Trivial
> Fix For: 3.0.0
>
> Attachments: MAPREDUCE-4574.patch, MAPREDUCE-4574.patch
>
>
> The current TotalOrderPartitioner class will not work with an alternative 
> serialization library such as Avro.
> To make it work, we may edit the readPartitions bits in it to support 
> non-WritableComparable keys and also remove the WritableComparable check in 
> the class types definition.
> That is, since we do not use the values at all (NullWritable), we may as well 
> do:
> {code}
>   private K[] readPartitions(FileSystem fs, Path p, Class keyClass,
>   Configuration conf) throws IOException {
> …
> while ((key = (K) reader.next(key)) != null) {
>   parts.add(key);
>   key = ReflectionUtils.newInstance(keyClass, conf);
> }
> …
>   }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4574) Fix TotalOrderParitioner to work with non-WritableComparable key types

2012-10-09 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13472255#comment-13472255
 ] 

Hudson commented on MAPREDUCE-4574:
---

Integrated in Hadoop-Hdfs-trunk-Commit #2896 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2896/])
MAPREDUCE-4574. Fix TotalOrderParitioner to work with 
non-WritableComparable key types. Contributed by Harsh J. (harsh) (Revision 
1395936)

 Result = SUCCESS
harsh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1395936
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/lib/TotalOrderPartitioner.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/partition/TotalOrderPartitioner.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/lib/partition/TestTotalOrderPartitioner.java


> Fix TotalOrderParitioner to work with non-WritableComparable key types
> --
>
> Key: MAPREDUCE-4574
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4574
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: client
>Affects Versions: 2.0.0-alpha
>Reporter: Harsh J
>Assignee: Harsh J
>Priority: Trivial
> Fix For: 3.0.0
>
> Attachments: MAPREDUCE-4574.patch, MAPREDUCE-4574.patch
>
>
> The current TotalOrderPartitioner class will not work with an alternative 
> serialization library such as Avro.
> To make it work, we may edit the readPartitions bits in it to support 
> non-WritableComparable keys and also remove the WritableComparable check in 
> the class types definition.
> That is, since we do not use the values at all (NullWritable), we may as well 
> do:
> {code}
>   private K[] readPartitions(FileSystem fs, Path p, Class keyClass,
>   Configuration conf) throws IOException {
> …
> while ((key = (K) reader.next(key)) != null) {
>   parts.add(key);
>   key = ReflectionUtils.newInstance(keyClass, conf);
> }
> …
>   }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4574) Fix TotalOrderParitioner to work with non-WritableComparable key types

2012-10-09 Thread Harsh J (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated MAPREDUCE-4574:
---

   Resolution: Fixed
Fix Version/s: 3.0.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Thanks Doug! I went ahead and committed this to trunk.

> Fix TotalOrderParitioner to work with non-WritableComparable key types
> --
>
> Key: MAPREDUCE-4574
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4574
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: client
>Affects Versions: 2.0.0-alpha
>Reporter: Harsh J
>Assignee: Harsh J
>Priority: Trivial
> Fix For: 3.0.0
>
> Attachments: MAPREDUCE-4574.patch, MAPREDUCE-4574.patch
>
>
> The current TotalOrderPartitioner class will not work with an alternative 
> serialization library such as Avro.
> To make it work, we may edit the readPartitions bits in it to support 
> non-WritableComparable keys and also remove the WritableComparable check in 
> the class types definition.
> That is, since we do not use the values at all (NullWritable), we may as well 
> do:
> {code}
>   private K[] readPartitions(FileSystem fs, Path p, Class keyClass,
>   Configuration conf) throws IOException {
> …
> while ((key = (K) reader.next(key)) != null) {
>   parts.add(key);
>   key = ReflectionUtils.newInstance(keyClass, conf);
> }
> …
>   }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

65 matches

Mail list logo