[jira] [Updated] (MAPREDUCE-4671) AM does not tell the RM about container requests that are no longer needed

2012-10-10 Thread Bikas Saha (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikas Saha updated MAPREDUCE-4671:
--

Attachment: MAPREDUCE-4671.3.patch

 AM does not tell the RM about container requests that are no longer needed
 --

 Key: MAPREDUCE-4671
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4671
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.23.3, 2.0.0-alpha
Reporter: Bikas Saha
Assignee: Bikas Saha
 Attachments: MAPREDUCE-4671.1.patch, MAPREDUCE-4671.2.patch, 
 MAPREDUCE-4671.3.patch


 Say the AM wanted a container at hosts h1, h2, h3. After getting a container 
 at h1 it should tell RM that it no longer needs containers at h2, h3. 
 Otherwise on the RM h2, h3 remain valid allocation locations.
 The AM RMContainerAllocator does remove these resource requests internally. 
 When the resource request container count drops to 0 then it drops the 
 resource request from its tables but forgets to send the 0 sized request to 
 the RM.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4671) AM does not tell the RM about container requests that are no longer needed

2012-10-10 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473028#comment-13473028
 ] 

Bikas Saha commented on MAPREDUCE-4671:
---

New patch.
Checking for resource request falling below zero.
Changing the ask list to have custom comparator to avoid duplication of 
resource requests.

 AM does not tell the RM about container requests that are no longer needed
 --

 Key: MAPREDUCE-4671
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4671
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.23.3, 2.0.0-alpha
Reporter: Bikas Saha
Assignee: Bikas Saha
 Attachments: MAPREDUCE-4671.1.patch, MAPREDUCE-4671.2.patch, 
 MAPREDUCE-4671.3.patch


 Say the AM wanted a container at hosts h1, h2, h3. After getting a container 
 at h1 it should tell RM that it no longer needs containers at h2, h3. 
 Otherwise on the RM h2, h3 remain valid allocation locations.
 The AM RMContainerAllocator does remove these resource requests internally. 
 When the resource request container count drops to 0 then it drops the 
 resource request from its tables but forgets to send the 0 sized request to 
 the RM.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4671) AM does not tell the RM about container requests that are no longer needed

2012-10-10 Thread Bikas Saha (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikas Saha updated MAPREDUCE-4671:
--

Status: Patch Available  (was: Open)

 AM does not tell the RM about container requests that are no longer needed
 --

 Key: MAPREDUCE-4671
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4671
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.0.0-alpha, 0.23.3
Reporter: Bikas Saha
Assignee: Bikas Saha
 Attachments: MAPREDUCE-4671.1.patch, MAPREDUCE-4671.2.patch, 
 MAPREDUCE-4671.3.patch


 Say the AM wanted a container at hosts h1, h2, h3. After getting a container 
 at h1 it should tell RM that it no longer needs containers at h2, h3. 
 Otherwise on the RM h2, h3 remain valid allocation locations.
 The AM RMContainerAllocator does remove these resource requests internally. 
 When the resource request container count drops to 0 then it drops the 
 resource request from its tables but forgets to send the 0 sized request to 
 the RM.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-3655) Exception from launching allocated container

2012-10-10 Thread Li Ming (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473030#comment-13473030
 ] 

Li Ming commented on MAPREDUCE-3655:


This is also happens on 2.0.1-alpha, it seems related to the resource 
localization. In the DistributedShell example, the ContainerLaunchContext of AM 
has LocalResources which are the AppMaster.jar, but other task containers do 
not have this. And only the container with local resources will create the 
directory like 
/tmp/nm-local-dir/usercache/jiangbing/appcache/application_1325062142731_0006, 
so the non-AM containers will fail to use these directories.

 Exception from launching allocated container
 

 Key: MAPREDUCE-3655
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3655
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster
Affects Versions: 0.23.0
Reporter: Bing Jiang

 I use Hadoop-Yarn to deploy my real-time distributed computation system, and 
 I get reply from mapreduce-u...@hadoop.apache.org to follow these guilders 
 below:
  
 http://hadoop.apache.org/common/docs/r0.23.0/hadoop-yarn/hadoop-yarn-site/YARN.html
  
 http://hadoop.apache.org/common/docs/r0.23.0/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html
 When I follow the steps to construct my Client, ApplicationMaster. And an 
 issue occurs to me that  NM fail to launch a Container because of  
 java.io.FileNotFoundException.
 So the part of NM log  has been attached below:
  
 2011-12-29 15:49:16,250 INFO org.apache.hadoop.yarn.server.
 nodemanager.containermanager.application.Application: Adding 
 container_1325062142731_0006_01_01 to application 
 application_1325062142731_0006
 2011-12-29 15:49:16,250 DEBUG org.apache.hadoop.yarn.event.AsyncDispatcher: 
 Dispatching the event 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.event.ApplicationLocalizationEvent.EventType:
  INIT_APPLICATION_RESOURCES
 2011-12-29 15:49:16,250 DEBUG org.apache.hadoop.yarn.event.AsyncDispatcher: 
 Dispatching the event 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationInitedEvent.EventType:
  APPLICATION_INITED
 2011-12-29 15:49:16,250 INFO 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application:
  Processing application_1325062142731_0006 of type APPLICATION_INITED
 2011-12-29 15:49:16,250 INFO 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application:
  Application application_1325062142731_0006 transitioned from INITING to 
 RUNNING
 2011-12-29 15:49:16,250 DEBUG org.apache.hadoop.yarn.event.AsyncDispatcher: 
 Dispatching the event 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.loghandler.event.LogHandlerAppStartedEvent.EventType:
  APPLICATION_STARTED
 2011-12-29 15:49:16,250 DEBUG org.apache.hadoop.yarn.event.AsyncDispatcher: 
 Dispatching the event 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerInitEvent.EventType:
  INIT_CONTAINER
 2011-12-29 15:49:16,250 INFO 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
  Processing container_1325062142731_0006_01_01 of type INIT_CONTAINER
 2011-12-29 15:49:16,250 INFO 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
  Container container_1325062142731_0006_01_01 transitioned from NEW to 
 LOCALIZED
 2011-12-29 15:49:16,250 DEBUG org.apache.hadoop.yarn.event.AsyncDispatcher: 
 Dispatching the event 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncherEvent.EventType:
  LAUNCH_CONTAINER
 2011-12-29 15:49:16,287 DEBUG org.apache.hadoop.yarn.event.AsyncDispatcher: 
 Dispatching the event 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerEvent.EventType:
  CONTAINER_LAUNCHED
 2011-12-29 15:49:16,287 INFO 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
  Processing container_1325062142731_0006_01_01 of type CONTAINER_LAUNCHED
 2011-12-29 15:49:16,287 INFO 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
  Container container_1325062142731_0006_01_01 transitioned from LOCALIZED 
 to RUNNING
 2011-12-29 15:49:16,288 DEBUG org.apache.hadoop.yarn.event.AsyncDispatcher: 
 Dispatching the event 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainerStartMonitoringEvent.EventType:
  START_MONITORING_CONTAINER
 2011-12-29 15:49:16,289 WARN 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch:
  Failed to launch container
 java.io.FileNotFoundException: File 
 /tmp/nm-local-dir/usercache/jiangbing/appcache/application_1325062142731_0006 
 does not exist
 

[jira] [Updated] (MAPREDUCE-3655) Exception from launching allocated container

2012-10-10 Thread Li Ming (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Ming updated MAPREDUCE-3655:
---

Affects Version/s: 2.0.1-alpha

 Exception from launching allocated container
 

 Key: MAPREDUCE-3655
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3655
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster
Affects Versions: 0.23.0, 2.0.1-alpha
Reporter: Bing Jiang

 I use Hadoop-Yarn to deploy my real-time distributed computation system, and 
 I get reply from mapreduce-u...@hadoop.apache.org to follow these guilders 
 below:
  
 http://hadoop.apache.org/common/docs/r0.23.0/hadoop-yarn/hadoop-yarn-site/YARN.html
  
 http://hadoop.apache.org/common/docs/r0.23.0/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html
 When I follow the steps to construct my Client, ApplicationMaster. And an 
 issue occurs to me that  NM fail to launch a Container because of  
 java.io.FileNotFoundException.
 So the part of NM log  has been attached below:
  
 2011-12-29 15:49:16,250 INFO org.apache.hadoop.yarn.server.
 nodemanager.containermanager.application.Application: Adding 
 container_1325062142731_0006_01_01 to application 
 application_1325062142731_0006
 2011-12-29 15:49:16,250 DEBUG org.apache.hadoop.yarn.event.AsyncDispatcher: 
 Dispatching the event 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.event.ApplicationLocalizationEvent.EventType:
  INIT_APPLICATION_RESOURCES
 2011-12-29 15:49:16,250 DEBUG org.apache.hadoop.yarn.event.AsyncDispatcher: 
 Dispatching the event 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationInitedEvent.EventType:
  APPLICATION_INITED
 2011-12-29 15:49:16,250 INFO 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application:
  Processing application_1325062142731_0006 of type APPLICATION_INITED
 2011-12-29 15:49:16,250 INFO 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application:
  Application application_1325062142731_0006 transitioned from INITING to 
 RUNNING
 2011-12-29 15:49:16,250 DEBUG org.apache.hadoop.yarn.event.AsyncDispatcher: 
 Dispatching the event 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.loghandler.event.LogHandlerAppStartedEvent.EventType:
  APPLICATION_STARTED
 2011-12-29 15:49:16,250 DEBUG org.apache.hadoop.yarn.event.AsyncDispatcher: 
 Dispatching the event 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerInitEvent.EventType:
  INIT_CONTAINER
 2011-12-29 15:49:16,250 INFO 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
  Processing container_1325062142731_0006_01_01 of type INIT_CONTAINER
 2011-12-29 15:49:16,250 INFO 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
  Container container_1325062142731_0006_01_01 transitioned from NEW to 
 LOCALIZED
 2011-12-29 15:49:16,250 DEBUG org.apache.hadoop.yarn.event.AsyncDispatcher: 
 Dispatching the event 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncherEvent.EventType:
  LAUNCH_CONTAINER
 2011-12-29 15:49:16,287 DEBUG org.apache.hadoop.yarn.event.AsyncDispatcher: 
 Dispatching the event 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerEvent.EventType:
  CONTAINER_LAUNCHED
 2011-12-29 15:49:16,287 INFO 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
  Processing container_1325062142731_0006_01_01 of type CONTAINER_LAUNCHED
 2011-12-29 15:49:16,287 INFO 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
  Container container_1325062142731_0006_01_01 transitioned from LOCALIZED 
 to RUNNING
 2011-12-29 15:49:16,288 DEBUG org.apache.hadoop.yarn.event.AsyncDispatcher: 
 Dispatching the event 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainerStartMonitoringEvent.EventType:
  START_MONITORING_CONTAINER
 2011-12-29 15:49:16,289 WARN 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch:
  Failed to launch container
 java.io.FileNotFoundException: File 
 /tmp/nm-local-dir/usercache/jiangbing/appcache/application_1325062142731_0006 
 does not exist
 at 
 org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:431)
 at org.apache.hadoop.fs.FileSystem.primitiveMkdir(FileSystem.java:815)
 at 
 org.apache.hadoop.fs.DelegateToFileSystem.mkdir(DelegateToFileSystem.java:143)
 at org.apache.hadoop.fs.FilterFs.mkdir(FilterFs.java:189)
 at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:700)
 at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:697)
at 
 

[jira] [Created] (MAPREDUCE-4717) Mapreduce job fails to run after configuring multiple namespaces [HDFS Federation]

2012-10-10 Thread Sagar Shimpi (JIRA)
Sagar Shimpi created MAPREDUCE-4717:
---

 Summary: Mapreduce job fails to run after configuring multiple 
namespaces [HDFS Federation]
 Key: MAPREDUCE-4717
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4717
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1
Affects Versions: 0.20.2
 Environment: 2 Standalone Desktop with 360Gb HDD and 4GB RAM - Acting 
as two Namenodes
2 Virtual Machine with 60GB HDD and 1GB RAM - Acting as Job tracker and 
zookeeper
Reporter: Sagar Shimpi


I am having setup of 4 nodes with following details -

Standalone Desktop-1 - 
NameNode1,Tasktracker,Zookeeper,Jobtracker,datanode,HMaster

Standalone Desktop-2 - NameNode2,Tasktracker,datanode.RegionServer

Virtual Machine-1 - Namenode3,Datanode,Tasktracker

Virtual Machine-2 - Namenode4,Datanode,Tasktracker


I have configured HDFS Federation with following name service -
a) nameservice1
b) oss-hadoop-nameservice

While executing Mapreduce job I am getting following error -


-bash-4.1$ id
uid=496(hdfs) gid=496(hdfs) groups=496(hdfs),497(hadoop)
-bash-4.1$ hadoop jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar 
wordcount /hbase/install.log.syslog /hbase/testing
12/10/10 12:30:21 ERROR security.UserGroupInformation: 
PriviledgedActionException as:hdfs (auth:SIMPLE) cause:java.io.IOException: 
viewfs://cluster6/
java.io.IOException: viewfs://cluster6/
at org.apache.hadoop.fs.viewfs.InodeTree.init(InodeTree.java:338)
at 
org.apache.hadoop.fs.viewfs.ViewFileSystem$1.init(ViewFileSystem.java:178)
at 
org.apache.hadoop.fs.viewfs.ViewFileSystem.initialize(ViewFileSystem.java:178)
at 
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2150)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:80)
at 
org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2184)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2166)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:302)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:194)
at 
org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:103)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:844)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
at 
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:844)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:481)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:511)
at org.apache.hadoop.examples.WordCount.main(WordCount.java:67)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72)
at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:144)
at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
-bash-4.1$


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4671) AM does not tell the RM about container requests that are no longer needed

2012-10-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473035#comment-13473035
 ] 

Hadoop QA commented on MAPREDUCE-4671:
--

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12548529/MAPREDUCE-4671.3.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2923//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2923//console

This message is automatically generated.

 AM does not tell the RM about container requests that are no longer needed
 --

 Key: MAPREDUCE-4671
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4671
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.23.3, 2.0.0-alpha
Reporter: Bikas Saha
Assignee: Bikas Saha
 Attachments: MAPREDUCE-4671.1.patch, MAPREDUCE-4671.2.patch, 
 MAPREDUCE-4671.3.patch


 Say the AM wanted a container at hosts h1, h2, h3. After getting a container 
 at h1 it should tell RM that it no longer needs containers at h2, h3. 
 Otherwise on the RM h2, h3 remain valid allocation locations.
 The AM RMContainerAllocator does remove these resource requests internally. 
 When the resource request container count drops to 0 then it drops the 
 resource request from its tables but forgets to send the 0 sized request to 
 the RM.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4451) fairscheduler fail to init job with kerberos authentication configured

2012-10-10 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473076#comment-13473076
 ] 

Karthik Kambatla commented on MAPREDUCE-4451:
-

Given that the patch doesn't have tests, I was planning on running a secure 
cluster to ascertain the behavior, but haven't been able to get to it.

Will you be able to validate the behavior and report the same? Otherwise, I ll 
see if I can do the same in the next couple of days?

 fairscheduler fail to init job with kerberos authentication configured
 --

 Key: MAPREDUCE-4451
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4451
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/fair-share
Affects Versions: 1.0.3
Reporter: Erik.fang
 Attachments: MAPREDUCE-4451_branch-1.patch, 
 MAPREDUCE-4451_branch-1.patch, MAPREDUCE-4451_branch-1.patch, 
 MAPREDUCE-4451_branch-1.patch, MAPREDUCE-4451_branch-1.patch


 Using FairScheduler in Hadoop 1.0.3 with kerberos authentication configured. 
 Job initialization fails:
 {code}
 2012-07-17 15:15:09,220 ERROR org.apache.hadoop.mapred.JobTracker: Job 
 initialization failed:
 java.io.IOException: Call to /192.168.7.80:8020 failed on local exception: 
 java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed 
 [Caused by GSSException: No valid credentials provided (Mechanism level: 
 Failed to find any Kerberos tgt)]
 at org.apache.hadoop.ipc.Client.wrapException(Client.java:1129)
 at org.apache.hadoop.ipc.Client.call(Client.java:1097)
 at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
 at $Proxy7.getProtocolVersion(Unknown Source)
 at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:411)
 at 
 org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:125)
 at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:329)
 at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:294)
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:100)
 at 
 org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1411)
 at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
 at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1429)
 at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
 at org.apache.hadoop.fs.Path.getFileSystem(Path.java:187)
 at 
 org.apache.hadoop.security.Credentials.writeTokenStorageFile(Credentials.java:169)
 at 
 org.apache.hadoop.mapred.JobInProgress.generateAndStoreTokens(JobInProgress.java:3558)
 at 
 org.apache.hadoop.mapred.JobInProgress.initTasks(JobInProgress.java:696)
 at org.apache.hadoop.mapred.JobTracker.initJob(JobTracker.java:3911)
 at 
 org.apache.hadoop.mapred.FairScheduler$JobInitializer$InitJob.run(FairScheduler.java:301)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 Caused by: java.io.IOException: javax.security.sasl.SaslException: GSS 
 initiate failed [Caused by GSSException: No valid credentials provided 
 (Mechanism level: Failed to find any Kerberos tgt)]
 at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:543)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1136)
 at 
 org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:488)
 at 
 org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:590)
 at 
 org.apache.hadoop.ipc.Client$Connection.access$2100(Client.java:187)
 at org.apache.hadoop.ipc.Client.getConnection(Client.java:1228)
 at org.apache.hadoop.ipc.Client.call(Client.java:1072)
 ... 20 more
 Caused by: javax.security.sasl.SaslException: GSS initiate failed [Caused by 
 GSSException: No valid credentials provided (Mechanism level: Failed to find 
 any Kerberos tgt)]
 at 
 com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:194)
 at 
 org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:134)
 at 
 org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:385)
 at 
 org.apache.hadoop.ipc.Client$Connection.access$1200(Client.java:187)
 at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:583)
 at 

[jira] [Commented] (MAPREDUCE-4451) fairscheduler fail to init job with kerberos authentication configured

2012-10-10 Thread Erik.fang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473090#comment-13473090
 ] 

Erik.fang commented on MAPREDUCE-4451:
--

Before upload the patch ,I have tested it in a 4-node dev-cluster with 
hadoop-1.0.3 to make sure it works. Maybe it is better to validate the patch 
with branch-1 compiled jars. I can do it today or tomorrow, and post the result.

However, I can only post some jobtracker logs to show that job initialization 
fails before apply the patch and every thing works fine after apply the patch. 
Is that enough or any other ideas?

 fairscheduler fail to init job with kerberos authentication configured
 --

 Key: MAPREDUCE-4451
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4451
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/fair-share
Affects Versions: 1.0.3
Reporter: Erik.fang
 Attachments: MAPREDUCE-4451_branch-1.patch, 
 MAPREDUCE-4451_branch-1.patch, MAPREDUCE-4451_branch-1.patch, 
 MAPREDUCE-4451_branch-1.patch, MAPREDUCE-4451_branch-1.patch


 Using FairScheduler in Hadoop 1.0.3 with kerberos authentication configured. 
 Job initialization fails:
 {code}
 2012-07-17 15:15:09,220 ERROR org.apache.hadoop.mapred.JobTracker: Job 
 initialization failed:
 java.io.IOException: Call to /192.168.7.80:8020 failed on local exception: 
 java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed 
 [Caused by GSSException: No valid credentials provided (Mechanism level: 
 Failed to find any Kerberos tgt)]
 at org.apache.hadoop.ipc.Client.wrapException(Client.java:1129)
 at org.apache.hadoop.ipc.Client.call(Client.java:1097)
 at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
 at $Proxy7.getProtocolVersion(Unknown Source)
 at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:411)
 at 
 org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:125)
 at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:329)
 at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:294)
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:100)
 at 
 org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1411)
 at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
 at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1429)
 at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
 at org.apache.hadoop.fs.Path.getFileSystem(Path.java:187)
 at 
 org.apache.hadoop.security.Credentials.writeTokenStorageFile(Credentials.java:169)
 at 
 org.apache.hadoop.mapred.JobInProgress.generateAndStoreTokens(JobInProgress.java:3558)
 at 
 org.apache.hadoop.mapred.JobInProgress.initTasks(JobInProgress.java:696)
 at org.apache.hadoop.mapred.JobTracker.initJob(JobTracker.java:3911)
 at 
 org.apache.hadoop.mapred.FairScheduler$JobInitializer$InitJob.run(FairScheduler.java:301)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 Caused by: java.io.IOException: javax.security.sasl.SaslException: GSS 
 initiate failed [Caused by GSSException: No valid credentials provided 
 (Mechanism level: Failed to find any Kerberos tgt)]
 at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:543)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1136)
 at 
 org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:488)
 at 
 org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:590)
 at 
 org.apache.hadoop.ipc.Client$Connection.access$2100(Client.java:187)
 at org.apache.hadoop.ipc.Client.getConnection(Client.java:1228)
 at org.apache.hadoop.ipc.Client.call(Client.java:1072)
 ... 20 more
 Caused by: javax.security.sasl.SaslException: GSS initiate failed [Caused by 
 GSSException: No valid credentials provided (Mechanism level: Failed to find 
 any Kerberos tgt)]
 at 
 com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:194)
 at 
 org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:134)
 at 
 org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:385)
 at 
 org.apache.hadoop.ipc.Client$Connection.access$1200(Client.java:187)
 at 

[jira] [Created] (MAPREDUCE-4718) MapReduce fails If I pass a parameter as a S3 folder

2012-10-10 Thread Benjamin Kim (JIRA)
Benjamin Kim created MAPREDUCE-4718:
---

 Summary: MapReduce fails If I pass a parameter as a S3 folder
 Key: MAPREDUCE-4718
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4718
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: job submission
Affects Versions: 1.0.3, 1.0.0
 Environment: Hadoop with default configurations
Reporter: Benjamin Kim


I'm running a wordcount MR as follows

hadoop jar WordCount.jar wordcount.WordCountDriver s3n://bucket/wordcount/input 
s3n://bucket/wordcount/output
 
s3n://bucket/wordcount/input is a s3 object that contains other input files.

However I get following NPE error

12/10/02 18:56:23 INFO mapred.JobClient:  map 0% reduce 0%
12/10/02 18:56:54 INFO mapred.JobClient:  map 50% reduce 0%
12/10/02 18:56:56 INFO mapred.JobClient: Task Id : 
attempt_201210021853_0001_m_01_0, Status : FAILED
java.lang.NullPointerException
at 
org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsInputStream.close(NativeS3FileSystem.java:106)
at java.io.BufferedInputStream.close(BufferedInputStream.java:451)
at java.io.FilterInputStream.close(FilterInputStream.java:155)
at org.apache.hadoop.util.LineReader.close(LineReader.java:83)
at 
org.apache.hadoop.mapreduce.lib.input.LineRecordReader.close(LineRecordReader.java:144)
at 
org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.close(MapTask.java:497)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:765)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
at org.apache.hadoop.mapred.Child.main(Child.java:249)

MR runs fine if i specify more specific input path such as 
s3n://bucket/wordcount/input/file.txt

MR fails if I pass s3 folder as a parameter


In summary,
This works
 hadoop jar ./hadoop-examples-1.0.3.jar wordcount /user/hadoop/wordcount/input/ 
s3n://bucket/wordcount/output/

This doesn't work
 hadoop jar ./hadoop-examples-1.0.3.jar wordcount s3n://bucket/wordcount/input/ 
s3n://bucket/wordcount/output/

(both input path are directories)



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-3678) The Map tasks logs should have the value of input split it processed

2012-10-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473198#comment-13473198
 ] 

Hudson commented on MAPREDUCE-3678:
---

Integrated in Hadoop-Hdfs-trunk #1191 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1191/])
MAPREDUCE-3678. The Map tasks logs should have the value of input split it 
processed. Contributed by Harsh J. (harsh) (Revision 1396032)

 Result = SUCCESS
harsh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1396032
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/MapTask.java


 The Map tasks logs should have the value of input split it processed
 

 Key: MAPREDUCE-3678
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3678
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: mrv1, mrv2
Affects Versions: 1.0.0, 2.0.0-alpha
Reporter: Bejoy KS
Assignee: Harsh J
 Fix For: 1.2.0, 2.0.3-alpha

 Attachments: MAPREDUCE-3678-branch-1.patch, MAPREDUCE-3678.patch


 It would be easier to debug some corner in tasks if we knew what was the 
 input split processed by that task. Map reduce task tracker log should 
 accommodate the same. Also in the jobdetails web UI, the split also should be 
 displayed along with the Split Locations. 
 Sample as
 Input Split
 hdfs://myserver:9000/userdata/sampleapp/inputdir/file1.csv - split 
 no/offset from beginning of file
 This would be much beneficial to nail down some data quality issues in large 
 data volume processing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4654) TestDistCp is @ignored

2012-10-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473199#comment-13473199
 ] 

Hudson commented on MAPREDUCE-4654:
---

Integrated in Hadoop-Hdfs-trunk #1191 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1191/])
MAPREDUCE-4654. TestDistCp is ignored. Contributed by Sandy Ryza. (Revision 
1396047)

 Result = SUCCESS
tomwhite : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1396047
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-tools/hadoop-distcp/src/test/java/org/apache/hadoop/tools/TestDistCp.java
* 
/hadoop/common/trunk/hadoop-tools/hadoop-distcp/src/test/java/org/apache/hadoop/tools/TestIntegration.java


 TestDistCp is @ignored
 --

 Key: MAPREDUCE-4654
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4654
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Affects Versions: 2.0.2-alpha
Reporter: Colin Patrick McCabe
Assignee: Sandy Ryza
Priority: Critical
 Fix For: 2.0.3-alpha

 Attachments: MAPREDUCE-4654.patch


 We should fix TestDistCp so that it actually runs, rather than being ignored.
 {code}
 @ignore
 public class TestDistCp {
   private static final Log LOG = LogFactory.getLog(TestDistCp.class);
   private static ListPath pathList = new ArrayListPath();
   ...
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (MAPREDUCE-4719) mapred.TaskInProgress should be public

2012-10-10 Thread Dave Beech (JIRA)
Dave Beech created MAPREDUCE-4719:
-

 Summary: mapred.TaskInProgress should be public
 Key: MAPREDUCE-4719
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4719
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Dave Beech
Priority: Minor


In Cloudera's CDH3 distributions, mapred.TaskInProgress has been made public 
along with its generateSingleReport() and getDiagnosticInfo() methods.

Should this change be brought back into the main source tree?



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4654) TestDistCp is @ignored

2012-10-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473238#comment-13473238
 ] 

Hudson commented on MAPREDUCE-4654:
---

Integrated in Hadoop-Mapreduce-trunk #1222 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1222/])
MAPREDUCE-4654. TestDistCp is ignored. Contributed by Sandy Ryza. (Revision 
1396047)

 Result = SUCCESS
tomwhite : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1396047
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-tools/hadoop-distcp/src/test/java/org/apache/hadoop/tools/TestDistCp.java
* 
/hadoop/common/trunk/hadoop-tools/hadoop-distcp/src/test/java/org/apache/hadoop/tools/TestIntegration.java


 TestDistCp is @ignored
 --

 Key: MAPREDUCE-4654
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4654
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Affects Versions: 2.0.2-alpha
Reporter: Colin Patrick McCabe
Assignee: Sandy Ryza
Priority: Critical
 Fix For: 2.0.3-alpha

 Attachments: MAPREDUCE-4654.patch


 We should fix TestDistCp so that it actually runs, rather than being ignored.
 {code}
 @ignore
 public class TestDistCp {
   private static final Log LOG = LogFactory.getLog(TestDistCp.class);
   private static ListPath pathList = new ArrayListPath();
   ...
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-3678) The Map tasks logs should have the value of input split it processed

2012-10-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473237#comment-13473237
 ] 

Hudson commented on MAPREDUCE-3678:
---

Integrated in Hadoop-Mapreduce-trunk #1222 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1222/])
MAPREDUCE-3678. The Map tasks logs should have the value of input split it 
processed. Contributed by Harsh J. (harsh) (Revision 1396032)

 Result = SUCCESS
harsh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1396032
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/MapTask.java


 The Map tasks logs should have the value of input split it processed
 

 Key: MAPREDUCE-3678
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3678
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: mrv1, mrv2
Affects Versions: 1.0.0, 2.0.0-alpha
Reporter: Bejoy KS
Assignee: Harsh J
 Fix For: 1.2.0, 2.0.3-alpha

 Attachments: MAPREDUCE-3678-branch-1.patch, MAPREDUCE-3678.patch


 It would be easier to debug some corner in tasks if we knew what was the 
 input split processed by that task. Map reduce task tracker log should 
 accommodate the same. Also in the jobdetails web UI, the split also should be 
 displayed along with the Split Locations. 
 Sample as
 Input Split
 hdfs://myserver:9000/userdata/sampleapp/inputdir/file1.csv - split 
 no/offset from beginning of file
 This would be much beneficial to nail down some data quality issues in large 
 data volume processing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (MAPREDUCE-4720) Browser thinks History Server main page JS is taking too long

2012-10-10 Thread Robert Joseph Evans (JIRA)
Robert Joseph Evans created MAPREDUCE-4720:
--

 Summary: Browser thinks History Server main page JS is taking too 
long
 Key: MAPREDUCE-4720
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4720
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.23.3
Reporter: Robert Joseph Evans


The main History Server page with the default settings of 20,000 jobs can cause 
browsers to think that the JS on the page is stuck and ask you if you want to 
kill it. This is a big usability problem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4568) Throw early exception when duplicate files or archives are found in distributed cache

2012-10-10 Thread Robert Joseph Evans (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473271#comment-13473271
 ] 

Robert Joseph Evans commented on MAPREDUCE-4568:


Adding a true duplicate, exact same file multiple times, to the dist cache will 
not result in an error under YARN.  The MR client will just dedupe them before 
submitting the request to YARN.  The issue is when there are different files 
that will both map to the same key in the dist cache map (the key is the name 
of the symlink created in the working directory of the task/container).  Then 
is where it will throw an exception under 2.0

 Throw early exception when duplicate files or archives are found in 
 distributed cache
 ---

 Key: MAPREDUCE-4568
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4568
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Mohammad Kamrul Islam
Assignee: Arun C Murthy

 According to #MAPREDUCE-4549, Hadoop 2.x throws exception if duplicates found 
 in cacheFiles or cacheArchives. The exception  throws during job submission.
 This JIRA is to throw the exception ==early== when it is first added to the 
 Distributed Cache through addCacheFile or addFileToClassPath.
 It will help the client to decide whether to fail-fast or continue w/o the 
 duplicated entries.
 Alternatively, Hadoop could provide a knob where user will choose whether to 
 throw error( coming behavior) or silently ignore (old behavior).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4568) Throw early exception when duplicate files or archives are found in distributed cache

2012-10-10 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473277#comment-13473277
 ] 

Jason Lowe commented on MAPREDUCE-4568:
---

bq. In addition, it will be better, if there is a way of checking whether some 
file is already added in DC.

Would adding an interface so the client can query the contents of the DC before 
job submission be sufficient?  This seems like a reasonable enhancement that 
doesn't overlap with existing interfaces.  Or do you think it's still a 
requirement to throw early when adding a collision?  Throwing will require 
adding a new interface for adding to the DC which overlaps with existing 
functionality and adds to the pile of APIs we already have for adding things to 
the DC.

 Throw early exception when duplicate files or archives are found in 
 distributed cache
 ---

 Key: MAPREDUCE-4568
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4568
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Mohammad Kamrul Islam
Assignee: Arun C Murthy

 According to #MAPREDUCE-4549, Hadoop 2.x throws exception if duplicates found 
 in cacheFiles or cacheArchives. The exception  throws during job submission.
 This JIRA is to throw the exception ==early== when it is first added to the 
 Distributed Cache through addCacheFile or addFileToClassPath.
 It will help the client to decide whether to fail-fast or continue w/o the 
 duplicated entries.
 Alternatively, Hadoop could provide a knob where user will choose whether to 
 throw error( coming behavior) or silently ignore (old behavior).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4719) mapred.TaskInProgress should be public

2012-10-10 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473282#comment-13473282
 ] 

Todd Lipcon commented on MAPREDUCE-4719:


Hi Dave. What's the use case you're trying to address that needs them public? 
We did this in CDH back in early 2010 for use in a contrib plugin, but I'd like 
to hear why you need it before forward-porting the change. In more recent CDH, 
those plugins no longer exist as they've been supplanted by other APIs.

 mapred.TaskInProgress should be public
 --

 Key: MAPREDUCE-4719
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4719
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Dave Beech
Priority: Minor

 In Cloudera's CDH3 distributions, mapred.TaskInProgress has been made public 
 along with its generateSingleReport() and getDiagnosticInfo() methods.
 Should this change be brought back into the main source tree?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4719) mapred.TaskInProgress should be public

2012-10-10 Thread Dave Beech (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473286#comment-13473286
 ] 

Dave Beech commented on MAPREDUCE-4719:
---

Hi Todd. I have no use case for this actually. It's just an inconsistency I 
noticed and discussed it with Steve Loughran. (you may see the messages on 
Twitter!). Happy to have this closed as not a problem.

 mapred.TaskInProgress should be public
 --

 Key: MAPREDUCE-4719
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4719
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Dave Beech
Priority: Minor

 In Cloudera's CDH3 distributions, mapred.TaskInProgress has been made public 
 along with its generateSingleReport() and getDiagnosticInfo() methods.
 Should this change be brought back into the main source tree?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (MAPREDUCE-4719) mapred.TaskInProgress should be public

2012-10-10 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon resolved MAPREDUCE-4719.


Resolution: Not A Problem

Gotcha. Let's resolve as not-a-problem then for now, and if someone disagrees, 
we can re-open.

 mapred.TaskInProgress should be public
 --

 Key: MAPREDUCE-4719
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4719
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Dave Beech
Priority: Minor

 In Cloudera's CDH3 distributions, mapred.TaskInProgress has been made public 
 along with its generateSingleReport() and getDiagnosticInfo() methods.
 Should this change be brought back into the main source tree?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (MAPREDUCE-4717) Mapreduce job fails to run after configuring multiple namespaces [HDFS Federation]

2012-10-10 Thread Aaron T. Myers (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron T. Myers resolved MAPREDUCE-4717.
---

Resolution: Not A Problem

Hi Sagar, this looks to me to be most likely a configuration error, in which 
case you should try emailing a user mailing list. I'm guessing from the facts 
that you say you configured federation and you're using MR1 that you're using 
CDH, in which case you should email cdh-u...@cloudera.org. If I'm wrong about 
that and you're somehow using a straight Apache release, then you should email 
u...@hadoop.apache.org.

 Mapreduce job fails to run after configuring multiple namespaces [HDFS 
 Federation]
 --

 Key: MAPREDUCE-4717
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4717
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1
Affects Versions: 0.20.2
 Environment: 2 Standalone Desktop with 360Gb HDD and 4GB RAM - Acting 
 as two Namenodes
 2 Virtual Machine with 60GB HDD and 1GB RAM - Acting as Job tracker and 
 zookeeper
Reporter: Sagar Shimpi

 I am having setup of 4 nodes with following details -
 Standalone Desktop-1 - 
 NameNode1,Tasktracker,Zookeeper,Jobtracker,datanode,HMaster
 Standalone Desktop-2 - NameNode2,Tasktracker,datanode.RegionServer
 Virtual Machine-1 - Namenode3,Datanode,Tasktracker
 Virtual Machine-2 - Namenode4,Datanode,Tasktracker
 I have configured HDFS Federation with following name service -
 a) nameservice1
 b) oss-hadoop-nameservice
 While executing Mapreduce job I am getting following error -
 
 -bash-4.1$ id
 uid=496(hdfs) gid=496(hdfs) groups=496(hdfs),497(hadoop)
 -bash-4.1$ hadoop jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar 
 wordcount /hbase/install.log.syslog /hbase/testing
 12/10/10 12:30:21 ERROR security.UserGroupInformation: 
 PriviledgedActionException as:hdfs (auth:SIMPLE) cause:java.io.IOException: 
 viewfs://cluster6/
 java.io.IOException: viewfs://cluster6/
 at org.apache.hadoop.fs.viewfs.InodeTree.init(InodeTree.java:338)
 at 
 org.apache.hadoop.fs.viewfs.ViewFileSystem$1.init(ViewFileSystem.java:178)
 at 
 org.apache.hadoop.fs.viewfs.ViewFileSystem.initialize(ViewFileSystem.java:178)
 at 
 org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2150)
 at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:80)
 at 
 org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2184)
 at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2166)
 at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:302)
 at org.apache.hadoop.fs.Path.getFileSystem(Path.java:194)
 at 
 org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:103)
 at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850)
 at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:844)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
 at 
 org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:844)
 at org.apache.hadoop.mapreduce.Job.submit(Job.java:481)
 at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:511)
 at org.apache.hadoop.examples.WordCount.main(WordCount.java:67)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72)
 at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:144)
 at 
 org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
 -bash-4.1$
 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: 

[jira] [Commented] (MAPREDUCE-4398) Fix mapred.system.dir permission error with FairScheduler

2012-10-10 Thread Arpit Gupta (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473380#comment-13473380
 ] 

Arpit Gupta commented on MAPREDUCE-4398:


The following stack trace was seen when using fair scheduler with 1.0.3 release

{code}
Generating 100 using 2 maps with step of 50
12/10/09 19:04:09 INFO mapred.JobClient: Running job: job_201210091900_0002
12/10/09 19:04:10 INFO mapred.JobClient:  map 0% reduce 0%
12/10/09 19:04:10 INFO mapred.JobClient: Job complete: job_201210091900_0002
12/10/09 19:04:10 INFO mapred.JobClient: Counters: 0
12/10/09 19:04:10 INFO mapred.JobClient: Job Failed: Job initialization failed:
org.apache.hadoop.security.AccessControlException: 
org.apache.hadoop.security.AccessControlException: Permission denied: 
user=robing, access=EXECUTE, inode=system:mapred:hadoop:rwx--
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
at 
org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:95)
at 
org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:57)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.init(DFSClient.java:3251)
at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:713)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:182)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:555)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:536)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:443)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:435)
at 
org.apache.hadoop.security.Credentials.writeTokenStorageFile(Credentials.java:169)
at 
org.apache.hadoop.mapred.JobInProgress.generateAndStoreTokens(JobInProgress.java:3537)
at org.apache.hadoop.mapred.JobInProgress.initTasks(JobInProgress.java:696)
at org.apache.hadoop.mapred.JobTracker.initJob(JobTracker.java:4207)
at 
org.apache.hadoop.mapred.FairScheduler$JobInitializer$InitJob.run(FairScheduler.java:291)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
{code}

It looks like when using fair scheduler the job token file is being written (in 
the mapred.system.dir) as the user running the job where as if we use the 
default scheduler that file is being written as the user running mr.

 Fix mapred.system.dir permission error with FairScheduler
 -

 Key: MAPREDUCE-4398
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4398
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/fair-share
Affects Versions: 1.0.3
Reporter: Luke Lu
Assignee: Yu Gao

 Incorrect job initialization logic in FairScheduler causes mysterious 
 intermittent mapred.system.dir permission errors.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-2454) Allow external sorter plugin for MR

2012-10-10 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473411#comment-13473411
 ] 

Alejandro Abdelnur commented on MAPREDUCE-2454:
---

Initial feedback on the patch (I'll do a more detailed review):

* Nice work
* patch needs rebase, TestReduceTask.java has been moved to 
hadoop-mapreduce-client-jobclient/
* remove introduced  unused imports through out the patch
* reformat lines with over 80 chars through out the patch

I'm not trilled on how we are mixing mapred and mapreduce classes in the APIs 
of pluggable sort. But given how the current MR stuff implementation is done, I 
don't think it is possible to avoid that without a mayor cleanup/refactoring of 
much bigger scope.

One thing would be quite useful, and I'd say a pre-requisite before committing 
it, is a performance comparison of terasort with and without the patch; we 
shouldn't be introducing a sensible performance penalty.

 Allow external sorter plugin for MR
 ---

 Key: MAPREDUCE-2454
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2454
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Affects Versions: 2.0.0-alpha, 3.0.0, 2.0.2-alpha
Reporter: Mariappan Asokan
Assignee: Mariappan Asokan
Priority: Minor
  Labels: features, performance, plugin, sort
 Attachments: HadoopSortPlugin.pdf, HadoopSortPlugin.pdf, 
 KeyValueIterator.java, MapOutputSorterAbstract.java, MapOutputSorter.java, 
 mapreduce-2454.patch, mapreduce-2454.patch, mapreduce-2454.patch, 
 mapreduce-2454.patch, mapreduce-2454.patch, mapreduce-2454.patch, 
 mapreduce-2454.patch, mapreduce-2454.patch, mapreduce-2454.patch, 
 mapreduce-2454.patch, mapreduce-2454.patch, 
 mr-2454-on-mr-279-build82.patch.gz, MR-2454-trunkPatchPreview.gz, 
 ReduceInputSorter.java


 Define interfaces and some abstract classes in the Hadoop framework to 
 facilitate external sorter plugins both on the Map and Reduce sides.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4495) Workflow Application Master in YARN

2012-10-10 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473422#comment-13473422
 ] 

Alejandro Abdelnur commented on MAPREDUCE-4495:
---

+1 Following up, as I've said before, I think is a good starting point and I'd 
like to commit this to trunk (only move it to a release branch once it is in 
good shape.


 Workflow Application Master in YARN
 ---

 Key: MAPREDUCE-4495
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4495
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Affects Versions: 2.0.0-alpha
Reporter: Bo Wang
Assignee: Bo Wang
 Attachments: MAPREDUCE-4495-v1.1.patch, MAPREDUCE-4495-v1.patch, 
 MapReduceWorkflowAM.pdf


 It is useful to have a workflow application master, which will be capable of 
 running a DAG of jobs. The workflow client submits a DAG request to the AM 
 and then the AM will manage the life cycle of this application in terms of 
 requesting the needed resources from the RM, and starting, monitoring and 
 retrying the application's individual tasks.
 Compared to running Oozie with the current MapReduce Application Master, 
 these are some of the advantages:
  - Less number of consumed resources, since only one application master will 
 be spawned for the whole workflow.
  - Reuse of resources, since the same resources can be used by multiple 
 consecutive jobs in the workflow (no need to request/wait for resources for 
 every individual job from the central RM).
  - More optimization opportunities in terms of collective resource requests.
  - Optimization opportunities in terms of rewriting and composing jobs in the 
 workflow (e.g. pushing down Mappers).
  - This Application Master can be reused/extended by higher systems like Pig 
 and hive to provide an optimized way of running their workflows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4495) Workflow Application Master in YARN

2012-10-10 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473431#comment-13473431
 ] 

Arun C Murthy commented on MAPREDUCE-4495:
--

Tucu, I'm a little disappointed. We had a chat last week and I told you that 
I'd get back to you - I'll look at this soon, please wait. Thanks.

 Workflow Application Master in YARN
 ---

 Key: MAPREDUCE-4495
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4495
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Affects Versions: 2.0.0-alpha
Reporter: Bo Wang
Assignee: Bo Wang
 Attachments: MAPREDUCE-4495-v1.1.patch, MAPREDUCE-4495-v1.patch, 
 MapReduceWorkflowAM.pdf


 It is useful to have a workflow application master, which will be capable of 
 running a DAG of jobs. The workflow client submits a DAG request to the AM 
 and then the AM will manage the life cycle of this application in terms of 
 requesting the needed resources from the RM, and starting, monitoring and 
 retrying the application's individual tasks.
 Compared to running Oozie with the current MapReduce Application Master, 
 these are some of the advantages:
  - Less number of consumed resources, since only one application master will 
 be spawned for the whole workflow.
  - Reuse of resources, since the same resources can be used by multiple 
 consecutive jobs in the workflow (no need to request/wait for resources for 
 every individual job from the central RM).
  - More optimization opportunities in terms of collective resource requests.
  - Optimization opportunities in terms of rewriting and composing jobs in the 
 workflow (e.g. pushing down Mappers).
  - This Application Master can be reused/extended by higher systems like Pig 
 and hive to provide an optimized way of running their workflows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4495) Workflow Application Master in YARN

2012-10-10 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473434#comment-13473434
 ] 

Arun C Murthy commented on MAPREDUCE-4495:
--

I'll ask the same question I asked you personally last week: 

Why aren't we putting this in an incubator project rather than importing code 
from Oozie etc. into MapReduce?

What is the need for the complex event system here? Why is that needed if we 
only need MR jobs? Why aren't we using JobControl api?

I've also asked this same question before:

Why aren't we using JobControl api since it already exists?


 Workflow Application Master in YARN
 ---

 Key: MAPREDUCE-4495
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4495
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Affects Versions: 2.0.0-alpha
Reporter: Bo Wang
Assignee: Bo Wang
 Attachments: MAPREDUCE-4495-v1.1.patch, MAPREDUCE-4495-v1.patch, 
 MapReduceWorkflowAM.pdf


 It is useful to have a workflow application master, which will be capable of 
 running a DAG of jobs. The workflow client submits a DAG request to the AM 
 and then the AM will manage the life cycle of this application in terms of 
 requesting the needed resources from the RM, and starting, monitoring and 
 retrying the application's individual tasks.
 Compared to running Oozie with the current MapReduce Application Master, 
 these are some of the advantages:
  - Less number of consumed resources, since only one application master will 
 be spawned for the whole workflow.
  - Reuse of resources, since the same resources can be used by multiple 
 consecutive jobs in the workflow (no need to request/wait for resources for 
 every individual job from the central RM).
  - More optimization opportunities in terms of collective resource requests.
  - Optimization opportunities in terms of rewriting and composing jobs in the 
 workflow (e.g. pushing down Mappers).
  - This Application Master can be reused/extended by higher systems like Pig 
 and hive to provide an optimized way of running their workflows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4495) Workflow Application Master in YARN

2012-10-10 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473467#comment-13473467
 ] 

Arun C Murthy commented on MAPREDUCE-4495:
--

Let me restate what I've been saying all along:

WFAM has very wide scope, is importing a whole new bunch (500KB) of code from 
Oozie i.e. workflowlib.

Given that, it belong in a separate project by itself, no need to extend Hadoop 
to incorporate Oozie.

---

OTOH, if you want to merely support DAG of MR jobs we already have JobControl - 
we can, trivially, change JobControl to run in an AM without any need for 
workflowlib. So, let's not import that in.



Let's not blow up Hadoop into an even bigger umbrella project by importing 
Oozie into it. Let's do it in an incubator project.

I have a proposal which I'll share, you can be part of the it from day one. 
Makes sense? Thanks.




 Workflow Application Master in YARN
 ---

 Key: MAPREDUCE-4495
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4495
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Affects Versions: 2.0.0-alpha
Reporter: Bo Wang
Assignee: Bo Wang
 Attachments: MAPREDUCE-4495-v1.1.patch, MAPREDUCE-4495-v1.patch, 
 MapReduceWorkflowAM.pdf


 It is useful to have a workflow application master, which will be capable of 
 running a DAG of jobs. The workflow client submits a DAG request to the AM 
 and then the AM will manage the life cycle of this application in terms of 
 requesting the needed resources from the RM, and starting, monitoring and 
 retrying the application's individual tasks.
 Compared to running Oozie with the current MapReduce Application Master, 
 these are some of the advantages:
  - Less number of consumed resources, since only one application master will 
 be spawned for the whole workflow.
  - Reuse of resources, since the same resources can be used by multiple 
 consecutive jobs in the workflow (no need to request/wait for resources for 
 every individual job from the central RM).
  - More optimization opportunities in terms of collective resource requests.
  - Optimization opportunities in terms of rewriting and composing jobs in the 
 workflow (e.g. pushing down Mappers).
  - This Application Master can be reused/extended by higher systems like Pig 
 and hive to provide an optimized way of running their workflows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4495) Workflow Application Master in YARN

2012-10-10 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473475#comment-13473475
 ] 

Arun C Murthy commented on MAPREDUCE-4495:
--

Tucu - I remember talking with you that we should discuss at the contributor 
meetup (Friday), maybe there was a misunderstanding. Anyway, doesn't matter, 
thanks.

 Workflow Application Master in YARN
 ---

 Key: MAPREDUCE-4495
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4495
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Affects Versions: 2.0.0-alpha
Reporter: Bo Wang
Assignee: Bo Wang
 Attachments: MAPREDUCE-4495-v1.1.patch, MAPREDUCE-4495-v1.patch, 
 MapReduceWorkflowAM.pdf


 It is useful to have a workflow application master, which will be capable of 
 running a DAG of jobs. The workflow client submits a DAG request to the AM 
 and then the AM will manage the life cycle of this application in terms of 
 requesting the needed resources from the RM, and starting, monitoring and 
 retrying the application's individual tasks.
 Compared to running Oozie with the current MapReduce Application Master, 
 these are some of the advantages:
  - Less number of consumed resources, since only one application master will 
 be spawned for the whole workflow.
  - Reuse of resources, since the same resources can be used by multiple 
 consecutive jobs in the workflow (no need to request/wait for resources for 
 every individual job from the central RM).
  - More optimization opportunities in terms of collective resource requests.
  - Optimization opportunities in terms of rewriting and composing jobs in the 
 workflow (e.g. pushing down Mappers).
  - This Application Master can be reused/extended by higher systems like Pig 
 and hive to provide an optimized way of running their workflows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4451) fairscheduler fail to init job with kerberos authentication configured

2012-10-10 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473495#comment-13473495
 ] 

Karthik Kambatla commented on MAPREDUCE-4451:
-

Thanks Erik. In my opinion, that should be good enough.

 fairscheduler fail to init job with kerberos authentication configured
 --

 Key: MAPREDUCE-4451
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4451
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/fair-share
Affects Versions: 1.0.3
Reporter: Erik.fang
 Attachments: MAPREDUCE-4451_branch-1.patch, 
 MAPREDUCE-4451_branch-1.patch, MAPREDUCE-4451_branch-1.patch, 
 MAPREDUCE-4451_branch-1.patch, MAPREDUCE-4451_branch-1.patch


 Using FairScheduler in Hadoop 1.0.3 with kerberos authentication configured. 
 Job initialization fails:
 {code}
 2012-07-17 15:15:09,220 ERROR org.apache.hadoop.mapred.JobTracker: Job 
 initialization failed:
 java.io.IOException: Call to /192.168.7.80:8020 failed on local exception: 
 java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed 
 [Caused by GSSException: No valid credentials provided (Mechanism level: 
 Failed to find any Kerberos tgt)]
 at org.apache.hadoop.ipc.Client.wrapException(Client.java:1129)
 at org.apache.hadoop.ipc.Client.call(Client.java:1097)
 at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
 at $Proxy7.getProtocolVersion(Unknown Source)
 at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:411)
 at 
 org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:125)
 at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:329)
 at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:294)
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:100)
 at 
 org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1411)
 at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
 at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1429)
 at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
 at org.apache.hadoop.fs.Path.getFileSystem(Path.java:187)
 at 
 org.apache.hadoop.security.Credentials.writeTokenStorageFile(Credentials.java:169)
 at 
 org.apache.hadoop.mapred.JobInProgress.generateAndStoreTokens(JobInProgress.java:3558)
 at 
 org.apache.hadoop.mapred.JobInProgress.initTasks(JobInProgress.java:696)
 at org.apache.hadoop.mapred.JobTracker.initJob(JobTracker.java:3911)
 at 
 org.apache.hadoop.mapred.FairScheduler$JobInitializer$InitJob.run(FairScheduler.java:301)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 Caused by: java.io.IOException: javax.security.sasl.SaslException: GSS 
 initiate failed [Caused by GSSException: No valid credentials provided 
 (Mechanism level: Failed to find any Kerberos tgt)]
 at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:543)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1136)
 at 
 org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:488)
 at 
 org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:590)
 at 
 org.apache.hadoop.ipc.Client$Connection.access$2100(Client.java:187)
 at org.apache.hadoop.ipc.Client.getConnection(Client.java:1228)
 at org.apache.hadoop.ipc.Client.call(Client.java:1072)
 ... 20 more
 Caused by: javax.security.sasl.SaslException: GSS initiate failed [Caused by 
 GSSException: No valid credentials provided (Mechanism level: Failed to find 
 any Kerberos tgt)]
 at 
 com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:194)
 at 
 org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:134)
 at 
 org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:385)
 at 
 org.apache.hadoop.ipc.Client$Connection.access$1200(Client.java:187)
 at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:583)
 at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:580)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at 
 

[jira] [Commented] (MAPREDUCE-4495) Workflow Application Master in YARN

2012-10-10 Thread Robert Joseph Evans (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473513#comment-13473513
 ] 

Robert Joseph Evans commented on MAPREDUCE-4495:


I really do like the idea of having an AM that can run a workflow.  I think 
that there is a huge potential here and I want to see this move forward, but 
the size and scope of this change is a lot to take in. There are 11,734 lines 
in the patch.  I realize that a lot of this was taken from Oozie itself, but 
then how are we going to keep the two in sync?  What happens when Oozie finds a 
bug?  How are we going to be sure that the bug is pulled into mapred?  I really 
would prefer to see a more agile approach to these changes, and hopefully some 
of them can correspond to MR, YARN, and HDFS splitting apart after 2.0 has 
stabilized, so Arun's fears about Hadoop returning to be a project of projects 
can be alleviated.

Can we look at moving the parts that can be common between Oozie and the 
workflow AM into a separate project? That project I would expect to eventually 
own the complete Workflow AM, but in the short term it would just provide a 
place for this workflow library.  In parallel with that we can move forward and 
put in a simple AM that allows for the existing JobControl API to run in an AM. 
 This would allow us to validate that the MR AM is thread safe, and keep it 
that way.  It would also offer a potentially huge benefit to pig which does use 
that API currently.  I would expect most of the initial code for this 
JobControl workflow AM to be replaced as it moves to use the common workflow 
library.  

By doing this in an agile fashion it would also allow us to work out a number 
of potential issues I see when moving this from Oozie which uses a DB to store 
its state to a workflow AM where that is not possible.  By doing an initial 
simple JobControl AM we can work out some of the issues with restarting the AM 
after it crashes.  What is more by keeping the changes small, it is much more 
likely to be something that can be merged into branch 2 so that the branches do 
not diverge nearly as much.

 Workflow Application Master in YARN
 ---

 Key: MAPREDUCE-4495
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4495
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Affects Versions: 2.0.0-alpha
Reporter: Bo Wang
Assignee: Bo Wang
 Attachments: MAPREDUCE-4495-v1.1.patch, MAPREDUCE-4495-v1.patch, 
 MapReduceWorkflowAM.pdf


 It is useful to have a workflow application master, which will be capable of 
 running a DAG of jobs. The workflow client submits a DAG request to the AM 
 and then the AM will manage the life cycle of this application in terms of 
 requesting the needed resources from the RM, and starting, monitoring and 
 retrying the application's individual tasks.
 Compared to running Oozie with the current MapReduce Application Master, 
 these are some of the advantages:
  - Less number of consumed resources, since only one application master will 
 be spawned for the whole workflow.
  - Reuse of resources, since the same resources can be used by multiple 
 consecutive jobs in the workflow (no need to request/wait for resources for 
 every individual job from the central RM).
  - More optimization opportunities in terms of collective resource requests.
  - Optimization opportunities in terms of rewriting and composing jobs in the 
 workflow (e.g. pushing down Mappers).
  - This Application Master can be reused/extended by higher systems like Pig 
 and hive to provide an optimized way of running their workflows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (MAPREDUCE-4721) Task startup time in JHS is same as job startup time.

2012-10-10 Thread Ravi Prakash (JIRA)
Ravi Prakash created MAPREDUCE-4721:
---

 Summary: Task startup time in JHS is same as job startup time.
 Key: MAPREDUCE-4721
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4721
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver
Affects Versions: 0.23.3
Reporter: Ravi Prakash


As Bobby pointed out in 
https://issues.apache.org/jira/browse/MAPREDUCE-4711?focusedCommentId=13471696page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13471696

In the Map and Reduce tasks page, it should print the earliest task attempt 
launch time as TaskImpl:getLaunchTime() does.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (MAPREDUCE-4721) Task startup time in JHS is same as job startup time.

2012-10-10 Thread Ravi Prakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash reassigned MAPREDUCE-4721:
---

Assignee: Ravi Prakash

 Task startup time in JHS is same as job startup time.
 -

 Key: MAPREDUCE-4721
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4721
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver
Affects Versions: 0.23.3
Reporter: Ravi Prakash
Assignee: Ravi Prakash

 As Bobby pointed out in 
 https://issues.apache.org/jira/browse/MAPREDUCE-4711?focusedCommentId=13471696page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13471696
 In the Map and Reduce tasks page, it should print the earliest task attempt 
 launch time as TaskImpl:getLaunchTime() does.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4495) Workflow Application Master in YARN

2012-10-10 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473635#comment-13473635
 ] 

Siddharth Seth commented on MAPREDUCE-4495:
---

It'd be very nice to have an AM which is capable of processing workflows. I'm 
not sure that this belongs under the  MR project though. Along with bringing in 
a lot of Oozie / workflow code into MR, it also seems to restrict the AMs scope 
-  to running Java / MapReduce actions.
Like Bobby's suggestion of having the workflow library live in a separate 
project which can be used by Oozie as well as the WFAM. I'd be happy to 
contribute to this.



 Workflow Application Master in YARN
 ---

 Key: MAPREDUCE-4495
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4495
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Affects Versions: 2.0.0-alpha
Reporter: Bo Wang
Assignee: Bo Wang
 Attachments: MAPREDUCE-4495-v1.1.patch, MAPREDUCE-4495-v1.patch, 
 MapReduceWorkflowAM.pdf


 It is useful to have a workflow application master, which will be capable of 
 running a DAG of jobs. The workflow client submits a DAG request to the AM 
 and then the AM will manage the life cycle of this application in terms of 
 requesting the needed resources from the RM, and starting, monitoring and 
 retrying the application's individual tasks.
 Compared to running Oozie with the current MapReduce Application Master, 
 these are some of the advantages:
  - Less number of consumed resources, since only one application master will 
 be spawned for the whole workflow.
  - Reuse of resources, since the same resources can be used by multiple 
 consecutive jobs in the workflow (no need to request/wait for resources for 
 every individual job from the central RM).
  - More optimization opportunities in terms of collective resource requests.
  - Optimization opportunities in terms of rewriting and composing jobs in the 
 workflow (e.g. pushing down Mappers).
  - This Application Master can be reused/extended by higher systems like Pig 
 and hive to provide an optimized way of running their workflows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4495) Workflow Application Master in YARN

2012-10-10 Thread Mayank Bansal (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473727#comment-13473727
 ] 

Mayank Bansal commented on MAPREDUCE-4495:
--

Since beginning I was in favor of adding WF DAG to AM and I already shown 
intrest contributing to that. I was before in favor of adding this work to MR 
and YARN but Bobby pointed out correctly that it would be very difficult for us 
to keep Oozie and WFAM in sync.

I dont think we should add this WFAM as part of oozie as well because if we do 
that then all other projects like PIG and HIVE wants to use it and then it 
would be a problem for them or any other future project which wants to use WF 
functionality.

Ideally it should be a library/project which other projects like Oozie, PIG and 
HIVE should depend upon. One can argue that then it should be part of MR and 
YARN and then every other project can depends on it, But if we do that, then 
every new functionality which wants to implement AM interface has to be part of 
YARN which does not seems right to me at this point. I think the basic Idea 
behind creating AM is that any new application/project can implement that and 
use the YARN framework. Its not necessarily means it should be part of YARN 
framework itself.

Based on my understanding I think we should create new project and move forward 
, I am very much willing to contribute to that. It would be easier for us to 
innovate and move forward in separate project then being part of YARN.

Thats just my understanding.

Thoughts?



 Workflow Application Master in YARN
 ---

 Key: MAPREDUCE-4495
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4495
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Affects Versions: 2.0.0-alpha
Reporter: Bo Wang
Assignee: Bo Wang
 Attachments: MAPREDUCE-4495-v1.1.patch, MAPREDUCE-4495-v1.patch, 
 MapReduceWorkflowAM.pdf


 It is useful to have a workflow application master, which will be capable of 
 running a DAG of jobs. The workflow client submits a DAG request to the AM 
 and then the AM will manage the life cycle of this application in terms of 
 requesting the needed resources from the RM, and starting, monitoring and 
 retrying the application's individual tasks.
 Compared to running Oozie with the current MapReduce Application Master, 
 these are some of the advantages:
  - Less number of consumed resources, since only one application master will 
 be spawned for the whole workflow.
  - Reuse of resources, since the same resources can be used by multiple 
 consecutive jobs in the workflow (no need to request/wait for resources for 
 every individual job from the central RM).
  - More optimization opportunities in terms of collective resource requests.
  - Optimization opportunities in terms of rewriting and composing jobs in the 
 workflow (e.g. pushing down Mappers).
  - This Application Master can be reused/extended by higher systems like Pig 
 and hive to provide an optimized way of running their workflows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira