[jira] [Updated] (MAPREDUCE-4671) AM does not tell the RM about container requests that are no longer needed
[ https://issues.apache.org/jira/browse/MAPREDUCE-4671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bikas Saha updated MAPREDUCE-4671: -- Attachment: MAPREDUCE-4671.3.patch AM does not tell the RM about container requests that are no longer needed -- Key: MAPREDUCE-4671 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4671 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.23.3, 2.0.0-alpha Reporter: Bikas Saha Assignee: Bikas Saha Attachments: MAPREDUCE-4671.1.patch, MAPREDUCE-4671.2.patch, MAPREDUCE-4671.3.patch Say the AM wanted a container at hosts h1, h2, h3. After getting a container at h1 it should tell RM that it no longer needs containers at h2, h3. Otherwise on the RM h2, h3 remain valid allocation locations. The AM RMContainerAllocator does remove these resource requests internally. When the resource request container count drops to 0 then it drops the resource request from its tables but forgets to send the 0 sized request to the RM. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4671) AM does not tell the RM about container requests that are no longer needed
[ https://issues.apache.org/jira/browse/MAPREDUCE-4671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473028#comment-13473028 ] Bikas Saha commented on MAPREDUCE-4671: --- New patch. Checking for resource request falling below zero. Changing the ask list to have custom comparator to avoid duplication of resource requests. AM does not tell the RM about container requests that are no longer needed -- Key: MAPREDUCE-4671 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4671 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.23.3, 2.0.0-alpha Reporter: Bikas Saha Assignee: Bikas Saha Attachments: MAPREDUCE-4671.1.patch, MAPREDUCE-4671.2.patch, MAPREDUCE-4671.3.patch Say the AM wanted a container at hosts h1, h2, h3. After getting a container at h1 it should tell RM that it no longer needs containers at h2, h3. Otherwise on the RM h2, h3 remain valid allocation locations. The AM RMContainerAllocator does remove these resource requests internally. When the resource request container count drops to 0 then it drops the resource request from its tables but forgets to send the 0 sized request to the RM. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4671) AM does not tell the RM about container requests that are no longer needed
[ https://issues.apache.org/jira/browse/MAPREDUCE-4671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bikas Saha updated MAPREDUCE-4671: -- Status: Patch Available (was: Open) AM does not tell the RM about container requests that are no longer needed -- Key: MAPREDUCE-4671 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4671 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.0.0-alpha, 0.23.3 Reporter: Bikas Saha Assignee: Bikas Saha Attachments: MAPREDUCE-4671.1.patch, MAPREDUCE-4671.2.patch, MAPREDUCE-4671.3.patch Say the AM wanted a container at hosts h1, h2, h3. After getting a container at h1 it should tell RM that it no longer needs containers at h2, h3. Otherwise on the RM h2, h3 remain valid allocation locations. The AM RMContainerAllocator does remove these resource requests internally. When the resource request container count drops to 0 then it drops the resource request from its tables but forgets to send the 0 sized request to the RM. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3655) Exception from launching allocated container
[ https://issues.apache.org/jira/browse/MAPREDUCE-3655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473030#comment-13473030 ] Li Ming commented on MAPREDUCE-3655: This is also happens on 2.0.1-alpha, it seems related to the resource localization. In the DistributedShell example, the ContainerLaunchContext of AM has LocalResources which are the AppMaster.jar, but other task containers do not have this. And only the container with local resources will create the directory like /tmp/nm-local-dir/usercache/jiangbing/appcache/application_1325062142731_0006, so the non-AM containers will fail to use these directories. Exception from launching allocated container Key: MAPREDUCE-3655 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3655 Project: Hadoop Map/Reduce Issue Type: Bug Components: applicationmaster Affects Versions: 0.23.0 Reporter: Bing Jiang I use Hadoop-Yarn to deploy my real-time distributed computation system, and I get reply from mapreduce-u...@hadoop.apache.org to follow these guilders below: http://hadoop.apache.org/common/docs/r0.23.0/hadoop-yarn/hadoop-yarn-site/YARN.html http://hadoop.apache.org/common/docs/r0.23.0/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html When I follow the steps to construct my Client, ApplicationMaster. And an issue occurs to me that NM fail to launch a Container because of java.io.FileNotFoundException. So the part of NM log has been attached below: 2011-12-29 15:49:16,250 INFO org.apache.hadoop.yarn.server. nodemanager.containermanager.application.Application: Adding container_1325062142731_0006_01_01 to application application_1325062142731_0006 2011-12-29 15:49:16,250 DEBUG org.apache.hadoop.yarn.event.AsyncDispatcher: Dispatching the event org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.event.ApplicationLocalizationEvent.EventType: INIT_APPLICATION_RESOURCES 2011-12-29 15:49:16,250 DEBUG org.apache.hadoop.yarn.event.AsyncDispatcher: Dispatching the event org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationInitedEvent.EventType: APPLICATION_INITED 2011-12-29 15:49:16,250 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: Processing application_1325062142731_0006 of type APPLICATION_INITED 2011-12-29 15:49:16,250 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: Application application_1325062142731_0006 transitioned from INITING to RUNNING 2011-12-29 15:49:16,250 DEBUG org.apache.hadoop.yarn.event.AsyncDispatcher: Dispatching the event org.apache.hadoop.yarn.server.nodemanager.containermanager.loghandler.event.LogHandlerAppStartedEvent.EventType: APPLICATION_STARTED 2011-12-29 15:49:16,250 DEBUG org.apache.hadoop.yarn.event.AsyncDispatcher: Dispatching the event org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerInitEvent.EventType: INIT_CONTAINER 2011-12-29 15:49:16,250 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Processing container_1325062142731_0006_01_01 of type INIT_CONTAINER 2011-12-29 15:49:16,250 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_1325062142731_0006_01_01 transitioned from NEW to LOCALIZED 2011-12-29 15:49:16,250 DEBUG org.apache.hadoop.yarn.event.AsyncDispatcher: Dispatching the event org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncherEvent.EventType: LAUNCH_CONTAINER 2011-12-29 15:49:16,287 DEBUG org.apache.hadoop.yarn.event.AsyncDispatcher: Dispatching the event org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerEvent.EventType: CONTAINER_LAUNCHED 2011-12-29 15:49:16,287 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Processing container_1325062142731_0006_01_01 of type CONTAINER_LAUNCHED 2011-12-29 15:49:16,287 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_1325062142731_0006_01_01 transitioned from LOCALIZED to RUNNING 2011-12-29 15:49:16,288 DEBUG org.apache.hadoop.yarn.event.AsyncDispatcher: Dispatching the event org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainerStartMonitoringEvent.EventType: START_MONITORING_CONTAINER 2011-12-29 15:49:16,289 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch: Failed to launch container java.io.FileNotFoundException: File /tmp/nm-local-dir/usercache/jiangbing/appcache/application_1325062142731_0006 does not exist
[jira] [Updated] (MAPREDUCE-3655) Exception from launching allocated container
[ https://issues.apache.org/jira/browse/MAPREDUCE-3655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Ming updated MAPREDUCE-3655: --- Affects Version/s: 2.0.1-alpha Exception from launching allocated container Key: MAPREDUCE-3655 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3655 Project: Hadoop Map/Reduce Issue Type: Bug Components: applicationmaster Affects Versions: 0.23.0, 2.0.1-alpha Reporter: Bing Jiang I use Hadoop-Yarn to deploy my real-time distributed computation system, and I get reply from mapreduce-u...@hadoop.apache.org to follow these guilders below: http://hadoop.apache.org/common/docs/r0.23.0/hadoop-yarn/hadoop-yarn-site/YARN.html http://hadoop.apache.org/common/docs/r0.23.0/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html When I follow the steps to construct my Client, ApplicationMaster. And an issue occurs to me that NM fail to launch a Container because of java.io.FileNotFoundException. So the part of NM log has been attached below: 2011-12-29 15:49:16,250 INFO org.apache.hadoop.yarn.server. nodemanager.containermanager.application.Application: Adding container_1325062142731_0006_01_01 to application application_1325062142731_0006 2011-12-29 15:49:16,250 DEBUG org.apache.hadoop.yarn.event.AsyncDispatcher: Dispatching the event org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.event.ApplicationLocalizationEvent.EventType: INIT_APPLICATION_RESOURCES 2011-12-29 15:49:16,250 DEBUG org.apache.hadoop.yarn.event.AsyncDispatcher: Dispatching the event org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationInitedEvent.EventType: APPLICATION_INITED 2011-12-29 15:49:16,250 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: Processing application_1325062142731_0006 of type APPLICATION_INITED 2011-12-29 15:49:16,250 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: Application application_1325062142731_0006 transitioned from INITING to RUNNING 2011-12-29 15:49:16,250 DEBUG org.apache.hadoop.yarn.event.AsyncDispatcher: Dispatching the event org.apache.hadoop.yarn.server.nodemanager.containermanager.loghandler.event.LogHandlerAppStartedEvent.EventType: APPLICATION_STARTED 2011-12-29 15:49:16,250 DEBUG org.apache.hadoop.yarn.event.AsyncDispatcher: Dispatching the event org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerInitEvent.EventType: INIT_CONTAINER 2011-12-29 15:49:16,250 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Processing container_1325062142731_0006_01_01 of type INIT_CONTAINER 2011-12-29 15:49:16,250 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_1325062142731_0006_01_01 transitioned from NEW to LOCALIZED 2011-12-29 15:49:16,250 DEBUG org.apache.hadoop.yarn.event.AsyncDispatcher: Dispatching the event org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncherEvent.EventType: LAUNCH_CONTAINER 2011-12-29 15:49:16,287 DEBUG org.apache.hadoop.yarn.event.AsyncDispatcher: Dispatching the event org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerEvent.EventType: CONTAINER_LAUNCHED 2011-12-29 15:49:16,287 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Processing container_1325062142731_0006_01_01 of type CONTAINER_LAUNCHED 2011-12-29 15:49:16,287 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_1325062142731_0006_01_01 transitioned from LOCALIZED to RUNNING 2011-12-29 15:49:16,288 DEBUG org.apache.hadoop.yarn.event.AsyncDispatcher: Dispatching the event org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainerStartMonitoringEvent.EventType: START_MONITORING_CONTAINER 2011-12-29 15:49:16,289 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch: Failed to launch container java.io.FileNotFoundException: File /tmp/nm-local-dir/usercache/jiangbing/appcache/application_1325062142731_0006 does not exist at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:431) at org.apache.hadoop.fs.FileSystem.primitiveMkdir(FileSystem.java:815) at org.apache.hadoop.fs.DelegateToFileSystem.mkdir(DelegateToFileSystem.java:143) at org.apache.hadoop.fs.FilterFs.mkdir(FilterFs.java:189) at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:700) at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:697) at
[jira] [Created] (MAPREDUCE-4717) Mapreduce job fails to run after configuring multiple namespaces [HDFS Federation]
Sagar Shimpi created MAPREDUCE-4717: --- Summary: Mapreduce job fails to run after configuring multiple namespaces [HDFS Federation] Key: MAPREDUCE-4717 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4717 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv1 Affects Versions: 0.20.2 Environment: 2 Standalone Desktop with 360Gb HDD and 4GB RAM - Acting as two Namenodes 2 Virtual Machine with 60GB HDD and 1GB RAM - Acting as Job tracker and zookeeper Reporter: Sagar Shimpi I am having setup of 4 nodes with following details - Standalone Desktop-1 - NameNode1,Tasktracker,Zookeeper,Jobtracker,datanode,HMaster Standalone Desktop-2 - NameNode2,Tasktracker,datanode.RegionServer Virtual Machine-1 - Namenode3,Datanode,Tasktracker Virtual Machine-2 - Namenode4,Datanode,Tasktracker I have configured HDFS Federation with following name service - a) nameservice1 b) oss-hadoop-nameservice While executing Mapreduce job I am getting following error - -bash-4.1$ id uid=496(hdfs) gid=496(hdfs) groups=496(hdfs),497(hadoop) -bash-4.1$ hadoop jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar wordcount /hbase/install.log.syslog /hbase/testing 12/10/10 12:30:21 ERROR security.UserGroupInformation: PriviledgedActionException as:hdfs (auth:SIMPLE) cause:java.io.IOException: viewfs://cluster6/ java.io.IOException: viewfs://cluster6/ at org.apache.hadoop.fs.viewfs.InodeTree.init(InodeTree.java:338) at org.apache.hadoop.fs.viewfs.ViewFileSystem$1.init(ViewFileSystem.java:178) at org.apache.hadoop.fs.viewfs.ViewFileSystem.initialize(ViewFileSystem.java:178) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2150) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:80) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2184) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2166) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:302) at org.apache.hadoop.fs.Path.getFileSystem(Path.java:194) at org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:103) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:844) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:844) at org.apache.hadoop.mapreduce.Job.submit(Job.java:481) at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:511) at org.apache.hadoop.examples.WordCount.main(WordCount.java:67) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72) at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:144) at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:208) -bash-4.1$ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4671) AM does not tell the RM about container requests that are no longer needed
[ https://issues.apache.org/jira/browse/MAPREDUCE-4671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473035#comment-13473035 ] Hadoop QA commented on MAPREDUCE-4671: -- {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12548529/MAPREDUCE-4671.3.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2923//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2923//console This message is automatically generated. AM does not tell the RM about container requests that are no longer needed -- Key: MAPREDUCE-4671 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4671 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.23.3, 2.0.0-alpha Reporter: Bikas Saha Assignee: Bikas Saha Attachments: MAPREDUCE-4671.1.patch, MAPREDUCE-4671.2.patch, MAPREDUCE-4671.3.patch Say the AM wanted a container at hosts h1, h2, h3. After getting a container at h1 it should tell RM that it no longer needs containers at h2, h3. Otherwise on the RM h2, h3 remain valid allocation locations. The AM RMContainerAllocator does remove these resource requests internally. When the resource request container count drops to 0 then it drops the resource request from its tables but forgets to send the 0 sized request to the RM. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4451) fairscheduler fail to init job with kerberos authentication configured
[ https://issues.apache.org/jira/browse/MAPREDUCE-4451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473076#comment-13473076 ] Karthik Kambatla commented on MAPREDUCE-4451: - Given that the patch doesn't have tests, I was planning on running a secure cluster to ascertain the behavior, but haven't been able to get to it. Will you be able to validate the behavior and report the same? Otherwise, I ll see if I can do the same in the next couple of days? fairscheduler fail to init job with kerberos authentication configured -- Key: MAPREDUCE-4451 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4451 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/fair-share Affects Versions: 1.0.3 Reporter: Erik.fang Attachments: MAPREDUCE-4451_branch-1.patch, MAPREDUCE-4451_branch-1.patch, MAPREDUCE-4451_branch-1.patch, MAPREDUCE-4451_branch-1.patch, MAPREDUCE-4451_branch-1.patch Using FairScheduler in Hadoop 1.0.3 with kerberos authentication configured. Job initialization fails: {code} 2012-07-17 15:15:09,220 ERROR org.apache.hadoop.mapred.JobTracker: Job initialization failed: java.io.IOException: Call to /192.168.7.80:8020 failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)] at org.apache.hadoop.ipc.Client.wrapException(Client.java:1129) at org.apache.hadoop.ipc.Client.call(Client.java:1097) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229) at $Proxy7.getProtocolVersion(Unknown Source) at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:411) at org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:125) at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:329) at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:294) at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:100) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1411) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1429) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254) at org.apache.hadoop.fs.Path.getFileSystem(Path.java:187) at org.apache.hadoop.security.Credentials.writeTokenStorageFile(Credentials.java:169) at org.apache.hadoop.mapred.JobInProgress.generateAndStoreTokens(JobInProgress.java:3558) at org.apache.hadoop.mapred.JobInProgress.initTasks(JobInProgress.java:696) at org.apache.hadoop.mapred.JobTracker.initJob(JobTracker.java:3911) at org.apache.hadoop.mapred.FairScheduler$JobInitializer$InitJob.run(FairScheduler.java:301) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)] at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:543) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1136) at org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:488) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:590) at org.apache.hadoop.ipc.Client$Connection.access$2100(Client.java:187) at org.apache.hadoop.ipc.Client.getConnection(Client.java:1228) at org.apache.hadoop.ipc.Client.call(Client.java:1072) ... 20 more Caused by: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)] at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:194) at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:134) at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:385) at org.apache.hadoop.ipc.Client$Connection.access$1200(Client.java:187) at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:583) at
[jira] [Commented] (MAPREDUCE-4451) fairscheduler fail to init job with kerberos authentication configured
[ https://issues.apache.org/jira/browse/MAPREDUCE-4451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473090#comment-13473090 ] Erik.fang commented on MAPREDUCE-4451: -- Before upload the patch ,I have tested it in a 4-node dev-cluster with hadoop-1.0.3 to make sure it works. Maybe it is better to validate the patch with branch-1 compiled jars. I can do it today or tomorrow, and post the result. However, I can only post some jobtracker logs to show that job initialization fails before apply the patch and every thing works fine after apply the patch. Is that enough or any other ideas? fairscheduler fail to init job with kerberos authentication configured -- Key: MAPREDUCE-4451 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4451 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/fair-share Affects Versions: 1.0.3 Reporter: Erik.fang Attachments: MAPREDUCE-4451_branch-1.patch, MAPREDUCE-4451_branch-1.patch, MAPREDUCE-4451_branch-1.patch, MAPREDUCE-4451_branch-1.patch, MAPREDUCE-4451_branch-1.patch Using FairScheduler in Hadoop 1.0.3 with kerberos authentication configured. Job initialization fails: {code} 2012-07-17 15:15:09,220 ERROR org.apache.hadoop.mapred.JobTracker: Job initialization failed: java.io.IOException: Call to /192.168.7.80:8020 failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)] at org.apache.hadoop.ipc.Client.wrapException(Client.java:1129) at org.apache.hadoop.ipc.Client.call(Client.java:1097) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229) at $Proxy7.getProtocolVersion(Unknown Source) at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:411) at org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:125) at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:329) at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:294) at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:100) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1411) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1429) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254) at org.apache.hadoop.fs.Path.getFileSystem(Path.java:187) at org.apache.hadoop.security.Credentials.writeTokenStorageFile(Credentials.java:169) at org.apache.hadoop.mapred.JobInProgress.generateAndStoreTokens(JobInProgress.java:3558) at org.apache.hadoop.mapred.JobInProgress.initTasks(JobInProgress.java:696) at org.apache.hadoop.mapred.JobTracker.initJob(JobTracker.java:3911) at org.apache.hadoop.mapred.FairScheduler$JobInitializer$InitJob.run(FairScheduler.java:301) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)] at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:543) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1136) at org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:488) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:590) at org.apache.hadoop.ipc.Client$Connection.access$2100(Client.java:187) at org.apache.hadoop.ipc.Client.getConnection(Client.java:1228) at org.apache.hadoop.ipc.Client.call(Client.java:1072) ... 20 more Caused by: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)] at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:194) at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:134) at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:385) at org.apache.hadoop.ipc.Client$Connection.access$1200(Client.java:187) at
[jira] [Created] (MAPREDUCE-4718) MapReduce fails If I pass a parameter as a S3 folder
Benjamin Kim created MAPREDUCE-4718: --- Summary: MapReduce fails If I pass a parameter as a S3 folder Key: MAPREDUCE-4718 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4718 Project: Hadoop Map/Reduce Issue Type: Bug Components: job submission Affects Versions: 1.0.3, 1.0.0 Environment: Hadoop with default configurations Reporter: Benjamin Kim I'm running a wordcount MR as follows hadoop jar WordCount.jar wordcount.WordCountDriver s3n://bucket/wordcount/input s3n://bucket/wordcount/output s3n://bucket/wordcount/input is a s3 object that contains other input files. However I get following NPE error 12/10/02 18:56:23 INFO mapred.JobClient: map 0% reduce 0% 12/10/02 18:56:54 INFO mapred.JobClient: map 50% reduce 0% 12/10/02 18:56:56 INFO mapred.JobClient: Task Id : attempt_201210021853_0001_m_01_0, Status : FAILED java.lang.NullPointerException at org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsInputStream.close(NativeS3FileSystem.java:106) at java.io.BufferedInputStream.close(BufferedInputStream.java:451) at java.io.FilterInputStream.close(FilterInputStream.java:155) at org.apache.hadoop.util.LineReader.close(LineReader.java:83) at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.close(LineRecordReader.java:144) at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.close(MapTask.java:497) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:765) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121) at org.apache.hadoop.mapred.Child.main(Child.java:249) MR runs fine if i specify more specific input path such as s3n://bucket/wordcount/input/file.txt MR fails if I pass s3 folder as a parameter In summary, This works hadoop jar ./hadoop-examples-1.0.3.jar wordcount /user/hadoop/wordcount/input/ s3n://bucket/wordcount/output/ This doesn't work hadoop jar ./hadoop-examples-1.0.3.jar wordcount s3n://bucket/wordcount/input/ s3n://bucket/wordcount/output/ (both input path are directories) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3678) The Map tasks logs should have the value of input split it processed
[ https://issues.apache.org/jira/browse/MAPREDUCE-3678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473198#comment-13473198 ] Hudson commented on MAPREDUCE-3678: --- Integrated in Hadoop-Hdfs-trunk #1191 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1191/]) MAPREDUCE-3678. The Map tasks logs should have the value of input split it processed. Contributed by Harsh J. (harsh) (Revision 1396032) Result = SUCCESS harsh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1396032 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/MapTask.java The Map tasks logs should have the value of input split it processed Key: MAPREDUCE-3678 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3678 Project: Hadoop Map/Reduce Issue Type: New Feature Components: mrv1, mrv2 Affects Versions: 1.0.0, 2.0.0-alpha Reporter: Bejoy KS Assignee: Harsh J Fix For: 1.2.0, 2.0.3-alpha Attachments: MAPREDUCE-3678-branch-1.patch, MAPREDUCE-3678.patch It would be easier to debug some corner in tasks if we knew what was the input split processed by that task. Map reduce task tracker log should accommodate the same. Also in the jobdetails web UI, the split also should be displayed along with the Split Locations. Sample as Input Split hdfs://myserver:9000/userdata/sampleapp/inputdir/file1.csv - split no/offset from beginning of file This would be much beneficial to nail down some data quality issues in large data volume processing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4654) TestDistCp is @ignored
[ https://issues.apache.org/jira/browse/MAPREDUCE-4654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473199#comment-13473199 ] Hudson commented on MAPREDUCE-4654: --- Integrated in Hadoop-Hdfs-trunk #1191 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1191/]) MAPREDUCE-4654. TestDistCp is ignored. Contributed by Sandy Ryza. (Revision 1396047) Result = SUCCESS tomwhite : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1396047 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-tools/hadoop-distcp/src/test/java/org/apache/hadoop/tools/TestDistCp.java * /hadoop/common/trunk/hadoop-tools/hadoop-distcp/src/test/java/org/apache/hadoop/tools/TestIntegration.java TestDistCp is @ignored -- Key: MAPREDUCE-4654 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4654 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Affects Versions: 2.0.2-alpha Reporter: Colin Patrick McCabe Assignee: Sandy Ryza Priority: Critical Fix For: 2.0.3-alpha Attachments: MAPREDUCE-4654.patch We should fix TestDistCp so that it actually runs, rather than being ignored. {code} @ignore public class TestDistCp { private static final Log LOG = LogFactory.getLog(TestDistCp.class); private static ListPath pathList = new ArrayListPath(); ... {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-4719) mapred.TaskInProgress should be public
Dave Beech created MAPREDUCE-4719: - Summary: mapred.TaskInProgress should be public Key: MAPREDUCE-4719 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4719 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Dave Beech Priority: Minor In Cloudera's CDH3 distributions, mapred.TaskInProgress has been made public along with its generateSingleReport() and getDiagnosticInfo() methods. Should this change be brought back into the main source tree? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4654) TestDistCp is @ignored
[ https://issues.apache.org/jira/browse/MAPREDUCE-4654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473238#comment-13473238 ] Hudson commented on MAPREDUCE-4654: --- Integrated in Hadoop-Mapreduce-trunk #1222 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1222/]) MAPREDUCE-4654. TestDistCp is ignored. Contributed by Sandy Ryza. (Revision 1396047) Result = SUCCESS tomwhite : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1396047 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-tools/hadoop-distcp/src/test/java/org/apache/hadoop/tools/TestDistCp.java * /hadoop/common/trunk/hadoop-tools/hadoop-distcp/src/test/java/org/apache/hadoop/tools/TestIntegration.java TestDistCp is @ignored -- Key: MAPREDUCE-4654 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4654 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Affects Versions: 2.0.2-alpha Reporter: Colin Patrick McCabe Assignee: Sandy Ryza Priority: Critical Fix For: 2.0.3-alpha Attachments: MAPREDUCE-4654.patch We should fix TestDistCp so that it actually runs, rather than being ignored. {code} @ignore public class TestDistCp { private static final Log LOG = LogFactory.getLog(TestDistCp.class); private static ListPath pathList = new ArrayListPath(); ... {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3678) The Map tasks logs should have the value of input split it processed
[ https://issues.apache.org/jira/browse/MAPREDUCE-3678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473237#comment-13473237 ] Hudson commented on MAPREDUCE-3678: --- Integrated in Hadoop-Mapreduce-trunk #1222 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1222/]) MAPREDUCE-3678. The Map tasks logs should have the value of input split it processed. Contributed by Harsh J. (harsh) (Revision 1396032) Result = SUCCESS harsh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1396032 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/MapTask.java The Map tasks logs should have the value of input split it processed Key: MAPREDUCE-3678 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3678 Project: Hadoop Map/Reduce Issue Type: New Feature Components: mrv1, mrv2 Affects Versions: 1.0.0, 2.0.0-alpha Reporter: Bejoy KS Assignee: Harsh J Fix For: 1.2.0, 2.0.3-alpha Attachments: MAPREDUCE-3678-branch-1.patch, MAPREDUCE-3678.patch It would be easier to debug some corner in tasks if we knew what was the input split processed by that task. Map reduce task tracker log should accommodate the same. Also in the jobdetails web UI, the split also should be displayed along with the Split Locations. Sample as Input Split hdfs://myserver:9000/userdata/sampleapp/inputdir/file1.csv - split no/offset from beginning of file This would be much beneficial to nail down some data quality issues in large data volume processing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-4720) Browser thinks History Server main page JS is taking too long
Robert Joseph Evans created MAPREDUCE-4720: -- Summary: Browser thinks History Server main page JS is taking too long Key: MAPREDUCE-4720 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4720 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.23.3 Reporter: Robert Joseph Evans The main History Server page with the default settings of 20,000 jobs can cause browsers to think that the JS on the page is stuck and ask you if you want to kill it. This is a big usability problem. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4568) Throw early exception when duplicate files or archives are found in distributed cache
[ https://issues.apache.org/jira/browse/MAPREDUCE-4568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473271#comment-13473271 ] Robert Joseph Evans commented on MAPREDUCE-4568: Adding a true duplicate, exact same file multiple times, to the dist cache will not result in an error under YARN. The MR client will just dedupe them before submitting the request to YARN. The issue is when there are different files that will both map to the same key in the dist cache map (the key is the name of the symlink created in the working directory of the task/container). Then is where it will throw an exception under 2.0 Throw early exception when duplicate files or archives are found in distributed cache --- Key: MAPREDUCE-4568 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4568 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Mohammad Kamrul Islam Assignee: Arun C Murthy According to #MAPREDUCE-4549, Hadoop 2.x throws exception if duplicates found in cacheFiles or cacheArchives. The exception throws during job submission. This JIRA is to throw the exception ==early== when it is first added to the Distributed Cache through addCacheFile or addFileToClassPath. It will help the client to decide whether to fail-fast or continue w/o the duplicated entries. Alternatively, Hadoop could provide a knob where user will choose whether to throw error( coming behavior) or silently ignore (old behavior). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4568) Throw early exception when duplicate files or archives are found in distributed cache
[ https://issues.apache.org/jira/browse/MAPREDUCE-4568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473277#comment-13473277 ] Jason Lowe commented on MAPREDUCE-4568: --- bq. In addition, it will be better, if there is a way of checking whether some file is already added in DC. Would adding an interface so the client can query the contents of the DC before job submission be sufficient? This seems like a reasonable enhancement that doesn't overlap with existing interfaces. Or do you think it's still a requirement to throw early when adding a collision? Throwing will require adding a new interface for adding to the DC which overlaps with existing functionality and adds to the pile of APIs we already have for adding things to the DC. Throw early exception when duplicate files or archives are found in distributed cache --- Key: MAPREDUCE-4568 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4568 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Mohammad Kamrul Islam Assignee: Arun C Murthy According to #MAPREDUCE-4549, Hadoop 2.x throws exception if duplicates found in cacheFiles or cacheArchives. The exception throws during job submission. This JIRA is to throw the exception ==early== when it is first added to the Distributed Cache through addCacheFile or addFileToClassPath. It will help the client to decide whether to fail-fast or continue w/o the duplicated entries. Alternatively, Hadoop could provide a knob where user will choose whether to throw error( coming behavior) or silently ignore (old behavior). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4719) mapred.TaskInProgress should be public
[ https://issues.apache.org/jira/browse/MAPREDUCE-4719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473282#comment-13473282 ] Todd Lipcon commented on MAPREDUCE-4719: Hi Dave. What's the use case you're trying to address that needs them public? We did this in CDH back in early 2010 for use in a contrib plugin, but I'd like to hear why you need it before forward-porting the change. In more recent CDH, those plugins no longer exist as they've been supplanted by other APIs. mapred.TaskInProgress should be public -- Key: MAPREDUCE-4719 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4719 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Dave Beech Priority: Minor In Cloudera's CDH3 distributions, mapred.TaskInProgress has been made public along with its generateSingleReport() and getDiagnosticInfo() methods. Should this change be brought back into the main source tree? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4719) mapred.TaskInProgress should be public
[ https://issues.apache.org/jira/browse/MAPREDUCE-4719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473286#comment-13473286 ] Dave Beech commented on MAPREDUCE-4719: --- Hi Todd. I have no use case for this actually. It's just an inconsistency I noticed and discussed it with Steve Loughran. (you may see the messages on Twitter!). Happy to have this closed as not a problem. mapred.TaskInProgress should be public -- Key: MAPREDUCE-4719 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4719 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Dave Beech Priority: Minor In Cloudera's CDH3 distributions, mapred.TaskInProgress has been made public along with its generateSingleReport() and getDiagnosticInfo() methods. Should this change be brought back into the main source tree? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (MAPREDUCE-4719) mapred.TaskInProgress should be public
[ https://issues.apache.org/jira/browse/MAPREDUCE-4719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon resolved MAPREDUCE-4719. Resolution: Not A Problem Gotcha. Let's resolve as not-a-problem then for now, and if someone disagrees, we can re-open. mapred.TaskInProgress should be public -- Key: MAPREDUCE-4719 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4719 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Dave Beech Priority: Minor In Cloudera's CDH3 distributions, mapred.TaskInProgress has been made public along with its generateSingleReport() and getDiagnosticInfo() methods. Should this change be brought back into the main source tree? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (MAPREDUCE-4717) Mapreduce job fails to run after configuring multiple namespaces [HDFS Federation]
[ https://issues.apache.org/jira/browse/MAPREDUCE-4717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron T. Myers resolved MAPREDUCE-4717. --- Resolution: Not A Problem Hi Sagar, this looks to me to be most likely a configuration error, in which case you should try emailing a user mailing list. I'm guessing from the facts that you say you configured federation and you're using MR1 that you're using CDH, in which case you should email cdh-u...@cloudera.org. If I'm wrong about that and you're somehow using a straight Apache release, then you should email u...@hadoop.apache.org. Mapreduce job fails to run after configuring multiple namespaces [HDFS Federation] -- Key: MAPREDUCE-4717 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4717 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv1 Affects Versions: 0.20.2 Environment: 2 Standalone Desktop with 360Gb HDD and 4GB RAM - Acting as two Namenodes 2 Virtual Machine with 60GB HDD and 1GB RAM - Acting as Job tracker and zookeeper Reporter: Sagar Shimpi I am having setup of 4 nodes with following details - Standalone Desktop-1 - NameNode1,Tasktracker,Zookeeper,Jobtracker,datanode,HMaster Standalone Desktop-2 - NameNode2,Tasktracker,datanode.RegionServer Virtual Machine-1 - Namenode3,Datanode,Tasktracker Virtual Machine-2 - Namenode4,Datanode,Tasktracker I have configured HDFS Federation with following name service - a) nameservice1 b) oss-hadoop-nameservice While executing Mapreduce job I am getting following error - -bash-4.1$ id uid=496(hdfs) gid=496(hdfs) groups=496(hdfs),497(hadoop) -bash-4.1$ hadoop jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar wordcount /hbase/install.log.syslog /hbase/testing 12/10/10 12:30:21 ERROR security.UserGroupInformation: PriviledgedActionException as:hdfs (auth:SIMPLE) cause:java.io.IOException: viewfs://cluster6/ java.io.IOException: viewfs://cluster6/ at org.apache.hadoop.fs.viewfs.InodeTree.init(InodeTree.java:338) at org.apache.hadoop.fs.viewfs.ViewFileSystem$1.init(ViewFileSystem.java:178) at org.apache.hadoop.fs.viewfs.ViewFileSystem.initialize(ViewFileSystem.java:178) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2150) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:80) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2184) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2166) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:302) at org.apache.hadoop.fs.Path.getFileSystem(Path.java:194) at org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:103) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:844) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:844) at org.apache.hadoop.mapreduce.Job.submit(Job.java:481) at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:511) at org.apache.hadoop.examples.WordCount.main(WordCount.java:67) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72) at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:144) at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:208) -bash-4.1$ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see:
[jira] [Commented] (MAPREDUCE-4398) Fix mapred.system.dir permission error with FairScheduler
[ https://issues.apache.org/jira/browse/MAPREDUCE-4398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473380#comment-13473380 ] Arpit Gupta commented on MAPREDUCE-4398: The following stack trace was seen when using fair scheduler with 1.0.3 release {code} Generating 100 using 2 maps with step of 50 12/10/09 19:04:09 INFO mapred.JobClient: Running job: job_201210091900_0002 12/10/09 19:04:10 INFO mapred.JobClient: map 0% reduce 0% 12/10/09 19:04:10 INFO mapred.JobClient: Job complete: job_201210091900_0002 12/10/09 19:04:10 INFO mapred.JobClient: Counters: 0 12/10/09 19:04:10 INFO mapred.JobClient: Job Failed: Job initialization failed: org.apache.hadoop.security.AccessControlException: org.apache.hadoop.security.AccessControlException: Permission denied: user=robing, access=EXECUTE, inode=system:mapred:hadoop:rwx-- at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:95) at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:57) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.init(DFSClient.java:3251) at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:713) at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:182) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:555) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:536) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:443) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:435) at org.apache.hadoop.security.Credentials.writeTokenStorageFile(Credentials.java:169) at org.apache.hadoop.mapred.JobInProgress.generateAndStoreTokens(JobInProgress.java:3537) at org.apache.hadoop.mapred.JobInProgress.initTasks(JobInProgress.java:696) at org.apache.hadoop.mapred.JobTracker.initJob(JobTracker.java:4207) at org.apache.hadoop.mapred.FairScheduler$JobInitializer$InitJob.run(FairScheduler.java:291) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) {code} It looks like when using fair scheduler the job token file is being written (in the mapred.system.dir) as the user running the job where as if we use the default scheduler that file is being written as the user running mr. Fix mapred.system.dir permission error with FairScheduler - Key: MAPREDUCE-4398 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4398 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/fair-share Affects Versions: 1.0.3 Reporter: Luke Lu Assignee: Yu Gao Incorrect job initialization logic in FairScheduler causes mysterious intermittent mapred.system.dir permission errors. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2454) Allow external sorter plugin for MR
[ https://issues.apache.org/jira/browse/MAPREDUCE-2454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473411#comment-13473411 ] Alejandro Abdelnur commented on MAPREDUCE-2454: --- Initial feedback on the patch (I'll do a more detailed review): * Nice work * patch needs rebase, TestReduceTask.java has been moved to hadoop-mapreduce-client-jobclient/ * remove introduced unused imports through out the patch * reformat lines with over 80 chars through out the patch I'm not trilled on how we are mixing mapred and mapreduce classes in the APIs of pluggable sort. But given how the current MR stuff implementation is done, I don't think it is possible to avoid that without a mayor cleanup/refactoring of much bigger scope. One thing would be quite useful, and I'd say a pre-requisite before committing it, is a performance comparison of terasort with and without the patch; we shouldn't be introducing a sensible performance penalty. Allow external sorter plugin for MR --- Key: MAPREDUCE-2454 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2454 Project: Hadoop Map/Reduce Issue Type: New Feature Affects Versions: 2.0.0-alpha, 3.0.0, 2.0.2-alpha Reporter: Mariappan Asokan Assignee: Mariappan Asokan Priority: Minor Labels: features, performance, plugin, sort Attachments: HadoopSortPlugin.pdf, HadoopSortPlugin.pdf, KeyValueIterator.java, MapOutputSorterAbstract.java, MapOutputSorter.java, mapreduce-2454.patch, mapreduce-2454.patch, mapreduce-2454.patch, mapreduce-2454.patch, mapreduce-2454.patch, mapreduce-2454.patch, mapreduce-2454.patch, mapreduce-2454.patch, mapreduce-2454.patch, mapreduce-2454.patch, mapreduce-2454.patch, mr-2454-on-mr-279-build82.patch.gz, MR-2454-trunkPatchPreview.gz, ReduceInputSorter.java Define interfaces and some abstract classes in the Hadoop framework to facilitate external sorter plugins both on the Map and Reduce sides. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4495) Workflow Application Master in YARN
[ https://issues.apache.org/jira/browse/MAPREDUCE-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473422#comment-13473422 ] Alejandro Abdelnur commented on MAPREDUCE-4495: --- +1 Following up, as I've said before, I think is a good starting point and I'd like to commit this to trunk (only move it to a release branch once it is in good shape. Workflow Application Master in YARN --- Key: MAPREDUCE-4495 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4495 Project: Hadoop Map/Reduce Issue Type: New Feature Affects Versions: 2.0.0-alpha Reporter: Bo Wang Assignee: Bo Wang Attachments: MAPREDUCE-4495-v1.1.patch, MAPREDUCE-4495-v1.patch, MapReduceWorkflowAM.pdf It is useful to have a workflow application master, which will be capable of running a DAG of jobs. The workflow client submits a DAG request to the AM and then the AM will manage the life cycle of this application in terms of requesting the needed resources from the RM, and starting, monitoring and retrying the application's individual tasks. Compared to running Oozie with the current MapReduce Application Master, these are some of the advantages: - Less number of consumed resources, since only one application master will be spawned for the whole workflow. - Reuse of resources, since the same resources can be used by multiple consecutive jobs in the workflow (no need to request/wait for resources for every individual job from the central RM). - More optimization opportunities in terms of collective resource requests. - Optimization opportunities in terms of rewriting and composing jobs in the workflow (e.g. pushing down Mappers). - This Application Master can be reused/extended by higher systems like Pig and hive to provide an optimized way of running their workflows. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4495) Workflow Application Master in YARN
[ https://issues.apache.org/jira/browse/MAPREDUCE-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473431#comment-13473431 ] Arun C Murthy commented on MAPREDUCE-4495: -- Tucu, I'm a little disappointed. We had a chat last week and I told you that I'd get back to you - I'll look at this soon, please wait. Thanks. Workflow Application Master in YARN --- Key: MAPREDUCE-4495 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4495 Project: Hadoop Map/Reduce Issue Type: New Feature Affects Versions: 2.0.0-alpha Reporter: Bo Wang Assignee: Bo Wang Attachments: MAPREDUCE-4495-v1.1.patch, MAPREDUCE-4495-v1.patch, MapReduceWorkflowAM.pdf It is useful to have a workflow application master, which will be capable of running a DAG of jobs. The workflow client submits a DAG request to the AM and then the AM will manage the life cycle of this application in terms of requesting the needed resources from the RM, and starting, monitoring and retrying the application's individual tasks. Compared to running Oozie with the current MapReduce Application Master, these are some of the advantages: - Less number of consumed resources, since only one application master will be spawned for the whole workflow. - Reuse of resources, since the same resources can be used by multiple consecutive jobs in the workflow (no need to request/wait for resources for every individual job from the central RM). - More optimization opportunities in terms of collective resource requests. - Optimization opportunities in terms of rewriting and composing jobs in the workflow (e.g. pushing down Mappers). - This Application Master can be reused/extended by higher systems like Pig and hive to provide an optimized way of running their workflows. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4495) Workflow Application Master in YARN
[ https://issues.apache.org/jira/browse/MAPREDUCE-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473434#comment-13473434 ] Arun C Murthy commented on MAPREDUCE-4495: -- I'll ask the same question I asked you personally last week: Why aren't we putting this in an incubator project rather than importing code from Oozie etc. into MapReduce? What is the need for the complex event system here? Why is that needed if we only need MR jobs? Why aren't we using JobControl api? I've also asked this same question before: Why aren't we using JobControl api since it already exists? Workflow Application Master in YARN --- Key: MAPREDUCE-4495 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4495 Project: Hadoop Map/Reduce Issue Type: New Feature Affects Versions: 2.0.0-alpha Reporter: Bo Wang Assignee: Bo Wang Attachments: MAPREDUCE-4495-v1.1.patch, MAPREDUCE-4495-v1.patch, MapReduceWorkflowAM.pdf It is useful to have a workflow application master, which will be capable of running a DAG of jobs. The workflow client submits a DAG request to the AM and then the AM will manage the life cycle of this application in terms of requesting the needed resources from the RM, and starting, monitoring and retrying the application's individual tasks. Compared to running Oozie with the current MapReduce Application Master, these are some of the advantages: - Less number of consumed resources, since only one application master will be spawned for the whole workflow. - Reuse of resources, since the same resources can be used by multiple consecutive jobs in the workflow (no need to request/wait for resources for every individual job from the central RM). - More optimization opportunities in terms of collective resource requests. - Optimization opportunities in terms of rewriting and composing jobs in the workflow (e.g. pushing down Mappers). - This Application Master can be reused/extended by higher systems like Pig and hive to provide an optimized way of running their workflows. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4495) Workflow Application Master in YARN
[ https://issues.apache.org/jira/browse/MAPREDUCE-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473467#comment-13473467 ] Arun C Murthy commented on MAPREDUCE-4495: -- Let me restate what I've been saying all along: WFAM has very wide scope, is importing a whole new bunch (500KB) of code from Oozie i.e. workflowlib. Given that, it belong in a separate project by itself, no need to extend Hadoop to incorporate Oozie. --- OTOH, if you want to merely support DAG of MR jobs we already have JobControl - we can, trivially, change JobControl to run in an AM without any need for workflowlib. So, let's not import that in. Let's not blow up Hadoop into an even bigger umbrella project by importing Oozie into it. Let's do it in an incubator project. I have a proposal which I'll share, you can be part of the it from day one. Makes sense? Thanks. Workflow Application Master in YARN --- Key: MAPREDUCE-4495 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4495 Project: Hadoop Map/Reduce Issue Type: New Feature Affects Versions: 2.0.0-alpha Reporter: Bo Wang Assignee: Bo Wang Attachments: MAPREDUCE-4495-v1.1.patch, MAPREDUCE-4495-v1.patch, MapReduceWorkflowAM.pdf It is useful to have a workflow application master, which will be capable of running a DAG of jobs. The workflow client submits a DAG request to the AM and then the AM will manage the life cycle of this application in terms of requesting the needed resources from the RM, and starting, monitoring and retrying the application's individual tasks. Compared to running Oozie with the current MapReduce Application Master, these are some of the advantages: - Less number of consumed resources, since only one application master will be spawned for the whole workflow. - Reuse of resources, since the same resources can be used by multiple consecutive jobs in the workflow (no need to request/wait for resources for every individual job from the central RM). - More optimization opportunities in terms of collective resource requests. - Optimization opportunities in terms of rewriting and composing jobs in the workflow (e.g. pushing down Mappers). - This Application Master can be reused/extended by higher systems like Pig and hive to provide an optimized way of running their workflows. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4495) Workflow Application Master in YARN
[ https://issues.apache.org/jira/browse/MAPREDUCE-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473475#comment-13473475 ] Arun C Murthy commented on MAPREDUCE-4495: -- Tucu - I remember talking with you that we should discuss at the contributor meetup (Friday), maybe there was a misunderstanding. Anyway, doesn't matter, thanks. Workflow Application Master in YARN --- Key: MAPREDUCE-4495 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4495 Project: Hadoop Map/Reduce Issue Type: New Feature Affects Versions: 2.0.0-alpha Reporter: Bo Wang Assignee: Bo Wang Attachments: MAPREDUCE-4495-v1.1.patch, MAPREDUCE-4495-v1.patch, MapReduceWorkflowAM.pdf It is useful to have a workflow application master, which will be capable of running a DAG of jobs. The workflow client submits a DAG request to the AM and then the AM will manage the life cycle of this application in terms of requesting the needed resources from the RM, and starting, monitoring and retrying the application's individual tasks. Compared to running Oozie with the current MapReduce Application Master, these are some of the advantages: - Less number of consumed resources, since only one application master will be spawned for the whole workflow. - Reuse of resources, since the same resources can be used by multiple consecutive jobs in the workflow (no need to request/wait for resources for every individual job from the central RM). - More optimization opportunities in terms of collective resource requests. - Optimization opportunities in terms of rewriting and composing jobs in the workflow (e.g. pushing down Mappers). - This Application Master can be reused/extended by higher systems like Pig and hive to provide an optimized way of running their workflows. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4451) fairscheduler fail to init job with kerberos authentication configured
[ https://issues.apache.org/jira/browse/MAPREDUCE-4451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473495#comment-13473495 ] Karthik Kambatla commented on MAPREDUCE-4451: - Thanks Erik. In my opinion, that should be good enough. fairscheduler fail to init job with kerberos authentication configured -- Key: MAPREDUCE-4451 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4451 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/fair-share Affects Versions: 1.0.3 Reporter: Erik.fang Attachments: MAPREDUCE-4451_branch-1.patch, MAPREDUCE-4451_branch-1.patch, MAPREDUCE-4451_branch-1.patch, MAPREDUCE-4451_branch-1.patch, MAPREDUCE-4451_branch-1.patch Using FairScheduler in Hadoop 1.0.3 with kerberos authentication configured. Job initialization fails: {code} 2012-07-17 15:15:09,220 ERROR org.apache.hadoop.mapred.JobTracker: Job initialization failed: java.io.IOException: Call to /192.168.7.80:8020 failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)] at org.apache.hadoop.ipc.Client.wrapException(Client.java:1129) at org.apache.hadoop.ipc.Client.call(Client.java:1097) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229) at $Proxy7.getProtocolVersion(Unknown Source) at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:411) at org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:125) at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:329) at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:294) at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:100) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1411) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1429) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254) at org.apache.hadoop.fs.Path.getFileSystem(Path.java:187) at org.apache.hadoop.security.Credentials.writeTokenStorageFile(Credentials.java:169) at org.apache.hadoop.mapred.JobInProgress.generateAndStoreTokens(JobInProgress.java:3558) at org.apache.hadoop.mapred.JobInProgress.initTasks(JobInProgress.java:696) at org.apache.hadoop.mapred.JobTracker.initJob(JobTracker.java:3911) at org.apache.hadoop.mapred.FairScheduler$JobInitializer$InitJob.run(FairScheduler.java:301) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)] at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:543) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1136) at org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:488) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:590) at org.apache.hadoop.ipc.Client$Connection.access$2100(Client.java:187) at org.apache.hadoop.ipc.Client.getConnection(Client.java:1228) at org.apache.hadoop.ipc.Client.call(Client.java:1072) ... 20 more Caused by: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)] at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:194) at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:134) at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:385) at org.apache.hadoop.ipc.Client$Connection.access$1200(Client.java:187) at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:583) at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:580) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at
[jira] [Commented] (MAPREDUCE-4495) Workflow Application Master in YARN
[ https://issues.apache.org/jira/browse/MAPREDUCE-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473513#comment-13473513 ] Robert Joseph Evans commented on MAPREDUCE-4495: I really do like the idea of having an AM that can run a workflow. I think that there is a huge potential here and I want to see this move forward, but the size and scope of this change is a lot to take in. There are 11,734 lines in the patch. I realize that a lot of this was taken from Oozie itself, but then how are we going to keep the two in sync? What happens when Oozie finds a bug? How are we going to be sure that the bug is pulled into mapred? I really would prefer to see a more agile approach to these changes, and hopefully some of them can correspond to MR, YARN, and HDFS splitting apart after 2.0 has stabilized, so Arun's fears about Hadoop returning to be a project of projects can be alleviated. Can we look at moving the parts that can be common between Oozie and the workflow AM into a separate project? That project I would expect to eventually own the complete Workflow AM, but in the short term it would just provide a place for this workflow library. In parallel with that we can move forward and put in a simple AM that allows for the existing JobControl API to run in an AM. This would allow us to validate that the MR AM is thread safe, and keep it that way. It would also offer a potentially huge benefit to pig which does use that API currently. I would expect most of the initial code for this JobControl workflow AM to be replaced as it moves to use the common workflow library. By doing this in an agile fashion it would also allow us to work out a number of potential issues I see when moving this from Oozie which uses a DB to store its state to a workflow AM where that is not possible. By doing an initial simple JobControl AM we can work out some of the issues with restarting the AM after it crashes. What is more by keeping the changes small, it is much more likely to be something that can be merged into branch 2 so that the branches do not diverge nearly as much. Workflow Application Master in YARN --- Key: MAPREDUCE-4495 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4495 Project: Hadoop Map/Reduce Issue Type: New Feature Affects Versions: 2.0.0-alpha Reporter: Bo Wang Assignee: Bo Wang Attachments: MAPREDUCE-4495-v1.1.patch, MAPREDUCE-4495-v1.patch, MapReduceWorkflowAM.pdf It is useful to have a workflow application master, which will be capable of running a DAG of jobs. The workflow client submits a DAG request to the AM and then the AM will manage the life cycle of this application in terms of requesting the needed resources from the RM, and starting, monitoring and retrying the application's individual tasks. Compared to running Oozie with the current MapReduce Application Master, these are some of the advantages: - Less number of consumed resources, since only one application master will be spawned for the whole workflow. - Reuse of resources, since the same resources can be used by multiple consecutive jobs in the workflow (no need to request/wait for resources for every individual job from the central RM). - More optimization opportunities in terms of collective resource requests. - Optimization opportunities in terms of rewriting and composing jobs in the workflow (e.g. pushing down Mappers). - This Application Master can be reused/extended by higher systems like Pig and hive to provide an optimized way of running their workflows. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-4721) Task startup time in JHS is same as job startup time.
Ravi Prakash created MAPREDUCE-4721: --- Summary: Task startup time in JHS is same as job startup time. Key: MAPREDUCE-4721 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4721 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobhistoryserver Affects Versions: 0.23.3 Reporter: Ravi Prakash As Bobby pointed out in https://issues.apache.org/jira/browse/MAPREDUCE-4711?focusedCommentId=13471696page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13471696 In the Map and Reduce tasks page, it should print the earliest task attempt launch time as TaskImpl:getLaunchTime() does. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (MAPREDUCE-4721) Task startup time in JHS is same as job startup time.
[ https://issues.apache.org/jira/browse/MAPREDUCE-4721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravi Prakash reassigned MAPREDUCE-4721: --- Assignee: Ravi Prakash Task startup time in JHS is same as job startup time. - Key: MAPREDUCE-4721 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4721 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobhistoryserver Affects Versions: 0.23.3 Reporter: Ravi Prakash Assignee: Ravi Prakash As Bobby pointed out in https://issues.apache.org/jira/browse/MAPREDUCE-4711?focusedCommentId=13471696page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13471696 In the Map and Reduce tasks page, it should print the earliest task attempt launch time as TaskImpl:getLaunchTime() does. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4495) Workflow Application Master in YARN
[ https://issues.apache.org/jira/browse/MAPREDUCE-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473635#comment-13473635 ] Siddharth Seth commented on MAPREDUCE-4495: --- It'd be very nice to have an AM which is capable of processing workflows. I'm not sure that this belongs under the MR project though. Along with bringing in a lot of Oozie / workflow code into MR, it also seems to restrict the AMs scope - to running Java / MapReduce actions. Like Bobby's suggestion of having the workflow library live in a separate project which can be used by Oozie as well as the WFAM. I'd be happy to contribute to this. Workflow Application Master in YARN --- Key: MAPREDUCE-4495 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4495 Project: Hadoop Map/Reduce Issue Type: New Feature Affects Versions: 2.0.0-alpha Reporter: Bo Wang Assignee: Bo Wang Attachments: MAPREDUCE-4495-v1.1.patch, MAPREDUCE-4495-v1.patch, MapReduceWorkflowAM.pdf It is useful to have a workflow application master, which will be capable of running a DAG of jobs. The workflow client submits a DAG request to the AM and then the AM will manage the life cycle of this application in terms of requesting the needed resources from the RM, and starting, monitoring and retrying the application's individual tasks. Compared to running Oozie with the current MapReduce Application Master, these are some of the advantages: - Less number of consumed resources, since only one application master will be spawned for the whole workflow. - Reuse of resources, since the same resources can be used by multiple consecutive jobs in the workflow (no need to request/wait for resources for every individual job from the central RM). - More optimization opportunities in terms of collective resource requests. - Optimization opportunities in terms of rewriting and composing jobs in the workflow (e.g. pushing down Mappers). - This Application Master can be reused/extended by higher systems like Pig and hive to provide an optimized way of running their workflows. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4495) Workflow Application Master in YARN
[ https://issues.apache.org/jira/browse/MAPREDUCE-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473727#comment-13473727 ] Mayank Bansal commented on MAPREDUCE-4495: -- Since beginning I was in favor of adding WF DAG to AM and I already shown intrest contributing to that. I was before in favor of adding this work to MR and YARN but Bobby pointed out correctly that it would be very difficult for us to keep Oozie and WFAM in sync. I dont think we should add this WFAM as part of oozie as well because if we do that then all other projects like PIG and HIVE wants to use it and then it would be a problem for them or any other future project which wants to use WF functionality. Ideally it should be a library/project which other projects like Oozie, PIG and HIVE should depend upon. One can argue that then it should be part of MR and YARN and then every other project can depends on it, But if we do that, then every new functionality which wants to implement AM interface has to be part of YARN which does not seems right to me at this point. I think the basic Idea behind creating AM is that any new application/project can implement that and use the YARN framework. Its not necessarily means it should be part of YARN framework itself. Based on my understanding I think we should create new project and move forward , I am very much willing to contribute to that. It would be easier for us to innovate and move forward in separate project then being part of YARN. Thats just my understanding. Thoughts? Workflow Application Master in YARN --- Key: MAPREDUCE-4495 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4495 Project: Hadoop Map/Reduce Issue Type: New Feature Affects Versions: 2.0.0-alpha Reporter: Bo Wang Assignee: Bo Wang Attachments: MAPREDUCE-4495-v1.1.patch, MAPREDUCE-4495-v1.patch, MapReduceWorkflowAM.pdf It is useful to have a workflow application master, which will be capable of running a DAG of jobs. The workflow client submits a DAG request to the AM and then the AM will manage the life cycle of this application in terms of requesting the needed resources from the RM, and starting, monitoring and retrying the application's individual tasks. Compared to running Oozie with the current MapReduce Application Master, these are some of the advantages: - Less number of consumed resources, since only one application master will be spawned for the whole workflow. - Reuse of resources, since the same resources can be used by multiple consecutive jobs in the workflow (no need to request/wait for resources for every individual job from the central RM). - More optimization opportunities in terms of collective resource requests. - Optimization opportunities in terms of rewriting and composing jobs in the workflow (e.g. pushing down Mappers). - This Application Master can be reused/extended by higher systems like Pig and hive to provide an optimized way of running their workflows. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira