[jira] [Resolved] (MAPREDUCE-5512) TaskTracker hung after failed reconnect to the JobTracker
[ https://issues.apache.org/jira/browse/MAPREDUCE-5512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Mitic resolved MAPREDUCE-5512. --- Resolution: Fixed Fix Version/s: 1.3.0 1-win Fix committed to branch-1 and branch-1-win. > TaskTracker hung after failed reconnect to the JobTracker > - > > Key: MAPREDUCE-5512 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5512 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: tasktracker >Affects Versions: 1.3.0 >Reporter: Ivan Mitic >Assignee: Ivan Mitic > Fix For: 1-win, 1.3.0 > > Attachments: hadoop-tasktracker-RD00155DD09100.log, > MAPREDUCE-5512.branch-1.patch, tt_Hung.txt > > > TaskTracker hung after failed reconnect to the JobTracker. > This is the problematic piece of code: > {code} > this.distributedCacheManager = new TrackerDistributedCacheManager( > this.fConf, taskController); > this.distributedCacheManager.startCleanupThread(); > > this.jobClient = (InterTrackerProtocol) > UserGroupInformation.getLoginUser().doAs( > new PrivilegedExceptionAction() { > public Object run() throws IOException { > return RPC.waitForProxy(InterTrackerProtocol.class, > InterTrackerProtocol.versionID, > jobTrackAddr, fConf); > } > }); > {code} > In case RPC.waitForProxy() throws, TrackerDistributedCacheManager cleanup > thread will never be stopped, and given that it is a non daemon thread it > will keep TT up forever. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (MAPREDUCE-5577) Allow querying the JobHistoryServer by job arrival time
Sandy Ryza created MAPREDUCE-5577: - Summary: Allow querying the JobHistoryServer by job arrival time Key: MAPREDUCE-5577 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5577 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobhistoryserver Reporter: Sandy Ryza Assignee: Sandy Ryza The JobHistoryServer REST APIs currently allow querying by job submit time and finish time. However, jobs don't necessarily arrive in order of their finish time, meaning that a client who wants to stay on top of all completed jobs needs to query large time intervals to make sure they're not missing anything. Exposing functionality to allow querying by the time a job lands at the JobHistoryServer would allow clients to set the start of their query interval to the time of their last query. The arrival time of a job would be defined as the time that it lands in the done directory. -- This message was sent by Atlassian JIRA (v6.1#6144)
pipes not working in MR2?
I'm unable to get a simple hadoop pipes job working in MR2, and got the sense it hasn't been working for a while. Does anybody have any insight into what's going on? Has anybody used them successfully recently? thanks for any help, Sandy
[jira] [Created] (MAPREDUCE-5576) MR AM unregistration should be failed due to UnknownHostException on getting history url
Zhijie Shen created MAPREDUCE-5576: -- Summary: MR AM unregistration should be failed due to UnknownHostException on getting history url Key: MAPREDUCE-5576 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5576 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Zhijie Shen Assignee: Zhijie Shen Before RMCommunicator sends the request to RM to finish the application, it will try to get the JHS url, which may throw UnknownHostException. The current code path will skip sending the request to RM when the exception is raised, which sounds not a reasonable behavior, because RM's unregistering an AM will not affected by the tracking URL. The URL can be empty or null. AFAIK, the impact of null URL will be that the URL to redirect users from RM web page to JHS will be unavailable, and the job report will not show the URL as well. However, is it much much better than failing an application because of UnknownHostException here? Anyway, users can go to JHS directly to find the application history info. Therefore, the reasonable code path here should be catching UnknownHostException and set historyUrl = null -- This message was sent by Atlassian JIRA (v6.1#6144)
Re: [VOTE] Release Apache Hadoop 2.2.0
+1 non-binding Built from source code, and ran a few sample jobs on single node cluster, tested RM and AM recovery. Thanks, Jian On Thu, Oct 10, 2013 at 10:59 AM, Chris Nauroth wrote: > +1 non-binding > > I verified the checksum and signature. I deployed the tarball to a small > cluster of Ubuntu VMs: 1 * NameNode, 1 * ResourceManager, 2 * DataNode, 2 * > NodeManager, 1 * SecondaryNameNode. I ran a few HDFS commands and sample > MapReduce jobs. I verified that the 2NN can take a checkpoint > successfully. Everything worked as expected. > > The outcome of the recent discussions on HDFS symlinks was that we need to > disable the feature in this release. Just to be certain that this patch > took, I wrote a small client to call FileSystem.createSymlink and tried to > run it in my 2.2.0 cluster. It threw UnsupportedOperationException, which > is the expected behavior. > > Chris Nauroth > Hortonworks > http://hortonworks.com/ > > > > On Thu, Oct 10, 2013 at 10:18 AM, Bikas Saha > wrote: > > > +1 (non binding) > > > > -Original Message- > > From: Arpit Gupta [mailto:ar...@hortonworks.com] > > Sent: Thursday, October 10, 2013 10:06 AM > > To: common-...@hadoop.apache.org > > Cc: hdfs-...@hadoop.apache.org; yarn-...@hadoop.apache.org; > > mapreduce-dev@hadoop.apache.org > > Subject: Re: [VOTE] Release Apache Hadoop 2.2.0 > > > > +1 (non binding) > > > > Ran secure and non secure multi node clusters and tested HA and RM > > recovery tests. > > > > -- > > Arpit Gupta > > Hortonworks Inc. > > http://hortonworks.com/ > > > > On Oct 7, 2013, at 12:00 AM, Arun C Murthy wrote: > > > > > Folks, > > > > > > I've created a release candidate (rc0) for hadoop-2.2.0 that I would > > like to get released - this release fixes a small number of bugs and some > > protocol/api issues which should ensure they are now stable and will not > > change in hadoop-2.x. > > > > > > The RC is available at: > > > http://people.apache.org/~acmurthy/hadoop-2.2.0-rc0 > > > The RC tag in svn is here: > > > http://svn.apache.org/repos/asf/hadoop/common/tags/release-2.2.0-rc0 > > > > > > The maven artifacts are available via repository.apache.org. > > > > > > Please try the release and vote; the vote will run for the usual 7 > days. > > > > > > thanks, > > > Arun > > > > > > P.S.: Thanks to Colin, Andrew, Daryn, Chris and others for helping nail > > down the symlinks-related issues. I'll release note the fact that we have > > disabled it in 2.2. Also, thanks to Vinod for some heavy-lifting on the > > YARN side in the last couple of weeks. > > > > > > > > > > > > > > > > > > -- > > > Arun C. Murthy > > > Hortonworks Inc. > > > http://hortonworks.com/ > > > > > > > > > > > > -- > > > CONFIDENTIALITY NOTICE > > > NOTICE: This message is intended for the use of the individual or > > > entity to which it is addressed and may contain information that is > > > confidential, privileged and exempt from disclosure under applicable > > > law. If the reader of this message is not the intended recipient, you > > > are hereby notified that any printing, copying, dissemination, > > > distribution, disclosure or forwarding of this communication is > > > strictly prohibited. If you have received this communication in error, > > > please contact the sender immediately and delete it from your system. > > Thank You. > > > > > > -- > > CONFIDENTIALITY NOTICE > > NOTICE: This message is intended for the use of the individual or entity > > to > > which it is addressed and may contain information that is confidential, > > privileged and exempt from disclosure under applicable law. If the reader > > of this message is not the intended recipient, you are hereby notified > > that > > any printing, copying, dissemination, distribution, disclosure or > > forwarding of this communication is strictly prohibited. If you have > > received this communication in error, please contact the sender > > immediately > > and delete it from your system. Thank You. > > > > -- > > CONFIDENTIALITY NOTICE > > NOTICE: This message is intended for the use of the individual or entity > to > > which it is addressed and may contain information that is confidential, > > privileged and exempt from disclosure under applicable law. If the reader > > of this message is not the intended recipient, you are hereby notified > that > > any printing, copying, dissemination, distribution, disclosure or > > forwarding of this communication is strictly prohibited. If you have > > received this communication in error, please contact the sender > immediately > > and delete it from your system. Thank You. > > > > -- > CONFIDENTIALITY NOTICE > NOTICE: This message is intended for the use of the individual or entity to > which it is addressed and may contain information that is confidential, > privileged and exempt from disclosure under applicable law. If the reader > of this message is not the intended recipient, you are hereby notified that > any printing, copying, dissemination, distribution, di
Re: [VOTE] Release Apache Hadoop 2.2.0
+1 non-binding I verified the checksum and signature. I deployed the tarball to a small cluster of Ubuntu VMs: 1 * NameNode, 1 * ResourceManager, 2 * DataNode, 2 * NodeManager, 1 * SecondaryNameNode. I ran a few HDFS commands and sample MapReduce jobs. I verified that the 2NN can take a checkpoint successfully. Everything worked as expected. The outcome of the recent discussions on HDFS symlinks was that we need to disable the feature in this release. Just to be certain that this patch took, I wrote a small client to call FileSystem.createSymlink and tried to run it in my 2.2.0 cluster. It threw UnsupportedOperationException, which is the expected behavior. Chris Nauroth Hortonworks http://hortonworks.com/ On Thu, Oct 10, 2013 at 10:18 AM, Bikas Saha wrote: > +1 (non binding) > > -Original Message- > From: Arpit Gupta [mailto:ar...@hortonworks.com] > Sent: Thursday, October 10, 2013 10:06 AM > To: common-...@hadoop.apache.org > Cc: hdfs-...@hadoop.apache.org; yarn-...@hadoop.apache.org; > mapreduce-dev@hadoop.apache.org > Subject: Re: [VOTE] Release Apache Hadoop 2.2.0 > > +1 (non binding) > > Ran secure and non secure multi node clusters and tested HA and RM > recovery tests. > > -- > Arpit Gupta > Hortonworks Inc. > http://hortonworks.com/ > > On Oct 7, 2013, at 12:00 AM, Arun C Murthy wrote: > > > Folks, > > > > I've created a release candidate (rc0) for hadoop-2.2.0 that I would > like to get released - this release fixes a small number of bugs and some > protocol/api issues which should ensure they are now stable and will not > change in hadoop-2.x. > > > > The RC is available at: > > http://people.apache.org/~acmurthy/hadoop-2.2.0-rc0 > > The RC tag in svn is here: > > http://svn.apache.org/repos/asf/hadoop/common/tags/release-2.2.0-rc0 > > > > The maven artifacts are available via repository.apache.org. > > > > Please try the release and vote; the vote will run for the usual 7 days. > > > > thanks, > > Arun > > > > P.S.: Thanks to Colin, Andrew, Daryn, Chris and others for helping nail > down the symlinks-related issues. I'll release note the fact that we have > disabled it in 2.2. Also, thanks to Vinod for some heavy-lifting on the > YARN side in the last couple of weeks. > > > > > > > > > > > > -- > > Arun C. Murthy > > Hortonworks Inc. > > http://hortonworks.com/ > > > > > > > > -- > > CONFIDENTIALITY NOTICE > > NOTICE: This message is intended for the use of the individual or > > entity to which it is addressed and may contain information that is > > confidential, privileged and exempt from disclosure under applicable > > law. If the reader of this message is not the intended recipient, you > > are hereby notified that any printing, copying, dissemination, > > distribution, disclosure or forwarding of this communication is > > strictly prohibited. If you have received this communication in error, > > please contact the sender immediately and delete it from your system. > Thank You. > > > -- > CONFIDENTIALITY NOTICE > NOTICE: This message is intended for the use of the individual or entity > to > which it is addressed and may contain information that is confidential, > privileged and exempt from disclosure under applicable law. If the reader > of this message is not the intended recipient, you are hereby notified > that > any printing, copying, dissemination, distribution, disclosure or > forwarding of this communication is strictly prohibited. If you have > received this communication in error, please contact the sender > immediately > and delete it from your system. Thank You. > > -- > CONFIDENTIALITY NOTICE > NOTICE: This message is intended for the use of the individual or entity to > which it is addressed and may contain information that is confidential, > privileged and exempt from disclosure under applicable law. If the reader > of this message is not the intended recipient, you are hereby notified that > any printing, copying, dissemination, distribution, disclosure or > forwarding of this communication is strictly prohibited. If you have > received this communication in error, please contact the sender immediately > and delete it from your system. Thank You. > -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
[jira] [Created] (MAPREDUCE-5575) History files deleted from the intermediate directory never get removed from the JobListCache
Sandy Ryza created MAPREDUCE-5575: - Summary: History files deleted from the intermediate directory never get removed from the JobListCache Key: MAPREDUCE-5575 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5575 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobhistoryserver Affects Versions: 2.2.0 Reporter: Sandy Ryza The JobHistoryServer periodically scans through the intermediate directory. It adds all files to the JobListCache. It deletes job files that are older than the max age and moves all other files to the done directory. Later, when files in the done directory become too old, they're deleted from the JobListCache. Jobs that were deleted in the intermediate directory (and thus never moved to the done directory) end up in the JobListCache but can never be deleted from it. -- This message was sent by Atlassian JIRA (v6.1#6144)
RE: [VOTE] Release Apache Hadoop 2.2.0
+1 (non binding) -Original Message- From: Arpit Gupta [mailto:ar...@hortonworks.com] Sent: Thursday, October 10, 2013 10:06 AM To: common-...@hadoop.apache.org Cc: hdfs-...@hadoop.apache.org; yarn-...@hadoop.apache.org; mapreduce-dev@hadoop.apache.org Subject: Re: [VOTE] Release Apache Hadoop 2.2.0 +1 (non binding) Ran secure and non secure multi node clusters and tested HA and RM recovery tests. -- Arpit Gupta Hortonworks Inc. http://hortonworks.com/ On Oct 7, 2013, at 12:00 AM, Arun C Murthy wrote: > Folks, > > I've created a release candidate (rc0) for hadoop-2.2.0 that I would like to get released - this release fixes a small number of bugs and some protocol/api issues which should ensure they are now stable and will not change in hadoop-2.x. > > The RC is available at: > http://people.apache.org/~acmurthy/hadoop-2.2.0-rc0 > The RC tag in svn is here: > http://svn.apache.org/repos/asf/hadoop/common/tags/release-2.2.0-rc0 > > The maven artifacts are available via repository.apache.org. > > Please try the release and vote; the vote will run for the usual 7 days. > > thanks, > Arun > > P.S.: Thanks to Colin, Andrew, Daryn, Chris and others for helping nail down the symlinks-related issues. I'll release note the fact that we have disabled it in 2.2. Also, thanks to Vinod for some heavy-lifting on the YARN side in the last couple of weeks. > > > > > > -- > Arun C. Murthy > Hortonworks Inc. > http://hortonworks.com/ > > > > -- > CONFIDENTIALITY NOTICE > NOTICE: This message is intended for the use of the individual or > entity to which it is addressed and may contain information that is > confidential, privileged and exempt from disclosure under applicable > law. If the reader of this message is not the intended recipient, you > are hereby notified that any printing, copying, dissemination, > distribution, disclosure or forwarding of this communication is > strictly prohibited. If you have received this communication in error, > please contact the sender immediately and delete it from your system. Thank You. -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: [VOTE] Release Apache Hadoop 2.2.0
+1 (non binding) Ran secure and non secure multi node clusters and tested HA and RM recovery tests. -- Arpit Gupta Hortonworks Inc. http://hortonworks.com/ On Oct 7, 2013, at 12:00 AM, Arun C Murthy wrote: > Folks, > > I've created a release candidate (rc0) for hadoop-2.2.0 that I would like to > get released - this release fixes a small number of bugs and some > protocol/api issues which should ensure they are now stable and will not > change in hadoop-2.x. > > The RC is available at: http://people.apache.org/~acmurthy/hadoop-2.2.0-rc0 > The RC tag in svn is here: > http://svn.apache.org/repos/asf/hadoop/common/tags/release-2.2.0-rc0 > > The maven artifacts are available via repository.apache.org. > > Please try the release and vote; the vote will run for the usual 7 days. > > thanks, > Arun > > P.S.: Thanks to Colin, Andrew, Daryn, Chris and others for helping nail down > the symlinks-related issues. I'll release note the fact that we have disabled > it in 2.2. Also, thanks to Vinod for some heavy-lifting on the YARN side in > the last couple of weeks. > > > > > > -- > Arun C. Murthy > Hortonworks Inc. > http://hortonworks.com/ > > > > -- > CONFIDENTIALITY NOTICE > NOTICE: This message is intended for the use of the individual or entity to > which it is addressed and may contain information that is confidential, > privileged and exempt from disclosure under applicable law. If the reader > of this message is not the intended recipient, you are hereby notified that > any printing, copying, dissemination, distribution, disclosure or > forwarding of this communication is strictly prohibited. If you have > received this communication in error, please contact the sender immediately > and delete it from your system. Thank You. -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Hadoop-Mapreduce-trunk - Build # 1574 - Still Failing
See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1574/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 33695 lines...] TestReduceFetchFromPartialMem.testReduceFromPartialMem:93->runJob:300 » OutOfMemory TestJobSysDirWithDFS.testWithDFS:130 » YarnRuntime java.lang.OutOfMemoryError:... TestLazyOutput.testLazyOutput:146 » YarnRuntime java.lang.OutOfMemoryError: un... TestSpecialCharactersInOutputPath.testJobWithDFS:112 » YarnRuntime java.lang.O... TestMapReduceLazyOutput.testLazyOutput:136 » YarnRuntime java.lang.OutOfMemory... TestSpeculativeExecution.setup:122 » IO Cannot run program "stat": java.io.IOE... TestMRJobs.setup:130 » YarnRuntime java.lang.OutOfMemoryError: unable to creat... TestRMNMInfo.setup:84 » IO Cannot run program "stat": java.io.IOException: err... TestUberAM.setup:45->TestMRJobs.setup:130 » YarnRuntime java.lang.OutOfMemoryE... Tests run: 455, Failures: 8, Errors: 11, Skipped: 11 [INFO] [INFO] Reactor Summary: [INFO] [INFO] hadoop-mapreduce-client ... SUCCESS [2.543s] [INFO] hadoop-mapreduce-client-core .. SUCCESS [43.385s] [INFO] hadoop-mapreduce-client-common SUCCESS [24.790s] [INFO] hadoop-mapreduce-client-shuffle ... SUCCESS [2.472s] [INFO] hadoop-mapreduce-client-app ... SUCCESS [6:47.311s] [INFO] hadoop-mapreduce-client-hs SUCCESS [2:02.866s] [INFO] hadoop-mapreduce-client-jobclient . FAILURE [49:40.789s] [INFO] hadoop-mapreduce-client-hs-plugins SKIPPED [INFO] Apache Hadoop MapReduce Examples .. SKIPPED [INFO] hadoop-mapreduce .. SKIPPED [INFO] [INFO] BUILD FAILURE [INFO] [INFO] Total time: 59:44.837s [INFO] Finished at: Thu Oct 10 14:19:09 UTC 2013 [INFO] Final Memory: 22M/93M [INFO] [ERROR] Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:2.16:test (default-test) on project hadoop-mapreduce-client-jobclient: ExecutionException; nested exception is java.util.concurrent.ExecutionException: java.lang.RuntimeException: The forked VM terminated without saying properly goodbye. VM crash or System.exit called ? [ERROR] Command was/bin/sh -c cd /home/jenkins/jenkins-slave/workspace/Hadoop-Mapreduce-trunk/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient && /home/jenkins/tools/java/jdk1.6.0_26/jre/bin/java -Xmx1024m -XX:+HeapDumpOnOutOfMemoryError -jar /home/jenkins/jenkins-slave/workspace/Hadoop-Mapreduce-trunk/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/surefire/surefirebooter5605906304175332674.jar /home/jenkins/jenkins-slave/workspace/Hadoop-Mapreduce-trunk/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/surefire/surefire9153747844174506124tmp /home/jenkins/jenkins-slave/workspace/Hadoop-Mapreduce-trunk/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/surefire/surefire_1115428300805884185348tmp [ERROR] -> [Help 1] [ERROR] [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch. [ERROR] Re-run Maven using the -X switch to enable full debug logging. [ERROR] [ERROR] For more information about the errors and possible solutions, please read the following articles: [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException [ERROR] [ERROR] After correcting the problems, you can resume the build with the command [ERROR] mvn -rf :hadoop-mapreduce-client-jobclient Build step 'Execute shell' marked build as failure [FINDBUGS] Skipping publisher since build result is FAILURE Archiving artifacts Updating MAPREDUCE-5569 Updating HDFS-5323 Updating HADOOP-9470 Updating HDFS-4510 Updating YARN-1284 Updating YARN-1283 Updating YARN-879 Updating HADOOP-10031 Updating MAPREDUCE-5102 Updating HDFS-5337 Email was triggered for: Failure Sending email for trigger: Failure ### ## FAILED TESTS (if any) ## No tests ran.