[jira] [Resolved] (MAPREDUCE-3750) ConcurrentModificationException in counter groups
[ https://issues.apache.org/jira/browse/MAPREDUCE-3750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eli Collins resolved MAPREDUCE-3750. Resolution: Duplicate Dupe of MAPREDUCE-3749 > ConcurrentModificationException in counter groups > - > > Key: MAPREDUCE-3750 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-3750 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Tom White >Priority: Blocker > > Iterating over a counter's groups while adding more groups results in a > ConcurrentModificationException. This was discovered while running Hive unit > tests on a recent 0.23 version of Hadoop. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (MAPREDUCE-3575) Streaming/tools Jar does not get included in the tarball.
[ https://issues.apache.org/jira/browse/MAPREDUCE-3575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eli Collins resolved MAPREDUCE-3575. Resolution: Duplicate > Streaming/tools Jar does not get included in the tarball. > - > > Key: MAPREDUCE-3575 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-3575 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv2 >Reporter: Mahadev konar >Priority: Blocker > Fix For: 0.23.1 > > > The streaming jar used to be available in the mapreduce tarballs before we > created the hadoop-tools package. The streaming and tools jars are not being > shipped with any tars. Our mapreduce tarballs should include the streaming > and tools jar. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (MAPREDUCE-3424) Some LinuxTaskController cleanup
[ https://issues.apache.org/jira/browse/MAPREDUCE-3424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eli Collins resolved MAPREDUCE-3424. Resolution: Fixed Hadoop Flags: Reviewed Thanks Todd. I've committed this. > Some LinuxTaskController cleanup > > > Key: MAPREDUCE-3424 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-3424 > Project: Hadoop Map/Reduce > Issue Type: Sub-task > Components: tasktracker >Affects Versions: 0.20.205.0 >Reporter: Eli Collins >Assignee: Eli Collins >Priority: Minor > Attachments: mapreduce-3424-1.patch, mapreduce-3424-2.patch, > mapreduce-3424-3.patch > > > MR-2415 had some tabs and weird indenting and spacing. Also would be more > clear if LTC explicitly overrides createLogDir. Let's clean this up. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (MAPREDUCE-1549) TestTrackerDistributedCacheManager failed on some machines
[ https://issues.apache.org/jira/browse/MAPREDUCE-1549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eli Collins resolved MAPREDUCE-1549. Resolution: Duplicate The issue isn't host specific, the fix is to chmod a+x the hadoop directory. See MAPREDUCE-2073. I'll commit it to 20x. > TestTrackerDistributedCacheManager failed on some machines > -- > > Key: MAPREDUCE-1549 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1549 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: test >Reporter: Amar Kamat > Attachments: > TEST-org.apache.hadoop.mapreduce.filecache.TestTrackerDistributedCacheManager.txt > > > TestTrackerDistributedCacheManager.testPublicPrivateCache fails on some > machines. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (MAPREDUCE-3356) TestTrackerDistributedCacheManager fails on branch-20-security
[ https://issues.apache.org/jira/browse/MAPREDUCE-3356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eli Collins resolved MAPREDUCE-3356. Resolution: Duplicate Dupe of MAPREDUCE-1549. > TestTrackerDistributedCacheManager fails on branch-20-security > -- > > Key: MAPREDUCE-3356 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-3356 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: distributed-cache, test >Affects Versions: 0.20.205.0 >Reporter: Eli Collins > > The testReferenceCount and testPublicPrivateCache tests fail reproducibly on > branch-20-security. Details in follow up comment. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (MAPREDUCE-3419) Don't mark exited TT threads as dead in MiniMRCluster
[ https://issues.apache.org/jira/browse/MAPREDUCE-3419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eli Collins resolved MAPREDUCE-3419. Resolution: Fixed Hadoop Flags: Reviewed All tests ran cleanly except for one, which is unrelated, filed HADOOP-7836 for it. I've committed this. > Don't mark exited TT threads as dead in MiniMRCluster > --- > > Key: MAPREDUCE-3419 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-3419 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: tasktracker, test >Affects Versions: 0.20.206.0 >Reporter: Eli Collins >Assignee: Eli Collins > Attachments: mapreduce-2850-1.patch > > > MAPREDUCE-2850 flagged all TT threads that exited in the MiniMRCluster as > dead, this breaks a number of the other tests that use MiniMRCluster across > restart. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (MAPREDUCE-3015) Add local dir failure info to metrics and the web UI
[ https://issues.apache.org/jira/browse/MAPREDUCE-3015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eli Collins resolved MAPREDUCE-3015. Resolution: Fixed Fix Version/s: 0.20.206.0 Hadoop Flags: Reviewed Since the changes to the previous patch were trivial I went ahead and committed this. Thanks Todd. > Add local dir failure info to metrics and the web UI > > > Key: MAPREDUCE-3015 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-3015 > Project: Hadoop Map/Reduce > Issue Type: Sub-task > Components: tasktracker >Affects Versions: 0.20.204.0 >Reporter: Eli Collins >Assignee: Eli Collins > Fix For: 0.20.206.0 > > Attachments: mapreduce-3015-1.patch, mapreduce-3015-2.patch > > > Like HDFS-811/HDFS-1850 but for the TT. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (MAPREDUCE-2921) TaskTracker won't start with failed local directory
[ https://issues.apache.org/jira/browse/MAPREDUCE-2921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eli Collins resolved MAPREDUCE-2921. Resolution: Duplicate It turns out the TT will successfully start if there's a failed local directory (it checks the dirs and removes any that fail) so it will start up fine with a failed or read-only directory etc. The failure I discovered in the first description is because checkDir doesn't fail if the directory exists and is not exectutable, once that's fixed the TT will start up in that case as well. > TaskTracker won't start with failed local directory > --- > > Key: MAPREDUCE-2921 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-2921 > Project: Hadoop Map/Reduce > Issue Type: Sub-task > Components: tasktracker >Affects Versions: 0.20.204.0 >Reporter: Eli Collins >Assignee: Eli Collins > > Chmod'ing one of the mapred local directories so it's not executable will > cause the TT to fail to start. Doing this after the TT has started will > result in a TT that is up but can not execute tasks. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (MAPREDUCE-2924) TaskTracker number of failed disks to tolerate should be configurable
[ https://issues.apache.org/jira/browse/MAPREDUCE-2924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eli Collins resolved MAPREDUCE-2924. Resolution: Won't Fix Thought about this some.. I think leaving the current behavior as is (TT keeps running regardless # disk failures) but using a health script that shutsdown the TT when the DN goes down makes more sense. The DN already has logic for shutting down given a sufficient # of disk failures, and it doesn't make sense for the TT to keep running if the DN isn't running. Do think we still need to fix MAPREDUCE-2657, otherwise restarting a cluster may result in a bunch of TTs that were running not coming up because they tolerated a disk failure while running but won't while starting. > TaskTracker number of failed disks to tolerate should be configurable > - > > Key: MAPREDUCE-2924 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-2924 > Project: Hadoop Map/Reduce > Issue Type: Sub-task > Components: tasktracker >Affects Versions: 0.20.204.0 >Reporter: Eli Collins > > Like HDFS-1161 but for the TT. The user should be able to configure how many > valid disks are needed for operation. Currently the TT will start and accept > tasks even if eg only 1 of its 12 disks is working, which leads to poor > performance of jobs with tasks that use this machine. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (MAPREDUCE-3016) Add TT local dir failure info to the JT web UI
[ https://issues.apache.org/jira/browse/MAPREDUCE-3016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eli Collins resolved MAPREDUCE-3016. Resolution: Duplicate Assignee: Eli Collins This is really simple, will do as part of MAPREDUCE-3015. > Add TT local dir failure info to the JT web UI > -- > > Key: MAPREDUCE-3016 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-3016 > Project: Hadoop Map/Reduce > Issue Type: Sub-task > Components: jobtracker >Affects Versions: 0.20.204.0 >Reporter: Eli Collins >Assignee: Eli Collins > > Like HDFS-556 but for the JT. The machine list page should report local > directory failures per TT. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (MAPREDUCE-2920) Local log dir links in the JT web UI are broken
[ https://issues.apache.org/jira/browse/MAPREDUCE-2920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eli Collins resolved MAPREDUCE-2920. Resolution: Won't Fix The task logs that are no longer available via logs/userlogs are available via the job history server (which works because it uses TaskLog to read the files instead of Jetty directly) so I don't think this is worth the effort of fixing. > Local log dir links in the JT web UI are broken > - > > Key: MAPREDUCE-2920 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-2920 > Project: Hadoop Map/Reduce > Issue Type: Sub-task > Components: tasktracker >Affects Versions: 0.20.204.0 >Reporter: Eli Collins > > The task log servlet can no longer access user logs because MAPREDUCE-2415 > introduce symlinks to the logs and jetty is not configured by default to > follow symlinks (for security reasons). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (MAPREDUCE-2850) TaskTracker disk failure handling (MR-2413) has no test coverage
[ https://issues.apache.org/jira/browse/MAPREDUCE-2850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eli Collins resolved MAPREDUCE-2850. Resolution: Fixed Hadoop Flags: Reviewed > TaskTracker disk failure handling (MR-2413) has no test coverage > > > Key: MAPREDUCE-2850 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-2850 > Project: Hadoop Map/Reduce > Issue Type: Sub-task > Components: tasktracker >Affects Versions: 0.20.204.0 >Reporter: Eli Collins >Assignee: Ravi Gummadi > Attachments: MR2850.v0.patch, MR2850.v1.1.patch, MR2850.v1.2.patch, > MR2850.v1.3.patch, MR2850.v1.patch > > > MR-2413 doesn't have any test coverage that eg tests that the TT can survive > disk failure. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (MAPREDUCE-3011) TT should remove bad local dirs from conf to prevent constant disk checking
[ https://issues.apache.org/jira/browse/MAPREDUCE-3011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eli Collins resolved MAPREDUCE-3011. Resolution: Not A Problem You're right, I missed that we're always updating the conf the job is launched with the latest known good dirs in TT#launchTaskForJob. Thanks for the explanation. I verified we weren't missing other locations by failing a local dir, logging in the confChanged case and verifying that we only notice the change once. > TT should remove bad local dirs from conf to prevent constant disk checking > --- > > Key: MAPREDUCE-3011 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-3011 > Project: Hadoop Map/Reduce > Issue Type: Sub-task > Components: tasktracker >Affects Versions: 0.20.204.0 >Reporter: Eli Collins > > Per HADOOP-7551 the TT does not remove bad mapred.local.dirs from the conf so > after a single disk failure *every* call to get a local path for reading or > writing results in a disk check of *all* configured local dirs. After > detecting that a local dir is bad we should remove it from the conf so that > we don't repeatedly perform this expensive operation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (MAPREDUCE-2957) The TT should not re-init if it has no good local dirs
[ https://issues.apache.org/jira/browse/MAPREDUCE-2957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eli Collins resolved MAPREDUCE-2957. Resolution: Fixed Hadoop Flags: Reviewed Thanks for the review Ravi! I've committed this. > The TT should not re-init if it has no good local dirs > -- > > Key: MAPREDUCE-2957 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-2957 > Project: Hadoop Map/Reduce > Issue Type: Sub-task > Components: tasktracker >Affects Versions: 0.20.204.0 >Reporter: Eli Collins >Assignee: Eli Collins > Attachments: mapreduce-2957-1.patch, mapreduce-2957-2.patch, > mapreduce-2957.patch > > > The TT will currently try to re-init itself on disk failure even if it has no > good local dirs. It should shutdown instead. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira