[jira] [Resolved] (MAPREDUCE-3750) ConcurrentModificationException in counter groups

2012-01-27 Thread Eli Collins (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins resolved MAPREDUCE-3750.


Resolution: Duplicate

Dupe of MAPREDUCE-3749

> ConcurrentModificationException in counter groups
> -
>
> Key: MAPREDUCE-3750
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3750
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Tom White
>Priority: Blocker
>
> Iterating over a counter's groups while adding more groups results in a 
> ConcurrentModificationException. This was discovered while running Hive unit 
> tests on a recent 0.23 version of Hadoop.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (MAPREDUCE-3575) Streaming/tools Jar does not get included in the tarball.

2011-12-20 Thread Eli Collins (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins resolved MAPREDUCE-3575.


Resolution: Duplicate

> Streaming/tools Jar does not get included in the tarball.
> -
>
> Key: MAPREDUCE-3575
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3575
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Reporter: Mahadev konar
>Priority: Blocker
> Fix For: 0.23.1
>
>
> The streaming jar used to be available in the mapreduce tarballs before we 
> created the hadoop-tools package. The streaming and tools jars are not being 
> shipped with any tars. Our mapreduce tarballs should include the streaming 
> and tools jar.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (MAPREDUCE-3424) Some LinuxTaskController cleanup

2011-11-22 Thread Eli Collins (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins resolved MAPREDUCE-3424.


  Resolution: Fixed
Hadoop Flags: Reviewed

Thanks Todd. I've committed this. 

> Some LinuxTaskController cleanup
> 
>
> Key: MAPREDUCE-3424
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3424
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>  Components: tasktracker
>Affects Versions: 0.20.205.0
>Reporter: Eli Collins
>Assignee: Eli Collins
>Priority: Minor
> Attachments: mapreduce-3424-1.patch, mapreduce-3424-2.patch, 
> mapreduce-3424-3.patch
>
>
> MR-2415 had some tabs and weird indenting and spacing. Also would be more 
> clear if LTC explicitly overrides createLogDir. Let's clean this up. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (MAPREDUCE-1549) TestTrackerDistributedCacheManager failed on some machines

2011-11-18 Thread Eli Collins (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins resolved MAPREDUCE-1549.


Resolution: Duplicate

The issue isn't host specific, the fix is to chmod a+x the hadoop directory. 
See MAPREDUCE-2073. I'll commit it to 20x.

> TestTrackerDistributedCacheManager failed on some machines
> --
>
> Key: MAPREDUCE-1549
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1549
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: test
>Reporter: Amar Kamat
> Attachments: 
> TEST-org.apache.hadoop.mapreduce.filecache.TestTrackerDistributedCacheManager.txt
>
>
> TestTrackerDistributedCacheManager.testPublicPrivateCache fails on some 
> machines.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (MAPREDUCE-3356) TestTrackerDistributedCacheManager fails on branch-20-security

2011-11-18 Thread Eli Collins (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins resolved MAPREDUCE-3356.


Resolution: Duplicate

Dupe of MAPREDUCE-1549.

> TestTrackerDistributedCacheManager fails on branch-20-security
> --
>
> Key: MAPREDUCE-3356
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3356
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distributed-cache, test
>Affects Versions: 0.20.205.0
>Reporter: Eli Collins
>
> The testReferenceCount and testPublicPrivateCache tests fail reproducibly on 
> branch-20-security. Details in follow up comment.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (MAPREDUCE-3419) Don't mark exited TT threads as dead in MiniMRCluster

2011-11-17 Thread Eli Collins (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins resolved MAPREDUCE-3419.


  Resolution: Fixed
Hadoop Flags: Reviewed

All tests ran cleanly except for one, which is unrelated, filed HADOOP-7836 for 
it.  I've committed this.


> Don't mark exited TT threads as dead in MiniMRCluster  
> ---
>
> Key: MAPREDUCE-3419
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3419
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: tasktracker, test
>Affects Versions: 0.20.206.0
>Reporter: Eli Collins
>Assignee: Eli Collins
> Attachments: mapreduce-2850-1.patch
>
>
> MAPREDUCE-2850 flagged all TT threads that exited in the MiniMRCluster as 
> dead, this breaks a number of the other tests that use MiniMRCluster across 
> restart.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (MAPREDUCE-3015) Add local dir failure info to metrics and the web UI

2011-11-14 Thread Eli Collins (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins resolved MAPREDUCE-3015.


   Resolution: Fixed
Fix Version/s: 0.20.206.0
 Hadoop Flags: Reviewed

Since the changes to the previous patch were trivial I went ahead and committed 
this. Thanks Todd.

> Add local dir failure info to metrics and the web UI
> 
>
> Key: MAPREDUCE-3015
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3015
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>  Components: tasktracker
>Affects Versions: 0.20.204.0
>Reporter: Eli Collins
>Assignee: Eli Collins
> Fix For: 0.20.206.0
>
> Attachments: mapreduce-3015-1.patch, mapreduce-3015-2.patch
>
>
> Like HDFS-811/HDFS-1850 but for the TT.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (MAPREDUCE-2921) TaskTracker won't start with failed local directory

2011-11-11 Thread Eli Collins (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins resolved MAPREDUCE-2921.


Resolution: Duplicate

It turns out the TT will successfully start if there's a failed local directory 
(it checks the dirs and removes any that fail)  so it will start up fine with a 
failed or read-only directory etc. The failure I discovered in the first 
description is because checkDir doesn't fail if the directory exists and is not 
exectutable, once that's fixed the TT will start up in that case as well.

> TaskTracker won't start with failed local directory
> ---
>
> Key: MAPREDUCE-2921
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2921
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>  Components: tasktracker
>Affects Versions: 0.20.204.0
>Reporter: Eli Collins
>Assignee: Eli Collins
>
> Chmod'ing one of the mapred local directories so it's not executable will 
> cause the TT to fail to start. Doing this after the TT has started will 
> result in a TT that is up but can not execute tasks. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (MAPREDUCE-2924) TaskTracker number of failed disks to tolerate should be configurable

2011-11-04 Thread Eli Collins (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins resolved MAPREDUCE-2924.


Resolution: Won't Fix

Thought about this some.. I think leaving the current behavior as is (TT keeps 
running regardless # disk failures) but using a health script that shutsdown 
the TT when the DN goes down makes more sense. The DN already has logic for 
shutting down given a sufficient # of disk failures, and it doesn't make sense 
for the TT to keep running if the DN isn't running. Do think we still need to 
fix MAPREDUCE-2657, otherwise restarting a cluster may result in a bunch of TTs 
that were running not coming up because they tolerated a disk failure while 
running but won't while starting.

> TaskTracker number of failed disks to tolerate should be configurable
> -
>
> Key: MAPREDUCE-2924
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2924
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>  Components: tasktracker
>Affects Versions: 0.20.204.0
>Reporter: Eli Collins
>
> Like HDFS-1161 but for the TT. The user should be able to configure how many 
> valid disks are needed for operation. Currently the TT will start and accept 
> tasks even if eg only 1 of its 12 disks is working, which leads to poor 
> performance of jobs with tasks that use this machine.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (MAPREDUCE-3016) Add TT local dir failure info to the JT web UI

2011-11-03 Thread Eli Collins (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins resolved MAPREDUCE-3016.


Resolution: Duplicate
  Assignee: Eli Collins

This is really simple, will do as part of MAPREDUCE-3015.

> Add TT local dir failure info to the JT web UI
> --
>
> Key: MAPREDUCE-3016
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3016
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>  Components: jobtracker
>Affects Versions: 0.20.204.0
>Reporter: Eli Collins
>Assignee: Eli Collins
>
> Like HDFS-556 but for the JT. The machine list page should report local 
> directory failures per TT.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (MAPREDUCE-2920) Local log dir links in the JT web UI are broken

2011-11-03 Thread Eli Collins (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins resolved MAPREDUCE-2920.


Resolution: Won't Fix

The task logs that are no longer available via logs/userlogs are available via 
the job history server (which works because it uses TaskLog to read the files 
instead of Jetty directly) so I don't think this is worth the effort of fixing.

> Local log dir links in the JT web UI are broken  
> -
>
> Key: MAPREDUCE-2920
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2920
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>  Components: tasktracker
>Affects Versions: 0.20.204.0
>Reporter: Eli Collins
>
> The task log servlet can no longer access user logs because MAPREDUCE-2415 
> introduce symlinks to the logs and jetty is not configured by default to 
> follow  symlinks (for security reasons).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (MAPREDUCE-2850) TaskTracker disk failure handling (MR-2413) has no test coverage

2011-11-03 Thread Eli Collins (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins resolved MAPREDUCE-2850.


  Resolution: Fixed
Hadoop Flags: Reviewed

> TaskTracker disk failure handling (MR-2413) has no test coverage
> 
>
> Key: MAPREDUCE-2850
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2850
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>  Components: tasktracker
>Affects Versions: 0.20.204.0
>Reporter: Eli Collins
>Assignee: Ravi Gummadi
> Attachments: MR2850.v0.patch, MR2850.v1.1.patch, MR2850.v1.2.patch, 
> MR2850.v1.3.patch, MR2850.v1.patch
>
>
> MR-2413 doesn't have any test coverage that eg tests that the TT can survive 
> disk failure.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (MAPREDUCE-3011) TT should remove bad local dirs from conf to prevent constant disk checking

2011-11-03 Thread Eli Collins (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins resolved MAPREDUCE-3011.


Resolution: Not A Problem

You're right, I missed that we're always updating the conf the job is launched 
with the latest known good dirs in TT#launchTaskForJob.  Thanks for the 
explanation.  I verified we weren't missing other locations by failing a local 
dir, logging in the confChanged case and verifying that we only notice the 
change once.

> TT should remove bad local dirs from conf to prevent constant disk checking
> ---
>
> Key: MAPREDUCE-3011
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3011
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>  Components: tasktracker
>Affects Versions: 0.20.204.0
>Reporter: Eli Collins
>
> Per HADOOP-7551 the TT does not remove bad mapred.local.dirs from the conf so 
> after a single disk failure *every* call to get a local path for reading or 
> writing results in a disk check of *all* configured local dirs. After 
> detecting that a local dir is bad we should remove it from the conf so that 
> we don't repeatedly perform this expensive operation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (MAPREDUCE-2957) The TT should not re-init if it has no good local dirs

2011-10-24 Thread Eli Collins (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins resolved MAPREDUCE-2957.


  Resolution: Fixed
Hadoop Flags: Reviewed

Thanks for the review Ravi! I've committed this.

> The TT should not re-init if it has no good local dirs
> --
>
> Key: MAPREDUCE-2957
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2957
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>  Components: tasktracker
>Affects Versions: 0.20.204.0
>Reporter: Eli Collins
>Assignee: Eli Collins
> Attachments: mapreduce-2957-1.patch, mapreduce-2957-2.patch, 
> mapreduce-2957.patch
>
>
> The TT will currently try to re-init itself on disk failure even if it has no 
> good local dirs. It should shutdown instead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira