[jira] Updated: (MAPREDUCE-1342) Potential JT deadlock in faulty TT tracking

2010-01-19 Thread Hemanth Yamijala (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hemanth Yamijala updated MAPREDUCE-1342:


  Resolution: Fixed
Hadoop Flags: [Reviewed]
  Status: Resolved  (was: Patch Available)

 Potential JT deadlock in faulty TT tracking
 ---

 Key: MAPREDUCE-1342
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1342
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 0.20.1
Reporter: Todd Lipcon
Assignee: Amareshwari Sriramadasu
 Fix For: 0.21.0

 Attachments: cycle0.png, mapreduce-1342-1.patch, 
 mapreduce-1342-2.patch, patch-1342-0.21.txt, patch-1342-1.txt, 
 patch-1342-2-ydist.txt, patch-1342-2.txt, patch-1342-3-ydist.txt, 
 patch-1342-3.txt, patch-1342-ydist.txt, patch-1342.txt


 JT$FaultyTrackersInfo.incrementFaults first locks potentiallyFaultyTrackers, 
 and then calls blackListTracker, which calls removeHostCapacity, which locks 
 JT.taskTrackers
 On the other hand, JT.blacklistedTaskTrackers() locks taskTrackers, then 
 calls faultyTrackers.isBlacklisted() which goes on to lock 
 potentiallyFaultyTrackers.
 I haven't produced such a deadlock, but the lock ordering here is inverted 
 and therefore could deadlock.
 Not sure if this goes back to 0.21 or just in trunk.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1342) Potential JT deadlock in faulty TT tracking

2010-01-18 Thread Amareshwari Sriramadasu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amareshwari Sriramadasu updated MAPREDUCE-1342:
---

Attachment: patch-1342-0.21.txt

Patch for branch 0.21.
Ran test-patch and ant test. All tests passed.

 Potential JT deadlock in faulty TT tracking
 ---

 Key: MAPREDUCE-1342
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1342
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 0.20.1
Reporter: Todd Lipcon
Assignee: Amareshwari Sriramadasu
 Fix For: 0.21.0

 Attachments: cycle0.png, mapreduce-1342-1.patch, 
 mapreduce-1342-2.patch, patch-1342-0.21.txt, patch-1342-1.txt, 
 patch-1342-2-ydist.txt, patch-1342-2.txt, patch-1342-3-ydist.txt, 
 patch-1342-3.txt, patch-1342-ydist.txt, patch-1342.txt


 JT$FaultyTrackersInfo.incrementFaults first locks potentiallyFaultyTrackers, 
 and then calls blackListTracker, which calls removeHostCapacity, which locks 
 JT.taskTrackers
 On the other hand, JT.blacklistedTaskTrackers() locks taskTrackers, then 
 calls faultyTrackers.isBlacklisted() which goes on to lock 
 potentiallyFaultyTrackers.
 I haven't produced such a deadlock, but the lock ordering here is inverted 
 and therefore could deadlock.
 Not sure if this goes back to 0.21 or just in trunk.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1342) Potential JT deadlock in faulty TT tracking

2010-01-14 Thread Arun C Murthy (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated MAPREDUCE-1342:
-

Environment: Fixed a potential deadlock in the global blacklist of 
tasktrackers feature.

 Potential JT deadlock in faulty TT tracking
 ---

 Key: MAPREDUCE-1342
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1342
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 0.20.1
 Environment: Fixed a potential deadlock in the global blacklist of 
 tasktrackers feature.
Reporter: Todd Lipcon
Assignee: Amareshwari Sriramadasu
 Fix For: 0.21.0

 Attachments: cycle0.png, mapreduce-1342-1.patch, 
 mapreduce-1342-2.patch, patch-1342-1.txt, patch-1342-2-ydist.txt, 
 patch-1342-2.txt, patch-1342-3-ydist.txt, patch-1342-3.txt, 
 patch-1342-ydist.txt, patch-1342.txt


 JT$FaultyTrackersInfo.incrementFaults first locks potentiallyFaultyTrackers, 
 and then calls blackListTracker, which calls removeHostCapacity, which locks 
 JT.taskTrackers
 On the other hand, JT.blacklistedTaskTrackers() locks taskTrackers, then 
 calls faultyTrackers.isBlacklisted() which goes on to lock 
 potentiallyFaultyTrackers.
 I haven't produced such a deadlock, but the lock ordering here is inverted 
 and therefore could deadlock.
 Not sure if this goes back to 0.21 or just in trunk.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1342) Potential JT deadlock in faulty TT tracking

2010-01-14 Thread Arun C Murthy (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated MAPREDUCE-1342:
-

 Environment: (was: Fixed a potential deadlock in the global blacklist 
of tasktrackers feature.)
Release Note: Fix for a potential deadlock in the global blacklist of 
tasktrackers feature.

 Potential JT deadlock in faulty TT tracking
 ---

 Key: MAPREDUCE-1342
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1342
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 0.20.1
Reporter: Todd Lipcon
Assignee: Amareshwari Sriramadasu
 Fix For: 0.21.0

 Attachments: cycle0.png, mapreduce-1342-1.patch, 
 mapreduce-1342-2.patch, patch-1342-1.txt, patch-1342-2-ydist.txt, 
 patch-1342-2.txt, patch-1342-3-ydist.txt, patch-1342-3.txt, 
 patch-1342-ydist.txt, patch-1342.txt


 JT$FaultyTrackersInfo.incrementFaults first locks potentiallyFaultyTrackers, 
 and then calls blackListTracker, which calls removeHostCapacity, which locks 
 JT.taskTrackers
 On the other hand, JT.blacklistedTaskTrackers() locks taskTrackers, then 
 calls faultyTrackers.isBlacklisted() which goes on to lock 
 potentiallyFaultyTrackers.
 I haven't produced such a deadlock, but the lock ordering here is inverted 
 and therefore could deadlock.
 Not sure if this goes back to 0.21 or just in trunk.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1342) Potential JT deadlock in faulty TT tracking

2010-01-13 Thread Amareshwari Sriramadasu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amareshwari Sriramadasu updated MAPREDUCE-1342:
---

Status: Patch Available  (was: Open)

 Potential JT deadlock in faulty TT tracking
 ---

 Key: MAPREDUCE-1342
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1342
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 0.20.1
Reporter: Todd Lipcon
Assignee: Amareshwari Sriramadasu
 Fix For: 0.21.0

 Attachments: cycle0.png, mapreduce-1342-1.patch, 
 mapreduce-1342-2.patch, patch-1342-1.txt, patch-1342-2-ydist.txt, 
 patch-1342-2.txt, patch-1342-3-ydist.txt, patch-1342-3.txt, 
 patch-1342-ydist.txt, patch-1342.txt


 JT$FaultyTrackersInfo.incrementFaults first locks potentiallyFaultyTrackers, 
 and then calls blackListTracker, which calls removeHostCapacity, which locks 
 JT.taskTrackers
 On the other hand, JT.blacklistedTaskTrackers() locks taskTrackers, then 
 calls faultyTrackers.isBlacklisted() which goes on to lock 
 potentiallyFaultyTrackers.
 I haven't produced such a deadlock, but the lock ordering here is inverted 
 and therefore could deadlock.
 Not sure if this goes back to 0.21 or just in trunk.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1342) Potential JT deadlock in faulty TT tracking

2010-01-13 Thread Amareshwari Sriramadasu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amareshwari Sriramadasu updated MAPREDUCE-1342:
---

Attachment: patch-1342-3.txt

Patch for trunk

 Potential JT deadlock in faulty TT tracking
 ---

 Key: MAPREDUCE-1342
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1342
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 0.20.1
Reporter: Todd Lipcon
Assignee: Amareshwari Sriramadasu
 Fix For: 0.21.0

 Attachments: cycle0.png, mapreduce-1342-1.patch, 
 mapreduce-1342-2.patch, patch-1342-1.txt, patch-1342-2-ydist.txt, 
 patch-1342-2.txt, patch-1342-3-ydist.txt, patch-1342-3.txt, 
 patch-1342-ydist.txt, patch-1342.txt


 JT$FaultyTrackersInfo.incrementFaults first locks potentiallyFaultyTrackers, 
 and then calls blackListTracker, which calls removeHostCapacity, which locks 
 JT.taskTrackers
 On the other hand, JT.blacklistedTaskTrackers() locks taskTrackers, then 
 calls faultyTrackers.isBlacklisted() which goes on to lock 
 potentiallyFaultyTrackers.
 I haven't produced such a deadlock, but the lock ordering here is inverted 
 and therefore could deadlock.
 Not sure if this goes back to 0.21 or just in trunk.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1342) Potential JT deadlock in faulty TT tracking

2010-01-13 Thread Amareshwari Sriramadasu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amareshwari Sriramadasu updated MAPREDUCE-1342:
---

Attachment: patch-1342-3-ydist.txt

Added comments about locking order assumptions to methods 
JobTracker.addNewTracker and JobTracker.removeTracker.

 Potential JT deadlock in faulty TT tracking
 ---

 Key: MAPREDUCE-1342
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1342
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 0.20.1
Reporter: Todd Lipcon
Assignee: Amareshwari Sriramadasu
 Fix For: 0.21.0

 Attachments: cycle0.png, mapreduce-1342-1.patch, 
 mapreduce-1342-2.patch, patch-1342-1.txt, patch-1342-2-ydist.txt, 
 patch-1342-2.txt, patch-1342-3-ydist.txt, patch-1342-3.txt, 
 patch-1342-ydist.txt, patch-1342.txt


 JT$FaultyTrackersInfo.incrementFaults first locks potentiallyFaultyTrackers, 
 and then calls blackListTracker, which calls removeHostCapacity, which locks 
 JT.taskTrackers
 On the other hand, JT.blacklistedTaskTrackers() locks taskTrackers, then 
 calls faultyTrackers.isBlacklisted() which goes on to lock 
 potentiallyFaultyTrackers.
 I haven't produced such a deadlock, but the lock ordering here is inverted 
 and therefore could deadlock.
 Not sure if this goes back to 0.21 or just in trunk.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1342) Potential JT deadlock in faulty TT tracking

2010-01-12 Thread Amareshwari Sriramadasu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amareshwari Sriramadasu updated MAPREDUCE-1342:
---

Attachment: patch-1342-2.txt

Patch with Arun's comments incorporated. Now, taskTrackers or 
potentiallyFaultyTrackers is always locked holding JobTracker lock. The newly 
synchronized methods are called from testcases or already synchronized methods.

 Potential JT deadlock in faulty TT tracking
 ---

 Key: MAPREDUCE-1342
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1342
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 0.20.1
Reporter: Todd Lipcon
Assignee: Amareshwari Sriramadasu
 Fix For: 0.21.0

 Attachments: cycle0.png, mapreduce-1342-1.patch, 
 mapreduce-1342-2.patch, patch-1342-1.txt, patch-1342-2.txt, 
 patch-1342-ydist.txt, patch-1342.txt


 JT$FaultyTrackersInfo.incrementFaults first locks potentiallyFaultyTrackers, 
 and then calls blackListTracker, which calls removeHostCapacity, which locks 
 JT.taskTrackers
 On the other hand, JT.blacklistedTaskTrackers() locks taskTrackers, then 
 calls faultyTrackers.isBlacklisted() which goes on to lock 
 potentiallyFaultyTrackers.
 I haven't produced such a deadlock, but the lock ordering here is inverted 
 and therefore could deadlock.
 Not sure if this goes back to 0.21 or just in trunk.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1342) Potential JT deadlock in faulty TT tracking

2010-01-12 Thread Amareshwari Sriramadasu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amareshwari Sriramadasu updated MAPREDUCE-1342:
---

Status: Patch Available  (was: Open)

 Potential JT deadlock in faulty TT tracking
 ---

 Key: MAPREDUCE-1342
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1342
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 0.20.1
Reporter: Todd Lipcon
Assignee: Amareshwari Sriramadasu
 Fix For: 0.21.0

 Attachments: cycle0.png, mapreduce-1342-1.patch, 
 mapreduce-1342-2.patch, patch-1342-1.txt, patch-1342-2.txt, 
 patch-1342-ydist.txt, patch-1342.txt


 JT$FaultyTrackersInfo.incrementFaults first locks potentiallyFaultyTrackers, 
 and then calls blackListTracker, which calls removeHostCapacity, which locks 
 JT.taskTrackers
 On the other hand, JT.blacklistedTaskTrackers() locks taskTrackers, then 
 calls faultyTrackers.isBlacklisted() which goes on to lock 
 potentiallyFaultyTrackers.
 I haven't produced such a deadlock, but the lock ordering here is inverted 
 and therefore could deadlock.
 Not sure if this goes back to 0.21 or just in trunk.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1342) Potential JT deadlock in faulty TT tracking

2010-01-12 Thread Amareshwari Sriramadasu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amareshwari Sriramadasu updated MAPREDUCE-1342:
---

Attachment: patch-1342-2-ydist.txt

Patch for Yahoo! distribution

 Potential JT deadlock in faulty TT tracking
 ---

 Key: MAPREDUCE-1342
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1342
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 0.20.1
Reporter: Todd Lipcon
Assignee: Amareshwari Sriramadasu
 Fix For: 0.21.0

 Attachments: cycle0.png, mapreduce-1342-1.patch, 
 mapreduce-1342-2.patch, patch-1342-1.txt, patch-1342-2-ydist.txt, 
 patch-1342-2.txt, patch-1342-ydist.txt, patch-1342.txt


 JT$FaultyTrackersInfo.incrementFaults first locks potentiallyFaultyTrackers, 
 and then calls blackListTracker, which calls removeHostCapacity, which locks 
 JT.taskTrackers
 On the other hand, JT.blacklistedTaskTrackers() locks taskTrackers, then 
 calls faultyTrackers.isBlacklisted() which goes on to lock 
 potentiallyFaultyTrackers.
 I haven't produced such a deadlock, but the lock ordering here is inverted 
 and therefore could deadlock.
 Not sure if this goes back to 0.21 or just in trunk.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1342) Potential JT deadlock in faulty TT tracking

2010-01-12 Thread Amareshwari Sriramadasu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amareshwari Sriramadasu updated MAPREDUCE-1342:
---

Attachment: patch-1342-2.txt

Attaching the patch again. As Hudson picked up wrong patch.

 Potential JT deadlock in faulty TT tracking
 ---

 Key: MAPREDUCE-1342
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1342
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 0.20.1
Reporter: Todd Lipcon
Assignee: Amareshwari Sriramadasu
 Fix For: 0.21.0

 Attachments: cycle0.png, mapreduce-1342-1.patch, 
 mapreduce-1342-2.patch, patch-1342-1.txt, patch-1342-2-ydist.txt, 
 patch-1342-2.txt, patch-1342-2.txt, patch-1342-ydist.txt, patch-1342.txt


 JT$FaultyTrackersInfo.incrementFaults first locks potentiallyFaultyTrackers, 
 and then calls blackListTracker, which calls removeHostCapacity, which locks 
 JT.taskTrackers
 On the other hand, JT.blacklistedTaskTrackers() locks taskTrackers, then 
 calls faultyTrackers.isBlacklisted() which goes on to lock 
 potentiallyFaultyTrackers.
 I haven't produced such a deadlock, but the lock ordering here is inverted 
 and therefore could deadlock.
 Not sure if this goes back to 0.21 or just in trunk.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1342) Potential JT deadlock in faulty TT tracking

2010-01-11 Thread Amareshwari Sriramadasu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amareshwari Sriramadasu updated MAPREDUCE-1342:
---

Assignee: Amareshwari Sriramadasu
  Status: Open  (was: Patch Available)

 Potential JT deadlock in faulty TT tracking
 ---

 Key: MAPREDUCE-1342
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1342
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 0.20.1
Reporter: Todd Lipcon
Assignee: Amareshwari Sriramadasu
 Fix For: 0.21.0

 Attachments: cycle0.png, mapreduce-1342-1.patch, 
 mapreduce-1342-2.patch, patch-1342-ydist.txt, patch-1342.txt


 JT$FaultyTrackersInfo.incrementFaults first locks potentiallyFaultyTrackers, 
 and then calls blackListTracker, which calls removeHostCapacity, which locks 
 JT.taskTrackers
 On the other hand, JT.blacklistedTaskTrackers() locks taskTrackers, then 
 calls faultyTrackers.isBlacklisted() which goes on to lock 
 potentiallyFaultyTrackers.
 I haven't produced such a deadlock, but the lock ordering here is inverted 
 and therefore could deadlock.
 Not sure if this goes back to 0.21 or just in trunk.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1342) Potential JT deadlock in faulty TT tracking

2010-01-11 Thread Amareshwari Sriramadasu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amareshwari Sriramadasu updated MAPREDUCE-1342:
---

Status: Patch Available  (was: Open)

 Potential JT deadlock in faulty TT tracking
 ---

 Key: MAPREDUCE-1342
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1342
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 0.20.1
Reporter: Todd Lipcon
Assignee: Amareshwari Sriramadasu
 Fix For: 0.21.0

 Attachments: cycle0.png, mapreduce-1342-1.patch, 
 mapreduce-1342-2.patch, patch-1342-1.txt, patch-1342-ydist.txt, patch-1342.txt


 JT$FaultyTrackersInfo.incrementFaults first locks potentiallyFaultyTrackers, 
 and then calls blackListTracker, which calls removeHostCapacity, which locks 
 JT.taskTrackers
 On the other hand, JT.blacklistedTaskTrackers() locks taskTrackers, then 
 calls faultyTrackers.isBlacklisted() which goes on to lock 
 potentiallyFaultyTrackers.
 I haven't produced such a deadlock, but the lock ordering here is inverted 
 and therefore could deadlock.
 Not sure if this goes back to 0.21 or just in trunk.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1342) Potential JT deadlock in faulty TT tracking

2010-01-11 Thread Arun C Murthy (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated MAPREDUCE-1342:
-

Status: Open  (was: Patch Available)

Shouldn't we make JobTracker.getFaultCount and JobTracker.taskTrackers too?

Oh, and thanks for your help Todd!

 Potential JT deadlock in faulty TT tracking
 ---

 Key: MAPREDUCE-1342
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1342
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 0.20.1
Reporter: Todd Lipcon
Assignee: Amareshwari Sriramadasu
 Fix For: 0.21.0

 Attachments: cycle0.png, mapreduce-1342-1.patch, 
 mapreduce-1342-2.patch, patch-1342-1.txt, patch-1342-ydist.txt, patch-1342.txt


 JT$FaultyTrackersInfo.incrementFaults first locks potentiallyFaultyTrackers, 
 and then calls blackListTracker, which calls removeHostCapacity, which locks 
 JT.taskTrackers
 On the other hand, JT.blacklistedTaskTrackers() locks taskTrackers, then 
 calls faultyTrackers.isBlacklisted() which goes on to lock 
 potentiallyFaultyTrackers.
 I haven't produced such a deadlock, but the lock ordering here is inverted 
 and therefore could deadlock.
 Not sure if this goes back to 0.21 or just in trunk.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1342) Potential JT deadlock in faulty TT tracking

2010-01-07 Thread Amareshwari Sriramadasu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amareshwari Sriramadasu updated MAPREDUCE-1342:
---

Attachment: patch-1342.txt

Patch making the methods activeTaskTrackers(), blacklistedTaskTrackers() and 
taskTrackerNames() synchronized. These are the methods which lock taskTrackers 
and then potentiallyFaultyTrackers, without JobTracker lock.

 Potential JT deadlock in faulty TT tracking
 ---

 Key: MAPREDUCE-1342
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1342
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 0.22.0
Reporter: Todd Lipcon
 Attachments: cycle0.png, mapreduce-1342-1.patch, 
 mapreduce-1342-2.patch, patch-1342.txt


 JT$FaultyTrackersInfo.incrementFaults first locks potentiallyFaultyTrackers, 
 and then calls blackListTracker, which calls removeHostCapacity, which locks 
 JT.taskTrackers
 On the other hand, JT.blacklistedTaskTrackers() locks taskTrackers, then 
 calls faultyTrackers.isBlacklisted() which goes on to lock 
 potentiallyFaultyTrackers.
 I haven't produced such a deadlock, but the lock ordering here is inverted 
 and therefore could deadlock.
 Not sure if this goes back to 0.21 or just in trunk.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1342) Potential JT deadlock in faulty TT tracking

2010-01-07 Thread Amareshwari Sriramadasu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amareshwari Sriramadasu updated MAPREDUCE-1342:
---

Fix Version/s: 0.21.0
Affects Version/s: (was: 0.22.0)
   0.20.1
   Status: Patch Available  (was: Open)

 Potential JT deadlock in faulty TT tracking
 ---

 Key: MAPREDUCE-1342
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1342
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 0.20.1
Reporter: Todd Lipcon
 Fix For: 0.21.0

 Attachments: cycle0.png, mapreduce-1342-1.patch, 
 mapreduce-1342-2.patch, patch-1342.txt


 JT$FaultyTrackersInfo.incrementFaults first locks potentiallyFaultyTrackers, 
 and then calls blackListTracker, which calls removeHostCapacity, which locks 
 JT.taskTrackers
 On the other hand, JT.blacklistedTaskTrackers() locks taskTrackers, then 
 calls faultyTrackers.isBlacklisted() which goes on to lock 
 potentiallyFaultyTrackers.
 I haven't produced such a deadlock, but the lock ordering here is inverted 
 and therefore could deadlock.
 Not sure if this goes back to 0.21 or just in trunk.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1342) Potential JT deadlock in faulty TT tracking

2010-01-07 Thread Amareshwari Sriramadasu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amareshwari Sriramadasu updated MAPREDUCE-1342:
---

Attachment: patch-1342-ydist.txt

Patch for Y! distribution.

Ran test-patch and ant test. All the tests passed except 
TestKillSubProcesses(due to MAPREDUCE-408).

 Potential JT deadlock in faulty TT tracking
 ---

 Key: MAPREDUCE-1342
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1342
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 0.20.1
Reporter: Todd Lipcon
 Fix For: 0.21.0

 Attachments: cycle0.png, mapreduce-1342-1.patch, 
 mapreduce-1342-2.patch, patch-1342-ydist.txt, patch-1342.txt


 JT$FaultyTrackersInfo.incrementFaults first locks potentiallyFaultyTrackers, 
 and then calls blackListTracker, which calls removeHostCapacity, which locks 
 JT.taskTrackers
 On the other hand, JT.blacklistedTaskTrackers() locks taskTrackers, then 
 calls faultyTrackers.isBlacklisted() which goes on to lock 
 potentiallyFaultyTrackers.
 I haven't produced such a deadlock, but the lock ordering here is inverted 
 and therefore could deadlock.
 Not sure if this goes back to 0.21 or just in trunk.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1342) Potential JT deadlock in faulty TT tracking

2010-01-03 Thread Sreekanth Ramakrishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sreekanth Ramakrishnan updated MAPREDUCE-1342:
--

Attachment: mapreduce-1342-1.patch

Attaching a patch, removes the need to lock on faultyTrackerInfo, by changing 
the field to a concurrent hash map and not locking on addition and removal. 

 Potential JT deadlock in faulty TT tracking
 ---

 Key: MAPREDUCE-1342
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1342
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 0.22.0
Reporter: Todd Lipcon
 Attachments: cycle0.png, mapreduce-1342-1.patch


 JT$FaultyTrackersInfo.incrementFaults first locks potentiallyFaultyTrackers, 
 and then calls blackListTracker, which calls removeHostCapacity, which locks 
 JT.taskTrackers
 On the other hand, JT.blacklistedTaskTrackers() locks taskTrackers, then 
 calls faultyTrackers.isBlacklisted() which goes on to lock 
 potentiallyFaultyTrackers.
 I haven't produced such a deadlock, but the lock ordering here is inverted 
 and therefore could deadlock.
 Not sure if this goes back to 0.21 or just in trunk.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1342) Potential JT deadlock in faulty TT tracking

2010-01-03 Thread Sreekanth Ramakrishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sreekanth Ramakrishnan updated MAPREDUCE-1342:
--

Attachment: mapreduce-1342-2.patch

Attaching new patch after discussion with Amar. Made the map concurrent map and 
changed the getters not to lock on the map. This way we will remove the lock on 
the second resource for Client API's which don't lock on {{JobTracker}}

 Potential JT deadlock in faulty TT tracking
 ---

 Key: MAPREDUCE-1342
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1342
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 0.22.0
Reporter: Todd Lipcon
 Attachments: cycle0.png, mapreduce-1342-1.patch, 
 mapreduce-1342-2.patch


 JT$FaultyTrackersInfo.incrementFaults first locks potentiallyFaultyTrackers, 
 and then calls blackListTracker, which calls removeHostCapacity, which locks 
 JT.taskTrackers
 On the other hand, JT.blacklistedTaskTrackers() locks taskTrackers, then 
 calls faultyTrackers.isBlacklisted() which goes on to lock 
 potentiallyFaultyTrackers.
 I haven't produced such a deadlock, but the lock ordering here is inverted 
 and therefore could deadlock.
 Not sure if this goes back to 0.21 or just in trunk.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1342) Potential JT deadlock in faulty TT tracking

2009-12-28 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated MAPREDUCE-1342:
---

Attachment: cycle0.png

Here's the output from jcarder that shows the cycle (this was detected while 
running TestLostTracker with jcarder instrumentation using the branch at 
http://github.com/toddlipcon/jcarder/tree/cloudera)

 Potential JT deadlock in faulty TT tracking
 ---

 Key: MAPREDUCE-1342
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1342
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 0.22.0
Reporter: Todd Lipcon
 Attachments: cycle0.png


 JT$FaultyTrackersInfo.incrementFaults first locks potentiallyFaultyTrackers, 
 and then calls blackListTracker, which calls removeHostCapacity, which locks 
 JT.taskTrackers
 On the other hand, JT.blacklistedTaskTrackers() locks taskTrackers, then 
 calls faultyTrackers.isBlacklisted() which goes on to lock 
 potentiallyFaultyTrackers.
 I haven't produced such a deadlock, but the lock ordering here is inverted 
 and therefore could deadlock.
 Not sure if this goes back to 0.21 or just in trunk.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.