[ https://issues.apache.org/jira/browse/TEZ-2421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14537054#comment-14537054 ]
Bikas Saha commented on TEZ-2421: --------------------------------- I think T3 cannot proceed because the waiting writelock on T1 is going to prevent other readlocks from getting acquired (otherwise the writelock would starve in the present of a continuous stream of overlapping readlocks). I think recovery will be fine since during recovery everything is running on the central dispatcher and vertex managers are not running (since we dont support vertex manager recovery). I have run TestDAGRecovery and TestAMRecovery many times and there were no further issues. Before the workaround there were issues with them all the time. Yes, TEZ-1019 would provide a better fix. > Deadlock in AM because attempt and vertex locking each other out > ---------------------------------------------------------------- > > Key: TEZ-2421 > URL: https://issues.apache.org/jira/browse/TEZ-2421 > Project: Apache Tez > Issue Type: Bug > Reporter: Bikas Saha > Assignee: Bikas Saha > Priority: Blocker > Attachments: TEZ-2421.1.patch, TEZ-2421.2.patch, TEZ-2421.3.patch, > TEZ-2421.4.patch > > > Ideally locks should be taken one way - either going down or up. Preferably > not going up because most such data can be passed in during object > construction. -- This message was sent by Atlassian JIRA (v6.3.4#6332)