[ 
https://issues.apache.org/jira/browse/YARN-11127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17550303#comment-17550303
 ] 

fanshilun commented on YARN-11127:
----------------------------------

[~zhengchenyu],Thanks for your contribution, but I think it is not correct to 
change the write lock to read lock in getResourceUsageReport, the reason for 
using write lock has been described in the original code,changing this place to 
a read lock may cause the app to be inaccurately sorted in the queue.


{code:java}
public ApplicationResourceUsageReport getResourceUsageReport() {
    writeLock.lock();
    try {
      // Use write lock here because
      // SchedulerApplicationAttempt#getResourceUsageReport updated fields
      // TODO: improve this
      ApplicationResourceUsageReport report = super.getResourceUsageReport();
      Resource cluster = rmContext.getScheduler().getClusterResource();
    ....
      }
      return report;
    } finally {
      writeLock.unlock();
    }
  }
{code}


> Potential deadlock in AsyncDispatcher caused by RMNodeImpl, 
> SchedulerApplicationAttempt and RMAppImpl's lock contention.
> ------------------------------------------------------------------------------------------------------------------------
>
>                 Key: YARN-11127
>                 URL: https://issues.apache.org/jira/browse/YARN-11127
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager
>    Affects Versions: 3.2.1
>            Reporter: zhengchenyu
>            Assignee: zhengchenyu
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: rm-dead-lock.png
>
>          Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> I found rm deadlock in our cluster. It's a low probability event. some 
> critical jstack information are below: 
> {code:java}
> "RM Event dispatcher" #63 prio=5 os_prio=0 tid=0x00007f9a73aaa800 nid=0x221e7 
> waiting on condition [0x00007f85dd00b000]
>    java.lang.Thread.State: WAITING (parking)
>         at sun.misc.Unsafe.park(Native Method)
>         - parking to wait for  <0x00007f9389aab478> (a 
> java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
>         at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>         at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>         at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
>         at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
>         at 
> java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:943)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppLogAggregation.aggregateLogReport(RMAppLogAggregation.java:120)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.aggregateLogReport(RMAppImpl.java:1740)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl.handleLogAggregationStatus(RMNodeImpl.java:1481)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl.access$500(RMNodeImpl.java:104)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl$StatusUpdateWhenHealthyTransition.transition(RMNodeImpl.java:1242)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl$StatusUpdateWhenHealthyTransition.transition(RMNodeImpl.java:1198)
>         at 
> org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
>         at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
>         at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$500(StateMachineFactory.java:46)
>         at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:487)
>         - locked <0x00007f88db78c5c8> (a 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl.handle(RMNodeImpl.java:670)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl.handle(RMNodeImpl.java:101)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$NodeEventDispatcher.handle(ResourceManager.java:1116)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$NodeEventDispatcher.handle(ResourceManager.java:1100)
>         at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:219)
>         at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:133)
>         at java.lang.Thread.run(Thread.java:748)
> "IPC Server handler 264 on default port 8032" #1717 daemon prio=5 os_prio=0 
> tid=0x000055b69acc2800 nid=0x229a5 waiting on condition [0x00007f8574ba2000]
>    java.lang.Thread.State: WAITING (parking)
>         at sun.misc.Unsafe.park(Native Method)
>         - parking to wait for  <0x00007f938976e818> (a 
> java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
>         at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>         at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>         at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
>         at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
>         at 
> java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:943)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.getResourceUsageReport(FiCaSchedulerApp.java:1115)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.getAppResourceUsageReport(AbstractYarnScheduler.java:433)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptMetrics.getAggregateAppResourceUsage(RMAppAttemptMetrics.java:143)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.getRMAppMetrics(RMAppImpl.java:1693)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.createAndGetApplicationReport(RMAppImpl.java:742)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:428)
>         at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java:247)
>         at 
> org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:609)
>         at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:528)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070)
>         at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1040)
>         at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:958)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:422)
>         at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2972)
> "SchedulerEventDispatcher:Event Processor" #38 prio=5 os_prio=0 
> tid=0x000055b699c20000 nid=0x223be waiting on condition [0x00007f85d3686000]
>    java.lang.Thread.State: WAITING (parking)
>         at sun.misc.Unsafe.park(Native Method)
>         - parking to wait for  <0x00007f88d6708f68> (a 
> java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
>         at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>         at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>         at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:967)
>         at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1283)
>         at 
> java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:727)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl.getState(RMNodeImpl.java:563)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.accept(FiCaSchedulerApp.java:464)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.tryCommit(CapacityScheduler.java:3035)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.submitResourceCommitRequest(CapacityScheduler.java:2816)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateOrReserveNewContainers(CapacityScheduler.java:1637)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainerOnSingleNode(CapacityScheduler.java:1624)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1727)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1476)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.nodeUpdate(CapacityScheduler.java:1312)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1785)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:171)
>         at 
> org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:66)
>         at java.lang.Thread.run(Thread.java:748){code}
> Here we found dead lock, because of three thread's lock contention on 
> RMNodeImpl, SchedulerApplicationAttempt and RMAppImpl rwlock. 
> For easy to understand the lock contention, you can see below image.
>  
> !rm-dead-lock.png|width=578,height=214!
> Mulit-level lock with write lock is very danger.
> I reproduce this bug in my uni-test, only one nodemnager and one app with 
> specical parameter, and keep nm have LogAggregationStatusForApps, the 
> probability of this bug increase.
> How to fix?
>  * For thread "RM Event dispatcher", we have lock writelock twice. We could 
> update aggregateLogReport in asynchronous way, then only lock writelock one 
> time.
>  * For thread "IPC Server handler 264 on default port 8032", I think 
> getResourceUsageReport could use readLock.
> Note: Any of the two proposal could solve this problem. But I think we need 
> repair all. 
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to