William Montaz created YARN-11814:
-------------------------------------
Summary: Deadlock when use yarn rmadmin -refreshQueues
Key: YARN-11814
URL: https://issues.apache.org/jira/browse/YARN-11814
Project: Hadoop YARN
Issue Type: Bug
Components: yarn
Affects Versions: 3.3.6
Reporter: William Montaz
This ticket is a revival of https://issues.apache.org/jira/browse/YARN-9163 as
there is a clear bug in YARN code and no JDK issue as suspected initially (we
dig thoroughly the ReentrantReadWriteLock code as well as the locking
guarantees documented, and the behavior stays the same even with newer versions
of java)
I put a comment in YARN-9163 with an example of how the bug is triggered with
simple java code.
Could you please reconsider the initial patch proposal ?
I also put the java example on how to create a deadlock here:
{code:java}
import java.util.concurrent.locks.ReentrantLock;
import java.util.concurrent.locks.ReentrantReadWriteLock;
public class Main {
ReentrantLock otherLock = new ReentrantLock();
void log(String s) {
System.out.printf("%s: %s%n", Thread.currentThread().getName(), s);
}
public static void main(String[] args) throws Exception {
new Main().runTest();
}
public void runTest() throws Exception {
ReentrantReadWriteLock rwlock = new ReentrantReadWriteLock(false);
// obtain read lock
log("get readlock");
rwlock.readLock().lock(); //should success to get the readLock
new Thread(this.new ReadLockThread(rwlock), "TryRead").start(); //will
get other lock and 2 sec later try to get read
new Thread(this.new WriteLockThread(rwlock), "TryWrite").start();
//will try to get rwlock's write lock and be queued before previous read thread
log("try to get other lock");
otherLock.lock(); //should not succeed as this lock is taken by the
read thread, but the read thread is blocked by the write thread in queue (even
if the writer thread did not yet acquired the lock)
rwlock.readLock().unlock();
}
class WriteLockThread implements Runnable {
private ReentrantReadWriteLock rwlock;
public WriteLockThread(ReentrantReadWriteLock rwlock) {
this.rwlock = rwlock;
}
public void run() {
try {
log("try get writelock");
rwlock.writeLock().lock(); //should fail to get the writeLock
since the readLock already hold by another thread
log("can get writelock");
} finally {
rwlock.writeLock().unlock();
}
}
}
class ReadLockThread implements Runnable {
private ReentrantReadWriteLock rwlock;
public ReadLockThread(ReentrantReadWriteLock rwlock) {
this.rwlock = rwlock;
}
public void run() {
try {
log("try get write lock");
otherLock.lock();
log("try get read lock");
Thread.sleep(2000); // introduce latency to allow a writer
thread to be placed in queue before this one
rwlock.readLock().lock();
log("can get readlock");
} catch (InterruptedException e) {
throw new RuntimeException(e);
} finally {
log("unlock readlock");
rwlock.readLock().unlock();
}
}
}
} {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]