Liang Xie created HDFS-6766: ------------------------------- Summary: optimize ack notify mechanism to avoid thundering herd issue Key: HDFS-6766 URL: https://issues.apache.org/jira/browse/HDFS-6766 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Affects Versions: 3.0.0 Reporter: Liang Xie Assignee: Liang Xie
Currently, DFSOutputStream uses wait/notifyAll to coordinate ack receiving and ack waiting, etc.. say there're 5 threads(t1,t2,t3,t4,t5) wait for ack seq no: 1,2,3,4,5, once the no. 1 ack arrived, the "notifyAll" be called, so t2/t3/t4/t5 could do nothing except wait again. we can rewrite it with Condition class, with a fair policy(fifo), we can just make t1 be notified, so a number of context switch be saved. It's possible more than one thread waiting on the same ack seq no(e.g. no more data be written between two flush operations), so once it happened, we need to notify those threads, so i introduced a set to remember this seq no. In a simple HBase ycsb testing, the context switch number per second was reduced about 15%, and reduced sys cpu% about 6%(My HBase has new write model patch, i think the benefit will be higher if w/o it) -- This message was sent by Atlassian JIRA (v6.2#6252)