Thanks for finding the issue Wei-Chiu.
I agree hsync should be handling DN failure similarly as write-pipeline
recovery, as you stated. If it's not doing that, it should be fixed.
--Yongjun
On Mon, Sep 11, 2017 at 10:53 AM, Wei-Chiu Chuang
wrote:
> Hello my dear HDFS dev
Hello my dear HDFS dev colleagues,
It appears that when a dfs client writes and hsync(), and if the primary
replica (that is, the first DataNode in the write pipeline) is unresponsive
to the hsync() request, the hsync() would wait at
DataStreamer#waitForAckedSeqno().
In one scenario, we saw this