[ https://issues.apache.org/jira/browse/HDFS-9180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14942662#comment-14942662 ]
Walter Su commented on HDFS-9180: --------------------------------- 1. After calling replaceFailedStreamers(), does getExcludedNodes() always return empty? // allocateNewBlock() bq. When replacing a StripedDataStreamer, should also reset the currentPacket to null if the streamer is the first one. 2. You assumed replaceFailedStreamers() will call when currentIdx==0. That's true. It's just I think if we change {code} - if (i == 0) { + if (i == getCurrentIndex()) { {code} it's more readable. And getCurrentIndex() can get index from {{currentBlockGroup}} instead of {{streamer}}. 3. I don't know why #7 streamer didn't bump gs in the failed test. I run 40min and can't reproduce. If we check blockGroup.numBytes() instead of DATA_STREAMING, and assert stage is DATA_STREAMING, does it help debug if test fails again? {code} if (streamer.isHealthy() && // the streamer may not be in STREAMING stage if the block length is // less than a stripe streamer.getStage() == BlockConstructionStage.DATA_STREAMING) { streamer.setExternalError(); healthySet.add(streamer); } {code} #2, #3 not related. > Update excluded DataNodes in DFSStripedOutputStream based on failures in data > streamers > --------------------------------------------------------------------------------------- > > Key: HDFS-9180 > URL: https://issues.apache.org/jira/browse/HDFS-9180 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: erasure-coding > Affects Versions: 3.0.0 > Reporter: Jing Zhao > Assignee: Jing Zhao > Attachments: HDFS-9180.000.patch, HDFS-9180.001.patch, > HDFS-9180.002.patch > > > This is a TODO in HDFS-9040: based on the failures all the striped data > streamers hit, the DFSStripedOutputStream should keep a record of all the > DataNodes that should be excluded. > This jira will also fix several bugs in the DFSStripedOutputStream. Will > provide more details in the comment. -- This message was sent by Atlassian JIRA (v6.3.4#6332)