shahrs87 commented on code in PR #6513: URL: https://github.com/apache/hadoop/pull/6513#discussion_r1479041865
########## hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DataStreamer.java: ########## @@ -1607,8 +1607,11 @@ private void transfer(final DatanodeInfo src, final DatanodeInfo[] targets, * it can be written to. * This happens when a file is appended or data streaming fails * It keeps on trying until a pipeline is setup + * + * Returns boolean whether pipeline was setup successfully or not. + * This boolean is used upstream on whether to continue creating pipeline or throw exception */ - private void setupPipelineForAppendOrRecovery() throws IOException { + private boolean setupPipelineForAppendOrRecovery() throws IOException { Review Comment: We are changing the return type of `setupPipelineForAppendOrRecovery` and `setupPipelineInternal` methods. IIRC this is the reason: `handleBadDatanode` can silently fail to handle bad datanode [here](https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DataStreamer.java#L1700-L1706) and `setupPipelineInternal` will silently return [here](https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DataStreamer.java#L1637-L1638) without bubbling up the exception. ########## hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DataStreamer.java: ########## @@ -1618,24 +1621,33 @@ private void setupPipelineForAppendOrRecovery() throws IOException { LOG.warn(msg); lastException.set(new IOException(msg)); streamerClosed = true; - return; + return false; } - setupPipelineInternal(nodes, storageTypes, storageIDs); + return setupPipelineInternal(nodes, storageTypes, storageIDs); } - protected void setupPipelineInternal(DatanodeInfo[] datanodes, + protected boolean setupPipelineInternal(DatanodeInfo[] datanodes, StorageType[] nodeStorageTypes, String[] nodeStorageIDs) throws IOException { boolean success = false; long newGS = 0L; + boolean isCreateStage = BlockConstructionStage.PIPELINE_SETUP_CREATE == stage; while (!success && !streamerClosed && dfsClient.clientRunning) { if (!handleRestartingDatanode()) { - return; + return false; + } + + final boolean isRecovery = errorState.hasInternalError() && !isCreateStage; + + // During create stage, if we remove a node (nodes.length - 1) + // min replication should still be satisfied. + if (isCreateStage && !(dfsClient.dtpReplaceDatanodeOnFailureReplication > 0 && Review Comment: Reason behind adding this check here: We are already doing this check in catch block of `addDatanode2ExistingPipeline` method [here](https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DataStreamer.java#L1528-L1539). But when `isAppend` flag is set to `false` and we are in `PIPELINE_SETUP_CREATE` phase, we exit early from `addDatanode2ExistingPipeline` method [here](https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DataStreamer.java#L1489-L1492) Lets say the replication factor is 3 and we have set the config property `dfs.client.block.write.replace-datanode-on-failure.min-replication` to 3 and there is one bad node in the pipeline. Even if we have set the config property to `ReplaceDatanodeOnFailure.CONDITION_TRUE`, the code will exit the addDatanode2ExistingPipeline method early since `isAppend` is set to false and stage is `PIPELINE_SETUP_CREATE`. Assuming that there are NO available nodes in the rack, the pipeline will succeed with 2 nodes in the pipeline which will violate the config property: `dfs.client.block.write.replace-datanode-on-failure.min-replication` Having written all of these, I realized that even if there are some good nodes available in the rack, we will exit early after this patch. Should we move this check after `handleDatanodeReplacement` method? @ritegarg -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org