[ https://issues.apache.org/jira/browse/HDFS-17383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17834594#comment-17834594 ]
ASF GitHub Bot commented on HDFS-17383: --------------------------------------- zhangshuyan0 commented on PR #6562: URL: https://github.com/apache/hadoop/pull/6562#issuecomment-2041304944 There a problem with this fix. Assuming the following situation: 1. nn1 and nn2 are both standby, and after dn1 registers with them, its currentKey is null; 2. nn1 is transitioned to active, dn1 reports heartbeat, nn1 sends some DNA_TRANSFER commands and a DNA_ACCESSKEYUPDATE command; 3. Due to the order of commands, dn1 will first process DNA_TRANSFER before processing DNA_ACCESSKEYUPDATE, which results in a failure to process DNA_TRANSFER due to a null currentKey. > Datanode current block token should come from active NameNode in HA mode > ------------------------------------------------------------------------ > > Key: HDFS-17383 > URL: https://issues.apache.org/jira/browse/HDFS-17383 > Project: Hadoop HDFS > Issue Type: Bug > Reporter: lei w > Priority: Major > Labels: pull-request-available > Attachments: reproduce.diff > > > We found that transfer block failed during the namenode upgrade. The specific > error reported was that the block token verification failed. The reason is > that during the datanode transfer block process, the source datanode uses its > own generated block token, and the keyid comes from ANN or SBN. However, > because the newly upgraded NN has just been started, the keyid owned by the > source datanode may not be owned by the target datanode, so the write fails. > Here's how to reproduce this situation in the attachment -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org