[ https://issues.apache.org/jira/browse/HDFS-15391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17133910#comment-17133910 ]
huhaiyang edited comment on HDFS-15391 at 6/12/20, 4:23 AM: ------------------------------------------------------------ [~ayushtkn] Thank you for reply! Will try to reproduce. However, the problem has not been repeated in the test environment。 I follow up and see if I can reproduce it? {quote} {quote} The block used by CloseOp twice is the same instance, which causes the first CloseOp has wrong block size. {quote} didn't quite understood this. {quote} in the first CloseOp(TXID=126060942290) block_11382080753 block size is 63154347 and GENSTAMP is 10354157480, but in fact in the first CloseOp block_11382080753 block size should be 108764672 and GENSTAMP should be 10354071495. and in the second CloseOp(TXID= 126060943585) block_11382080753 block size is 63154347 and GENSTAMP is 10354157480. The block block_11382080753 used by CloseOp twice is the same instance, the first CloseOp has wrong block information. was (Author: haiyang hu): [~ayushtkn] Thank you for reply! Will try to reproduce. However, the problem has not been repeated in the test environment。 I follow up and see if I can reproduce it? {quote} {quote}The block used by CloseOp twice is the same instance, which causes the first CloseOp has wrong block size. {quote} didn't quite understood this. {quote} in the first CloseOp(TXID=126060942290) block_11382080753 block size is 63154347 and GENSTAMP is 10354157480, but in fact in the first CloseOp block_11382080753 block size should be 108764672 and GENSTAMP should be 10354071495. and in the second CloseOp(TXID= 126060943585) block_11382080753 block size is 63154347 and GENSTAMP is 10354157480. The block block_11382080753 used by CloseOp twice is the same instance, the first CloseOp has wrong block information. > Standby NameNode due loads the corruption edit log, the service exits and > cannot be restarted > --------------------------------------------------------------------------------------------- > > Key: HDFS-15391 > URL: https://issues.apache.org/jira/browse/HDFS-15391 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode > Affects Versions: 3.2.0 > Reporter: huhaiyang > Priority: Critical > > In the cluster version 3.2.0 production environment, > We found that due to edit log corruption, Standby NameNode could not > properly load the Ediltog log, result in abnormal exit of the service and > failure to restart > {noformat} > The specific scenario is that Flink writes to HDFS(replication file), and in > the case of an exception to the write file, the following operations are > performed : > 1.close file > 2.open file > 3.truncate file > 4.append file > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org