[ https://issues.apache.org/jira/browse/HDFS-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17135832#comment-17135832 ]
Hadoop QA commented on HDFS-15175: ---------------------------------- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 36s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 27m 47s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 23s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 3s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 27s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 20m 10s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 55s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 4m 10s{color} | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 7s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 52s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 16m 41s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 47s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}105m 4s{color} | {color:red} hadoop-hdfs in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 35s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}189m 16s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot | | | hadoop.hdfs.TestReconstructStripedFile | | | hadoop.hdfs.server.namenode.ha.TestPipelinesFailover | | | hadoop.hdfs.TestReconstructStripedFileWithRandomECPolicy | | | hadoop.hdfs.server.datanode.TestBPOfferService | | | hadoop.hdfs.server.namenode.TestNameNodeRetryCacheMetrics | | | hadoop.hdfs.TestStripedFileAppend | \\ \\ || Subsystem || Report/Notes || | Docker | ClientAPI=1.40 ServerAPI=1.40 base: https://builds.apache.org/job/PreCommit-HDFS-Build/29429/artifact/out/Dockerfile | | JIRA Issue | HDFS-15175 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/13005703/HDFS-15175-trunk.1.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 6f00f4b79780 4.15.0-101-generic #102-Ubuntu SMP Mon May 11 10:07:26 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | personality/hadoop.sh | | git revision | trunk / 81d8a887b04 | | Default Java | Private Build-1.8.0_252-8u252-b09-1~18.04-b09 | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/29429/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/29429/testReport/ | | Max. process+thread count | 3233 (vs. ulimit of 5500) | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/29429/console | | versions | git=2.17.1 maven=3.6.0 findbugs=3.1.0-RC1 | | Powered by | Apache Yetus 0.12.0 https://yetus.apache.org | This message was automatically generated. > Multiple CloseOp shared block instance causes the standby namenode to crash > when rolling editlog > ------------------------------------------------------------------------------------------------ > > Key: HDFS-15175 > URL: https://issues.apache.org/jira/browse/HDFS-15175 > Project: Hadoop HDFS > Issue Type: Bug > Affects Versions: 2.9.2 > Reporter: Yicong Cai > Assignee: Wan Chang > Priority: Critical > Labels: NameNode > Attachments: HDFS-15175-trunk.1.patch > > > > {panel:title=Crash exception} > 2020-02-16 09:24:46,426 [507844305] - ERROR [Edit log > tailer:FSEditLogLoader@245] - Encountered exception on operation CloseOp > [length=0, inodeId=0, path=..., replication=3, mtime=1581816138774, > atime=1581814760398, blockSize=536870912, blocks=[blk_5568434562_4495417845], > permissions=da_music:hdfs:rw-r-----, aclEntries=null, clientName=, > clientMachine=, overwrite=false, storagePolicyId=0, opCode=OP_CLOSE, > txid=32625024993] > java.io.IOException: File is not under construction: ...... > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:442) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:237) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:146) > at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:891) > at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:872) > at > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:262) > at > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:395) > at > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$300(EditLogTailer.java:348) > at > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:365) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:360) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1873) > at > org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:479) > at > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:361) > {panel} > > {panel:title=Editlog} > <RECORD> > <OPCODE>OP_REASSIGN_LEASE</OPCODE> > <DATA> > <TXID>32625021150</TXID> > <LEASEHOLDER>DFSClient_NONMAPREDUCE_-969060727_197760</LEASEHOLDER> > <PATH>......</PATH> > <NEWHOLDER>DFSClient_NONMAPREDUCE_1000868229_201260</NEWHOLDER> > </DATA> > </RECORD> > ...... > <RECORD> > <OPCODE>OP_CLOSE</OPCODE> > <DATA> > <TXID>32625023743</TXID> > <LENGTH>0</LENGTH> > <INODEID>0</INODEID> > <PATH>......</PATH> > <REPLICATION>3</REPLICATION> > <MTIME>1581816135883</MTIME> > <ATIME>1581814760398</ATIME> > <BLOCKSIZE>536870912</BLOCKSIZE> > <CLIENT_NAME></CLIENT_NAME> > <CLIENT_MACHINE></CLIENT_MACHINE> > <OVERWRITE>false</OVERWRITE> > <BLOCK> > <BLOCK_ID>5568434562</BLOCK_ID> > <NUM_BYTES>185818644</NUM_BYTES> > <GENSTAMP>4495417845</GENSTAMP> > </BLOCK> > <PERMISSION_STATUS> > <USERNAME>da_music</USERNAME> > <GROUPNAME>hdfs</GROUPNAME> > <MODE>416</MODE> > </PERMISSION_STATUS> > </DATA> > </RECORD> > ...... > <RECORD> > <OPCODE>OP_TRUNCATE</OPCODE> > <DATA> > <TXID>32625024049</TXID> > <SRC>......</SRC> > <CLIENTNAME>DFSClient_NONMAPREDUCE_1000868229_201260</CLIENTNAME> > <CLIENTMACHINE>......</CLIENTMACHINE> > <NEWLENGTH>185818644</NEWLENGTH> > <TIMESTAMP>1581816136336</TIMESTAMP> > <BLOCK> > <BLOCK_ID>5568434562</BLOCK_ID> > <NUM_BYTES>185818648</NUM_BYTES> > <GENSTAMP>4495417845</GENSTAMP> > </BLOCK> > </DATA> > </RECORD> > ...... > <RECORD> > <OPCODE>OP_CLOSE</OPCODE> > <DATA> > <TXID>32625024993</TXID> > <LENGTH>0</LENGTH> > <INODEID>0</INODEID> > <PATH>......</PATH> > <REPLICATION>3</REPLICATION> > <MTIME>1581816138774</MTIME> > <ATIME>1581814760398</ATIME> > <BLOCKSIZE>536870912</BLOCKSIZE> > <CLIENT_NAME></CLIENT_NAME> > <CLIENT_MACHINE></CLIENT_MACHINE> > <OVERWRITE>false</OVERWRITE> > <BLOCK> > <BLOCK_ID>5568434562</BLOCK_ID> > <NUM_BYTES>185818644</NUM_BYTES> > <GENSTAMP>4495417845</GENSTAMP> > </BLOCK> > <PERMISSION_STATUS> > <USERNAME>da_music</USERNAME> > <GROUPNAME>hdfs</GROUPNAME> > <MODE>416</MODE> > </PERMISSION_STATUS> > </DATA> > </RECORD> > {panel} > > > The block size should be 185818648 in the first CloseOp. When truncate is > used, the block size becomes 185818644. The CloseOp/TruncateOp/CloseOp is > synchronized to the JournalNode in the same batch. The block used by CloseOp > twice is the same instance, which causes the first CloseOp has wrong block > size. When SNN rolling Editlog, TruncateOp does not make the file to the > UnderConstruction state. Then, when the second CloseOp is executed, the file > is not in the UnderConstruction state, and SNN crashes. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org