[ 
https://issues.apache.org/jira/browse/HDFS-14529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17373706#comment-17373706
 ] 

Wei-Chiu Chuang commented on HDFS-14529:
----------------------------------------

We encountered this bug again, and it is reproducible for this set of 
fsimage/edit logs.

We added debug logs and found that the IIP has a few missing components. It was 
supposed to have 8 components in the path but only 6 was found. Two were nulls. 
It is likely caused by files already deleted from snapshots. Somehow the active 
NN keeps the file in memory, so standby namenode crashes upon loading edits.

Comparing this method with other similar methods, I think we should check for 
nullity of iip.getLastINode(), and throw FileNotFoundException. There are other 
places in the code where we could add the nullity check as well. I did fail 
several times for other edit log op (mkdir, rename, renameSnapshot) too.

{noformat}
21/07/02 11:39:39 ERROR namenode.FSEditLogLoader: AssertionError caught in 
unprotectedSetTimes: iip=INodesInPath: path = 
/apps/hive/warehouse/ea_common.db/sls_blng_rw/ins_gmt_dt=2021-06-22/part-00001-087de2ec-7888-4f2b-bea6-3702c69cf953.c000
  inodes = [, apps, hive, warehouse, ea_common.db, sls_blng_rw, null, null], 
length=8
  isSnapshot        = false
  snapshotId        = 8014, lastINode=null, mtime=-1, atime=1624825911021, 
force? true
java.lang.AssertionError: i = 6 != 8, this=INodesInPath: path = 
/apps/hive/warehouse/ea_common.db/sls_blng_rw/ins_gmt_dt=2021-06-22/part-00001-087de2ec-7888-4f2b-bea6-3702c69cf953.c000
  inodes = [, apps, hive, warehouse, ea_common.db, sls_blng_rw, null, null], 
length=8
  isSnapshot        = false
  snapshotId        = 8014
        at 
org.apache.hadoop.hdfs.server.namenode.INodesInPath.validate(INodesInPath.java:488)
        at 
org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.unprotectedSetTimes(FSDirAttrOp.java:355)
        at 
org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:631)
{noformat}

> NPE while Loading the Editlogs
> ------------------------------
>
>                 Key: HDFS-14529
>                 URL: https://issues.apache.org/jira/browse/HDFS-14529
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs
>    Affects Versions: 3.1.1
>            Reporter: Harshakiran Reddy
>            Assignee: Ayush Saxena
>            Priority: Major
>
> {noformat}
> 2019-05-31 15:15:42,397 ERROR namenode.FSEditLogLoader: Encountered exception 
> on operation TimesOp [length=0, 
> path=/testLoadSpace/dir0/dir0/dir0/dir2/_file_9096763, mtime=-1, 
> atime=1559294343288, opCode=OP_TIMES, txid=18927893]
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.unprotectedSetTimes(FSDirAttrOp.java:490)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:711)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:286)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:181)
> at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:924)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:771)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:331)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1105)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:726)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.doRecovery(NameNode.java:1558)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1640)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1725){noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to