[ https://issues.apache.org/jira/browse/HDFS-15985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Work on HDFS-15985 started by JiangHua Zhu. ------------------------------------------- > Incorrect sorting will cause failure to load an FsImage file > ------------------------------------------------------------ > > Key: HDFS-15985 > URL: https://issues.apache.org/jira/browse/HDFS-15985 > Project: Hadoop HDFS > Issue Type: Sub-task > Reporter: JiangHua Zhu > Assignee: JiangHua Zhu > Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > After we have introduced HDFS-14617 or HDFS-14771, when loading an fsimage > file, the following error will pop up: > 2021-04-15 17:21:17,868 [293072]-INFO [main:FSImage@784]-Planning to load > image: > FSImageFile(file=/xxxx/hadoop/hdfs/namenode/current/fsimage_000000000xxxx, > cpktTxId=000000000xxxx) > 2021-04-15 17:25:53,288 [568492]-INFO > [main:FSImageFormatPBINode$Loader@229]-Loading 725097952 INodes. > 2021-04-15 17:25:53,289 [568493]-ERROR [main:FSImage@730]-Failed to load > image from > FSImageFile(file=/xxxx/hadoop/hdfs/namenode/current/fsimage_000000000xxxx, > cpktTxId=000000000xxxx) > java.lang.IllegalStateException: GLOBAL: serial number 3 does not exist > at > org.apache.hadoop.hdfs.server.namenode.SerialNumberMap.get(SerialNumberMap.java:85) > at > org.apache.hadoop.hdfs.server.namenode.SerialNumberManager.getString(SerialNumberManager.java:121) > at > org.apache.hadoop.hdfs.server.namenode.SerialNumberManager.getString(SerialNumberManager.java:125) > at > org.apache.hadoop.hdfs.server.namenode.INodeWithAdditionalFields$PermissionStatusFormat.toPermissionStatus(INodeWithAdditionalFields.java:86) > at > org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Loader.loadPermission(FSImageFormatPBINode.java:93) > at > org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Loader.loadINodeFile(FSImageFormatPBINode.java:303) > at > org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Loader.loadINode(FSImageFormatPBINode.java:280) > at > org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Loader.loadINodeSection(FSImageFormatPBINode.java:237) > at > org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader.loadInternal(FSImageFormatProtobuf.java:237) > at > org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader.load(FSImageFormatProtobuf.java:176) > at > org.apache.hadoop.hdfs.server.namenode.FSImageFormat$LoaderDelegator.load(FSImageFormat.java:226) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:937) > It was found that this anomaly was related to sorting, as follows: > ArrayList<FileSummary.Section> sections = Lists.newArrayList(summary > .getSectionsList()); > Collections.sort(sections, new Comparator<FileSummary.Section>() { > @Override > public int compare(FileSummary.Section s1, FileSummary.Section s2) { > SectionName n1 = SectionName.fromString(s1.getName()); > SectionName n2 = SectionName.fromString(s2.getName()); > if (n1 == null) { > return n2 == null? 0: -1; > } else if (n2 == null) { > return -1; > } else { > return n1.ordinal()-n2.ordinal(); > } > } > }); > When n1 != null and n2 == null, this will cause sorting errors. > When loading Sections, the correct order of loading Sections: > NS_INFO -> STRING_TABLE -> INODE > If the sorting is incorrect, the loading order is as follows: > INDOE -> NS_INFO -> STRING_TABLE > Because when loading INODE, you need to rely on STRING_TABLE. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org