[ https://issues.apache.org/jira/browse/HDFS-14706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Stephen O'Donnell updated HDFS-14706: ------------------------------------- Attachment: HDFS-14706.004.patch > Checksums are not checked if block meta file is less than 7 bytes > ----------------------------------------------------------------- > > Key: HDFS-14706 > URL: https://issues.apache.org/jira/browse/HDFS-14706 > Project: Hadoop HDFS > Issue Type: Bug > Affects Versions: 3.3.0 > Reporter: Stephen O'Donnell > Assignee: Stephen O'Donnell > Priority: Major > Attachments: HDFS-14706.001.patch, HDFS-14706.002.patch, > HDFS-14706.003.patch, HDFS-14706.004.patch > > > If a block and its meta file are corrupted in a certain way, the corruption > can go unnoticed by a client, causing it to return invalid data. > The meta file is expected to always have a header of 7 bytes and then a > series of checksums depending on the length of the block. > If the metafile gets corrupted in such a way, that it is between zero and > less than 7 bytes in length, then the header is incomplete. In > BlockSender.java the logic checks if the metafile length is at least the size > of the header and if it is not, it does not error, but instead returns a NULL > checksum type to the client. > https://github.com/apache/hadoop/blob/b77761b0e37703beb2c033029e4c0d5ad1dce794/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockSender.java#L327-L357 > If the client receives a NULL checksum client, it will not validate checksums > at all, and even corrupted data will be returned to the reader. This means > this corrupt will go unnoticed and HDFS will never repair it. Even the Volume > Scanner will not notice the corruption as the checksums are silently ignored. > Additionally, if the meta file does have enough bytes so it attempts to load > the header, and the header is corrupted such that it is not valid, it can > cause the datanode Volume Scanner to exit, which an exception like the > following: > {code} > 2019-08-06 18:16:39,151 ERROR datanode.VolumeScanner: > VolumeScanner(/tmp/hadoop-sodonnell/dfs/data, > DS-7f103313-61ba-4d37-b63d-e8cf7d2ed5f7) exiting because of exception > java.lang.IllegalArgumentException: id=51 out of range [0, 5) > at > org.apache.hadoop.util.DataChecksum$Type.valueOf(DataChecksum.java:76) > at > org.apache.hadoop.util.DataChecksum.newDataChecksum(DataChecksum.java:167) > at > org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:173) > at > org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:139) > at > org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:153) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.loadLastPartialChunkChecksum(FsVolumeImpl.java:1140) > at > org.apache.hadoop.hdfs.server.datanode.FinalizedReplica.loadLastPartialChunkChecksum(FinalizedReplica.java:157) > at > org.apache.hadoop.hdfs.server.datanode.BlockSender.getPartialChunkChecksumForFinalized(BlockSender.java:451) > at > org.apache.hadoop.hdfs.server.datanode.BlockSender.<init>(BlockSender.java:266) > at > org.apache.hadoop.hdfs.server.datanode.VolumeScanner.scanBlock(VolumeScanner.java:446) > at > org.apache.hadoop.hdfs.server.datanode.VolumeScanner.runLoop(VolumeScanner.java:558) > at > org.apache.hadoop.hdfs.server.datanode.VolumeScanner.run(VolumeScanner.java:633) > 2019-08-06 18:16:39,152 INFO datanode.VolumeScanner: > VolumeScanner(/tmp/hadoop-sodonnell/dfs/data, > DS-7f103313-61ba-4d37-b63d-e8cf7d2ed5f7) exiting. > {code} -- This message was sent by Atlassian Jira (v8.3.2#803003) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org