[ https://issues.apache.org/jira/browse/HDFS-14706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16914736#comment-16914736 ]
Wei-Chiu Chuang commented on HDFS-14706: ---------------------------------------- I'm still reviewing the patch and I think it is good. But my IntelliJ complains InvalidChecksumSizeException is never thrown in {{BlockMetadataHeader#preadHeader()}}. Additionally, I am thinking we should check the return value of {{fc.read()}}, and throw {{CorruptMetaHeaderException}} if it's less than the size of arry. {code} public static BlockMetadataHeader preadHeader(FileChannel fc) throws IOException { final byte arr[] = new byte[getHeaderSize()]; ByteBuffer buf = ByteBuffer.wrap(arr); while (buf.hasRemaining()) { if (fc.read(buf, 0) <= 0) { throw new CorruptMetaHeaderException("EOF while reading header from "+ "the metadata file. The meta file may be truncated or corrupt"); } } short version = (short)((arr[0] << 8) | (arr[1] & 0xff)); DataChecksum dataChecksum; try { dataChecksum = DataChecksum.newDataChecksum(arr, 2); } catch (InvalidChecksumSizeException e) { throw new CorruptMetaHeaderException("The block meta file header is "+ "corrupt", e); } return new BlockMetadataHeader(version, dataChecksum); } {code} > Checksums are not checked if block meta file is less than 7 bytes > ----------------------------------------------------------------- > > Key: HDFS-14706 > URL: https://issues.apache.org/jira/browse/HDFS-14706 > Project: Hadoop HDFS > Issue Type: Bug > Affects Versions: 3.3.0 > Reporter: Stephen O'Donnell > Assignee: Stephen O'Donnell > Priority: Major > Attachments: HDFS-14706.001.patch, HDFS-14706.002.patch, > HDFS-14706.003.patch, HDFS-14706.004.patch, HDFS-14706.005.patch > > > If a block and its meta file are corrupted in a certain way, the corruption > can go unnoticed by a client, causing it to return invalid data. > The meta file is expected to always have a header of 7 bytes and then a > series of checksums depending on the length of the block. > If the metafile gets corrupted in such a way, that it is between zero and > less than 7 bytes in length, then the header is incomplete. In > BlockSender.java the logic checks if the metafile length is at least the size > of the header and if it is not, it does not error, but instead returns a NULL > checksum type to the client. > https://github.com/apache/hadoop/blob/b77761b0e37703beb2c033029e4c0d5ad1dce794/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockSender.java#L327-L357 > If the client receives a NULL checksum client, it will not validate checksums > at all, and even corrupted data will be returned to the reader. This means > this corrupt will go unnoticed and HDFS will never repair it. Even the Volume > Scanner will not notice the corruption as the checksums are silently ignored. > Additionally, if the meta file does have enough bytes so it attempts to load > the header, and the header is corrupted such that it is not valid, it can > cause the datanode Volume Scanner to exit, which an exception like the > following: > {code} > 2019-08-06 18:16:39,151 ERROR datanode.VolumeScanner: > VolumeScanner(/tmp/hadoop-sodonnell/dfs/data, > DS-7f103313-61ba-4d37-b63d-e8cf7d2ed5f7) exiting because of exception > java.lang.IllegalArgumentException: id=51 out of range [0, 5) > at > org.apache.hadoop.util.DataChecksum$Type.valueOf(DataChecksum.java:76) > at > org.apache.hadoop.util.DataChecksum.newDataChecksum(DataChecksum.java:167) > at > org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:173) > at > org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:139) > at > org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:153) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.loadLastPartialChunkChecksum(FsVolumeImpl.java:1140) > at > org.apache.hadoop.hdfs.server.datanode.FinalizedReplica.loadLastPartialChunkChecksum(FinalizedReplica.java:157) > at > org.apache.hadoop.hdfs.server.datanode.BlockSender.getPartialChunkChecksumForFinalized(BlockSender.java:451) > at > org.apache.hadoop.hdfs.server.datanode.BlockSender.<init>(BlockSender.java:266) > at > org.apache.hadoop.hdfs.server.datanode.VolumeScanner.scanBlock(VolumeScanner.java:446) > at > org.apache.hadoop.hdfs.server.datanode.VolumeScanner.runLoop(VolumeScanner.java:558) > at > org.apache.hadoop.hdfs.server.datanode.VolumeScanner.run(VolumeScanner.java:633) > 2019-08-06 18:16:39,152 INFO datanode.VolumeScanner: > VolumeScanner(/tmp/hadoop-sodonnell/dfs/data, > DS-7f103313-61ba-4d37-b63d-e8cf7d2ed5f7) exiting. > {code} -- This message was sent by Atlassian Jira (v8.3.2#803003) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org