[ 
https://issues.apache.org/jira/browse/HDFS-14706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16914736#comment-16914736
 ] 

Wei-Chiu Chuang commented on HDFS-14706:
----------------------------------------

I'm still reviewing the patch and I think it is good.
But my IntelliJ complains InvalidChecksumSizeException is never thrown in 
{{BlockMetadataHeader#preadHeader()}}.

Additionally, I am thinking we should check the return value of {{fc.read()}}, 
and throw {{CorruptMetaHeaderException}} if it's less than the size of arry. 
{code}
  public static BlockMetadataHeader preadHeader(FileChannel fc)
      throws IOException {
    final byte arr[] = new byte[getHeaderSize()];
    ByteBuffer buf = ByteBuffer.wrap(arr);

    while (buf.hasRemaining()) {
      if (fc.read(buf, 0) <= 0) {
        throw new CorruptMetaHeaderException("EOF while reading header from "+
            "the metadata file. The meta file may be truncated or corrupt");
      }
    }
    short version = (short)((arr[0] << 8) | (arr[1] & 0xff));
    DataChecksum dataChecksum;
    try {
      dataChecksum = DataChecksum.newDataChecksum(arr, 2);
    } catch (InvalidChecksumSizeException e) {
      throw new CorruptMetaHeaderException("The block meta file header is "+
          "corrupt", e);
    }
    return new BlockMetadataHeader(version, dataChecksum);
  }
{code}


> Checksums are not checked if block meta file is less than 7 bytes
> -----------------------------------------------------------------
>
>                 Key: HDFS-14706
>                 URL: https://issues.apache.org/jira/browse/HDFS-14706
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 3.3.0
>            Reporter: Stephen O'Donnell
>            Assignee: Stephen O'Donnell
>            Priority: Major
>         Attachments: HDFS-14706.001.patch, HDFS-14706.002.patch, 
> HDFS-14706.003.patch, HDFS-14706.004.patch, HDFS-14706.005.patch
>
>
> If a block and its meta file are corrupted in a certain way, the corruption 
> can go unnoticed by a client, causing it to return invalid data.
> The meta file is expected to always have a header of 7 bytes and then a 
> series of checksums depending on the length of the block.
> If the metafile gets corrupted in such a way, that it is between zero and 
> less than 7 bytes in length, then the header is incomplete. In 
> BlockSender.java the logic checks if the metafile length is at least the size 
> of the header and if it is not, it does not error, but instead returns a NULL 
> checksum type to the client.
> https://github.com/apache/hadoop/blob/b77761b0e37703beb2c033029e4c0d5ad1dce794/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockSender.java#L327-L357
> If the client receives a NULL checksum client, it will not validate checksums 
> at all, and even corrupted data will be returned to the reader. This means 
> this corrupt will go unnoticed and HDFS will never repair it. Even the Volume 
> Scanner will not notice the corruption as the checksums are silently ignored.
> Additionally, if the meta file does have enough bytes so it attempts to load 
> the header, and the header is corrupted such that it is not valid, it can 
> cause the datanode Volume Scanner to exit, which an exception like the 
> following:
> {code}
> 2019-08-06 18:16:39,151 ERROR datanode.VolumeScanner: 
> VolumeScanner(/tmp/hadoop-sodonnell/dfs/data, 
> DS-7f103313-61ba-4d37-b63d-e8cf7d2ed5f7) exiting because of exception 
> java.lang.IllegalArgumentException: id=51 out of range [0, 5)
>       at 
> org.apache.hadoop.util.DataChecksum$Type.valueOf(DataChecksum.java:76)
>       at 
> org.apache.hadoop.util.DataChecksum.newDataChecksum(DataChecksum.java:167)
>       at 
> org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:173)
>       at 
> org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:139)
>       at 
> org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:153)
>       at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.loadLastPartialChunkChecksum(FsVolumeImpl.java:1140)
>       at 
> org.apache.hadoop.hdfs.server.datanode.FinalizedReplica.loadLastPartialChunkChecksum(FinalizedReplica.java:157)
>       at 
> org.apache.hadoop.hdfs.server.datanode.BlockSender.getPartialChunkChecksumForFinalized(BlockSender.java:451)
>       at 
> org.apache.hadoop.hdfs.server.datanode.BlockSender.<init>(BlockSender.java:266)
>       at 
> org.apache.hadoop.hdfs.server.datanode.VolumeScanner.scanBlock(VolumeScanner.java:446)
>       at 
> org.apache.hadoop.hdfs.server.datanode.VolumeScanner.runLoop(VolumeScanner.java:558)
>       at 
> org.apache.hadoop.hdfs.server.datanode.VolumeScanner.run(VolumeScanner.java:633)
> 2019-08-06 18:16:39,152 INFO datanode.VolumeScanner: 
> VolumeScanner(/tmp/hadoop-sodonnell/dfs/data, 
> DS-7f103313-61ba-4d37-b63d-e8cf7d2ed5f7) exiting.
> {code}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to