[ 
https://issues.apache.org/jira/browse/HDFS-14706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-14706:
-----------------------------------
    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

Pushed to trunk, branch-3.1 and branch-3.2
Thanks [~sodonnell]!

> Checksums are not checked if block meta file is less than 7 bytes
> -----------------------------------------------------------------
>
>                 Key: HDFS-14706
>                 URL: https://issues.apache.org/jira/browse/HDFS-14706
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 3.3.0
>            Reporter: Stephen O'Donnell
>            Assignee: Stephen O'Donnell
>            Priority: Major
>             Fix For: 3.3.0, 3.2.1, 3.1.3
>
>         Attachments: HDFS-14706.001.patch, HDFS-14706.002.patch, 
> HDFS-14706.003.patch, HDFS-14706.004.patch, HDFS-14706.005.patch, 
> HDFS-14706.006.patch, HDFS-14706.007.patch, HDFS-14706.008.patch
>
>
> If a block and its meta file are corrupted in a certain way, the corruption 
> can go unnoticed by a client, causing it to return invalid data.
> The meta file is expected to always have a header of 7 bytes and then a 
> series of checksums depending on the length of the block.
> If the metafile gets corrupted in such a way, that it is between zero and 
> less than 7 bytes in length, then the header is incomplete. In 
> BlockSender.java the logic checks if the metafile length is at least the size 
> of the header and if it is not, it does not error, but instead returns a NULL 
> checksum type to the client.
> https://github.com/apache/hadoop/blob/b77761b0e37703beb2c033029e4c0d5ad1dce794/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockSender.java#L327-L357
> If the client receives a NULL checksum client, it will not validate checksums 
> at all, and even corrupted data will be returned to the reader. This means 
> this corrupt will go unnoticed and HDFS will never repair it. Even the Volume 
> Scanner will not notice the corruption as the checksums are silently ignored.
> Additionally, if the meta file does have enough bytes so it attempts to load 
> the header, and the header is corrupted such that it is not valid, it can 
> cause the datanode Volume Scanner to exit, which an exception like the 
> following:
> {code}
> 2019-08-06 18:16:39,151 ERROR datanode.VolumeScanner: 
> VolumeScanner(/tmp/hadoop-sodonnell/dfs/data, 
> DS-7f103313-61ba-4d37-b63d-e8cf7d2ed5f7) exiting because of exception 
> java.lang.IllegalArgumentException: id=51 out of range [0, 5)
>       at 
> org.apache.hadoop.util.DataChecksum$Type.valueOf(DataChecksum.java:76)
>       at 
> org.apache.hadoop.util.DataChecksum.newDataChecksum(DataChecksum.java:167)
>       at 
> org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:173)
>       at 
> org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:139)
>       at 
> org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:153)
>       at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.loadLastPartialChunkChecksum(FsVolumeImpl.java:1140)
>       at 
> org.apache.hadoop.hdfs.server.datanode.FinalizedReplica.loadLastPartialChunkChecksum(FinalizedReplica.java:157)
>       at 
> org.apache.hadoop.hdfs.server.datanode.BlockSender.getPartialChunkChecksumForFinalized(BlockSender.java:451)
>       at 
> org.apache.hadoop.hdfs.server.datanode.BlockSender.<init>(BlockSender.java:266)
>       at 
> org.apache.hadoop.hdfs.server.datanode.VolumeScanner.scanBlock(VolumeScanner.java:446)
>       at 
> org.apache.hadoop.hdfs.server.datanode.VolumeScanner.runLoop(VolumeScanner.java:558)
>       at 
> org.apache.hadoop.hdfs.server.datanode.VolumeScanner.run(VolumeScanner.java:633)
> 2019-08-06 18:16:39,152 INFO datanode.VolumeScanner: 
> VolumeScanner(/tmp/hadoop-sodonnell/dfs/data, 
> DS-7f103313-61ba-4d37-b63d-e8cf7d2ed5f7) exiting.
> {code}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to