[
https://issues.apache.org/jira/browse/HDFS-14706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16917662#comment-16917662
]
Stephen O'Donnell commented on HDFS-14706:
------------------------------------------
{quote}
Additionally, I am thinking we should check the return value of {{fc.read()}},
and throw {{CorruptMetaHeaderException}} if it's less than the size of arry.
{quote}
I think this existing code block will already do that? It will keep trying to
read until the buf is full, and if fc.read reads zero bytes, the call returns a
zero which will cause it to throw the exception. It does seem a little strange
the read is wrapped in a loop, and that the read always starts from zero
though. Need to think about this a little more.
{code:java}
while (buf.hasRemaining()) {
if (fc.read(buf, 0) <= 0) {
throw new CorruptMetaHeaderException("EOF while reading header from "+
"the metadata file. The meta file may be truncated or corrupt");
}
} {code}
{quote}
But my IntelliJ complains InvalidChecksumSizeException is never thrown in
{{BlockMetadataHeader#preadHeader()}}.
{quote}
Interestingly, my IntelliJ does not complain about this, but I think yours is
correct. If the bytes passed into newDataChecksum() are not long enough I
believe it will return null rather than throwing an exception, so I need to
change this a little. Well spotted.
{code:java}
DataChecksum dataChecksum;
try {
dataChecksum = DataChecksum.newDataChecksum(arr, 2);
} catch (InvalidChecksumSizeException e) {
throw new CorruptMetaHeaderException("The block meta file header is "+
"corrupt", e);
}{code}
> Checksums are not checked if block meta file is less than 7 bytes
> -----------------------------------------------------------------
>
> Key: HDFS-14706
> URL: https://issues.apache.org/jira/browse/HDFS-14706
> Project: Hadoop HDFS
> Issue Type: Bug
> Affects Versions: 3.3.0
> Reporter: Stephen O'Donnell
> Assignee: Stephen O'Donnell
> Priority: Major
> Attachments: HDFS-14706.001.patch, HDFS-14706.002.patch,
> HDFS-14706.003.patch, HDFS-14706.004.patch, HDFS-14706.005.patch
>
>
> If a block and its meta file are corrupted in a certain way, the corruption
> can go unnoticed by a client, causing it to return invalid data.
> The meta file is expected to always have a header of 7 bytes and then a
> series of checksums depending on the length of the block.
> If the metafile gets corrupted in such a way, that it is between zero and
> less than 7 bytes in length, then the header is incomplete. In
> BlockSender.java the logic checks if the metafile length is at least the size
> of the header and if it is not, it does not error, but instead returns a NULL
> checksum type to the client.
> https://github.com/apache/hadoop/blob/b77761b0e37703beb2c033029e4c0d5ad1dce794/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockSender.java#L327-L357
> If the client receives a NULL checksum client, it will not validate checksums
> at all, and even corrupted data will be returned to the reader. This means
> this corrupt will go unnoticed and HDFS will never repair it. Even the Volume
> Scanner will not notice the corruption as the checksums are silently ignored.
> Additionally, if the meta file does have enough bytes so it attempts to load
> the header, and the header is corrupted such that it is not valid, it can
> cause the datanode Volume Scanner to exit, which an exception like the
> following:
> {code}
> 2019-08-06 18:16:39,151 ERROR datanode.VolumeScanner:
> VolumeScanner(/tmp/hadoop-sodonnell/dfs/data,
> DS-7f103313-61ba-4d37-b63d-e8cf7d2ed5f7) exiting because of exception
> java.lang.IllegalArgumentException: id=51 out of range [0, 5)
> at
> org.apache.hadoop.util.DataChecksum$Type.valueOf(DataChecksum.java:76)
> at
> org.apache.hadoop.util.DataChecksum.newDataChecksum(DataChecksum.java:167)
> at
> org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:173)
> at
> org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:139)
> at
> org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:153)
> at
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.loadLastPartialChunkChecksum(FsVolumeImpl.java:1140)
> at
> org.apache.hadoop.hdfs.server.datanode.FinalizedReplica.loadLastPartialChunkChecksum(FinalizedReplica.java:157)
> at
> org.apache.hadoop.hdfs.server.datanode.BlockSender.getPartialChunkChecksumForFinalized(BlockSender.java:451)
> at
> org.apache.hadoop.hdfs.server.datanode.BlockSender.<init>(BlockSender.java:266)
> at
> org.apache.hadoop.hdfs.server.datanode.VolumeScanner.scanBlock(VolumeScanner.java:446)
> at
> org.apache.hadoop.hdfs.server.datanode.VolumeScanner.runLoop(VolumeScanner.java:558)
> at
> org.apache.hadoop.hdfs.server.datanode.VolumeScanner.run(VolumeScanner.java:633)
> 2019-08-06 18:16:39,152 INFO datanode.VolumeScanner:
> VolumeScanner(/tmp/hadoop-sodonnell/dfs/data,
> DS-7f103313-61ba-4d37-b63d-e8cf7d2ed5f7) exiting.
> {code}
--
This message was sent by Atlassian Jira
(v8.3.2#803003)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]