Hairong Kuang wrote:
If end-to-end is a concern, we could let the client generate the checksums
and send it to the data node following the block data.
I created an issue in Jira related to this issue:
https://issues.apache.org/jira/browse/HADOOP-928
The idea there is to first make it possible for FileSystems to opt in or
out of the current checksum mechanism. Then, if a filesystem wishes to
implement end-to-end checksums more efficiently than the provided
generic implementation, it can do so. In HDFS it would be best to keep
checksums aligned and stored with blocks, so that they can be validated
on datanodes, and it would also be better if they don't consume names
and blockids. These goals would be difficult to meet with a generic
implementation.
Doug