tanmay created KAFKA-20528:
------------------------------
Summary: Implement missing LZ4 Content Checksum in
Lz4BlockOutputStream and Lz4BlockInputStream
Key: KAFKA-20528
URL: https://issues.apache.org/jira/browse/KAFKA-20528
Project: Kafka
Issue Type: Improvement
Components: clients, compression
Reporter: tanmay
While reviewing the LZ4 compression implementation in the Kafka clients module,
I noticed that we are not implementing the Content Checksum feature defined in
the LZ4 Frame Format specification (v1.5.1).
Currently, Lz4BlockOutputStream correctly handles block-level checksums (if
configured), but it leaves three explicit TODO comments regarding the final
content checksum. Specifically:
1. Lz4BlockOutputStream.java does not write the contentSize in the header or
validate the flg for it.
2. Lz4BlockOutputStream.java writes the endmark but fails to compute and
append the final 4-byte XXHash32 content checksum across the entire
uncompressed payload.
3. L3.z4BlockInputStream.java reads the 4-byte checksum if the
ContentChecksumSet flag is true, but merely discards it without actually
verifying it against the decompressed stream.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)