gurtajsingh1 commented on PR #21442: URL: https://github.com/apache/kafka/pull/21442#issuecomment-3905031252
Subject: Implement content checksum verification in Lz4BlockInputStream Dear [mimaison], Thank you for reviewing the PR and for your valuable feedback. I appreciate the opportunity to address your concerns. Regarding Tests I have added comprehensive test coverage to ensure the reliability of this feature. The new test file Lz4ContentChecksumTest.java includes the following test cases: Functional tests: Round-trip compression/decompression with content checksum enabled and disabled Edge cases: Single block, multiple blocks, and empty data scenarios Error handling: Verification that IOException is properly thrown when content checksum verification fails Buffer handling: Direct ByteBuffer support validation All tests follow the JUnit 5 (Jupiter) conventions consistent with the Apache Kafka codebase. Regarding Performance Impact The performance impact is minimal and negligible: Conditional execution: The overhead only applies when content checksum is explicitly enabled in the LZ4 stream header Efficient algorithm: Uses XXHash32, which is specifically designed for high-speed hashing (typically several GB/s) Memory efficiency: Requires only 4 bytes for the checksum accumulator plus a small temporary buffer for direct buffers Estimated overhead: Less than 1% in typical use cases Furthermore, the implementation maintains full backward compatibility - streams without content checksum experience zero overhead. I have also ensured that the implementation complies with the LZ4 v1.5.1 frame format specification for end-to-end data integrity verification. Please review the updated PR at your earliest convenience. I look forward to your feedback. Best regards, Gurtaj Singh -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
