Copilot commented on code in PR #10350:
URL: https://github.com/apache/ozone/pull/10350#discussion_r3401548152
##########
hadoop-hdds/common/src/main/java/org/apache/hadoop/ozone/common/ChecksumCache.java:
##########
@@ -65,58 +66,102 @@ public List<ByteString> getChecksums() {
return checksums;
}
- public List<ByteString> computeChecksum(ChunkBuffer data,
Function<ByteBuffer, ByteString> function) {
- // Indicates how much data the current chunk buffer holds
+ /**
+ * Recompute checksums for the windows that have changed since the last
+ * call: bytes {@code [ciStart * bytesPerChecksum, currChunkLength)} where
+ * {@code ciStart = prevChunkLength / bytesPerChecksum} (the index of the
+ * first window whose result may have changed - either the previously-
+ * partial last window now has more bytes, or new full windows have been
+ * appended).
+ *
+ * <p>Walks {@code data}'s underlying buffer list directly (no
+ * {@code iterate()} byte[] linearization) and feeds slices to {@code algo}
+ * incrementally; the cached prefix is skipped via index arithmetic so
+ * those bytes are never re-fed.
+ */
+ public List<ByteString> computeChecksum(ChunkBuffer data,
+ StreamingChecksum algo, int chksumSize) {
Review Comment:
`ChecksumCache.computeChecksum` is `public` but its signature now depends on
`Checksum.StreamingChecksum`, which is package-private. That effectively makes
this public method unusable to callers outside `org.apache.hadoop.ozone.common`
and is a breaking/awkward API surface. Either make the method package-private
(if it’s internal) or make `StreamingChecksum` a public type.
##########
hadoop-hdds/common/src/main/java/org/apache/hadoop/ozone/common/Checksum.java:
##########
@@ -218,65 +265,71 @@ public ChecksumData computeChecksum(ChunkBuffer data)
return computeChecksum(data, false);
}
+ /**
+ * @implNote The position of {@code data}'s underlying buffers is not
+ * advanced by this method - both the no-cache and cache paths slice via
+ * {@link ByteBuffer#duplicate()}.
+ */
public ChecksumData computeChecksum(ChunkBuffer data, boolean useCache)
throws OzoneChecksumException {
if (checksumType == ChecksumType.NONE) {
- // Since type is set to NONE, we do not need to compute the checksums
return new ChecksumData(checksumType, bytesPerChecksum);
}
- final Function<ByteBuffer, ByteString> function;
+ final StreamingChecksum algo;
try {
- function = Algorithm.valueOf(checksumType).newChecksumFunction();
+ algo = Algorithm.valueOf(checksumType).newStreamingChecksum();
} catch (Exception e) {
- throw new OzoneChecksumException("Failed to get the checksum function
for " + checksumType, e);
+ throw new OzoneChecksumException(
+ "Failed to get the checksum function for " + checksumType, e);
Review Comment:
The error message still refers to a "checksum function", but this code now
constructs a `StreamingChecksum`. Updating the message will make failures
easier to interpret during debugging (eg, missing algorithm/provider vs.
function creation).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]