Siyao Meng created HDDS-15356:
---------------------------------

             Summary: Make multi-buffer chunk checksum allocation-free
                 Key: HDDS-15356
                 URL: https://issues.apache.org/jira/browse/HDDS-15356
             Project: Apache Ozone
          Issue Type: Improvement
          Components: common
            Reporter: Siyao Meng
            Assignee: Siyao Meng


  A **multi-buffer** `ChunkBuffer` is one whose data is stored as more than one 
underlying `ByteBuffer`
  instead of a single contiguous block. In production this shape occurs when 
the data was assembled from
  a list of Netty `ByteBuf`s (Ratis state-machine-data read via 
`ChunkedNioFile`), when the verifier
  received a `List<ByteString>` spanning more than one checksum window (client 
read-verify), or when an
  operator opted into `IncrementalChunkBuffer` via 
`ozone.client.bytebuffer.increment > 0`.

  `Checksum.computeChecksum(ChunkBuffer)` previously delegated to
  `ChunkBuffer.iterate(bytesPerChecksum)`, which allocates a fresh 
`byte[bytesPerChecksum]` (1 MB by
  default) and memcpys into it whenever a checksum window straddles two of 
those underlying ByteBuffers.

  This change rewrites both the no-cache and cache compute paths to walk 
`data.asByteBufferList()`
  directly, slicing each window via a non-copying `ByteBuffer.duplicate()` 
helper (`BufferUtils.slice`)
  and feeding slices to a new `Checksum.StreamingChecksum` strategy that wraps 
`ChecksumByteBuffer`
  (CRC32, CRC32C) or `MessageDigest` (SHA-256, MD5). The `update(ByteBuffer)` 
contracts of both define
  incremental update as byte-equivalent to a single update over the 
concatenation, so output bytes are
  bit-identical.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to