xiangfu0 commented on code in PR #18772:
URL: https://github.com/apache/pinot/pull/18772#discussion_r3439628194
##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/io/writer/impl/VarByteChunkForwardIndexWriterV6.java:
##########
@@ -24,41 +24,69 @@
import org.apache.pinot.segment.spi.compression.ChunkCompressionType;
-/**
- * Forward index writer that extends {@link VarByteChunkForwardIndexWriterV5}
and delta-encodes the chunk header
- * when compression is enabled, writing individual entry sizes instead of
cumulative byte offsets.
- *
- * - **V4 chunk header**: `[numDocs][offset0][offset1]...[offsetN-1]` —
cumulative byte offsets.
- * - **V6 chunk header** (compressed): `[numDocs][size0][size1]...[sizeN-1]` —
individual entry sizes.
- * Sizes compress dramatically better (e.g. 11x with ZSTD) because they are
small, repetitive values.
- * The reader converts sizes back to offsets at read time with a single
forward pass.
- * - **PASS_THROUGH**: delta encoding provides no benefit, so V6 falls back to
V4's offset-based header.
- *
- * @see VarByteChunkForwardIndexWriterV4
- * @see VarByteChunkForwardIndexWriterV5
- */
+/// Forward index writer that extends [VarByteChunkForwardIndexWriterV5] and
delta-encodes the chunk header
+/// when compression is enabled, writing individual entry sizes instead of
cumulative byte offsets.
+///
+/// - **V4 chunk header**: `[numDocs][offset0][offset1]...[offsetN-1]` —
cumulative byte offsets.
+/// - **V6 chunk header** (compressed): `[numDocs][size0][size1]...[sizeN-1]`
— individual entry sizes.
+/// Sizes compress dramatically better (e.g. 11x with ZSTD) because they are
small, repetitive values.
+/// The reader converts sizes back to offsets at read time with a single
forward pass.
+/// - **PASS_THROUGH**: delta encoding provides no benefit, so V6 falls back
to V4's offset-based header.
+///
+/// ## Chunk boundaries
+/// A chunk is flushed whenever the next entry would overflow the
`chunkSize`-byte chunk buffer
+/// (inherited V4 behavior). V6 additionally accepts a `targetDocsPerChunk`
cap: when positive, a
+/// chunk is also flushed once it reaches that many documents, even if the
byte buffer is not yet full.
+/// This lets callers bound a chunk by both byte size and document count. The
cap defaults to `-1`
+/// (disabled), which preserves the original purely byte-driven behavior.
+///
+/// @see VarByteChunkForwardIndexWriterV4
+/// @see VarByteChunkForwardIndexWriterV5
@NotThreadSafe
public class VarByteChunkForwardIndexWriterV6 extends
VarByteChunkForwardIndexWriterV5 {
public static final int VERSION = 6;
+ /// Sentinel meaning "no docs-per-chunk cap": chunks are bounded only by
`chunkSize` bytes.
+ public static final int DISABLE_DOCS_PER_CHUNK = -1;
+
private final boolean _deltaEncoding;
+ private final int _targetDocsPerChunk;
public VarByteChunkForwardIndexWriterV6(File file, ChunkCompressionType
compressionType, int chunkSize)
throws IOException {
+ this(file, compressionType, chunkSize, DISABLE_DOCS_PER_CHUNK);
+ }
+
+ /// @param file output index file
+ /// @param compressionType chunk compression codec
+ /// @param chunkSize target uncompressed chunk size in bytes (the chunk
buffer capacity)
+ /// @param targetDocsPerChunk flush a chunk once it holds this many
documents, in addition to the
+ /// byte-size limit; `-1` ([#DISABLE_DOCS_PER_CHUNK]) disables the cap
and keeps the purely
+ /// byte-driven behavior
+ public VarByteChunkForwardIndexWriterV6(File file, ChunkCompressionType
compressionType, int chunkSize,
+ int targetDocsPerChunk)
+ throws IOException {
super(file, compressionType, chunkSize);
_deltaEncoding = compressionType != ChunkCompressionType.PASS_THROUGH;
+ _targetDocsPerChunk = targetDocsPerChunk;
Review Comment:
Added a `checkArgument` in the 4-arg constructor: `targetDocsPerChunk` must
be `-1` (disabled) or positive, otherwise it fails fast instead of silently
treating `0`/other negatives as disabled. Fixed in 99a96e1.
##########
pinot-segment-local/src/test/java/org/apache/pinot/segment/local/segment/index/creator/VarByteChunkV6Test.java:
##########
@@ -101,4 +101,46 @@ public void validateCompressionRatioIncrease()
FileUtils.deleteQuietly(v4FwdIndexFile);
FileUtils.deleteQuietly(v6FwdIndexFile);
}
+
+ /// A positive `targetDocsPerChunk` must cap each chunk at that many
documents even when the byte
+ /// buffer is far from full, and values must still round-trip. The default
(`-1`) keeps the original
+ /// byte-driven behavior, which is exercised by all the inherited read/write
tests.
+ @Test
+ public void testTargetDocsPerChunkCapsChunk()
+ throws IOException {
+ int numDocs = 1000;
+ int docsPerChunk = 50;
+ // Large byte budget so flushing is driven purely by the docs-per-chunk
cap, not the buffer size.
+ int chunkSize = 1 << 20;
+ File file = new File(FileUtils.getTempDirectory(), "v6test_docs_cap");
+ FileUtils.deleteQuietly(file);
+
+ String[] values = new String[numDocs];
+ try (VarByteChunkForwardIndexWriterV6 writer =
+ new VarByteChunkForwardIndexWriterV6(file,
ChunkCompressionType.ZSTANDARD, chunkSize, docsPerChunk)) {
+ for (int i = 0; i < numDocs; i++) {
+ values[i] = "value_" + i;
+ writer.putString(values[i]);
+ }
+ }
+
+ try (PinotDataBuffer buffer =
PinotDataBuffer.mapReadOnlyBigEndianFile(file)) {
+ // Chunk metadata starts at byte 16; each entry is 8 bytes. The metadata
region ends at the offset
+ // stored at byte 12 (start of chunk data).
+ int numChunks = (buffer.getInt(12) - 16) / 8;
+ Assert.assertEquals(numChunks, (numDocs + docsPerChunk - 1) /
docsPerChunk,
+ "Each chunk should hold exactly targetDocsPerChunk documents");
+ }
Review Comment:
Strengthened the test to verify per-chunk doc counts (not just the chunk
count): it now reads each chunk's first docId from the 8-byte metadata entries
and asserts `firstDocId == chunk * docsPerChunk`. Also switched to 1000 docs
with a cap of 30 so the last chunk holds the remainder (10). Fixed in 99a96e1.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]