Re: [PR] Add optional targetDocsPerChunk cap to VarByteChunkForwardIndexWriterV6 [pinot]

via GitHub Thu, 18 Jun 2026 17:27:59 -0700


xiangfu0 commented on code in PR #18772:
URL: https://github.com/apache/pinot/pull/18772#discussion_r3439628194



##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/io/writer/impl/VarByteChunkForwardIndexWriterV6.java:
##########
@@ -24,41 +24,69 @@
 import org.apache.pinot.segment.spi.compression.ChunkCompressionType;
 
 
-/**
- * Forward index writer that extends {@link VarByteChunkForwardIndexWriterV5} 
and delta-encodes the chunk header
- * when compression is enabled, writing individual entry sizes instead of 
cumulative byte offsets.
- *
- * - **V4 chunk header**: `[numDocs][offset0][offset1]...[offsetN-1]` — 
cumulative byte offsets.
- * - **V6 chunk header** (compressed): `[numDocs][size0][size1]...[sizeN-1]` — 
individual entry sizes.
- *   Sizes compress dramatically better (e.g. 11x with ZSTD) because they are 
small, repetitive values.
- *   The reader converts sizes back to offsets at read time with a single 
forward pass.
- * - **PASS_THROUGH**: delta encoding provides no benefit, so V6 falls back to 
V4's offset-based header.
- *
- * @see VarByteChunkForwardIndexWriterV4
- * @see VarByteChunkForwardIndexWriterV5
- */
+/// Forward index writer that extends [VarByteChunkForwardIndexWriterV5] and 
delta-encodes the chunk header
+/// when compression is enabled, writing individual entry sizes instead of 
cumulative byte offsets.
+///
+/// - **V4 chunk header**: `[numDocs][offset0][offset1]...[offsetN-1]` — 
cumulative byte offsets.
+/// - **V6 chunk header** (compressed): `[numDocs][size0][size1]...[sizeN-1]` 
— individual entry sizes.
+///   Sizes compress dramatically better (e.g. 11x with ZSTD) because they are 
small, repetitive values.
+///   The reader converts sizes back to offsets at read time with a single 
forward pass.
+/// - **PASS_THROUGH**: delta encoding provides no benefit, so V6 falls back 
to V4's offset-based header.
+///
+/// ## Chunk boundaries
+/// A chunk is flushed whenever the next entry would overflow the 
`chunkSize`-byte chunk buffer
+/// (inherited V4 behavior). V6 additionally accepts a `targetDocsPerChunk` 
cap: when positive, a
+/// chunk is also flushed once it reaches that many documents, even if the 
byte buffer is not yet full.
+/// This lets callers bound a chunk by both byte size and document count. The 
cap defaults to `-1`
+/// (disabled), which preserves the original purely byte-driven behavior.
+///
+/// @see VarByteChunkForwardIndexWriterV4
+/// @see VarByteChunkForwardIndexWriterV5
 @NotThreadSafe
 public class VarByteChunkForwardIndexWriterV6 extends 
VarByteChunkForwardIndexWriterV5 {
   public static final int VERSION = 6;
 
+  /// Sentinel meaning "no docs-per-chunk cap": chunks are bounded only by 
`chunkSize` bytes.
+  public static final int DISABLE_DOCS_PER_CHUNK = -1;
+
   private final boolean _deltaEncoding;
+  private final int _targetDocsPerChunk;
 
   public VarByteChunkForwardIndexWriterV6(File file, ChunkCompressionType 
compressionType, int chunkSize)
       throws IOException {
+    this(file, compressionType, chunkSize, DISABLE_DOCS_PER_CHUNK);
+  }
+
+  /// @param file output index file
+  /// @param compressionType chunk compression codec
+  /// @param chunkSize target uncompressed chunk size in bytes (the chunk 
buffer capacity)
+  /// @param targetDocsPerChunk flush a chunk once it holds this many 
documents, in addition to the
+  ///     byte-size limit; `-1` ([#DISABLE_DOCS_PER_CHUNK]) disables the cap 
and keeps the purely
+  ///     byte-driven behavior
+  public VarByteChunkForwardIndexWriterV6(File file, ChunkCompressionType 
compressionType, int chunkSize,
+      int targetDocsPerChunk)
+      throws IOException {
     super(file, compressionType, chunkSize);
     _deltaEncoding = compressionType != ChunkCompressionType.PASS_THROUGH;
+    _targetDocsPerChunk = targetDocsPerChunk;

Review Comment:
   Added a `checkArgument` in the 4-arg constructor: `targetDocsPerChunk` must 
be `-1` (disabled) or positive, otherwise it fails fast instead of silently 
treating `0`/other negatives as disabled. Fixed in 99a96e1.



##########
pinot-segment-local/src/test/java/org/apache/pinot/segment/local/segment/index/creator/VarByteChunkV6Test.java:
##########
@@ -101,4 +101,46 @@ public void validateCompressionRatioIncrease()
     FileUtils.deleteQuietly(v4FwdIndexFile);
     FileUtils.deleteQuietly(v6FwdIndexFile);
   }
+
+  /// A positive `targetDocsPerChunk` must cap each chunk at that many 
documents even when the byte
+  /// buffer is far from full, and values must still round-trip. The default 
(`-1`) keeps the original
+  /// byte-driven behavior, which is exercised by all the inherited read/write 
tests.
+  @Test
+  public void testTargetDocsPerChunkCapsChunk()
+      throws IOException {
+    int numDocs = 1000;
+    int docsPerChunk = 50;
+    // Large byte budget so flushing is driven purely by the docs-per-chunk 
cap, not the buffer size.
+    int chunkSize = 1 << 20;
+    File file = new File(FileUtils.getTempDirectory(), "v6test_docs_cap");
+    FileUtils.deleteQuietly(file);
+
+    String[] values = new String[numDocs];
+    try (VarByteChunkForwardIndexWriterV6 writer =
+        new VarByteChunkForwardIndexWriterV6(file, 
ChunkCompressionType.ZSTANDARD, chunkSize, docsPerChunk)) {
+      for (int i = 0; i < numDocs; i++) {
+        values[i] = "value_" + i;
+        writer.putString(values[i]);
+      }
+    }
+
+    try (PinotDataBuffer buffer = 
PinotDataBuffer.mapReadOnlyBigEndianFile(file)) {
+      // Chunk metadata starts at byte 16; each entry is 8 bytes. The metadata 
region ends at the offset
+      // stored at byte 12 (start of chunk data).
+      int numChunks = (buffer.getInt(12) - 16) / 8;
+      Assert.assertEquals(numChunks, (numDocs + docsPerChunk - 1) / 
docsPerChunk,
+          "Each chunk should hold exactly targetDocsPerChunk documents");
+    }

Review Comment:
   Strengthened the test to verify per-chunk doc counts (not just the chunk 
count): it now reads each chunk's first docId from the 8-byte metadata entries 
and asserts `firstDocId == chunk * docsPerChunk`. Also switched to 1000 docs 
with a cap of 30 so the last chunk holds the remainder (10). Fixed in 99a96e1.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] Add optional targetDocsPerChunk cap to VarByteChunkForwardIndexWriterV6 [pinot]

Reply via email to