itschrispeck commented on code in PR #12945:
URL: https://github.com/apache/pinot/pull/12945#discussion_r1591797449


##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/creator/impl/fwd/SingleValueVarByteRawIndexCreator.java:
##########
@@ -70,23 +68,32 @@ public SingleValueVarByteRawIndexCreator(File baseIndexDir, 
ChunkCompressionType
    * @param maxLength length of longest entry (in bytes)
    * @param deriveNumDocsPerChunk true if writer should auto-derive the number 
of rows per chunk
    * @param writerVersion writer format version
+   * @param targetMaxChunkSizeBytes target max chunk size in bytes, applicable 
only for V4 or when
+   *                                deriveNumDocsPerChunk is true
+   * @param targetDocsPerChunk target number of docs per chunk
    * @throws IOException
    */
   public SingleValueVarByteRawIndexCreator(File baseIndexDir, 
ChunkCompressionType compressionType, String column,
-      int totalDocs, DataType valueType, int maxLength, boolean 
deriveNumDocsPerChunk, int writerVersion)
+      int totalDocs, DataType valueType, int maxLength, boolean 
deriveNumDocsPerChunk, int writerVersion,
+      int targetMaxChunkSizeBytes, int targetDocsPerChunk)
       throws IOException {
     File file = new File(baseIndexDir, column + 
V1Constants.Indexes.RAW_SV_FORWARD_INDEX_FILE_EXTENSION);
-    int numDocsPerChunk = deriveNumDocsPerChunk ? 
getNumDocsPerChunk(maxLength) : DEFAULT_NUM_DOCS_PER_CHUNK;
+    int numDocsPerChunk =
+        deriveNumDocsPerChunk ? getNumDocsPerChunk(maxLength, 
targetMaxChunkSizeBytes) : targetDocsPerChunk;
+
+    // For columns with very small max value, target chunk size should also be 
capped to reduce memory during read
+    int dynamicTargetChunkSize =
+        ForwardIndexUtils.getDynamicTargetChunkSize(maxLength, 
targetDocsPerChunk, targetMaxChunkSizeBytes);

Review Comment:
   That is clearer 🙂 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to