satishkotha commented on a change in pull request #1355: [HUDI-633] limit 
archive file block size by number of bytes
URL: https://github.com/apache/incubator-hudi/pull/1355#discussion_r384242297
 
 

 ##########
 File path: 
hudi-client/src/main/java/org/apache/hudi/config/HoodieCompactionConfig.java
 ##########
 @@ -103,6 +104,8 @@
   private static final String DEFAULT_MAX_COMMITS_TO_KEEP = "30";
   private static final String DEFAULT_MIN_COMMITS_TO_KEEP = "20";
   private static final String DEFAULT_COMMITS_ARCHIVAL_BATCH_SIZE = 
String.valueOf(10);
+  // Do not read more than 6MB at a time (6MB observed p95 in prod from 2 
datasets)
+  private static final String DEFAULT_COMMITS_ARCHIVAL_MEM_SIZE = 
String.valueOf(6 * 1024 * 1024);
 
 Review comment:
   any suggestions on how to pick this number? I just looked at 2 production 
dataset and measured size of 10 records in archived file. Max noticed is 10MB. 
p95 is 6MB.  Let me know if you have suggestions.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to