wombatu-kun commented on code in PR #19052:
URL: https://github.com/apache/hudi/pull/19052#discussion_r3463915239


##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieArchivalConfig.java:
##########
@@ -88,6 +88,15 @@ public class HoodieArchivalConfig extends HoodieConfig {
       .withDocumentation("Archiving of instants is batched in best-effort 
manner, to pack more instants into a single"
           + " archive log. This config controls such archival batch size.");
 
+  public static final ConfigProperty<Integer> 
MIGRATION_COMMITS_ARCHIVAL_BATCH_SIZE = ConfigProperty
+      .key("hoodie.migration.commits.archival.batch")
+      .defaultValue(500)

Review Comment:
   ActiveAction holds only HoodieInstant references (requested, inflight, 
completed), not commit metadata. The metadata bytes are read lazily per instant 
inside LSMTimelineWriter.write (via ActiveAction.getCommitMetadata -> 
getInstantDetails) and converted one at a time, never retained in 
activeActionsBatch. A 500-action batch therefore buffers a few thousand small 
instant descriptors, not 500 commits' worth of metadata. The steady-state 
archival path in TimelineArchiverV2 already passes the entire instantsToArchive 
list to a single write() with no size cap, so 500 is more conservative than 
normal archival, not riskier.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to