wombatu-kun commented on code in PR #19052:
URL: https://github.com/apache/hudi/pull/19052#discussion_r3463915239
##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieArchivalConfig.java:
##########
@@ -88,6 +88,15 @@ public class HoodieArchivalConfig extends HoodieConfig {
.withDocumentation("Archiving of instants is batched in best-effort
manner, to pack more instants into a single"
+ " archive log. This config controls such archival batch size.");
+ public static final ConfigProperty<Integer>
MIGRATION_COMMITS_ARCHIVAL_BATCH_SIZE = ConfigProperty
+ .key("hoodie.migration.commits.archival.batch")
+ .defaultValue(500)
Review Comment:
ActiveAction holds only HoodieInstant references (requested, inflight,
completed), not commit metadata. The metadata bytes are read lazily per instant
inside LSMTimelineWriter.write (via ActiveAction.getCommitMetadata ->
getInstantDetails) and converted one at a time, never retained in
activeActionsBatch. A 500-action batch therefore buffers a few thousand small
instant descriptors, not 500 commits' worth of metadata. The steady-state
archival path in TimelineArchiverV2 already passes the entire instantsToArchive
list to a single write() with no size cap, so 500 is more conservative than
normal archival, not riskier.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]