hudi-agent commented on code in PR #18793:
URL: https://github.com/apache/hudi/pull/18793#discussion_r3274303077
##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/StreamWriteOperatorCoordinator.java:
##########
@@ -565,7 +565,7 @@ private void handleBootstrapEvent(WriteMetadataEvent event)
{
if (eventBuffer.allBootstrapEventsReceived()) {
// start to recommit the instant.
boolean committed = recommitInstant(event.getCheckpointId(),
event.getInstantTime(), eventBuffer);
- if (committed && tableState.isRLIWithBootstrap) {
+ if (committed && tableState.isRecordLevelIndex) {
Review Comment:
π€ @danny0405 The old `isRLIWithBootstrap` also required
`INDEX_BOOTSTRAP_ENABLED=true`, but the new `isRecordLevelIndex` drops that
check. Looking at `Pipelines.streamBootstrap`, `RLIBootstrapOperator` is only
added when `INDEX_BOOTSTRAP_ENABLED || bounded`. For RLI tables without
bootstrap enabled (and for partitioned RLI, which never gets
`RLIBootstrapOperator`), this now triggers a `context.failJob(...)` even though
there's no RLI bootstrap operator in the pipeline to benefit from the reload.
Is the broader failover intended (e.g. to refresh `IndexWriteOperator` state)?
If so, could you also update the exception message β it still says "so that RLI
bootstrap operator can load the record level index completely", which is
misleading for the non-bootstrap streaming case.
<sub><i>- AI-generated; verify before applying. React π/π to flag
quality.</i></sub>
##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/StreamWriteOperatorCoordinator.java:
##########
@@ -773,7 +773,7 @@ private TableState(Configuration conf) {
this.syncMetadata = conf.get(FlinkOptions.METADATA_ENABLED);
this.isDeltaTimeCompaction = OptionsResolver.isDeltaTimeCompaction(conf);
this.isStreamingIndexWriteEnabled =
OptionsResolver.isStreamingIndexWriteEnabled(conf);
- this.isRLIWithBootstrap = OptionsResolver.isRLIWithBootstrap(conf);
+ this.isRecordLevelIndex = OptionsResolver.isGlobalRecordLevelIndex(conf)
|| OptionsResolver.isRecordLevelIndex(conf);
Review Comment:
π€ Minor β `isRecordLevelIndex` here is computed as the disjunction
`isGlobalRecordLevelIndex(conf) || isRecordLevelIndex(conf)`. Would it be
cleaner to add a dedicated helper like
`OptionsResolver.isAnyRecordLevelIndex(conf)` (or similar) so future call sites
can't accidentally use just one of the two? Right now the static method
`OptionsResolver.isRecordLevelIndex` only covers the partitioned variant, which
makes the naming a bit easy to confuse.
<sub><i>- AI-generated; verify before applying. React π/π to flag
quality.</i></sub>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]