hudi-agent commented on code in PR #18690:
URL: https://github.com/apache/hudi/pull/18690#discussion_r3185184956
##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/bootstrap/RLIBootstrapOperator.java:
##########
@@ -62,12 +63,17 @@ public RLIBootstrapOperator(Configuration conf) {
@Override
public void initializeState(StateInitializationContext context) throws
Exception {
loadedCnt = 0;
- HoodieTableMetaClient metaClient = StreamerUtil.createMetaClient(conf);
- this.metadataTable = (HoodieBackedTableMetadata)
metaClient.getTableFormat().getMetadataFactory().create(
- HoodieFlinkEngineContext.DEFAULT,
- metaClient.getStorage(),
- StreamerUtil.metadataConfig(conf),
- conf.get(FlinkOptions.PATH));
+ HoodieTableMetaClient metaClient = createMetaClient();
+ this.metadataTable = createMetadataTable(metaClient);
+
+ if (metadataTable.getMetadataConfig().isMetricsEnabled()) {
+ // Bridge metadata table Dropwizard metrics to Flink's metric system so
they are
+ // visible in Flink dashboards and reporters.
+ MetricGroup metricGroup = getRuntimeContext().getMetricGroup();
+ metadataTable.getMetrics().ifPresent(m ->
Review Comment:
🤖 Could you double-check the timing here? `HoodieMetadataMetrics` registers
gauges lazily — `registerGauge` is only called from
`incrementMetric`/`setMetric`, which fire as metadata-table operations run. At
this point the metadata table was just constructed and `preLoadRLIRecords()`
hasn't executed yet, so `registry().getGauges()` is likely empty/near-empty and
the `forEach` registers nothing with Flink. Since this is a one-shot snapshot,
any gauges added later by the read also won't surface. Would it work to either
move this block after `preLoadRLIRecords()` (so the gauges that the bootstrap
actually populates are visible), or register a Flink gauge that delegates to
`registry().getGauges().get(name)` looked up lazily? cc @danny0405 @yihua
<sub><i>- AI-generated; verify before applying. React 👍/👎 to flag
quality.</i></sub>
##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/bootstrap/RLIBootstrapOperator.java:
##########
@@ -62,12 +63,17 @@ public RLIBootstrapOperator(Configuration conf) {
@Override
public void initializeState(StateInitializationContext context) throws
Exception {
loadedCnt = 0;
- HoodieTableMetaClient metaClient = StreamerUtil.createMetaClient(conf);
- this.metadataTable = (HoodieBackedTableMetadata)
metaClient.getTableFormat().getMetadataFactory().create(
- HoodieFlinkEngineContext.DEFAULT,
- metaClient.getStorage(),
- StreamerUtil.metadataConfig(conf),
- conf.get(FlinkOptions.PATH));
+ HoodieTableMetaClient metaClient = createMetaClient();
+ this.metadataTable = createMetadataTable(metaClient);
+
+ if (metadataTable.getMetadataConfig().isMetricsEnabled()) {
+ // Bridge metadata table Dropwizard metrics to Flink's metric system so
they are
+ // visible in Flink dashboards and reporters.
+ MetricGroup metricGroup = getRuntimeContext().getMetricGroup();
+ metadataTable.getMetrics().ifPresent(m ->
+ m.registry().getGauges().forEach((name, gauge) ->
+ metricGroup.gauge(name, gauge::getValue)));
+ }
Review Comment:
🤖 Just `getGauges()` is bridged here. Today `HoodieMetadataMetrics` only
registers Gauges, but if Counters/Timers/Histograms get added later they will
be silently dropped by this bridge. Worth either documenting that limitation or
also iterating `getCounters()/getTimers()/getHistograms()/getMeters()` so
future metric types don't go missing.
<sub><i>- AI-generated; verify before applying. React 👍/👎 to flag
quality.</i></sub>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]