prashantwason commented on code in PR #10641: URL: https://github.com/apache/hudi/pull/10641#discussion_r1556617105
########## hudi-common/src/main/java/org/apache/hudi/metadata/HoodieMetadataMetrics.java: ########## @@ -136,7 +144,7 @@ public void updateMetrics(String action, long durationInMs) { String countKey = action + ".count"; String durationKey = action + ".totalDuration"; incrementMetric(countKey, 1); - incrementMetric(durationKey, durationInMs); + setMetric(durationKey, durationInMs); Review Comment: You are assuming that code calling these functions would only call once. That may not be a correct assumption for all cases - opening the MDT is costly so multiple lookups etc can be called on open MDT readers. ########## hudi-common/src/main/java/org/apache/hudi/metadata/BaseTableMetadata.java: ########## @@ -302,8 +303,8 @@ public Map<String, HoodieRecordGlobalLocation> readRecordIndex(List<String> reco }); metrics.ifPresent(m -> m.updateMetrics(HoodieMetadataMetrics.LOOKUP_RECORD_INDEX_TIME_STR, timer.endTimer())); - metrics.ifPresent(m -> m.updateMetrics(HoodieMetadataMetrics.LOOKUP_RECORD_INDEX_KEYS_COUNT_STR, recordKeys.size())); - metrics.ifPresent(m -> m.updateMetrics(HoodieMetadataMetrics.LOOKUP_RECORD_INDEX_KEYS_HITS_COUNT_STR, recordKeyToLocation.size())); + metrics.ifPresent(m -> m.setMetric(HoodieMetadataMetrics.LOOKUP_RECORD_INDEX_KEYS_COUNT_STR, recordKeys.size())); Review Comment: The same HoodieTableMetadata object can be used to lookup keys from MDT multiple times. In that case, update is more accurate. ########## hudi-common/src/main/java/org/apache/hudi/metadata/HoodieMetadataMetrics.java: ########## @@ -73,10 +79,12 @@ public class HoodieMetadataMetrics implements Serializable { private static final Logger LOG = LoggerFactory.getLogger(HoodieMetadataMetrics.class); - private final Registry metricsRegistry; + private final transient MetricRegistry metricsRegistry; + private final transient Metrics metrics; - public HoodieMetadataMetrics(Registry metricsRegistry) { - this.metricsRegistry = metricsRegistry; + public HoodieMetadataMetrics(HoodieMetricsConfig metricsConfig) { Review Comment: If you do not use Registry then no metrics can be collected from the executors where most of the operations on the MDT readers take place (for indexes other than files index). Eg. RI lookup -> since there are multiple file groups in record_index, when looking up keys from the record index, each executor opens one file group of the record index and reads the keys that belong to that file group. When HoodieTableMetadata is serialized by Spark and send to the executors, the executors end up updating a local copy of the metadata metrics. since the publishing of the metrics is only done on the driver side, the metrics updated on the executor side never make it to the driver side and hence never published. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org