prashantwason commented on code in PR #10641:
URL: https://github.com/apache/hudi/pull/10641#discussion_r1556617105


##########
hudi-common/src/main/java/org/apache/hudi/metadata/HoodieMetadataMetrics.java:
##########
@@ -136,7 +144,7 @@ public void updateMetrics(String action, long durationInMs) 
{
     String countKey = action + ".count";
     String durationKey = action + ".totalDuration";
     incrementMetric(countKey, 1);
-    incrementMetric(durationKey, durationInMs);
+    setMetric(durationKey, durationInMs);

Review Comment:
   You are assuming that code calling these functions would only call once. 
That may not be a correct assumption for all cases - opening the MDT is costly 
so multiple lookups etc can be called on open MDT readers. 



##########
hudi-common/src/main/java/org/apache/hudi/metadata/BaseTableMetadata.java:
##########
@@ -302,8 +303,8 @@ public Map<String, HoodieRecordGlobalLocation> 
readRecordIndex(List<String> reco
     });
 
     metrics.ifPresent(m -> 
m.updateMetrics(HoodieMetadataMetrics.LOOKUP_RECORD_INDEX_TIME_STR, 
timer.endTimer()));
-    metrics.ifPresent(m -> 
m.updateMetrics(HoodieMetadataMetrics.LOOKUP_RECORD_INDEX_KEYS_COUNT_STR, 
recordKeys.size()));
-    metrics.ifPresent(m -> 
m.updateMetrics(HoodieMetadataMetrics.LOOKUP_RECORD_INDEX_KEYS_HITS_COUNT_STR, 
recordKeyToLocation.size()));
+    metrics.ifPresent(m -> 
m.setMetric(HoodieMetadataMetrics.LOOKUP_RECORD_INDEX_KEYS_COUNT_STR, 
recordKeys.size()));

Review Comment:
   The same HoodieTableMetadata object can be used to lookup keys from MDT 
multiple times. In that case, update is more accurate.
   



##########
hudi-common/src/main/java/org/apache/hudi/metadata/HoodieMetadataMetrics.java:
##########
@@ -73,10 +79,12 @@ public class HoodieMetadataMetrics implements Serializable {
 
   private static final Logger LOG = 
LoggerFactory.getLogger(HoodieMetadataMetrics.class);
 
-  private final Registry metricsRegistry;
+  private final transient MetricRegistry metricsRegistry;
+  private final transient Metrics metrics;
 
-  public HoodieMetadataMetrics(Registry metricsRegistry) {
-    this.metricsRegistry = metricsRegistry;
+  public HoodieMetadataMetrics(HoodieMetricsConfig metricsConfig) {

Review Comment:
   If you do not use Registry then no metrics can be collected from the 
executors where most of the operations on the MDT readers take place (for 
indexes other than files index).
   
   Eg. RI lookup -> since there are multiple file groups in record_index, when 
looking up keys from the record index, each executor opens one file group of 
the record index and reads the keys that belong to that file group. When 
HoodieTableMetadata is serialized by Spark and send to the executors, the 
executors end up updating a local copy of the metadata metrics. since the 
publishing of the metrics is only done on the driver side, the metrics updated 
on the executor side never make it to the driver side and hence never published.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to