[ https://issues.apache.org/jira/browse/HDFS-16959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17704972#comment-17704972 ]
ASF GitHub Bot commented on HDFS-16959: --------------------------------------- virajjasani commented on code in PR #5497: URL: https://github.com/apache/hadoop/pull/5497#discussion_r1148404105 ########## hadoop-hdfs-project/hadoop-hdfs-rbf/src/test/java/org/apache/hadoop/hdfs/server/federation/store/driver/TestStateStoreDriverBase.java: ########## @@ -574,6 +580,38 @@ private static Map<String, Class<?>> getFields(BaseRecord record) { return getters; } + public long getMountTableCacheLoadSamples(StateStoreDriver driver) throws IOException { + final MutableRate mountTableCache = getMountTableCache(driver); + return mountTableCache.lastStat().numSamples(); + } + + private static MutableRate getMountTableCache(StateStoreDriver driver) throws IOException { + StateStoreMetrics metrics = stateStore.getMetrics(); + final Query<MountTable> query = new Query<>(MountTable.newInstance()); + driver.getMultiple(MountTable.class, query); + final Map<String, MutableRate> cacheLoadMetrics = metrics.getCacheLoadMetrics(); + final MutableRate mountTableCache = cacheLoadMetrics.get("CacheMountTableLoad"); + assertNotNull("CacheMountTableLoad should be present in the state store metrics", + mountTableCache); + return mountTableCache; + } + + public void testCacheLoadMetrics(StateStoreDriver driver, long numRefresh) + throws IOException, IllegalArgumentException { + final MutableRate mountTableCache = getMountTableCache(driver); + // CacheMountTableLoadNumOps + final long mountTableCacheLoadNumOps = getMountTableCacheLoadSamples(driver); + assertEquals("Num of samples collected should match", numRefresh, mountTableCacheLoadNumOps); + // CacheMountTableLoadAvgTime ms + final double mountTableCacheLoadAvgTimeMs = mountTableCache.lastStat().mean(); + // 2 seconds is a high enough value for the test, hence we expect mount table cache + // with very few entries to be loaded by this time duration, hence not have this test result + // show flaky behavior. + assertTrue( + "Mean time duration for cache load is expected to be less than 2000 ms. Actual value: " + + mountTableCacheLoadAvgTimeMs, mountTableCacheLoadAvgTimeMs < 2000d); + } Review Comment: Oh for prod, I think we should be better with ms itself because even with few entries I see at least some value in double for sure and that is always > 0 so I am sure we should be good there. For test also, I just did more than 10 runs for each state store impl tests: fs, dfs, zk. And we definitely have values greater than 0 but one time I saw just 0.01d too hence I wonder Jenkins might get 0.0 too. Hence, let me go with your suggestion for test: > If only in test: May be have the default value as -1 and we can assert it isn't -1 and we should be sorted, if it is -1 then things didin't work the way we wanted. Done > RBF: State store cache loading metrics > -------------------------------------- > > Key: HDFS-16959 > URL: https://issues.apache.org/jira/browse/HDFS-16959 > Project: Hadoop HDFS > Issue Type: Improvement > Reporter: Viraj Jasani > Assignee: Viraj Jasani > Priority: Major > Labels: pull-request-available > > With increasing num of state store records (like mount points), it would be > good to be able to get the cache loading metrics like avg time for cache load > during refresh, num of times cache is loaded etc. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org