[ https://issues.apache.org/jira/browse/HDFS-11789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16052519#comment-16052519 ]
Arpit Agarwal commented on HDFS-11789: -------------------------------------- Thanks for updating the patch [~hanishakoneru]. A few more comments: # In BlockReaderLocal constructor, metrics is being passed to BlockReaderIoProvider before it is potentially initialized. {code} this.blockReaderIoProvider = new BlockReaderIoProvider( builder.shortCircuitConf, metrics, timer); if (builder.shortCircuitConf.isScrMetricsEnabled()) { metricsInitializationLock.lock(); if (!isMetricsEnabled) { metrics = BlockReaderLocalMetrics.create(); isMetricsEnabled = true; } {code} # Typo - {{SHORT_CIRCUIT_READ_LATECNY_METRIC_REGISTERD_NAME}}. # Looks like we can eliminate the {{BlockReaderLocal#isMetricsEnabled}}, instead just check whether metrics is null. # Thanks for adding TestBlockReaderIoProvider. One more related test - ensure {{addShortCircuitReadLatency}} is not invoked when the read takes less than the threshold. # Also I think we can make BlockReaderIoProvider be a static utility class since it doesn't maintain any mutable state. But this approach is also fine since the allocated object is lightweight. TestBlockReaderLocalMetrics is a nicely written unit test! > Maintain Short-Circuit Read Statistics > -------------------------------------- > > Key: HDFS-11789 > URL: https://issues.apache.org/jira/browse/HDFS-11789 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client > Reporter: Hanisha Koneru > Assignee: Hanisha Koneru > Attachments: HDFS-11789.001.patch, HDFS-11789.002.patch, > HDFS-11789.003.patch > > > If a disk or controller hardware is faulty then short-circuit read requests > can stall indefinitely while reading from the file descriptor. Currently > there is no way to detect when short-circuit read requests are slow or > blocked. > This Jira proposes that each BlockReaderLocal maintain read statistics while > it is active by measuring the time taken for a pre-determined fraction of > read requests. These per-reader stats can be aggregated into global stats > when the reader is closed. The aggregate statistics can be exposed via JMX. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org