[ https://issues.apache.org/jira/browse/HDFS-16090?focusedWorklogId=615504&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-615504 ]
ASF GitHub Bot logged work on HDFS-16090: ----------------------------------------- Author: ASF GitHub Bot Created on: 28/Jun/21 08:47 Start Date: 28/Jun/21 08:47 Worklog Time Spent: 10m Work Description: virajjasani commented on a change in pull request #3148: URL: https://github.com/apache/hadoop/pull/3148#discussion_r659596959 ########## File path: hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java ########## @@ -2272,19 +2274,11 @@ public int getActiveTransferThreadCount() { void incrDatanodeNetworkErrors(String host) { metrics.incrDatanodeNetworkErrors(); - /* - * Synchronizing on the whole cache is a big hammer, but since it's only - * accumulating errors, it should be ok. If this is ever expanded to include - * non-error stats, then finer-grained concurrency should be applied. - */ - synchronized (datanodeNetworkCounts) { - try { - final Map<String, Long> curCount = datanodeNetworkCounts.get(host); - curCount.put("networkErrors", curCount.get("networkErrors") + 1L); - datanodeNetworkCounts.put(host, curCount); - } catch (ExecutionException e) { - LOG.warn("failed to increment network error counts for host: {}", host); - } + try { + datanodeNetworkCounts.get(host).compute(NETWORK_ERRORS, + (key, errors) -> errors == null ? null : errors + 1L); Review comment: So everytime we have a network error, instead of locking entire LoadingCache, with CHM.compute(), we will just take lock on bucket of Map where the key resides and then error count will be incremented. So this is fine grained locking and much performant than taking lock on entire `LoadingCache`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking ------------------- Worklog Id: (was: 615504) Time Spent: 1h 10m (was: 1h) > Fine grained locking for datanodeNetworkCounts > ---------------------------------------------- > > Key: HDFS-16090 > URL: https://issues.apache.org/jira/browse/HDFS-16090 > Project: Hadoop HDFS > Issue Type: Improvement > Reporter: Viraj Jasani > Assignee: Viraj Jasani > Priority: Major > Labels: pull-request-available > Time Spent: 1h 10m > Remaining Estimate: 0h > > While incrementing DataNode network error count, we lock entireĀ LoadingCache > in order to increment network count of specific host. We should provide fine > grained concurrency for this update because locking entire cache is redundant > and could impact performance while incrementing network count for multiple > hosts. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org