[GitHub] [hadoop] virajjasani commented on a change in pull request #3148: HDFS-16090. Fine grained lock for datanodeNetworkCounts

2021-06-28 Thread GitBox


virajjasani commented on a change in pull request #3148:
URL: https://github.com/apache/hadoop/pull/3148#discussion_r659601136



##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
##
@@ -2272,19 +2274,11 @@ public int getActiveTransferThreadCount() {
   void incrDatanodeNetworkErrors(String host) {
 metrics.incrDatanodeNetworkErrors();
 
-/*
- * Synchronizing on the whole cache is a big hammer, but since it's only
- * accumulating errors, it should be ok. If this is ever expanded to 
include
- * non-error stats, then finer-grained concurrency should be applied.
- */
-synchronized (datanodeNetworkCounts) {
-  try {
-final Map curCount = datanodeNetworkCounts.get(host);
-curCount.put("networkErrors", curCount.get("networkErrors") + 1L);
-datanodeNetworkCounts.put(host, curCount);
-  } catch (ExecutionException e) {
-LOG.warn("failed to increment network error counts for host: {}", 
host);
-  }
+try {
+  datanodeNetworkCounts.get(host).compute(NETWORK_ERRORS,
+  (key, errors) -> errors == null ? null : errors + 1L);

Review comment:
   > i mean, shouldn't it be made 1 when errors is null (meaning the key 
didn't exist before)?
   
   I see. Based on the LoadingCache creation, we will always find value `0L` at 
the beginning and the only reason why I handled `errors == null` case is 
because findbugs don't complain about missing this. But I think your suggestion 
is better, we should return error `1L` when it is null. Let me change this.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] virajjasani commented on a change in pull request #3148: HDFS-16090. Fine grained lock for datanodeNetworkCounts

2021-06-28 Thread GitBox


virajjasani commented on a change in pull request #3148:
URL: https://github.com/apache/hadoop/pull/3148#discussion_r659596959



##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
##
@@ -2272,19 +2274,11 @@ public int getActiveTransferThreadCount() {
   void incrDatanodeNetworkErrors(String host) {
 metrics.incrDatanodeNetworkErrors();
 
-/*
- * Synchronizing on the whole cache is a big hammer, but since it's only
- * accumulating errors, it should be ok. If this is ever expanded to 
include
- * non-error stats, then finer-grained concurrency should be applied.
- */
-synchronized (datanodeNetworkCounts) {
-  try {
-final Map curCount = datanodeNetworkCounts.get(host);
-curCount.put("networkErrors", curCount.get("networkErrors") + 1L);
-datanodeNetworkCounts.put(host, curCount);
-  } catch (ExecutionException e) {
-LOG.warn("failed to increment network error counts for host: {}", 
host);
-  }
+try {
+  datanodeNetworkCounts.get(host).compute(NETWORK_ERRORS,
+  (key, errors) -> errors == null ? null : errors + 1L);

Review comment:
   So everytime we have a network error, instead of locking entire 
LoadingCache, with CHM.compute(), we will just take lock on bucket of Map where 
the key resides and then error count will be incremented. So this is fine 
grained locking and much performant than taking lock on entire `LoadingCache`.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] virajjasani commented on a change in pull request #3148: HDFS-16090. Fine grained lock for datanodeNetworkCounts

2021-06-28 Thread GitBox


virajjasani commented on a change in pull request #3148:
URL: https://github.com/apache/hadoop/pull/3148#discussion_r659595214



##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
##
@@ -2272,19 +2274,11 @@ public int getActiveTransferThreadCount() {
   void incrDatanodeNetworkErrors(String host) {
 metrics.incrDatanodeNetworkErrors();
 
-/*
- * Synchronizing on the whole cache is a big hammer, but since it's only
- * accumulating errors, it should be ok. If this is ever expanded to 
include
- * non-error stats, then finer-grained concurrency should be applied.
- */
-synchronized (datanodeNetworkCounts) {
-  try {
-final Map curCount = datanodeNetworkCounts.get(host);
-curCount.put("networkErrors", curCount.get("networkErrors") + 1L);
-datanodeNetworkCounts.put(host, curCount);
-  } catch (ExecutionException e) {
-LOG.warn("failed to increment network error counts for host: {}", 
host);
-  }
+try {
+  datanodeNetworkCounts.get(host).compute(NETWORK_ERRORS,
+  (key, errors) -> errors == null ? null : errors + 1L);

Review comment:
   Map.compute() is just replacement of below code (and ConcurrentHashMap 
does it atomically):
   ```
*  {@code
* V oldValue = map.get(key);
* V newValue = remappingFunction.apply(key, oldValue);
* if (oldValue != null ) {
*if (newValue != null)
*   map.put(key, newValue);
*else
*   map.remove(key);
* } else {
*if (newValue != null)
*   map.put(key, newValue);
*else
*   return null;
* }
* }
   ```
   
   errors will ideally never be null because it is defined as `0L` here:
   ```
   datanodeNetworkCounts =
   CacheBuilder.newBuilder()
   .maximumSize(dncCacheMaxSize)
   .build(new CacheLoader>() {
 @Override
 public Map load(String key) throws Exception {
   final Map ret = new HashMap();
   ret.put("networkErrors", 0L);
   return ret;
 }
   });
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org