[GitHub] [hadoop] liangxs commented on a change in pull request #3080: HADOOP-17749. Remove lock contention in SelectorPool of SocketIOWithTimeout

2021-07-04 Thread GitBox


liangxs commented on a change in pull request #3080:
URL: https://github.com/apache/hadoop/pull/3080#discussion_r663608673



##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/net/SocketIOWithTimeout.java
##
@@ -426,34 +408,41 @@ private synchronized SelectorInfo get(SelectableChannel 
channel)
  * 
  * @param info
  */
-private synchronized void release(SelectorInfo info) {
+private static void release(SelectorInfo info) {
   long now = Time.now();
   trimIdleSelectors(now);
   info.lastActivityTime = now;
-  info.queue.addLast(info);
+  // SelectorInfos in queue are sorted by lastActivityTime
+  providerMap.get(info.provider).addLast(info);
 }
 
+private static AtomicBoolean trimming = new AtomicBoolean(false);
+private static volatile long lastTrimTime = Time.now();
+
 /**
  * Closes selectors that are idle for IDLE_TIMEOUT (10 sec). It does not
  * traverse the whole list, just over the one that have crossed 
  * the timeout.
  */
-private void trimIdleSelectors(long now) {
+private static void trimIdleSelectors(long now) {
+  if (!trimming.compareAndSet(false, true)) {
+return;
+  }
+
   long cutoff = now - IDLE_TIMEOUT;
-  
-  for(ProviderInfo pList=providerList; pList != null; pList=pList.next) {
-if (pList.queue.isEmpty()) {
-  continue;
-}
-for(Iterator it = pList.queue.iterator(); it.hasNext();) 
{
-  SelectorInfo info = it.next();
-  if (info.lastActivityTime > cutoff) {
+  for (ConcurrentLinkedDeque infoQ : providerMap.values()) {
+SelectorInfo oldest;
+while ((oldest = infoQ.peekFirst()) != null) {
+  if (oldest.lastActivityTime <= cutoff && infoQ.remove(oldest)) {

Review comment:
   @aajisaka, Thanks for your review, but I think the first element of 
`infoQ` may be changed in multi-thread environment. For example:
   
   1. Precondition: `infoQ.size() == 1`
   1. Thread t1: Line 435:`oldest = infoQ.peekFirst()`
   1. Thread t2: Line 394:`infoQ.pollLast()` take oldest selector away and 
use the selector
   1. Thread t3: Line 416:`providerMap.get(info.provider).addLast(info)` 
put a newest selector
   1. Thread t1: Line 438:`infoQ.removeFirst()` remove the newest selector
   1. Thread t1:  `oldest.close()` close the oldest 
selector, which is using by Thread t2.
   1. GC Thread: the newest selector is collected due to no reference to it, 
but the selector is not closed.
   
   
   
   
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] liangxs commented on a change in pull request #3080: HADOOP-17749. Remove lock contention in SelectorPool of SocketIOWithTimeout

2021-07-04 Thread GitBox


liangxs commented on a change in pull request #3080:
URL: https://github.com/apache/hadoop/pull/3080#discussion_r663608644



##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/net/SocketIOWithTimeout.java
##
@@ -426,34 +408,41 @@ private synchronized SelectorInfo get(SelectableChannel 
channel)
  * 
  * @param info
  */
-private synchronized void release(SelectorInfo info) {
+private static void release(SelectorInfo info) {
   long now = Time.now();
   trimIdleSelectors(now);
   info.lastActivityTime = now;
-  info.queue.addLast(info);
+  // SelectorInfos in queue are sorted by lastActivityTime
+  providerMap.get(info.provider).addLast(info);
 }
 
+private static AtomicBoolean trimming = new AtomicBoolean(false);
+private static volatile long lastTrimTime = Time.now();

Review comment:
   Thanks for your suggestion, I added a new commit.

##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/net/SocketIOWithTimeout.java
##
@@ -426,34 +408,41 @@ private synchronized SelectorInfo get(SelectableChannel 
channel)
  * 
  * @param info
  */
-private synchronized void release(SelectorInfo info) {
+private static void release(SelectorInfo info) {
   long now = Time.now();
   trimIdleSelectors(now);
   info.lastActivityTime = now;
-  info.queue.addLast(info);
+  // SelectorInfos in queue are sorted by lastActivityTime
+  providerMap.get(info.provider).addLast(info);
 }
 
+private static AtomicBoolean trimming = new AtomicBoolean(false);
+private static volatile long lastTrimTime = Time.now();
+
 /**
  * Closes selectors that are idle for IDLE_TIMEOUT (10 sec). It does not
  * traverse the whole list, just over the one that have crossed 
  * the timeout.
  */
-private void trimIdleSelectors(long now) {
+private static void trimIdleSelectors(long now) {
+  if (!trimming.compareAndSet(false, true)) {
+return;
+  }
+
   long cutoff = now - IDLE_TIMEOUT;
-  
-  for(ProviderInfo pList=providerList; pList != null; pList=pList.next) {
-if (pList.queue.isEmpty()) {
-  continue;
-}
-for(Iterator it = pList.queue.iterator(); it.hasNext();) 
{
-  SelectorInfo info = it.next();
-  if (info.lastActivityTime > cutoff) {
+  for (ConcurrentLinkedDeque infoQ : providerMap.values()) {
+SelectorInfo oldest;
+while ((oldest = infoQ.peekFirst()) != null) {
+  if (oldest.lastActivityTime <= cutoff && infoQ.remove(oldest)) {

Review comment:
   @aajisaka, Thanks for your review, but I think the first element of 
`infoQ` may be changed in multi-thread environment. For example:
   
   1. Precondition: `infoQ.size() == 1`
   1. Thread t1: Line 435:`oldest = infoQ.peekFirst()`
   1. Thread t2: Line 394:`infoQ.pollLast()` take oldest selector away and 
use the selector
   1. Thread t3: Line 416:`providerMap.get(info.provider).addLast(info)` 
put a newest selector
   1. Thread t1: Line 438:`infoQ.removeFirst()` remove the newest selector
   1. Thread t1:  `oldest.close()` close the oldest 
selector, which is using by Thread t2.
   1. GC Thread: the newest selector is collect due to no reference to it, but 
the selector is not closed.
   
   
   
   
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] liangxs commented on a change in pull request #3080: HADOOP-17749. Remove lock contention in SelectorPool of SocketIOWithTimeout

2021-06-23 Thread GitBox


liangxs commented on a change in pull request #3080:
URL: https://github.com/apache/hadoop/pull/3080#discussion_r656829433



##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/net/SocketIOWithTimeout.java
##
@@ -426,34 +407,44 @@ private synchronized SelectorInfo get(SelectableChannel 
channel)
  * 
  * @param info
  */
-private synchronized void release(SelectorInfo info) {
+private static void release(SelectorInfo info) {
   long now = Time.now();
   trimIdleSelectors(now);
   info.lastActivityTime = now;
-  info.queue.addLast(info);
+  // SelectorInfos in queue are sorted by lastActivityTime
+  providerMap.get(info.provider).addLast(info);
 }
 
+private static AtomicBoolean trimming = new AtomicBoolean(false);
+private static volatile long lastTrimTime = Time.now();
+
 /**
  * Closes selectors that are idle for IDLE_TIMEOUT (10 sec). It does not
  * traverse the whole list, just over the one that have crossed 
  * the timeout.
  */
-private void trimIdleSelectors(long now) {
+private static void trimIdleSelectors(long now) {
+  if (!trimming.compareAndSet(false, true)) {
+return;
+  }
+  if (now - lastTrimTime < IDLE_TIMEOUT / 2) {

Review comment:
   @ferhui I remove this check in new commit.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org