[GitHub] [hadoop] liangxs commented on pull request #3080: HADOOP-17749. Remove lock contention in SelectorPool of SocketIOWithTimeout

2021-07-06 Thread GitBox


liangxs commented on pull request #3080:
URL: https://github.com/apache/hadoop/pull/3080#issuecomment-874400802


   @ferhui @aajisaka Thanks very much for your reviews!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] liangxs commented on pull request #3080: HADOOP-17749. Remove lock contention in SelectorPool of SocketIOWithTimeout

2021-07-05 Thread GitBox


liangxs commented on pull request #3080:
URL: https://github.com/apache/hadoop/pull/3080#issuecomment-874400802


   @ferhui @aajisaka Thanks very much for your reviews!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] liangxs commented on pull request #3080: HADOOP-17749. Remove lock contention in SelectorPool of SocketIOWithTimeout

2021-06-22 Thread GitBox


liangxs commented on pull request #3080:
URL: https://github.com/apache/hadoop/pull/3080#issuecomment-866478672


   @ferhui Thanks for your suggestion, I added unit test.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] liangxs commented on pull request #3080: HADOOP-17749. Remove lock contention in SelectorPool of SocketIOWithTimeout

2021-06-22 Thread GitBox


liangxs commented on pull request #3080:
URL: https://github.com/apache/hadoop/pull/3080#issuecomment-864899799


   I execute this command in my product environment:
   ```
   for i in `seq 1 20`; do jstack 23730 > d.jstack.`date +%H%M%S` ; sleep 1; 
done &> /dev/null
   
   ```
   
   
   20 jstack files are generated:
   
![image](https://user-images.githubusercontent.com/4401756/122742178-44803380-d2b8-11eb-9531-f82c72ceab73.png)
   
   
   Most of the lock contentions are about SelectPool:
   
![image](https://user-images.githubusercontent.com/4401756/122742317-6679b600-d2b8-11eb-9644-1dae1633508a.png)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] liangxs commented on pull request #3080: HADOOP-17749. Remove lock contention in SelectorPool of SocketIOWithTimeout

2021-06-21 Thread GitBox


liangxs commented on pull request #3080:
URL: https://github.com/apache/hadoop/pull/3080#issuecomment-864899799


   I execute this command in my product environment:
   ```
   for i in `seq 1 20`; do jstack 23730 > d.jstack.`date +%H%M%S` ; sleep 1; 
done &> /dev/null
   
   ```
   
   
   20 jstack files are generated:
   
![image](https://user-images.githubusercontent.com/4401756/122742178-44803380-d2b8-11eb-9531-f82c72ceab73.png)
   
   
   Most of the lock contentions are about SelectPool:
   
![image](https://user-images.githubusercontent.com/4401756/122742317-6679b600-d2b8-11eb-9644-1dae1633508a.png)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] liangxs commented on pull request #3080: HADOOP-17749. Remove lock contention in SelectorPool of SocketIOWithTimeout

2021-06-15 Thread GitBox


liangxs commented on pull request #3080:
URL: https://github.com/apache/hadoop/pull/3080#issuecomment-861476259


   
   I tested the performance of the trunk version and the optimized version.
   
   ### Test Case
   
   The test steps are as follow:
   
   1. Start a netty-echo-server with 100 service ports.
   ```
   private static void startServer() throws Exception {
 ChannelHandler serverHandler = new EchoHandler();
 for (int i = 0; i < 100; ++i) {
   ServerBootstrap b = new ServerBootstrap();
   b.group(new NioEventLoopGroup(1), new NioEventLoopGroup(2))
   .channel(NioServerSocketChannel.class)
   .option(ChannelOption.SO_BACKLOG, 512)
   .childOption(ChannelOption.SO_TIMEOUT, timeout)
   .childOption(ChannelOption.TCP_NODELAY, true)
   .childHandler(serverHandler);
   ChannelFuture f = b.bind(host, port + i).sync();
 }
 Thread.sleep(Integer.MAX_VALUE);
   }
   ```
   
   2. Start a hadoop socket client with multi-threads.
   These threads connect to the netty-echo-server's ports by a round robin 
manner.
   ```
   private static void startClient(int threadCnt) throws Exception {
 SocketFactory factory = new StandardSocketFactory();
 Thread[] tArray = new Thread[threadCnt];
 CountDownLatch latch = new CountDownLatch(threadCnt);
 for (int i = 0; i < threadCnt; ++i) {
   final int curPort = port + (i % 100);  // round robin
   Thread t = new Thread(() -> {
 try {
   Socket socket = factory.createSocket();
   socket.setTcpNoDelay(true);
   socket.setKeepAlive(false);
   NetUtils.connect(socket, new java.net.InetSocketAddress(host, 
curPort), timeout);
   socket.setSoTimeout(timeout);
   ...
   ...
   ```
   
   3. Each client thread send-to/recv-from its corresponding netty-server port 
1024 times, with 256-byte data each time.
   ```
 InputStream inStream = NetUtils.getInputStream(socket);
 OutputStream outStream = NetUtils.getOutputStream(socket, timeout);
 DataInputStream in = new DataInputStream(new 
BufferedInputStream(inStream));
 DataOutputStream out = new DataOutputStream(new 
BufferedOutputStream(outStream));
   
 byte[] buf = new byte[256];
 for (int j = 0; j < 1024; ++j) {
   out.write(buf);
   out.flush();
   in.readFully(buf);
 }
   ```
   
   4. print the total cost.
   
   
   Code project:  
[https://github.com/liangxs/test-HADOOP-17749](https://github.com/liangxs/test-HADOOP-17749)
   
   
   
   ### Test Result
   
   The test result is as follow:
   
   ```
   | client thread count | 100 | 200 | 400  | 800  | 1200 | 1600 | 2000 | 2400 
| 2800 |
   
|-|-|-|--|--|--|--|--|--|--|
   | trunk (millis)  | 351 | 609 | 1058 | 2024 | 2907 | 3882 | 5062 | 5675 
| 7117 |
   | optimized (millis)  | 253 | 438 |  799 | 1167 | 1561 | 2422 | 2784 | 2813 
| 3371 |
   | improved| 38% | 39% |  32% |  70% |  86% |  60% |  82% | 102% 
| 111% |
   ```
   
   
   
   
   ps, I used two test machines with same hardware and software configuration 
and on a same rack:
   
   ```
   $ lscpu
   Architecture:  x86_64
   CPU(s):56
   On-line CPU(s) list:   0-55
   Thread(s) per core:2
   Core(s) per socket:14
   Socket(s): 2
   NUMA node(s):  2
   Model name:Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz
   CPU MHz:   2401.000
   
   $ free -g
 totalusedfree  shared  buff/cache   
available
   Mem: 62   7  37   0  17  
54
   
   $ lspci | grep Ethernet
   06:00.0 Ethernet controller: Intel Corporation Ethernet Controller 
10-Gigabit X540-AT2 (rev 01)
   
   $ uname -r
   3.10.107-1
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] liangxs commented on pull request #3080: HADOOP-17749. Remove lock contention in SelectorPool of SocketIOWithTimeout

2021-06-09 Thread GitBox


liangxs commented on pull request #3080:
URL: https://github.com/apache/hadoop/pull/3080#issuecomment-857590472


   @jojochuang Can you kindly review this PR?
   Thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] liangxs commented on pull request #3080: HADOOP-17749. Remove lock contention in SelectorPool of SocketIOWithTimeout

2021-06-08 Thread GitBox


liangxs commented on pull request #3080:
URL: https://github.com/apache/hadoop/pull/3080#issuecomment-856703041


   Could someone please review this PR?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org