nobodyiam opened a new issue #1744: Client side response time is slower than 
actual when client side is in tcp delayed ack mode
URL: https://github.com/apache/incubator-dubbo/issues/1744
 
 
   ## Issue Description
   
   Dubbo's [netty 3 server 
implementation](https://github.com/apache/incubator-dubbo/blob/master/dubbo-remoting/dubbo-remoting-netty/src/main/java/com/alibaba/dubbo/remoting/transport/netty/NettyServer.java#L70)
 does not enable `TCP_NODELAY` option, which causes the server side not 
responding in time when client side is in delayed ack mode and the response 
size is less than MSS.
   
   However, the [netty 4 server 
implementation](https://github.com/apache/incubator-dubbo/blob/master/dubbo-remoting/dubbo-remoting-netty4/src/main/java/com/alibaba/dubbo/remoting/transport/netty4/NettyServer.java#L81)
 does enable this option.
   
   ```java
   bootstrap.group(bossGroup, workerGroup)
                   .channel(NioServerSocketChannel.class)
            --->   .childOption(ChannelOption.TCP_NODELAY, Boolean.TRUE) 
                   .childOption(ChannelOption.SO_REUSEADDR, Boolean.TRUE)
                   .childOption(ChannelOption.ALLOCATOR, 
PooledByteBufAllocator.DEFAULT)
   ```
   
   Considering Netty 4 enables this option by default([Enable TCP_NODELAY and 
SCTP_NODELAY by 
default](https://github.com/netty/netty/commit/39357f3835f971e6cc1a0e41a805fa1293e7005e)),
 dubbo's [netty 3 server 
implementation](https://github.com/apache/incubator-dubbo/blob/master/dubbo-remoting/dubbo-remoting-netty/src/main/java/com/alibaba/dubbo/remoting/transport/netty/NettyServer.java#L70)
 should also enable this option by default.
   
   ## Solution
   
   Simply set this option when constructing ServerBootstrap:
   
   ```java
   bootstrap.setOption("child.tcpNoDelay", true);
   ```
   
   ## Issue Screenshots
   
   Here is an actual example captured in our environment, with 10.5.160.181 as 
the client side and 10.5.169.180 as the server side.
   
   ### Case with normal ack
   
   The demo server side logic costs 10-11 ms, so normally, the client side 
response time is around 12 ms.
   
   The frame 9 highlighted below is a normal request, whose request id is 0x02.
   
   
![image](https://user-images.githubusercontent.com/837658/39660725-87c0676c-5077-11e8-9059-be353ceebf97.png)
   
   The frame 11 highlighted below is the response, whose response id is 0x02.
   
![image](https://user-images.githubusercontent.com/837658/39660733-b84b750c-5077-11e8-9c72-bdb78da0f299.png)
   
   The actual response time is 12 ms.
   
   Also we can see from the above screenshots that both client side and server 
side respond `ack` normally.
   
   ### Case with delayed ack
   
   Now let's take a look when client side is in delayed ack mode.
   
   From the screen shots below, there is no single `ack` packets, which means 
it's in delayed ack mode - `ack` is returned with a data packet.
   
   The frame 239 highlighted below is a request, whose request id is 0x7f.
   We can see the 240 packet responded immediately when 239 is sent to server, 
considering the server logic costs 10-11 ms, this packet is not the response.
   
   
![image](https://user-images.githubusercontent.com/837658/39660790-700dc08c-5078-11e8-9934-61c630420ee0.png)
   
   As we can see, the frame 240's response id is 0x7e, which is the response to 
the previous request.
   
![image](https://user-images.githubusercontent.com/837658/39660802-02df6a82-5079-11e8-8964-e8c3c5054ea1.png)
   
   Then frame 241 is sent, whose request id is 0x80.
   
   
![image](https://user-images.githubusercontent.com/837658/39660818-3cb482f6-5079-11e8-8563-c1e1b2480905.png)
   
   Until then, the response to request 0x7f is returned (frame 242)
   
   
![image](https://user-images.githubusercontent.com/837658/39660824-5a960dc6-5079-11e8-9feb-3f6f154a2b49.png)
   
   Request 0x7f was sent in 14:37:53.267, so we know that the response was 
ready around 14:37:53.277 in the server side. However, since the server side 
does not enable `TCP_NODELAY`, so according to Nagel algorithm, the response 
cannot be sent until an `ack` is received or a timeout occurrs(40ms).
   
   So the response is hold in server side and is sent when another request is 
sent from client side, which was 14:37:53.292730.
   
   In this case, the client side response time (25ms) is much slower than 
actual (12ms).
   
   Even worse, in this situation, the response time is determined by the 
interval in which the client side sends the request.
   
   We've done some experiments and the results shows in this situation, if 
client side sends requests in qps 50, then the client side response time is 
20ms, because the client side request sending interval is 20ms (1000/50). If 
client side qps is 40, the the response time is 25ms. If client side qps is 30, 
then the response time is 33ms.
   
   The response time increases when the qps decreases, until the qps is 25. 
Because when the interval is larger than 40ms(1000/25), tcp will cancel the 
delay ack mode.
   
   So I would hope this quick fix can be applied so that we could always expect 
a stable performance.
   
   BTW, I could submit a PR if necessary.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscr...@dubbo.apache.org
For additional commands, e-mail: notifications-h...@dubbo.apache.org

Reply via email to