Re: [PATCH RESEND v2] vhost-net: set packet weight of tx polling to 2 * vq size

2018-04-09 Thread David Miller
From: haibinzhang(张海斌) 
Date: Mon, 9 Apr 2018 07:22:17 +

> handle_tx will delay rx for tens or even hundreds of milliseconds when tx busy
> polling udp packets with small length(e.g. 1byte udp payload), because setting
> VHOST_NET_WEIGHT takes into account only sent-bytes but no single packet 
> length.
> 
> Ping-Latencies shown below were tested between two Virtual Machines using
> netperf (UDP_STREAM, len=1), and then another machine pinged the client:
 ...
> Acked-by: Michael S. Tsirkin 
> Signed-off-by: Haibin Zhang 
> Signed-off-by: Yunfang Tai 
> Signed-off-by: Lidong Chen 

Applied, thank you.


[PATCH RESEND v2] vhost-net: set packet weight of tx polling to 2 * vq size

2018-04-09 Thread 张海斌
handle_tx will delay rx for tens or even hundreds of milliseconds when tx busy
polling udp packets with small length(e.g. 1byte udp payload), because setting
VHOST_NET_WEIGHT takes into account only sent-bytes but no single packet length.

Ping-Latencies shown below were tested between two Virtual Machines using
netperf (UDP_STREAM, len=1), and then another machine pinged the client:

vq size=256
Packet-Weight   Ping-Latencies(millisecond)
   min  avg   max
Origin   3.319   18.48957.303
64   1.6432.021 2.552
128  1.8252.600 3.224
256  1.9972.710 4.295
512  1.8603.171 4.631
1024 2.0024.173 9.056
2048 2.2575.650 9.688
4096 2.0938.50815.943

vq size=512
Packet-Weight   Ping-Latencies(millisecond)
   min  avg   max
Origin   6.537   29.17766.245
64   2.7983.614 4.403
128  2.8613.820 4.775
256  3.0084.018 4.807
512  3.2544.523 5.824
1024 3.0795.335 7.747
2048 3.9448.20112.762
4096 4.158   11.05719.985

Seems pretty consistent, a small dip at 2 VQ sizes.
Ring size is a hint from device about a burst size it can tolerate. Based on
benchmarks, set the weight to 2 * vq size.

To evaluate this change, another tests were done using netperf(RR, TX) between
two machines with Intel(R) Xeon(R) Gold 6133 CPU @ 2.50GHz, and vq size was
tweaked through qemu. Results shown below does not show obvious changes.

vq size=256 TCP_RRvq size=512 TCP_RR
size/sessions/+thu%/+normalize%   size/sessions/+thu%/+normalize%
   1/   1/  -7%/-2%  1/   1/   0%/-2%
   1/   4/  +1%/ 0%  1/   4/  +1%/ 0%
   1/   8/  +1%/-2%  1/   8/   0%/+1%
  64/   1/  -6%/ 0% 64/   1/  +7%/+3%
  64/   4/   0%/+2% 64/   4/  -1%/+1%
  64/   8/   0%/ 0% 64/   8/  -1%/-2%
 256/   1/  -3%/-4%256/   1/  -4%/-2%
 256/   4/  +3%/+4%256/   4/  +1%/+2%
 256/   8/  +2%/ 0%256/   8/  +1%/-1%

vq size=256 UDP_RRvq size=512 UDP_RR
size/sessions/+thu%/+normalize%   size/sessions/+thu%/+normalize%
   1/   1/  -5%/+1%  1/   1/  -3%/-2%
   1/   4/  +4%/+1%  1/   4/  -2%/+2%
   1/   8/  -1%/-1%  1/   8/  -1%/ 0%
  64/   1/  -2%/-3% 64/   1/  +1%/+1%
  64/   4/  -5%/-1% 64/   4/  +2%/ 0%
  64/   8/   0%/-1% 64/   8/  -2%/+1%
 256/   1/  +7%/+1%256/   1/  -7%/ 0%
 256/   4/  +1%/+1%256/   4/  -3%/-4%
 256/   8/  +2%/+2%256/   8/  +1%/+1%

vq size=256 TCP_STREAMvq size=512 TCP_STREAM
size/sessions/+thu%/+normalize%   size/sessions/+thu%/+normalize%
  64/   1/   0%/-3% 64/   1/   0%/ 0%
  64/   4/  +3%/-1% 64/   4/  -2%/+4%
  64/   8/  +9%/-4% 64/   8/  -1%/+2%
 256/   1/  +1%/-4%256/   1/  +1%/+1%
 256/   4/  -1%/-1%256/   4/  -3%/ 0%
 256/   8/  +7%/+5%256/   8/  -3%/ 0%
 512/   1/  +1%/ 0%512/   1/  -1%/-1%
 512/   4/  +1%/-1%512/   4/   0%/ 0%
 512/   8/  +7%/-5%512/   8/  +6%/-1%
1024/   1/   0%/-1%   1024/   1/   0%/+1%
1024/   4/  +3%/ 0%   1024/   4/  +1%/ 0%
1024/   8/  +8%/+5%   1024/   8/  -1%/ 0%
2048/   1/  +2%/+2%   2048/   1/  -1%/ 0%
2048/   4/  +1%/ 0%   2048/   4/   0%/-1%
2048/   8/  -2%/ 0%   2048/   8/   5%/-1%
4096/   1/  -2%/ 0%   4096/   1/  -2%/ 0%
4096/   4/  +2%/ 0%   4096/   4/   0%/ 0%
4096/   8/  +9%/-2%   4096/   8/  -5%/-1%

Acked-by: Michael S. Tsirkin 
Signed-off-by: Haibin Zhang 
Signed-off-by: Yunfang Tai 
Signed-off-by: Lidong Chen 
---
 drivers/vhost/net.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 8139bc70ad7d..3563a305cc0a 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -44,6 +44,10 @@ MODULE_PARM_DESC(experimental_zcopytx, "Enable Zero Copy TX;"
  * Using this limit prevents one virtqueue from starving others. */
 #define