Please write up what you’ve done, and provide a pointer to your code. Thanks… Dave
From: Pragash Vijayaragavan [mailto:pxv3...@rit.edu] Sent: Wednesday, November 8, 2017 1:19 AM To: Dave Barach (dbarach) <dbar...@cisco.com> Cc: vpp-dev@lists.fd.io; John Marshall (jwm) <j...@cisco.com>; Neale Ranns (nranns) <nra...@cisco.com>; Minseok Kwon <mxk...@rit.edu> Subject: Re: multi-core multi-threading performance Hi all, Any help/ideas on how we can have a better performance using multi-cores is appreciated. Thanks, Pragash Vijayaragavan Grad Student at Rochester Institute of Technology email : pxv3...@rit.edu<mailto:pxv3...@rit.edu> ph : 585 764 4662 On Mon, Nov 6, 2017 at 8:10 AM, Pragash Vijayaragavan <pxv3...@g.rit.edu<mailto:pxv3...@g.rit.edu>> wrote: Ok now i provisioned 4 rx queues for 4 worker threads and yea all workers are processing traffic, but the lookup rate has dropped, i am getting low packets than when it was 2 workers. I tried configuring 4 tx queues as well, still same problem (low packets received compared to 2 workers). Thanks, Pragash Vijayaragavan Grad Student at Rochester Institute of Technology email : pxv3...@rit.edu<mailto:pxv3...@rit.edu> ph : 585 764 4662<tel:(585)%20764-4662> On Mon, Nov 6, 2017 at 8:00 AM, Pragash Vijayaragavan <pxv3...@g.rit.edu<mailto:pxv3...@g.rit.edu>> wrote: Just 1, let me change it to 2 may be 3 and get back to you. Thanks, Pragash Vijayaragavan Grad Student at Rochester Institute of Technology email : pxv3...@rit.edu<mailto:pxv3...@rit.edu> ph : 585 764 4662<tel:(585)%20764-4662> On Mon, Nov 6, 2017 at 7:48 AM, Dave Barach (dbarach) <dbar...@cisco.com<mailto:dbar...@cisco.com>> wrote: How many RX queues did you provision? One per worker, or no supper... Thanks… Dave From: Pragash Vijayaragavan [mailto:pxv3...@rit.edu<mailto:pxv3...@rit.edu>] Sent: Monday, November 6, 2017 7:36 AM To: Dave Barach (dbarach) <dbar...@cisco.com<mailto:dbar...@cisco.com>> Cc: vpp-dev@lists.fd.io<mailto:vpp-dev@lists.fd.io>; John Marshall (jwm) <j...@cisco.com<mailto:j...@cisco.com>>; Neale Ranns (nranns) <nra...@cisco.com<mailto:nra...@cisco.com>>; Minseok Kwon <mxk...@rit.edu<mailto:mxk...@rit.edu>> Subject: Re: multi-core multi-threading performance Hi Dave, As per your suggestion i tried sending different traffic and i could notice that, 1 worker acts per port (hardware NIC) Is it true that multiple workers cannot work on same port at the same time? Thanks, Pragash Vijayaragavan Grad Student at Rochester Institute of Technology email : pxv3...@rit.edu<mailto:pxv3...@rit.edu> ph : 585 764 4662<tel:(585)%20764-4662> On Mon, Nov 6, 2017 at 7:13 AM, Pragash Vijayaragavan <pxv3...@g.rit.edu<mailto:pxv3...@g.rit.edu>> wrote: Thanks Dave, let me try it out real quick and get back to you. Thanks, Pragash Vijayaragavan Grad Student at Rochester Institute of Technology email : pxv3...@rit.edu<mailto:pxv3...@rit.edu> ph : 585 764 4662<tel:(585)%20764-4662> On Mon, Nov 6, 2017 at 7:11 AM, Dave Barach (dbarach) <dbar...@cisco.com<mailto:dbar...@cisco.com>> wrote: Incrementing / random src/dst addr/port.... Thanks… Dave From: Pragash Vijayaragavan [mailto:pxv3...@rit.edu<mailto:pxv3...@rit.edu>] Sent: Monday, November 6, 2017 7:06 AM To: Dave Barach (dbarach) <dbar...@cisco.com<mailto:dbar...@cisco.com>> Cc: vpp-dev@lists.fd.io<mailto:vpp-dev@lists.fd.io>; John Marshall (jwm) <j...@cisco.com<mailto:j...@cisco.com>>; Neale Ranns (nranns) <nra...@cisco.com<mailto:nra...@cisco.com>>; Minseok Kwon <mxk...@rit.edu<mailto:mxk...@rit.edu>> Subject: Re: multi-core multi-threading performance Hi Dave, Thanks for the mail a "show run" command shows dpdk-input process on 2 of the workers but the ip6-lookup process is running only on 1 worker. What config should be done to make all threads process traffic. This is for 4 workers and 1 main core. Pasted output : vpp# sh run Thread 0 vpp_main (lcore 1) Time 7.5, average vectors/node 0.00, last 128 main loops 0.00 per node 0.00 vector rates in 0.0000e0, out 0.0000e0, drop 0.0000e0, punt 0.0000e0 Name State Calls Vectors Suspends Clocks Vectors/Call acl-plugin-fa-cleaner-process any wait 0 0 15 4.97e3 0.00 api-rx-from-ring active 0 0 79 1.07e5 0.00 cdp-process any wait 0 0 3 2.65e3 0.00 dpdk-process any wait 0 0 2 6.77e7 0.00 fib-walk any wait 0 0 7474 6.74e2 0.00 gmon-process time wait 0 0 1 4.24e3 0.00 ikev2-manager-process any wait 0 0 7 7.04e3 0.00 ip6-icmp-neighbor-discovery-ev any wait 0 0 7 4.67e3 0.00 lisp-retry-service any wait 0 0 3 7.21e3 0.00 unix-epoll-input polling 21655148 0 0 5.43e2 0.00 vpe-oam-process any wait 0 0 4 5.28e3 0.00 --------------- Thread 1 vpp_wk_0 (lcore 2) Time 7.5, average vectors/node 255.99, last 128 main loops 14.00 per node 256.00 vector rates in 4.1903e6, out 4.1903e6, drop 0.0000e0, punt 0.0000e0 Name State Calls Vectors Suspends Clocks Vectors/Call FortyGigabitEthernet4/0/0-outp active 123334 31572992 0 6.58e0 255.99 FortyGigabitEthernet4/0/0-tx active 123334 31572992 0 7.20e1 255.99 dpdk-input polling 124347 31572992 0 5.49e1 253.91 ip6-input active 123334 31572992 0 2.28e1 255.99 ip6-load-balance active 123334 31572992 0 1.61e1 255.99 ip6-lookup active 123334 31572992 0 3.77e2 255.99 ip6-rewrite active 123334 31572992 0 2.02e1 255.99 --------------- Thread 2 vpp_wk_1 (lcore 3) Time 7.5, average vectors/node 0.00, last 128 main loops 0.00 per node 0.00 vector rates in 0.0000e0, out 0.0000e0, drop 0.0000e0, punt 0.0000e0 Name State Calls Vectors Suspends Clocks Vectors/Call dpdk-input polling 83188682 0 0 1.11e2 0.00 --------------- Thread 3 vpp_wk_2 (lcore 18) Time 7.5, average vectors/node 0.00, last 128 main loops 0.00 per node 0.00 vector rates in 0.0000e0, out 0.0000e0, drop 0.0000e0, punt 0.0000e0 Name State Calls Vectors Suspends Clocks Vectors/Call --------------- Thread 4 vpp_wk_3 (lcore 19) Time 7.5, average vectors/node 0.00, last 128 main loops 0.00 per node 0.00 vector rates in 0.0000e0, out 0.0000e0, drop 0.0000e0, punt 0.0000e0 Name State Calls Vectors Suspends Clocks Vectors/Call Thanks, Pragash Vijayaragavan Grad Student at Rochester Institute of Technology email : pxv3...@rit.edu<mailto:pxv3...@rit.edu> ph : 585 764 4662<tel:(585)%20764-4662> On Mon, Nov 6, 2017 at 6:47 AM, Dave Barach (dbarach) <dbar...@cisco.com<mailto:dbar...@cisco.com>> wrote: Have you verified that all of the worker threads are processing traffic? Sufficiently poor RSS statistics could mean - in the limit - that only one worker thread is processing traffic. Thanks… Dave From: Pragash Vijayaragavan [mailto:pxv3...@rit.edu<mailto:pxv3...@rit.edu>] Sent: Sunday, November 5, 2017 10:03 PM To: vpp-dev@lists.fd.io<mailto:vpp-dev@lists.fd.io> Cc: John Marshall (jwm) <j...@cisco.com<mailto:j...@cisco.com>>; Neale Ranns (nranns) <nra...@cisco.com<mailto:nra...@cisco.com>>; Dave Barach (dbarach) <dbar...@cisco.com<mailto:dbar...@cisco.com>>; Minseok Kwon <mxk...@rit.edu<mailto:mxk...@rit.edu>> Subject: multi-core multi-threading performance Hi , We are measuring performance of ip6 lookup in multi-core multi-worker environments and we don't see good scaling of performance when we keep increasing the number of cores/workers. We are just changing the startup.conf file to create more workers, rx-queues, sock-mem etc. Should we do anything else to see an increase in performance. Is there a limitation on the performance even if we increase the number of workers. Is it dependent on the number of hardware NICs we have, we only have 1 NIC to receive the traffic. TIA, Thanks, Pragash Vijayaragavan Grad Student at Rochester Institute of Technology email : pxv3...@rit.edu<mailto:pxv3...@rit.edu> ph : 585 764 4662<tel:(585)%20764-4662>
_______________________________________________ vpp-dev mailing list vpp-dev@lists.fd.io https://lists.fd.io/mailman/listinfo/vpp-dev