Hi all, Any help/ideas on how we can have a better performance using multi-cores is appreciated.
Thanks, Pragash Vijayaragavan Grad Student at Rochester Institute of Technology email : pxv3...@rit.edu ph : 585 764 4662 On Mon, Nov 6, 2017 at 8:10 AM, Pragash Vijayaragavan <pxv3...@g.rit.edu> wrote: > Ok now i provisioned 4 rx queues for 4 worker threads and yea all workers > are processing traffic, but the lookup rate has dropped, i am getting low > packets than when it was 2 workers. > > I tried configuring 4 tx queues as well, still same problem (low packets > received compared to 2 workers). > > > > Thanks, > > Pragash Vijayaragavan > Grad Student at Rochester Institute of Technology > email : pxv3...@rit.edu > ph : 585 764 4662 <(585)%20764-4662> > > > On Mon, Nov 6, 2017 at 8:00 AM, Pragash Vijayaragavan <pxv3...@g.rit.edu> > wrote: > >> Just 1, let me change it to 2 may be 3 and get back to you. >> >> Thanks, >> >> Pragash Vijayaragavan >> Grad Student at Rochester Institute of Technology >> email : pxv3...@rit.edu >> ph : 585 764 4662 <(585)%20764-4662> >> >> >> On Mon, Nov 6, 2017 at 7:48 AM, Dave Barach (dbarach) <dbar...@cisco.com> >> wrote: >> >>> How many RX queues did you provision? One per worker, or no supper... >>> >>> >>> >>> Thanks… Dave >>> >>> >>> >>> *From:* Pragash Vijayaragavan [mailto:pxv3...@rit.edu] >>> *Sent:* Monday, November 6, 2017 7:36 AM >>> >>> *To:* Dave Barach (dbarach) <dbar...@cisco.com> >>> *Cc:* vpp-dev@lists.fd.io; John Marshall (jwm) <j...@cisco.com>; Neale >>> Ranns (nranns) <nra...@cisco.com>; Minseok Kwon <mxk...@rit.edu> >>> *Subject:* Re: multi-core multi-threading performance >>> >>> >>> >>> Hi Dave, >>> >>> >>> >>> As per your suggestion i tried sending different traffic and i could >>> notice that, 1 worker acts per port (hardware NIC) >>> >>> >>> >>> Is it true that multiple workers cannot work on same port at the same >>> time? >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> Thanks, >>> >>> >>> >>> Pragash Vijayaragavan >>> >>> Grad Student at Rochester Institute of Technology >>> >>> email : pxv3...@rit.edu >>> >>> ph : 585 764 4662 <(585)%20764-4662> >>> >>> >>> >>> >>> >>> On Mon, Nov 6, 2017 at 7:13 AM, Pragash Vijayaragavan <pxv3...@g.rit.edu> >>> wrote: >>> >>> Thanks Dave, >>> >>> >>> >>> let me try it out real quick and get back to you. >>> >>> >>> Thanks, >>> >>> >>> >>> Pragash Vijayaragavan >>> >>> Grad Student at Rochester Institute of Technology >>> >>> email : pxv3...@rit.edu >>> >>> ph : 585 764 4662 <(585)%20764-4662> >>> >>> >>> >>> >>> >>> On Mon, Nov 6, 2017 at 7:11 AM, Dave Barach (dbarach) <dbar...@cisco.com> >>> wrote: >>> >>> Incrementing / random src/dst addr/port.... >>> >>> >>> >>> Thanks… Dave >>> >>> >>> >>> *From:* Pragash Vijayaragavan [mailto:pxv3...@rit.edu] >>> *Sent:* Monday, November 6, 2017 7:06 AM >>> *To:* Dave Barach (dbarach) <dbar...@cisco.com> >>> *Cc:* vpp-dev@lists.fd.io; John Marshall (jwm) <j...@cisco.com>; Neale >>> Ranns (nranns) <nra...@cisco.com>; Minseok Kwon <mxk...@rit.edu> >>> *Subject:* Re: multi-core multi-threading performance >>> >>> >>> >>> Hi Dave, >>> >>> >>> >>> Thanks for the mail >>> >>> >>> >>> a "show run" command shows dpdk-input process on 2 of the workers but >>> the ip6-lookup process is running only on 1 worker. >>> >>> >>> >>> What config should be done to make all threads process traffic. >>> >>> >>> >>> This is for 4 workers and 1 main core. >>> >>> >>> >>> Pasted output : >>> >>> >>> >>> >>> >>> vpp# sh run >>> >>> Thread 0 vpp_main (lcore 1) >>> >>> Time 7.5, average vectors/node 0.00, last 128 main loops 0.00 per node >>> 0.00 >>> >>> vector rates in 0.0000e0, out 0.0000e0, drop 0.0000e0, punt 0.0000e0 >>> >>> Name State Calls Vectors >>> Suspends Clocks Vectors/Call >>> >>> acl-plugin-fa-cleaner-process any wait 0 >>> 0 15 4.97e3 0.00 >>> >>> api-rx-from-ring active 0 >>> 0 79 1.07e5 0.00 >>> >>> cdp-process any wait 0 >>> 0 3 2.65e3 0.00 >>> >>> dpdk-process any wait 0 >>> 0 2 6.77e7 0.00 >>> >>> fib-walk any wait 0 >>> 0 7474 6.74e2 0.00 >>> >>> gmon-process time wait 0 >>> 0 1 4.24e3 0.00 >>> >>> ikev2-manager-process any wait 0 >>> 0 7 7.04e3 0.00 >>> >>> ip6-icmp-neighbor-discovery-ev any wait 0 >>> 0 7 4.67e3 0.00 >>> >>> lisp-retry-service any wait 0 >>> 0 3 7.21e3 0.00 >>> >>> unix-epoll-input polling 21655148 >>> 0 0 5.43e2 0.00 >>> >>> vpe-oam-process any wait 0 >>> 0 4 5.28e3 0.00 >>> >>> --------------- >>> >>> Thread 1 vpp_wk_0 (lcore 2) >>> >>> Time 7.5, average vectors/node 255.99, last 128 main loops 14.00 per >>> node 256.00 >>> >>> vector rates in 4.1903e6, out 4.1903e6, drop 0.0000e0, punt 0.0000e0 >>> >>> Name State Calls Vectors >>> Suspends Clocks Vectors/Call >>> >>> FortyGigabitEthernet4/0/0-outp active 123334 >>> 31572992 0 6.58e0 255.99 >>> >>> FortyGigabitEthernet4/0/0-tx active 123334 >>> 31572992 0 7.20e1 255.99 >>> >>> dpdk-input polling 124347 >>> 31572992 0 5.49e1 253.91 >>> >>> ip6-input active 123334 >>> 31572992 0 2.28e1 255.99 >>> >>> ip6-load-balance active 123334 >>> 31572992 0 1.61e1 255.99 >>> >>> ip6-lookup active 123334 >>> 31572992 0 3.77e2 255.99 >>> >>> ip6-rewrite active 123334 >>> 31572992 0 2.02e1 255.99 >>> >>> --------------- >>> >>> Thread 2 vpp_wk_1 (lcore 3) >>> >>> Time 7.5, average vectors/node 0.00, last 128 main loops 0.00 per node >>> 0.00 >>> >>> vector rates in 0.0000e0, out 0.0000e0, drop 0.0000e0, punt 0.0000e0 >>> >>> Name State Calls Vectors >>> Suspends Clocks Vectors/Call >>> >>> dpdk-input polling 83188682 >>> 0 0 1.11e2 0.00 >>> >>> --------------- >>> >>> Thread 3 vpp_wk_2 (lcore 18) >>> >>> Time 7.5, average vectors/node 0.00, last 128 main loops 0.00 per node >>> 0.00 >>> >>> vector rates in 0.0000e0, out 0.0000e0, drop 0.0000e0, punt 0.0000e0 >>> >>> Name State Calls Vectors >>> Suspends Clocks Vectors/Call >>> >>> --------------- >>> >>> Thread 4 vpp_wk_3 (lcore 19) >>> >>> Time 7.5, average vectors/node 0.00, last 128 main loops 0.00 per node >>> 0.00 >>> >>> vector rates in 0.0000e0, out 0.0000e0, drop 0.0000e0, punt 0.0000e0 >>> >>> Name State Calls Vectors >>> Suspends Clocks Vectors/Call >>> >>> >>> >>> >>> Thanks, >>> >>> >>> >>> Pragash Vijayaragavan >>> >>> Grad Student at Rochester Institute of Technology >>> >>> email : pxv3...@rit.edu >>> >>> ph : 585 764 4662 <(585)%20764-4662> >>> >>> >>> >>> >>> >>> On Mon, Nov 6, 2017 at 6:47 AM, Dave Barach (dbarach) <dbar...@cisco.com> >>> wrote: >>> >>> Have you verified that all of the worker threads are processing traffic? >>> Sufficiently poor RSS statistics could mean - in the limit - that only one >>> worker thread is processing traffic. >>> >>> >>> >>> Thanks… Dave >>> >>> >>> >>> *From:* Pragash Vijayaragavan [mailto:pxv3...@rit.edu] >>> *Sent:* Sunday, November 5, 2017 10:03 PM >>> *To:* vpp-dev@lists.fd.io >>> *Cc:* John Marshall (jwm) <j...@cisco.com>; Neale Ranns (nranns) < >>> nra...@cisco.com>; Dave Barach (dbarach) <dbar...@cisco.com>; Minseok >>> Kwon <mxk...@rit.edu> >>> *Subject:* multi-core multi-threading performance >>> >>> >>> >>> Hi , >>> >>> >>> >>> We are measuring performance of ip6 lookup in multi-core multi-worker >>> environments and >>> >>> we don't see good scaling of performance when we keep increasing the >>> number of cores/workers. >>> >>> >>> >>> We are just changing the startup.conf file to create more workers, >>> rx-queues, sock-mem etc. Should we do anything else to see an increase in >>> performance. >>> >>> >>> >>> Is there a limitation on the performance even if we increase the number >>> of workers. >>> >>> >>> >>> Is it dependent on the number of hardware NICs we have, we only have 1 >>> NIC to receive the traffic. >>> >>> >>> >>> >>> >>> TIA, >>> >>> >>> Thanks, >>> >>> >>> >>> Pragash Vijayaragavan >>> >>> Grad Student at Rochester Institute of Technology >>> >>> email : pxv3...@rit.edu >>> >>> ph : 585 764 4662 <(585)%20764-4662> >>> >>> >>> >>> >>> >>> >>> >>> >>> >> >> >
_______________________________________________ vpp-dev mailing list vpp-dev@lists.fd.io https://lists.fd.io/mailman/listinfo/vpp-dev