Hi all,

Any help/ideas on how we can have a better performance using multi-cores is
appreciated.

Thanks,

Pragash Vijayaragavan
Grad Student at Rochester Institute of Technology
email : pxv3...@rit.edu
ph : 585 764 4662


On Mon, Nov 6, 2017 at 8:10 AM, Pragash Vijayaragavan <pxv3...@g.rit.edu>
wrote:

> Ok now i provisioned 4 rx queues for 4 worker threads and yea all workers
> are processing traffic, but the lookup rate has dropped, i am getting low
> packets than when it was 2 workers.
>
> I tried configuring 4 tx queues as well, still same problem (low packets
> received compared to 2 workers).
>
>
>
> Thanks,
>
> Pragash Vijayaragavan
> Grad Student at Rochester Institute of Technology
> email : pxv3...@rit.edu
> ph : 585 764 4662 <(585)%20764-4662>
>
>
> On Mon, Nov 6, 2017 at 8:00 AM, Pragash Vijayaragavan <pxv3...@g.rit.edu>
> wrote:
>
>> Just 1, let me change it to 2 may be 3 and get back to you.
>>
>> Thanks,
>>
>> Pragash Vijayaragavan
>> Grad Student at Rochester Institute of Technology
>> email : pxv3...@rit.edu
>> ph : 585 764 4662 <(585)%20764-4662>
>>
>>
>> On Mon, Nov 6, 2017 at 7:48 AM, Dave Barach (dbarach) <dbar...@cisco.com>
>> wrote:
>>
>>> How many RX queues did you provision? One per worker, or no supper...
>>>
>>>
>>>
>>> Thanks… Dave
>>>
>>>
>>>
>>> *From:* Pragash Vijayaragavan [mailto:pxv3...@rit.edu]
>>> *Sent:* Monday, November 6, 2017 7:36 AM
>>>
>>> *To:* Dave Barach (dbarach) <dbar...@cisco.com>
>>> *Cc:* vpp-dev@lists.fd.io; John Marshall (jwm) <j...@cisco.com>; Neale
>>> Ranns (nranns) <nra...@cisco.com>; Minseok Kwon <mxk...@rit.edu>
>>> *Subject:* Re: multi-core multi-threading performance
>>>
>>>
>>>
>>> Hi Dave,
>>>
>>>
>>>
>>> As per your suggestion i tried sending different traffic and i could
>>> notice that, 1 worker acts per port (hardware NIC)
>>>
>>>
>>>
>>> Is it true that multiple workers cannot work on same port at the same
>>> time?
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> Thanks,
>>>
>>>
>>>
>>> Pragash Vijayaragavan
>>>
>>> Grad Student at Rochester Institute of Technology
>>>
>>> email : pxv3...@rit.edu
>>>
>>> ph : 585 764 4662 <(585)%20764-4662>
>>>
>>>
>>>
>>>
>>>
>>> On Mon, Nov 6, 2017 at 7:13 AM, Pragash Vijayaragavan <pxv3...@g.rit.edu>
>>> wrote:
>>>
>>> Thanks Dave,
>>>
>>>
>>>
>>> let me try it out real quick and get back to you.
>>>
>>>
>>> Thanks,
>>>
>>>
>>>
>>> Pragash Vijayaragavan
>>>
>>> Grad Student at Rochester Institute of Technology
>>>
>>> email : pxv3...@rit.edu
>>>
>>> ph : 585 764 4662 <(585)%20764-4662>
>>>
>>>
>>>
>>>
>>>
>>> On Mon, Nov 6, 2017 at 7:11 AM, Dave Barach (dbarach) <dbar...@cisco.com>
>>> wrote:
>>>
>>> Incrementing / random src/dst addr/port....
>>>
>>>
>>>
>>> Thanks… Dave
>>>
>>>
>>>
>>> *From:* Pragash Vijayaragavan [mailto:pxv3...@rit.edu]
>>> *Sent:* Monday, November 6, 2017 7:06 AM
>>> *To:* Dave Barach (dbarach) <dbar...@cisco.com>
>>> *Cc:* vpp-dev@lists.fd.io; John Marshall (jwm) <j...@cisco.com>; Neale
>>> Ranns (nranns) <nra...@cisco.com>; Minseok Kwon <mxk...@rit.edu>
>>> *Subject:* Re: multi-core multi-threading performance
>>>
>>>
>>>
>>> Hi Dave,
>>>
>>>
>>>
>>> Thanks for the mail
>>>
>>>
>>>
>>> a "show run" command shows dpdk-input process on 2 of the workers but
>>> the ip6-lookup process is running only on 1 worker.
>>>
>>>
>>>
>>> What config should be done to make all threads process traffic.
>>>
>>>
>>>
>>> This is for 4 workers and 1 main core.
>>>
>>>
>>>
>>> Pasted output :
>>>
>>>
>>>
>>>
>>>
>>> vpp# sh run
>>>
>>> Thread 0 vpp_main (lcore 1)
>>>
>>> Time 7.5, average vectors/node 0.00, last 128 main loops 0.00 per node
>>> 0.00
>>>
>>>   vector rates in 0.0000e0, out 0.0000e0, drop 0.0000e0, punt 0.0000e0
>>>
>>>              Name                 State         Calls          Vectors
>>>       Suspends         Clocks       Vectors/Call
>>>
>>> acl-plugin-fa-cleaner-process   any wait                 0
>>>  0              15          4.97e3            0.00
>>>
>>> api-rx-from-ring                 active                  0
>>>  0              79          1.07e5            0.00
>>>
>>> cdp-process                     any wait                 0
>>>  0               3          2.65e3            0.00
>>>
>>> dpdk-process                    any wait                 0
>>>  0               2          6.77e7            0.00
>>>
>>> fib-walk                        any wait                 0
>>>  0            7474          6.74e2            0.00
>>>
>>> gmon-process                    time wait                0
>>>  0               1          4.24e3            0.00
>>>
>>> ikev2-manager-process           any wait                 0
>>>  0               7          7.04e3            0.00
>>>
>>> ip6-icmp-neighbor-discovery-ev  any wait                 0
>>>  0               7          4.67e3            0.00
>>>
>>> lisp-retry-service              any wait                 0
>>>  0               3          7.21e3            0.00
>>>
>>> unix-epoll-input                 polling          21655148
>>>  0               0          5.43e2            0.00
>>>
>>> vpe-oam-process                 any wait                 0
>>>  0               4          5.28e3            0.00
>>>
>>> ---------------
>>>
>>> Thread 1 vpp_wk_0 (lcore 2)
>>>
>>> Time 7.5, average vectors/node 255.99, last 128 main loops 14.00 per
>>> node 256.00
>>>
>>>   vector rates in 4.1903e6, out 4.1903e6, drop 0.0000e0, punt 0.0000e0
>>>
>>>              Name                 State         Calls          Vectors
>>>       Suspends         Clocks       Vectors/Call
>>>
>>> FortyGigabitEthernet4/0/0-outp   active             123334
>>> 31572992               0          6.58e0          255.99
>>>
>>> FortyGigabitEthernet4/0/0-tx     active             123334
>>> 31572992               0          7.20e1          255.99
>>>
>>> dpdk-input                       polling            124347
>>> 31572992               0          5.49e1          253.91
>>>
>>> ip6-input                        active             123334
>>> 31572992               0          2.28e1          255.99
>>>
>>> ip6-load-balance                 active             123334
>>> 31572992               0          1.61e1          255.99
>>>
>>> ip6-lookup                       active             123334
>>> 31572992               0          3.77e2          255.99
>>>
>>> ip6-rewrite                      active             123334
>>> 31572992               0          2.02e1          255.99
>>>
>>> ---------------
>>>
>>> Thread 2 vpp_wk_1 (lcore 3)
>>>
>>> Time 7.5, average vectors/node 0.00, last 128 main loops 0.00 per node
>>> 0.00
>>>
>>>   vector rates in 0.0000e0, out 0.0000e0, drop 0.0000e0, punt 0.0000e0
>>>
>>>              Name                 State         Calls          Vectors
>>>       Suspends         Clocks       Vectors/Call
>>>
>>> dpdk-input                       polling          83188682
>>>  0               0          1.11e2            0.00
>>>
>>> ---------------
>>>
>>> Thread 3 vpp_wk_2 (lcore 18)
>>>
>>> Time 7.5, average vectors/node 0.00, last 128 main loops 0.00 per node
>>> 0.00
>>>
>>>   vector rates in 0.0000e0, out 0.0000e0, drop 0.0000e0, punt 0.0000e0
>>>
>>>              Name                 State         Calls          Vectors
>>>       Suspends         Clocks       Vectors/Call
>>>
>>> ---------------
>>>
>>> Thread 4 vpp_wk_3 (lcore 19)
>>>
>>> Time 7.5, average vectors/node 0.00, last 128 main loops 0.00 per node
>>> 0.00
>>>
>>>   vector rates in 0.0000e0, out 0.0000e0, drop 0.0000e0, punt 0.0000e0
>>>
>>>              Name                 State         Calls          Vectors
>>>       Suspends         Clocks       Vectors/Call
>>>
>>>
>>>
>>>
>>> Thanks,
>>>
>>>
>>>
>>> Pragash Vijayaragavan
>>>
>>> Grad Student at Rochester Institute of Technology
>>>
>>> email : pxv3...@rit.edu
>>>
>>> ph : 585 764 4662 <(585)%20764-4662>
>>>
>>>
>>>
>>>
>>>
>>> On Mon, Nov 6, 2017 at 6:47 AM, Dave Barach (dbarach) <dbar...@cisco.com>
>>> wrote:
>>>
>>> Have you verified that all of the worker threads are processing traffic?
>>> Sufficiently poor RSS statistics could mean - in the limit - that only one
>>> worker thread is processing traffic.
>>>
>>>
>>>
>>> Thanks… Dave
>>>
>>>
>>>
>>> *From:* Pragash Vijayaragavan [mailto:pxv3...@rit.edu]
>>> *Sent:* Sunday, November 5, 2017 10:03 PM
>>> *To:* vpp-dev@lists.fd.io
>>> *Cc:* John Marshall (jwm) <j...@cisco.com>; Neale Ranns (nranns) <
>>> nra...@cisco.com>; Dave Barach (dbarach) <dbar...@cisco.com>; Minseok
>>> Kwon <mxk...@rit.edu>
>>> *Subject:* multi-core multi-threading performance
>>>
>>>
>>>
>>> Hi ,
>>>
>>>
>>>
>>> We are measuring performance of ip6 lookup in multi-core multi-worker
>>> environments and
>>>
>>> we don't see good scaling of performance when we keep increasing the
>>> number of cores/workers.
>>>
>>>
>>>
>>> We are just changing the startup.conf file to create more workers,
>>> rx-queues, sock-mem etc. Should we do anything else to see an increase in
>>> performance.
>>>
>>>
>>>
>>> Is there a limitation on the performance even if we increase the number
>>> of workers.
>>>
>>>
>>>
>>> Is it dependent on the number of hardware NICs we have, we only have 1
>>> NIC to receive the traffic.
>>>
>>>
>>>
>>>
>>>
>>> TIA,
>>>
>>>
>>> Thanks,
>>>
>>>
>>>
>>> Pragash Vijayaragavan
>>>
>>> Grad Student at Rochester Institute of Technology
>>>
>>> email : pxv3...@rit.edu
>>>
>>> ph : 585 764 4662 <(585)%20764-4662>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>
>>
>
_______________________________________________
vpp-dev mailing list
vpp-dev@lists.fd.io
https://lists.fd.io/mailman/listinfo/vpp-dev

Reply via email to