Re: [vpp-dev] tx-errors on VPP controlled dpdk device
[Edited Message Follows] Hi Dave, It looks like there are significant drops on the receive (rx-miss errors) when I use one core. I do not see rx issues with two cores. But, the issue is occurring on the transmit side. Thanks Chakri -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#10018): https://lists.fd.io/g/vpp-dev/message/10018 Mute This Topic: https://lists.fd.io/mt/23982730/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [vpp-dev] tx-errors on VPP controlled dpdk device
Hi Dave, It looks like there are significant drops on the receive (rx-miss errors) when I use one core. I do not see rx issues with one core. But, the issue is occurring on the transmit side. Thanks Chakri -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#10018): https://lists.fd.io/g/vpp-dev/message/10018 Mute This Topic: https://lists.fd.io/mt/23982730/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [vpp-dev] tx-errors on VPP controlled dpdk device
Hi Damjan, > You likely need to utilise RSS on rx side to equally load your cores, Could you please let me know how I configure RSS? I do not know how I configure RSS because this device is given to vpp/dpdk. It is bound to igb driver. Is it specified in startup.conf or CLI command? Thanks Chakri -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#10017): https://lists.fd.io/g/vpp-dev/message/10017 Mute This Topic: https://lists.fd.io/mt/23982730/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [vpp-dev] tx-errors on VPP controlled dpdk device
Hi Yichen, Thanks for the response. I have issue on the transmit path. I think the cores are assigned properly in the configuration. Thanks Chakri -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#10016): https://lists.fd.io/g/vpp-dev/message/10016 Mute This Topic: https://lists.fd.io/mt/23982730/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [vpp-dev] tx-errors on VPP controlled dpdk device
+1, the aggregate RX rate seems to be around 12 KPPS, the vector rate is small. Absent I/O silliness, one core should handle this load with no problem. D. From: vpp-dev@lists.fd.io On Behalf Of Damjan Marion via Lists.Fd.Io Sent: Wednesday, August 1, 2018 4:27 PM To: chakravarthy.arise...@viasat.com Cc: vpp-dev@lists.fd.io Subject: Re: [vpp-dev] tx-errors on VPP controlled dpdk device In VPP packet stays on the same core where it is received in majority of cases. Handing over packet to different core is performance expensive process and we are trying to avoid it. You likely need to utilise RSS on rx side to equally load your cores, but in this specific case VPP is not overloaded, your vector rate is ~2 -- Damjan On 1 Aug 2018, at 20:22, chakravarthy.arise...@viasat.com<mailto:chakravarthy.arise...@viasat.com> wrote: Hi Damjan, Thanks for your feedback. I'm running the test in AWS instances. Thus, I have got only VFs. I do not have access to PF. So, I'm trying to get help from AWS to find out. Once I get the info, I'll post it over here. In the mean time, I looked at the counters that you suggested me to focus on. It looks like the packets are scheduled on only one core in transmit direction. Is there a way to change? I have 3 dedicated cores (1 main core thread for stats/mgmt and 2 cores for the worker threads). All the Tx queues are pinned to worker thread 1. So, worker thread 2 is not used for transmit path at all. Is there way to spread the transmit queues across the threads? Thanks Chakri vpp# sh threads ID NameTypeLWPSched Policy (Priority) lcore Core Socket State 0 vpp_main 1733other (0)1 1 0 1 vpp_wk_0workers 1745other (0)2 2 0 2 vpp_wk_1workers 1746other (0)3 3 0 3 stats 1747other (0) 0 0 0 vpp# sh run Thread 0 vpp_main (lcore 1) Time 5125.9, average vectors/node 0.00, last 128 main loops 0.00 per node 0.00 vector rates in 0.e0, out 0.e0, drop 0.e0, punt 0.e0 Name State Calls Vectors Suspends Clocks Vectors/Call api-rx-from-ringany wait 0 0 364 1.19e40.00 cdp-process any wait 0 0 992 1.98e30.00 dhcp-client-process any wait 0 0 51 3.41e30.00 dns-resolver-processany wait 0 0 5 4.06e30.00 dpdk-processany wait 0 0 1709 5.13e40.00 fib-walkany wait 0 0 2563 1.37e30.00 ikev2-manager-process any wait 0 0 5124 7.25e20.00 ip-route-resolver-process any wait 0 0 51 2.64e30.00 ip4-reassembly-expire-walk any wait 0 0 513 3.85e30.00 ip6-icmp-neighbor-discovery-ev any wait 0 0 5124 6.92e20.00 ip6-reassembly-expire-walk any wait 0 0 513 3.84e30.00 lisp-retry-service any wait 0 0 2563 1.57e30.00 memif-process any wait 0 0 1709 2.10e30.00 rd-cp-process any wait 0 0 237212380 3.21e20.00 unix-cli-local:17active 0 0 580 2.05e50.00 unix-epoll-input polling 96172305 0 0 1.19e40.00 vpe-oam-process any wait 0 0 2513 1.23e30.00 --- Thread 1 vpp_wk_0 (lcore 2) Time 5125.9, average vectors/node 4.82, last 128 main loops 0.00 per node 0.00 vector rates in 9.5578e3, out 8.4052e3, drop 0.e0, punt 0.e0 Name State Calls Vectors Suspends Clocks Vectors/Call VirtualFunctionEthernet0/6/0-o active 91 91 0 8.59e21.00 VirtualFunctionEthernet0/6/0-t active 91
Re: [vpp-dev] tx-errors on VPP controlled dpdk device
In VPP packet stays on the same core where it is received in majority of cases. Handing over packet to different core is performance expensive process and we are trying to avoid it. You likely need to utilise RSS on rx side to equally load your cores, but in this specific case VPP is not overloaded, your vector rate is ~2 -- Damjan > On 1 Aug 2018, at 20:22, chakravarthy.arise...@viasat.com wrote: > > Hi Damjan, > > Thanks for your feedback. I'm running the test in AWS instances. Thus, I have > got only VFs. I do not have access to PF. So, I'm trying to get help from AWS > to find out. > Once I get the info, I'll post it over here. In the mean time, I looked at > the counters that you suggested me to focus on. It looks like the packets are > scheduled on only one core in transmit direction. Is there a way to change? > > I have 3 dedicated cores (1 main core thread for stats/mgmt and 2 cores for > the worker threads). All the Tx queues are pinned to worker thread 1. So, > worker thread 2 is not used for transmit path at all. Is there way to spread > the transmit queues across the threads? > > Thanks > Chakri > > vpp# sh threads > ID NameTypeLWPSched Policy (Priority) > lcore Core Socket State > 0 vpp_main 1733other (0) > 1 1 0 > 1 vpp_wk_0workers 1745other (0)2 > 2 0 > 2 vpp_wk_1workers 1746other (0)3 > 3 0 > 3 stats 1747other (0) > 0 0 0 > > vpp# sh run > Thread 0 vpp_main (lcore 1) > Time 5125.9, average vectors/node 0.00, last 128 main loops 0.00 per node 0.00 > vector rates in 0.e0, out 0.e0, drop 0.e0, punt 0.e0 > Name State Calls Vectors > Suspends Clocks Vectors/Call > api-rx-from-ringany wait 0 0 > 364 1.19e40.00 > cdp-process any wait 0 0 > 992 1.98e30.00 > dhcp-client-process any wait 0 0 > 51 3.41e30.00 > dns-resolver-processany wait 0 0 >5 4.06e30.00 > dpdk-processany wait 0 0 > 1709 5.13e40.00 > fib-walkany wait 0 0 > 2563 1.37e30.00 > ikev2-manager-process any wait 0 0 > 5124 7.25e20.00 > ip-route-resolver-process any wait 0 0 > 51 2.64e30.00 > ip4-reassembly-expire-walk any wait 0 0 > 513 3.85e30.00 > ip6-icmp-neighbor-discovery-ev any wait 0 0 > 5124 6.92e20.00 > ip6-reassembly-expire-walk any wait 0 0 > 513 3.84e30.00 > lisp-retry-service any wait 0 0 > 2563 1.57e30.00 > memif-process any wait 0 0 > 1709 2.10e30.00 > rd-cp-process any wait 0 0 >237212380 3.21e20.00 > unix-cli-local:17active 0 0 > 580 2.05e50.00 > unix-epoll-input polling 96172305 0 >0 1.19e40.00 > vpe-oam-process any wait 0 0 > 2513 1.23e30.00 > --- > Thread 1 vpp_wk_0 (lcore 2) > Time 5125.9, average vectors/node 4.82, last 128 main loops 0.00 per node 0.00 > vector rates in 9.5578e3, out 8.4052e3, drop 0.e0, punt 0.e0 > Name State Calls Vectors > Suspends Clocks Vectors/Call > VirtualFunctionEthernet0/6/0-o active 91 91 >0 8.59e21.00 > VirtualFunctionEthernet0/6/0-t active 91 91 >0 2.82e31.00 > VirtualFunctionEthernet0/7/0-o active533416432661561 >0 4.33e16.12 > VirtualFunctionEthernet0/7/0-t active533416426753703 >0 3.83e2
Re: [vpp-dev] tx-errors on VPP controlled dpdk device
Hi, Chakri, You can change the VPP worker assignments by doing: vpp# show interface rx-placement vpp# set interface rx-placement ? set interface rx-placement set interface rx-placement [queue ] [worker | main] Thanks very much! Regards, Yichen From: on behalf of "chakravarthy.arise...@viasat.com" Date: Wednesday, August 1, 2018 at 11:22 AM To: "vpp-dev@lists.fd.io" Subject: Re: [vpp-dev] tx-errors on VPP controlled dpdk device Hi Damjan, Thanks for your feedback. I'm running the test in AWS instances. Thus, I have got only VFs. I do not have access to PF. So, I'm trying to get help from AWS to find out. Once I get the info, I'll post it over here. In the mean time, I looked at the counters that you suggested me to focus on. It looks like the packets are scheduled on only one core in transmit direction. Is there a way to change? I have 3 dedicated cores (1 main core thread for stats/mgmt and 2 cores for the worker threads). All the Tx queues are pinned to worker thread 1. So, worker thread 2 is not used for transmit path at all. Is there way to spread the transmit queues across the threads? Thanks Chakri vpp# sh threads ID NameTypeLWPSched Policy (Priority) lcore Core Socket State 0 vpp_main 1733other (0)1 1 0 1 vpp_wk_0workers 1745other (0)2 2 0 2 vpp_wk_1workers 1746other (0)3 3 0 3 stats 1747other (0) 0 0 0 vpp# sh run Thread 0 vpp_main (lcore 1) Time 5125.9, average vectors/node 0.00, last 128 main loops 0.00 per node 0.00 vector rates in 0.e0, out 0.e0, drop 0.e0, punt 0.e0 Name State Calls Vectors Suspends Clocks Vectors/Call api-rx-from-ringany wait 0 0 364 1.19e40.00 cdp-process any wait 0 0 992 1.98e30.00 dhcp-client-process any wait 0 0 51 3.41e30.00 dns-resolver-processany wait 0 0 5 4.06e30.00 dpdk-processany wait 0 0 1709 5.13e40.00 fib-walkany wait 0 0 2563 1.37e30.00 ikev2-manager-process any wait 0 0 5124 7.25e20.00 ip-route-resolver-process any wait 0 0 51 2.64e30.00 ip4-reassembly-expire-walk any wait 0 0 513 3.85e30.00 ip6-icmp-neighbor-discovery-ev any wait 0 0 5124 6.92e20.00 ip6-reassembly-expire-walk any wait 0 0 513 3.84e30.00 lisp-retry-service any wait 0 0 2563 1.57e30.00 memif-process any wait 0 0 1709 2.10e30.00 rd-cp-process any wait 0 0 237212380 3.21e20.00 unix-cli-local:17active 0 0 580 2.05e50.00 unix-epoll-input polling 96172305 0 0 1.19e40.00 vpe-oam-process any wait 0 0 2513 1.23e30.00 --- Thread 1 vpp_wk_0 (lcore 2) Time 5125.9, average vectors/node 4.82, last 128 main loops 0.00 per node 0.00 vector rates in 9.5578e3, out 8.4052e3, drop 0.e0, punt 0.e0 Name State Calls Vectors Suspends Clocks Vectors/Call VirtualFunctionEthernet0/6/0-o active 91 91 0 8.59e21.00 VirtualFunctionEthernet0/6/0-t active 91 91 0 2.82e31.00 VirtualFunctionEthernet0/7/0-o active533416432661561 0 4.33e16.12 VirtualFunctionEthernet0/7/0-t active533416426753703 0 3.83e25.02 arp-inputactive182 1
Re: [vpp-dev] tx-errors on VPP controlled dpdk device
Hi Damjan, Thanks for your feedback. I'm running the test in AWS instances. Thus, I have got only VFs. I do not have access to PF. So, I'm trying to get help from AWS to find out. Once I get the info, I'll post it over here. In the mean time, I looked at the counters that you suggested me to focus on. It looks like the packets are scheduled on only one core in transmit direction. Is there a way to change? I have 3 dedicated cores (1 main core thread for stats/mgmt and 2 cores for the worker threads). All the Tx queues are pinned to worker thread 1. So, worker thread 2 is not used for transmit path at all. Is there way to spread the transmit queues across the threads? Thanks Chakri vpp# sh threads ID Name Type LWP Sched Policy (Priority) lcore Core Socket State 0 vpp_main 1733 other (0) 1 1 0 1 vpp_wk_0 workers 1745 other (0) 2 2 0 2 vpp_wk_1 workers 1746 other (0) 3 3 0 3 stats 1747 other (0) 0 0 0 vpp# sh run Thread 0 vpp_main (lcore 1) Time 5125.9, average vectors/node 0.00, last 128 main loops 0.00 per node 0.00 vector rates in 0.e0, out 0.e0, drop 0.e0, punt 0.e0 Name State Calls Vectors Suspends Clocks Vectors/Call api-rx-from-ring any wait 0 0 364 1.19e4 0.00 cdp-process any wait 0 0 992 1.98e3 0.00 dhcp-client-process any wait 0 0 51 3.41e3 0.00 dns-resolver-process any wait 0 0 5 4.06e3 0.00 dpdk-process any wait 0 0 1709 5.13e4 0.00 fib-walk any wait 0 0 2563 1.37e3 0.00 ikev2-manager-process any wait 0 0 5124 7.25e2 0.00 ip-route-resolver-process any wait 0 0 51 2.64e3 0.00 ip4-reassembly-expire-walk any wait 0 0 513 3.85e3 0.00 ip6-icmp-neighbor-discovery-ev any wait 0 0 5124 6.92e2 0.00 ip6-reassembly-expire-walk any wait 0 0 513 3.84e3 0.00 lisp-retry-service any wait 0 0 2563 1.57e3 0.00 memif-process any wait 0 0 1709 2.10e3 0.00 rd-cp-process any wait 0 0 237212380 3.21e2 0.00 unix-cli-local:17 active 0 0 580 2.05e5 0.00 unix-epoll-input polling 96172305 0 0 1.19e4 0.00 vpe-oam-process any wait 0 0 2513 1.23e3 0.00 --- Thread 1 vpp_wk_0 (lcore 2) Time 5125.9, average vectors/node 4.82, last 128 main loops 0.00 per node 0.00 vector rates in 9.5578e3, out 8.4052e3, drop 0.e0, punt 0.e0 Name State Calls Vectors Suspends Clocks Vectors/Call VirtualFunctionEthernet0/6/0-o active 91 91 0 8.59e2 1.00 VirtualFunctionEthernet0/6/0-t active 91 91 0 2.82e3 1.00 VirtualFunctionEthernet0/7/0-o active 5334164 32661561 0 4.33e1 6.12 VirtualFunctionEthernet0/7/0-t active 5334164 26753703 0 3.83e2 5.02 arp-input active 182 182 0 7.25e3 1.00 dpdk-input polling 16550217513 16330917 0 4.05e5 0.00 ethernet-input active 5334255 32661652 0 7.97e1 6.12 interface-output active 182 182 0 6.58e2 1.00 ip4-input active 4685453 16330735
Re: [vpp-dev] tx-errors on VPP controlled dpdk device
Try to dump hardware counters with "show hardware" That my give you more information what's wrong... As you are using VF, my wild guess is that you may have promisc mode disabled. Try to dump PF details with 'ip link show dev XXX" -- Damjan > On 1 Aug 2018, at 00:52, chakravarthy.arise...@viasat.com wrote: > > Hi, > > When VPP is sending out the traffic through DPDK device, it encounters > transmit errors? Can someone shed some light what might be happening? > > Thanks > Chakri > > vpp# show int > Name Idx State Counter > Count > VirtualFunctionEthernet0/6/0 1 up rx packets > 21141847 > rx bytes > 33657724324 > tx packets > 62 > tx bytes >2604 > ip4 > 21141785 > VirtualFunctionEthernet0/7/0 2 up rx packets > 62 > rx bytes >2604 > tx packets > 21141847 > tx bytes > 33657724324 > tx-error > 3675066 > local00 up > loop1 3 up > loop2 5 up > memif1/1 7 up tx packets > 21141785 > tx bytes > 32600632470 > memif2/2 8 up rx packets > 21141785 > rx bytes > 32600632470 > vxlan_tunnel1 4 up rx packets > 21141785 > rx bytes > 32600632470 > vxlan_tunnel2 6 up tx packets > 21141785 > tx bytes > 33361736730 > vpp# sh error >CountNode Reason > 10570865 vxlan4-input good packets decapsulated > 21141785 vxlan4-encap good packets encapsulated > 31712650l2-output L2 output packets > 31712650l2-learnL2 learn packets > 31712650l2-inputL2 input packets >126arp-input ARP replies sent >3675066 VirtualFunctionEthernet0/7/0-txTx packet drops (dpdk tx > failure) > 10570920 vxlan4-input good packets decapsulated > 10570920l2-output L2 output packets > 10570920l2-learnL2 learn packets > 10570920l2-inputL2 input packets > > Thread 0 vpp_main (lcore 1) > Time 957448.6, average vectors/node 1.00, last 128 main loops 0.00 per node > 0.00 > vector rates in 0.e0, out 2.0889e-6, drop 0.e0, punt 0.e0 > Name State Calls Vectors > Suspends Clocks Vectors/Call > VirtualFunctionEthernet0/6/0-o active 1 1 >0 9.65e31.00 > VirtualFunctionEthernet0/6/0-t active 1 1 >0 2.03e41.00 > VirtualFunctionEthernet0/7/0-o active 1 1 >0 6.08e31.00 > VirtualFunctionEthernet0/7/0-t active 1 1 >0 1.11e41.00 > acl-plugin-fa-cleaner-process event wait0 0 >1 2.51e40.00 > admin-up-down-process event wait0 0 >1 1.12e30.00 > api-rx-from-ringany wait 0 0 >68101 1.24e40.00 > avf-processevent wait0 0 >1 6.97e30.00 > bfd-processevent wait0 0 >1 1.64e40.00 > cdp-process any wait 0 0 > 127877 2.62e30.00 > dhcp-client-process any wait 0 0 > 9575 3.55e30.00 > dns-
[vpp-dev] tx-errors on VPP controlled dpdk device
Hi, When VPP is sending out the traffic through DPDK device, it encounters transmit errors? Can someone shed some light what might be happening? Thanks Chakri vpp# show int Name Idx State Counter Count VirtualFunctionEthernet0/6/0 1 up rx packets 21141847 rx bytes 33657724324 tx packets 62 tx bytes 2604 ip4 21141785 VirtualFunctionEthernet0/7/0 2 up rx packets 62 rx bytes 2604 tx packets 21141847 tx bytes 33657724324 * tx-error 3675066* local0 0 up loop1 3 up loop2 5 up memif1/1 7 up tx packets 21141785 tx bytes 32600632470 memif2/2 8 up rx packets 21141785 rx bytes 32600632470 vxlan_tunnel1 4 up rx packets 21141785 rx bytes 32600632470 vxlan_tunnel2 6 up tx packets 21141785 tx bytes 33361736730 vpp# sh error Count Node Reason 10570865 vxlan4-input good packets decapsulated 21141785 vxlan4-encap good packets encapsulated 31712650 l2-output L2 output packets 31712650 l2-learn L2 learn packets 31712650 l2-input L2 input packets 126 arp-input ARP replies sent * 3675066 VirtualFunctionEthernet0/7/0-tx Tx packet drops (dpdk tx failure)* 10570920 vxlan4-input good packets decapsulated 10570920 l2-output L2 output packets 10570920 l2-learn L2 learn packets 10570920 l2-input L2 input packets Thread 0 vpp_main (lcore 1) Time 957448.6, average vectors/node 1.00, last 128 main loops 0.00 per node 0.00 vector rates in 0.e0, out 2.0889e-6, drop 0.e0, punt 0.e0 Name State Calls Vectors Suspends Clocks Vectors/Call VirtualFunctionEthernet0/6/0-o active 1 1 0 9.65e3 1.00 VirtualFunctionEthernet0/6/0-t active 1 1 0 2.03e4 1.00 VirtualFunctionEthernet0/7/0-o active 1 1 0 6.08e3 1.00 VirtualFunctionEthernet0/7/0-t active 1 1 0 1.11e4 1.00 acl-plugin-fa-cleaner-process event wait 0 0 1 2.51e4 0.00 admin-up-down-process event wait 0 0 1 1.12e3 0.00 api-rx-from-ring any wait 0 0 68101 1.24e4 0.00 avf-process event wait 0 0 1 6.97e3 0.00 bfd-process event wait 0 0 1 1.64e4 0.00 cdp-process any wait 0 0 127877 2.62e3 0.00 dhcp-client-process any wait 0 0 9575 3.55e3 0.00 dns-resolver-process any wait 0 0 958 4.02e3 0.00 dpdk-ipsec-process done 1 0 0 1.29e5 0.00 dpdk-process any wait 0 0 319126 5.14e4 0.00 fib-walk any wait 0 0 478648 1.65e3 0.00 flow-report