Hi, In below log, we can clearly see packets are dropped between counter rx_unicast_packets and rx_good_packets But there is not any error/miss counter tell why/where packet is dropped. Is this a known bug/limitation of Mellanox card? Any suggestion?
Counter in test center(traffic generator): Tx count: 617496152 Rx count: 617475672 Drop: 20480 testpmd started with: dpdk-testpmd -l "2,3" --legacy-mem --socket-mem "5000,0" -a 0000:03:07.0 -- -i --nb-cores=1 --portmask=0x1 --rxd=512 --txd=512 testpmd> port stop 0 testpmd> vlan set filter on 0 testpmd> rx_vlan add 767 0 testpmd> port start 0 testpmd> set fwd 5tswap testpmd> start testpmd> show fwd stats all ---------------------- Forward statistics for port 0 ---------------------- RX-packets: 617475727 RX-dropped: 0 RX-total: 617475727 TX-packets: 617475727 TX-dropped: 0 TX-total: 617475727 ---------------------------------------------------------------------------- +++++++++++++++ Accumulated forward statistics for all ports+++++++++++++++ RX-packets: 617475727 RX-dropped: 0 RX-total: 617475727 TX-packets: 617475727 TX-dropped: 0 TX-total: 617475727 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ testpmd> show port xstats 0 ###### NIC extended statistics for port 0 rx_good_packets: 617475731 tx_good_packets: 617475730 rx_good_bytes: 45693207378 tx_good_bytes: 45693207036 rx_missed_errors: 0 rx_errors: 0 tx_errors: 0 rx_mbuf_allocation_errors: 0 rx_q0_packets: 617475731 rx_q0_bytes: 45693207378 rx_q0_errors: 0 tx_q0_packets: 617475730 tx_q0_bytes: 45693207036 rx_wqe_errors: 0 rx_unicast_packets: 617496152 rx_unicast_bytes: 45694715248 tx_unicast_packets: 617475730 tx_unicast_bytes: 45693207036 rx_multicast_packets: 3 rx_multicast_bytes: 342 tx_multicast_packets: 0 tx_multicast_bytes: 0 rx_broadcast_packets: 56 rx_broadcast_bytes: 7308 tx_broadcast_packets: 0 tx_broadcast_bytes: 0 tx_phy_packets: 0 rx_phy_packets: 0 rx_phy_crc_errors: 0 tx_phy_bytes: 0 rx_phy_bytes: 0 rx_phy_in_range_len_errors: 0 rx_phy_symbol_errors: 0 rx_phy_discard_packets: 0 tx_phy_discard_packets: 0 tx_phy_errors: 0 rx_out_of_buffer: 0 tx_pp_missed_interrupt_errors: 0 tx_pp_rearm_queue_errors: 0 tx_pp_clock_queue_errors: 0 tx_pp_timestamp_past_errors: 0 tx_pp_timestamp_future_errors: 0 tx_pp_jitter: 0 tx_pp_wander: 0 tx_pp_sync_lost: 0 Best regards Yan Xiaoping From: Yan, Xiaoping (NSB - CN/Hangzhou) Sent: 2021年9月29日 16:26 To: 'Asaf Penso' <as...@nvidia.com<mailto:as...@nvidia.com>> Cc: 'Slava Ovsiienko' <viachesl...@nvidia.com<mailto:viachesl...@nvidia.com>>; 'Matan Azrad' <ma...@nvidia.com<mailto:ma...@nvidia.com>>; 'Raslan Darawsheh' <rasl...@nvidia.com<mailto:rasl...@nvidia.com>>; Xu, Meng-Maggie (NSB - CN/Hangzhou) <meng-maggie...@nokia-sbell.com<mailto:meng-maggie...@nokia-sbell.com>> Subject: RE: mlx5 VF packet lost between rx_port_unicast_packets and rx_good_packets Hi, We replaced the NIC also (originally it was cx-4, now it is cx-5), but result is the same. Do you know why the packet is dropped between rx_port_unicast_packets and rx_good_packets, but there is no error/miss counter? And do you know mlx5_xxx kernel thread? They have cpu affinity to all cpu cores, including the core used by fastpath/testpmd. Would it affect? [cranuser1@hztt24f-rm17-ocp-sno-1 ~]$ taskset -cp 74548 pid 74548's current affinity list: 0-27 [cranuser1@hztt24f-rm17-ocp-sno-1 ~]$ ps -emo pid,tid,psr,comm | grep mlx5 903 - - mlx5_health0000 904 - - mlx5_page_alloc 907 - - mlx5_cmd_0000:0 916 - - mlx5_events 917 - - mlx5_esw_wq 918 - - mlx5_fw_tracer 919 - - mlx5_hv_vhca 921 - - mlx5_fc 924 - - mlx5_health0000 925 - - mlx5_page_alloc 927 - - mlx5_cmd_0000:0 935 - - mlx5_events 936 - - mlx5_esw_wq 937 - - mlx5_fw_tracer 938 - - mlx5_hv_vhca 939 - - mlx5_fc 941 - - mlx5_health0000 942 - - mlx5_page_alloc Best regards Yan Xiaoping From: Yan, Xiaoping (NSB - CN/Hangzhou) Sent: 2021年9月29日 15:03 To: 'Asaf Penso' <as...@nvidia.com<mailto:as...@nvidia.com>> Cc: Slava Ovsiienko <viachesl...@nvidia.com<mailto:viachesl...@nvidia.com>>; Matan Azrad <ma...@nvidia.com<mailto:ma...@nvidia.com>>; Raslan Darawsheh <rasl...@nvidia.com<mailto:rasl...@nvidia.com>>; Xu, Meng-Maggie (NSB - CN/Hangzhou) <meng-maggie...@nokia-sbell.com<mailto:meng-maggie...@nokia-sbell.com>> Subject: RE: mlx5 VF packet lost between rx_port_unicast_packets and rx_good_packets Hi, It is 20.11 (We upgraded to 20.11 recently). Best regards Yan Xiaoping From: Asaf Penso <as...@nvidia.com<mailto:as...@nvidia.com>> Sent: 2021年9月29日 14:47 To: Yan, Xiaoping (NSB - CN/Hangzhou) <xiaoping....@nokia-sbell.com<mailto:xiaoping....@nokia-sbell.com>> Cc: Slava Ovsiienko <viachesl...@nvidia.com<mailto:viachesl...@nvidia.com>>; Matan Azrad <ma...@nvidia.com<mailto:ma...@nvidia.com>>; Raslan Darawsheh <rasl...@nvidia.com<mailto:rasl...@nvidia.com>>; Xu, Meng-Maggie (NSB - CN/Hangzhou) <meng-maggie...@nokia-sbell.com<mailto:meng-maggie...@nokia-sbell.com>> Subject: Re: mlx5 VF packet lost between rx_port_unicast_packets and rx_good_packets What dpdk version are you using? 19.11 doesn't support 5tswap mode in testpmd. Regards, Asaf Penso ________________________________ From: Yan, Xiaoping (NSB - CN/Hangzhou) <xiaoping....@nokia-sbell.com<mailto:xiaoping....@nokia-sbell.com>> Sent: Monday, September 27, 2021 5:55:21 AM To: Asaf Penso <as...@nvidia.com<mailto:as...@nvidia.com>> Cc: Slava Ovsiienko <viachesl...@nvidia.com<mailto:viachesl...@nvidia.com>>; Matan Azrad <ma...@nvidia.com<mailto:ma...@nvidia.com>>; Raslan Darawsheh <rasl...@nvidia.com<mailto:rasl...@nvidia.com>>; Xu, Meng-Maggie (NSB - CN/Hangzhou) <meng-maggie...@nokia-sbell.com<mailto:meng-maggie...@nokia-sbell.com>> Subject: RE: mlx5 VF packet lost between rx_port_unicast_packets and rx_good_packets Hi, I tried also with testpmd with such command and configuration: dpdk-testpmd -l "4,5" --legacy-mem --socket-mem "5000,0" -a 0000:03:02.0 -- -i --nb-cores=1 --portmask=0x1 --rxd=512 --txd=512 testpmd> port stop 0 testpmd> vlan set filter on 0 testpmd> rx_vlan add 767 0 testpmd> port start 0 testpmd> set fwd 5tswap testpmd> start it only gets 1.4mpps. with 1.5mpps, it starts to drop packets occasionally. Best regards Yan Xiaoping From: Yan, Xiaoping (NSB - CN/Hangzhou) Sent: 2021年9月26日 13:19 To: 'Asaf Penso' <as...@nvidia.com<mailto:as...@nvidia.com>> Cc: Slava Ovsiienko <viachesl...@nvidia.com<mailto:viachesl...@nvidia.com>>; Matan Azrad <ma...@nvidia.com<mailto:ma...@nvidia.com>>; Raslan Darawsheh <rasl...@nvidia.com<mailto:rasl...@nvidia.com>>; Xu, Meng-Maggie (NSB - CN/Hangzhou) <meng-maggie...@nokia-sbell.com<mailto:meng-maggie...@nokia-sbell.com>> Subject: RE: mlx5 VF packet lost between rx_port_unicast_packets and rx_good_packets Hi, I was using 6wind fastpath instead of testpmd. >> Do you configure any flow? I think not, but is there any command to check? >> Do you work in isolate mode? Do you mean the CPU? The dpdk application (6wind fastpath) run inside container and it is using CPU core from exclusive pool<https://github.com/nokia/CPU-Pooler> On the otherhand, the cpu isolation is done by host infrastructure and a bit complicated, I’m not sure if there is really no any other task run in this core. BTW, we recently switched the host infra to redhat openshift container platform, and same problem is there… We can get 1.6mpps with intel 810 NIC, but we can only gets 1mpps for mlx. I raised also a ticket to mellanox Support https://support.mellanox.com/s/case/5001T00001ZC0jzQAD There is log about cpu affinity, and some mlx5_xxx threads seems strange to me… Can you please also check the ticket? Best regards Yan Xiaoping From: Asaf Penso <as...@nvidia.com<mailto:as...@nvidia.com>> Sent: 2021年9月26日 12:57 To: Yan, Xiaoping (NSB - CN/Hangzhou) <xiaoping....@nokia-sbell.com<mailto:xiaoping....@nokia-sbell.com>> Cc: Slava Ovsiienko <viachesl...@nvidia.com<mailto:viachesl...@nvidia.com>>; Matan Azrad <ma...@nvidia.com<mailto:ma...@nvidia.com>>; Raslan Darawsheh <rasl...@nvidia.com<mailto:rasl...@nvidia.com>> Subject: RE: mlx5 VF packet lost between rx_port_unicast_packets and rx_good_packets Hi, Could you please share the testpmd command line you are using? Do you configure any flow? Do you work in isolate mode? Regards, Asaf Penso From: Yan, Xiaoping (NSB - CN/Hangzhou) <xiaoping....@nokia-sbell.com<mailto:xiaoping....@nokia-sbell.com>> Sent: Monday, July 26, 2021 7:52 AM To: Asaf Penso <as...@nvidia.com<mailto:as...@nvidia.com>>; users@dpdk.org<mailto:users@dpdk.org> Cc: Slava Ovsiienko <viachesl...@nvidia.com<mailto:viachesl...@nvidia.com>>; Matan Azrad <ma...@nvidia.com<mailto:ma...@nvidia.com>>; Raslan Darawsheh <rasl...@nvidia.com<mailto:rasl...@nvidia.com>> Subject: RE: mlx5 VF packet lost between rx_port_unicast_packets and rx_good_packets Hi, dpdk version in use is 19.11 I have not tried with latest upstream version. It seems performance is affected by IPv6 neighbor advertisement packets coming to this interface 05:20:04.025290 IP6 fe80::6cf1:9fff:fe4e:8a01 > ff02::1: ICMP6, neighbor advertisement, tgt is fe80::6cf1:9fff:fe4e:8a01, length 32 0x0000: 3333 0000 0001 6ef1 9f4e 8a01 86dd 6008 0x0010: fe44 0020 3aff fe80 0000 0000 0000 6cf1 0x0020: 9fff fe4e 8a01 ff02 0000 0000 0000 0000 0x0030: 0000 0000 0001 8800 96d9 2000 0000 fe80 0x0040: 0000 0000 0000 6cf1 9fff fe4e 8a01 0201 0x0050: 6ef1 9f4e 8a01 Somehow, there are about 100 such packets per second coming to the interface, and packet loss happens. When we change default vlan in switch so that there is no such packets come to the interface (the mlx5 VF under test), there is not packet loss anymore. In both cases, all packets have arrived to rx_vport_unicast_packets. In the packet loss case, we see less packets in rx_good_packets (rx_vport_unicast_packets = rx_good_packets + lost packet). If the dpdk application is too slow to receive all packets from the VF, is there any counter to indicate this? Any suggestion? Thank you. Best regards Yan Xiaoping -----Original Message----- From: Asaf Penso <as...@nvidia.com<mailto:as...@nvidia.com>> Sent: 2021年7月13日 20:36 To: Yan, Xiaoping (NSB - CN/Hangzhou) <xiaoping....@nokia-sbell.com<mailto:xiaoping....@nokia-sbell.com>>; users@dpdk.org<mailto:users@dpdk.org> Cc: Slava Ovsiienko <viachesl...@nvidia.com<mailto:viachesl...@nvidia.com>>; Matan Azrad <ma...@nvidia.com<mailto:ma...@nvidia.com>>; Raslan Darawsheh <rasl...@nvidia.com<mailto:rasl...@nvidia.com>> Subject: RE: mlx5 VF packet lost between rx_port_unicast_packets and rx_good_packets Hello Yan, Can you please mention which DPDK version you use and whether you see this issue also with latest upstream version? Regards, Asaf Penso >-----Original Message----- >From: users <users-boun...@dpdk.org<mailto:users-boun...@dpdk.org>> On Behalf >Of Yan, Xiaoping (NSB - >CN/Hangzhou) >Sent: Monday, July 5, 2021 1:08 PM >To: users@dpdk.org<mailto:users@dpdk.org> >Subject: [dpdk-users] mlx5 VF packet lost between >rx_port_unicast_packets and rx_good_packets > >Hi, > >When doing traffic loopback test on a mlx5 VF, we found there are some >packet loss (not all packet received back ). > >From xstats counters, I found all packets have been received in >rx_port_unicast_packets, but rx_good_packets has lower counter, and >rx_port_unicast_packets - rx_good_packets = lost packets i.e. packet >lost between rx_port_unicast_packets and rx_good_packets. >But I can not find any other counter indicating where exactly those >packets are lost. > >Any idea? > >Attached is the counter logs. (bf is before the test, af is after the >test, fp-cli dpdk-port-stats is the command used to get xstats, and >ethtool -S _f1 (the vf >used) also printed) Test equipment reports that it sends: 2911176 >packets, >receives: 2909474, dropped: 1702 And the xstats (after - before) shows >rx_port_unicast_packets 2911177, rx_good_packets 2909475, so drop >(2911177 - rx_good_packets) is 1702 > >BTW, I also noticed this discussion "packet loss between phy and good >counter" >http://mails.dpdk.org/archives/users/2018-July/003271.html >but my case seems to be different as packet also received in >rx_port_unicast_packets, and I checked counter from pf (ethtool -S >ens1f0 in attached log), rx_discards_phy is not increasing. > >Thank you. > >Best regards >Yan Xiaoping