Hello Gerry and Tom,
We are aware of this issue and already provided a fix to 21.05 and CCed stable.
Please check this series from Matan Azrad, and let me know the result of your
cases:
[PATCH 0/4] net/mlx5: fix imissed statistic
The imissed port statistic counts packets that were dropped by the device Rx
queues.
In mlx5, the imissed counter summarizes 2 counters:
- packets dropped by the SW queue handling counted by SW.
- packets dropped by the HW queues due to "out of buffer" events
detected when no SW buffer is available for the incoming
packets.
There is HW counter object that should be created per device, and all the Rx
queues should be assigned to this counter in configuration time.
This part was missed when the Rx queues were created by DevX what remained the
"out of buffer" counter clean forever in this case.
Add 2 options to assign the DevX Rx queues to queue counter:
- Create queue counter per device by DevX and assign all the
queues to it.
- Query the kernel counter and assign all the queues to it.
Use the first option by default and if it is failed, fallback to the second
option.
Matan Azrad (4):
common/mlx5/linux: add glue function to query WQ
common/mlx5: add DevX command to query WQ
common/mlx5: add DevX commands for queue counters
net/mlx5: fix imissed statistics
Regards,
Asaf Penso
>-----Original Message-----
>From: users <[email protected]> On Behalf Of Tom Barbette
>Sent: Tuesday, April 13, 2021 4:40 PM
>To: Gerry Wan <[email protected]>; [email protected]
>Cc: Matan Azrad <[email protected]>; Shahaf Shuler <[email protected]>;
>Slava Ovsiienko <[email protected]>
>Subject: Re: [dpdk-users] mlx5: packets lost between good+discard and phy
>counters
>
>CC-ing maintainers.
>
>I did observe that too. rx_out_of_buffer is always 0 since a few months (I did
>not personnaly try to revert versions as Gerry did, I assume it was a DPDK
>update indeed as Gerry verified).
>
>
>Tom
>
>Le 11-04-21 à 03:31, Gerry Wan a écrit :
>> After further investigation, I think this may be a bug introduced in
>> DPDK v20.11, where these "lost" packets should be counted as
>"rx_out_of_buffer"
>> and "rx_missed_errors". On v20.08 both of these counters increment,
>> but
>on
>> v20.11 and v21.02 these counters always remain 0.
>>
>> Any workarounds for this? This is an important statistic for my use case.
>>
>> On Fri, Apr 2, 2021 at 5:03 PM Gerry Wan <[email protected]> wrote:
>>
>>> I have a simple forwarding experiment using a mlx5 NIC directly
>>> connected to a generator. I am noticing that at high enough
>>> throughput, rx_good_packets + rx_phy_discard_packets may not equal
>rx_phy_packets.
>>> Where are these packets being dropped?
>>>
>>> Below is an example xstats where I receive at almost the limit of
>>> what
>my
>>> application can handle with no loss. It shows rx_phy_discard_packets
>>> is 0 but the number actually received by the CPU is less than
>rx_phy_packets.
>>> rx_out_of_buffer and other errors are also 0.
>>>
>>> I have disabled Ethernet flow control via rte_eth_dev_flow_ctrl_set
>>> with mode = RTE_FC_NONE, if that matters.
>>>
>>> {
>>> "rx_good_packets": 319992439,
>>> "tx_good_packets": 0,
>>> "rx_good_bytes": 19199546340,
>>> "tx_good_bytes": 0,
>>> "rx_missed_errors": 0,
>>> "rx_errors": 0,
>>> "tx_errors": 0,
>>> "rx_mbuf_allocation_errors": 0,
>>> "rx_q0_packets": 319992439,
>>> "rx_q0_bytes": 19199546340,
>>> "rx_q0_errors": 0,
>>> "rx_wqe_errors": 0,
>>> "rx_unicast_packets": 319999892,
>>> "rx_unicast_bytes": 19199993520,
>>> "tx_unicast_packets": 0,
>>> "tx_unicast_bytes": 0,
>>> "rx_multicast_packets": 0,
>>> "rx_multicast_bytes": 0,
>>> "tx_multicast_packets": 0,
>>> "tx_multicast_bytes": 0,
>>> "rx_broadcast_packets": 0,
>>> "rx_broadcast_bytes": 0,
>>> "tx_broadcast_packets": 0,
>>> "tx_broadcast_bytes": 0,
>>> "tx_phy_packets": 0,
>>> "rx_phy_packets": 319999892,
>>> "rx_phy_crc_errors": 0,
>>> "tx_phy_bytes": 0,
>>> "rx_phy_bytes": 20479993088,
>>> "rx_phy_in_range_len_errors": 0,
>>> "rx_phy_symbol_errors": 0,
>>> "rx_phy_discard_packets": 0,
>>> "tx_phy_discard_packets": 0,
>>> "tx_phy_errors": 0,
>>> "rx_out_of_buffer": 0,
>>> "tx_pp_missed_interrupt_errors": 0,
>>> "tx_pp_rearm_queue_errors": 0,
>>> "tx_pp_clock_queue_errors": 0,
>>> "tx_pp_timestamp_past_errors": 0,
>>> "tx_pp_timestamp_future_errors": 0,
>>> "tx_pp_jitter": 0,
>>> "tx_pp_wander": 0,
>>> "tx_pp_sync_lost": 0,
>>> }
>>>
>>>