> On Feb 13, 2024, at 7:12 AM, Slava Ovsiienko <[email protected]> wrote:
> 
> Hi,
> 
> Regarding "dev_out_of_buffer" - it is global counter, relates to the whole 
> device port,
> Including queues not managed by DPDK application - Mellanox/Nvidia NICs 
> operate
> In "bifurcated mode" - there might be queues managed by kernel or another DPDK
> application. Not sure it makes a lot of sense, but I have no strong 
> objections.
These are still helpful to debug in lab environment. But, it would be good to 
document these.

> 
> The PCI related counters are also global ones and reflect statistics, 
> impacted by
> PCI activity of the whole physical device, including all the network ports 
> located
> on the same NIC board (and, sometimes, by internal activity in BlueField).
> 
> As I said, no objections from my side:
> 
> Acked-by: Viacheslav Ovsiienko <[email protected]>
> 
> With best regards,
> Slava
> 
>> -----Original Message-----
>> From: Wathsala Vithanage <[email protected]>
>> Sent: Friday, February 9, 2024 10:42 PM
>> To: NBU-Contact-Thomas Monjalon (EXTERNAL) <[email protected]>;
>> Dariusz Sosnowski <[email protected]>; Slava Ovsiienko
>> <[email protected]>; Ori Kam <[email protected]>; Suanming Mou
>> <[email protected]>; Matan Azrad <[email protected]>
>> Cc: [email protected]; [email protected]; Wathsala Vithanage
>> <[email protected]>; Honnappa Nagarahalli
>> <[email protected]>
>> Subject: [PATCH] net/mlx5: enable PCI related counters
>> 
>> Versions of Mellanox NICs starting from CX5 have device counters related to 
>> PCI.
>> These counters are helpful in debugging IO bottlenecks. For instance, the
>> outbound_pci_stalled_rd and outbound_pci_stalled_wr counters can help with
>> identifying NIC stalls due to insufficient PCI credits, which otherwise 
>> would have
>> required a PCI analyzer or a sophisticated PCI root port with a PMU.
>> Currently none of these are available in the MLX5 PMD even though ethtool is
>> capable of reading some of them.
>> Since PMD uses the same ioctl used by ethtool (SIOCETHTOOL) and reads via the
>> kernel driver it is possible to add support with ease.
>> There is one more PCI related counter and a device counter that aren't
>> implemented in the Linux driver at the moment. These two are named
>> outbound_pci_buffer_overflow and dev_out_of_buffer respectively. As per
>> Nvidia's documentation these two counters can tell the number of packets
>> dropped due to pci buffer overflow and the number of times the device owned
>> queue had not enough buffers allocated.
>> 
>> Signed-off-by: Wathsala Vithanage <[email protected]>
>> Reviewed-by: Honnappa Nagarahalli <[email protected]>
>> ---
>> .mailmap                                |  1 +
>> drivers/net/mlx5/linux/mlx5_ethdev_os.c | 33
>> +++++++++++++++++++++++++
>> 2 files changed, 34 insertions(+)
>> 
>> diff --git a/.mailmap b/.mailmap
>> index aa569ff456..f57415f7a1 100644
>> --- a/.mailmap
>> +++ b/.mailmap
>> @@ -1510,6 +1510,7 @@ Walter Heymans <[email protected]>
>> Wang Sheng-Hui <[email protected]>  Wangyu (Eric)
>> <[email protected]>  Waterman Cao <[email protected]>
>> +Wathsala Vithanage <[email protected]>
>> Weichun Chen <[email protected]>
>> Wei Dai <[email protected]>
>> Weifeng Li <[email protected]>
>> diff --git a/drivers/net/mlx5/linux/mlx5_ethdev_os.c
>> b/drivers/net/mlx5/linux/mlx5_ethdev_os.c
>> index dd5a0c546d..8f1567f6a7 100644
>> --- a/drivers/net/mlx5/linux/mlx5_ethdev_os.c
>> +++ b/drivers/net/mlx5/linux/mlx5_ethdev_os.c
>> @@ -1574,6 +1574,39 @@ static const struct mlx5_counter_ctrl
>> mlx5_counters_init[] = {
>> .dpdk_name = "tx_vport_bytes",
>> .ctr_name = "vport_tx_bytes",
>> },
>> + /* Device counters */
>> + {
>> + .dpdk_name = "rx_pci_signal_integrity",
>> + .ctr_name = "rx_pci_signal_integrity",
>> + },
>> + {
>> + .dpdk_name = "tx_pci_signal_integrity",
>> + .ctr_name = "tx_pci_signal_integrity",
>> + },
>> + {
>> + .dpdk_name = "outbound_pci_buffer_overflow",
>> + .ctr_name = "outbound_pci_buffer_overflow",
>> + },
>> + {
>> + .dpdk_name = "outbound_pci_stalled_rd",
>> + .ctr_name = "outbound_pci_stalled_rd",
>> + },
>> + {
>> + .dpdk_name = "outbound_pci_stalled_wr",
>> + .ctr_name = "outbound_pci_stalled_wr",
>> + },
>> + {
>> + .dpdk_name = "outbound_pci_stalled_rd_events",
>> + .ctr_name = "outbound_pci_stalled_rd_events",
>> + },
>> + {
>> + .dpdk_name = "outbound_pci_stalled_wr_events",
>> + .ctr_name = "outbound_pci_stalled_wr_events",
>> + },
>> + {
>> + .dpdk_name = "dev_out_of_buffer",
>> + .ctr_name = "dev_out_of_buffer",
>> + },
>> };
>> 
>> static const unsigned int xstats_n = RTE_DIM(mlx5_counters_init);
>> --
>> 2.25.1
> 

Reply via email to