Re: Question: DPDK thread model

2022-01-16 Thread Stephen Hemminger
On Sun, 16 Jan 2022 16:27:11 -0500
fwefew 4t4tg <7532ya...@gmail.com> wrote:

> I completed code to initialize an AWS ENA adapter with RX, TX queues. With
> this work in hand, DPDK creates one thread pinned to the right core as per
> the --lcores argument. So far so good.
> The DPDK documentation and example code is fairly clear here.

Look at examples/l3fwd

> 
> What's not as clear is how RX packets are handled. As far as I can tell the
> canonical way to deal with RX packets is running 'rte_eth_add_rx_callback'
> for each RXQ. This allows one to process each received packet (for a given
> RXQ) via a provided callback in the same lcore/hardware-thread that DPDK
> created for me. As such, there is no need to create additional threads.
> Correct?

rte_eth_add_rx_callback is not the droid you are looking for!
Applications use rte_eth_rx_burst to get packets. Most applications
use RSS and/or multiple queues so that each thread reads one or more
receive queues.


> Furthermore, I hope the mbufs the callback gets somehow correspond to mbufs
> associated with the RX descriptors provided to the RXQs so there's no need
> for copying packets after the NIC receives them before the callback acts on
> it. As far as I can this hope is ill-founded.. A lot of DPDK code I've seen
> allocates more mbufs per RXQ than the number of RX descriptors. To me this
> seems to imply DPDK's RXQ threads put copies of the
> received-off-the-wire-packets into a copy for delivery to app code.

You are digging in the wrong place.


> TX is less clear to me.
> 
> For TX there seems to be no way to transmit packets (burst or otherwise)
> without creating another thread --- that is, another thread beyond what
> DPDK makes for me. This other thread must at the appropriate time
> prepare mbufs and call 'rte_eth_tx_burst' on the correct TXQ. DPDK seems to
> want to keep its thread for it's own work. Yes, DPDK provides
> 'rte_eth_add_tx_callback' but that only works after the mbufs have been
> created and told to transmit, which is after the fact of creation. Putting
> this together, DPDK requires me to create new threads unlike RX. Correct?
> 
> While creating additional threads for TX is not the end of the world, I do
> not want the DPDK TX thread to copy mbufs; I want zero-copy. Here, then, I
> gather DPDK's TXQ thread takes the mbufs the helper TX thread provides in
> the 'rte_eth_tx_burst' call and provides them to the TXQS descriptors so
> they go out on the wire without copying. Is this correct?
> 
> Now, it's worth pointing out here that 'rte_eth_tx_queue_setup' unlike the
> RX equivalent does not accept a mempool.  So in addition to the above
> points, those additional TX helper threads (those which call
> rte_eth_tx_burst) will need to arrange for its own mempool. That's not hard
> to do, but I just want confirmation.

The common way to handle Transmit is to have one transmit queue per thread.
You don't want to be creating your own threads; those threads would not end up
being part of DPDK and it that creates lots of issues.



Question: DPDK thread model

2022-01-16 Thread fwefew 4t4tg
I completed code to initialize an AWS ENA adapter with RX, TX queues. With
this work in hand, DPDK creates one thread pinned to the right core as per
the --lcores argument. So far so good.
The DPDK documentation and example code is fairly clear here.

What's not as clear is how RX packets are handled. As far as I can tell the
canonical way to deal with RX packets is running 'rte_eth_add_rx_callback'
for each RXQ. This allows one to process each received packet (for a given
RXQ) via a provided callback in the same lcore/hardware-thread that DPDK
created for me. As such, there is no need to create additional threads.
Correct?

Furthermore, I hope the mbufs the callback gets somehow correspond to mbufs
associated with the RX descriptors provided to the RXQs so there's no need
for copying packets after the NIC receives them before the callback acts on
it. As far as I can this hope is ill-founded.. A lot of DPDK code I've seen
allocates more mbufs per RXQ than the number of RX descriptors. To me this
seems to imply DPDK's RXQ threads put copies of the
received-off-the-wire-packets into a copy for delivery to app code.

TX is less clear to me.

For TX there seems to be no way to transmit packets (burst or otherwise)
without creating another thread --- that is, another thread beyond what
DPDK makes for me. This other thread must at the appropriate time
prepare mbufs and call 'rte_eth_tx_burst' on the correct TXQ. DPDK seems to
want to keep its thread for it's own work. Yes, DPDK provides
'rte_eth_add_tx_callback' but that only works after the mbufs have been
created and told to transmit, which is after the fact of creation. Putting
this together, DPDK requires me to create new threads unlike RX. Correct?

While creating additional threads for TX is not the end of the world, I do
not want the DPDK TX thread to copy mbufs; I want zero-copy. Here, then, I
gather DPDK's TXQ thread takes the mbufs the helper TX thread provides in
the 'rte_eth_tx_burst' call and provides them to the TXQS descriptors so
they go out on the wire without copying. Is this correct?

Now, it's worth pointing out here that 'rte_eth_tx_queue_setup' unlike the
RX equivalent does not accept a mempool.  So in addition to the above
points, those additional TX helper threads (those which call
rte_eth_tx_burst) will need to arrange for its own mempool. That's not hard
to do, but I just want confirmation.

Thanks.


RE: net_mlx5: unable to recognize master/representors on the multiple IB devices

2022-01-16 Thread Asaf Penso
Hello Rocio,
IIRC, there was a fix in a recent stable version.
Would you please try taking latest 19.11 LTS and tell whether you still see the 
issue?

Regards,
Asaf Penso

>-Original Message-
>From: Thomas Monjalon 
>Sent: Sunday, January 16, 2022 3:24 PM
>To: Rocio Dominguez 
>Cc: users@dpdk.org; Matan Azrad ; Slava Ovsiienko
>; Raslan Darawsheh 
>Subject: Re: net_mlx5: unable to recognize master/representors on the
>multiple IB devices
>
>+Cc mlx5 experts
>
>
>14/01/2022 11:10, Rocio Dominguez:
>> Hi,
>>
>> I'm doing a setup with Mellanox ConnectX-4 (MCX416A-CCA) NICs.
>>
>> I'm using:
>>
>> OS SLES 15 SP2
>> DPDK 19.11.4 (the official supported version for SLES 15 SP2)
>> MLNX_OFED_LINUX-5.5-1.0.3.2-sles15sp2-x86_64 (the latest one) Mellanox
>> adapters firmware 12.28.2006 (corresponding to this MLNX_OFED version)
>> kernel 5.3.18-24.34-default
>>
>>
>> This is my SRIOV configuration for DPDK capable PCI slots:
>>
>> {
>> "resourceName": "mlnx_sriov_netdevice",
>> "resourcePrefix": "mellanox.com",
>> "isRdma": true,
>> "selectors": {
>> "vendors": ["15b3"],
>> "devices": ["1014"],
>> "drivers": ["mlx5_core"],
>> "pciAddresses": [":d8:00.2", ":d8:00.3", 
>> ":d8:00.4",
>":d8:00.5"],
>> "isRdma": true
>> }
>>
>> The sriov device plugin starts without problems, the devices are correctly
>allocated:
>>
>> {
>>   "cpu": "92",
>>   "ephemeral-storage": "419533922385",
>>   "hugepages-1Gi": "8Gi",
>>   "hugepages-2Mi": "4Gi",
>>   "intel.com/intel_sriov_dpdk": "0",
>>   "intel.com/sriov_cre": "3",
>>   "mellanox.com/mlnx_sriov_netdevice": "4",
>>   "mellanox.com/sriov_dp": "0",
>>   "memory": "183870336Ki",
>>   "pods": "110"
>> }
>>
>> The Mellanox NICs are binded to the kernel driver mlx5_core:
>>
>> pcgwpod009-c04:~ # dpdk-devbind --status
>>
>> Network devices using kernel driver
>> ===
>> :18:00.0 'Ethernet Controller 10G X550T 1563' if=em1 drv=ixgbe
>> unused=vfio-pci
>> :18:00.1 'Ethernet Controller 10G X550T 1563' if=em2 drv=ixgbe
>> unused=vfio-pci
>> :19:00.0 'Ethernet Controller 10G X550T 1563' if=em3 drv=ixgbe
>> unused=vfio-pci
>> :19:00.1 'Ethernet Controller 10G X550T 1563' if=em4 drv=ixgbe
>> unused=vfio-pci
>> :3b:00.0 'MT27700 Family [ConnectX-4] 1013' if=enp59s0f0
>> drv=mlx5_core unused=vfio-pci
>> :3b:00.1 'MT27700 Family [ConnectX-4] 1013' if=enp59s0f1
>> drv=mlx5_core unused=vfio-pci
>> :5e:00.0 '82599ES 10-Gigabit SFI/SFP+ Network Connection 10fb'
>> if=p3p1 drv=ixgbe unused=vfio-pci
>> :5e:00.1 '82599ES 10-Gigabit SFI/SFP+ Network Connection 10fb'
>> if=p3p2 drv=ixgbe unused=vfio-pci
>> :5e:10.0 '82599 Ethernet Controller Virtual Function 10ed' if=
>> drv=ixgbevf unused=vfio-pci
>> :5e:10.2 '82599 Ethernet Controller Virtual Function 10ed'
>> if=p3p1_1 drv=ixgbevf unused=vfio-pci
>> :5e:10.4 '82599 Ethernet Controller Virtual Function 10ed' if=
>> drv=ixgbevf unused=vfio-pci
>> :5e:10.6 '82599 Ethernet Controller Virtual Function 10ed'
>> if=p3p1_3 drv=ixgbevf unused=vfio-pci
>> :af:00.0 '82599ES 10-Gigabit SFI/SFP+ Network Connection 10fb'
>> if=p4p1 drv=ixgbe unused=vfio-pci
>> :af:00.1 '82599ES 10-Gigabit SFI/SFP+ Network Connection 10fb'
>> if=p4p2 drv=ixgbe unused=vfio-pci
>> :d8:00.0 'MT27700 Family [ConnectX-4] 1013' if=enp216s0f0
>> drv=mlx5_core unused=vfio-pci
>> :d8:00.1 'MT27700 Family [ConnectX-4] 1013' if=enp216s0f1
>> drv=mlx5_core unused=vfio-pci
>> :d8:00.2 'MT27700 Family [ConnectX-4 Virtual Function] 1014'
>> if=enp216s0f2 drv=mlx5_core unused=vfio-pci
>> :d8:00.3 'MT27700 Family [ConnectX-4 Virtual Function] 1014'
>> if=enp216s0f3 drv=mlx5_core unused=vfio-pci
>> :d8:00.4 'MT27700 Family [ConnectX-4 Virtual Function] 1014'
>> if=enp216s0f4 drv=mlx5_core unused=vfio-pci
>> :d8:00.5 'MT27700 Family [ConnectX-4 Virtual Function] 1014'
>> if=enp216s0f5 drv=mlx5_core unused=vfio-pci
>>
>> The interfaces are up:
>>
>> pcgwpod009-c04:~ # ibdev2netdev -v
>> :3b:00.0 mlx5_0 (MT4115 - MT1646K01301) CX416A - ConnectX-4
>QSFP28
>> fw 12.28.2006 port 1 (ACTIVE) ==> enp59s0f0 (Up)
>> :3b:00.1 mlx5_1 (MT4115 - MT1646K01301) CX416A - ConnectX-4
>QSFP28
>> fw 12.28.2006 port 1 (ACTIVE) ==> enp59s0f1 (Up)
>> :d8:00.0 mlx5_2 (MT4115 - MT1646K00538) CX416A - ConnectX-4
>QSFP28
>> fw 12.28.2006 port 1 (ACTIVE) ==> enp216s0f0 (Up)
>> :d8:00.1 mlx5_3 (MT4115 - MT1646K00538) CX416A - ConnectX-4
>QSFP28
>> fw 12.28.2006 port 1 (ACTIVE) ==> enp216s0f1 (Up)
>> :d8:00.2 mlx5_4 (MT4116 - NA)  fw 12.28.2006 port 1 (ACTIVE) ==>
>> enp216s0f2 (Up)
>> :d8:00.3 mlx5_5 (MT4116 - NA)  fw 12.28.2006 port 1 (ACTIVE) ==>
>> enp216s0f3 (Up)
>> :d8:00.4 mlx5_6 (MT4116 - NA)  fw 12.28.2006 port 1 (ACTIVE) ==>
>> enp216s0f4 (Up)
>> :d8:00.5 mlx5_7 

Re: net_mlx5: unable to recognize master/representors on the multiple IB devices

2022-01-16 Thread Thomas Monjalon
+Cc mlx5 experts


14/01/2022 11:10, Rocio Dominguez:
> Hi,
> 
> I'm doing a setup with Mellanox ConnectX-4 (MCX416A-CCA) NICs.
> 
> I'm using:
> 
> OS SLES 15 SP2
> DPDK 19.11.4 (the official supported version for SLES 15 SP2)
> MLNX_OFED_LINUX-5.5-1.0.3.2-sles15sp2-x86_64 (the latest one)
> Mellanox adapters firmware 12.28.2006 (corresponding to this MLNX_OFED 
> version)
> kernel 5.3.18-24.34-default
> 
> 
> This is my SRIOV configuration for DPDK capable PCI slots:
> 
> {
> "resourceName": "mlnx_sriov_netdevice",
> "resourcePrefix": "mellanox.com",
> "isRdma": true,
> "selectors": {
> "vendors": ["15b3"],
> "devices": ["1014"],
> "drivers": ["mlx5_core"],
> "pciAddresses": [":d8:00.2", ":d8:00.3", 
> ":d8:00.4", ":d8:00.5"],
> "isRdma": true
> }
> 
> The sriov device plugin starts without problems, the devices are correctly 
> allocated:
> 
> {
>   "cpu": "92",
>   "ephemeral-storage": "419533922385",
>   "hugepages-1Gi": "8Gi",
>   "hugepages-2Mi": "4Gi",
>   "intel.com/intel_sriov_dpdk": "0",
>   "intel.com/sriov_cre": "3",
>   "mellanox.com/mlnx_sriov_netdevice": "4",
>   "mellanox.com/sriov_dp": "0",
>   "memory": "183870336Ki",
>   "pods": "110"
> }
> 
> The Mellanox NICs are binded to the kernel driver mlx5_core:
> 
> pcgwpod009-c04:~ # dpdk-devbind --status
> 
> Network devices using kernel driver
> ===
> :18:00.0 'Ethernet Controller 10G X550T 1563' if=em1 drv=ixgbe 
> unused=vfio-pci
> :18:00.1 'Ethernet Controller 10G X550T 1563' if=em2 drv=ixgbe 
> unused=vfio-pci
> :19:00.0 'Ethernet Controller 10G X550T 1563' if=em3 drv=ixgbe 
> unused=vfio-pci
> :19:00.1 'Ethernet Controller 10G X550T 1563' if=em4 drv=ixgbe 
> unused=vfio-pci
> :3b:00.0 'MT27700 Family [ConnectX-4] 1013' if=enp59s0f0 drv=mlx5_core 
> unused=vfio-pci
> :3b:00.1 'MT27700 Family [ConnectX-4] 1013' if=enp59s0f1 drv=mlx5_core 
> unused=vfio-pci
> :5e:00.0 '82599ES 10-Gigabit SFI/SFP+ Network Connection 10fb' if=p3p1 
> drv=ixgbe unused=vfio-pci
> :5e:00.1 '82599ES 10-Gigabit SFI/SFP+ Network Connection 10fb' if=p3p2 
> drv=ixgbe unused=vfio-pci
> :5e:10.0 '82599 Ethernet Controller Virtual Function 10ed' if= 
> drv=ixgbevf unused=vfio-pci
> :5e:10.2 '82599 Ethernet Controller Virtual Function 10ed' if=p3p1_1 
> drv=ixgbevf unused=vfio-pci
> :5e:10.4 '82599 Ethernet Controller Virtual Function 10ed' if= 
> drv=ixgbevf unused=vfio-pci
> :5e:10.6 '82599 Ethernet Controller Virtual Function 10ed' if=p3p1_3 
> drv=ixgbevf unused=vfio-pci
> :af:00.0 '82599ES 10-Gigabit SFI/SFP+ Network Connection 10fb' if=p4p1 
> drv=ixgbe unused=vfio-pci
> :af:00.1 '82599ES 10-Gigabit SFI/SFP+ Network Connection 10fb' if=p4p2 
> drv=ixgbe unused=vfio-pci
> :d8:00.0 'MT27700 Family [ConnectX-4] 1013' if=enp216s0f0 drv=mlx5_core 
> unused=vfio-pci
> :d8:00.1 'MT27700 Family [ConnectX-4] 1013' if=enp216s0f1 drv=mlx5_core 
> unused=vfio-pci
> :d8:00.2 'MT27700 Family [ConnectX-4 Virtual Function] 1014' 
> if=enp216s0f2 drv=mlx5_core unused=vfio-pci
> :d8:00.3 'MT27700 Family [ConnectX-4 Virtual Function] 1014' 
> if=enp216s0f3 drv=mlx5_core unused=vfio-pci
> :d8:00.4 'MT27700 Family [ConnectX-4 Virtual Function] 1014' 
> if=enp216s0f4 drv=mlx5_core unused=vfio-pci
> :d8:00.5 'MT27700 Family [ConnectX-4 Virtual Function] 1014' 
> if=enp216s0f5 drv=mlx5_core unused=vfio-pci
> 
> The interfaces are up:
> 
> pcgwpod009-c04:~ # ibdev2netdev -v
> :3b:00.0 mlx5_0 (MT4115 - MT1646K01301) CX416A - ConnectX-4 QSFP28 fw 
> 12.28.2006 port 1 (ACTIVE) ==> enp59s0f0 (Up)
> :3b:00.1 mlx5_1 (MT4115 - MT1646K01301) CX416A - ConnectX-4 QSFP28 fw 
> 12.28.2006 port 1 (ACTIVE) ==> enp59s0f1 (Up)
> :d8:00.0 mlx5_2 (MT4115 - MT1646K00538) CX416A - ConnectX-4 QSFP28 fw 
> 12.28.2006 port 1 (ACTIVE) ==> enp216s0f0 (Up)
> :d8:00.1 mlx5_3 (MT4115 - MT1646K00538) CX416A - ConnectX-4 QSFP28 fw 
> 12.28.2006 port 1 (ACTIVE) ==> enp216s0f1 (Up)
> :d8:00.2 mlx5_4 (MT4116 - NA)  fw 12.28.2006 port 1 (ACTIVE) ==> 
> enp216s0f2 (Up)
> :d8:00.3 mlx5_5 (MT4116 - NA)  fw 12.28.2006 port 1 (ACTIVE) ==> 
> enp216s0f3 (Up)
> :d8:00.4 mlx5_6 (MT4116 - NA)  fw 12.28.2006 port 1 (ACTIVE) ==> 
> enp216s0f4 (Up)
> :d8:00.5 mlx5_7 (MT4116 - NA)  fw 12.28.2006 port 1 (ACTIVE) ==> 
> enp216s0f5 (Up)
> pcgwpod009-c04:~ #
> 
> 
> But when I run my application the Mellanox adapters are probed and I obtain 
> the following error:
> 
> {"proc_id":"6"},"message":"[pio] EAL: Probe PCI driver: mlx5_pci (15b3:1014) 
> device: :d8:00.4 (socket 1)"}
> {"version":"0.2.0","timestamp":"2022-01-14T09:51:39.826+00:00","severity":"info","service_id":"eric-pc-up-data-plane","metadata":{"proc_id":"6"},"message":"[pio]
>  net_mlx5: unable to recognize 

net_mlx5: unable to recognize master/representors on the multiple IB devices

2022-01-16 Thread Rocio Dominguez
Hi,

I'm doing a setup with Mellanox ConnectX-4 (MCX416A-CCA) NICs.

I'm using:

OS SLES 15 SP2
DPDK 19.11.4 (the official supported version for SLES 15 SP2)
MLNX_OFED_LINUX-5.5-1.0.3.2-sles15sp2-x86_64 (the latest one)
Mellanox adapters firmware 12.28.2006 (corresponding to this MLNX_OFED version)
kernel 5.3.18-24.34-default


This is my SRIOV configuration for DPDK capable PCI slots:

{
"resourceName": "mlnx_sriov_netdevice",
"resourcePrefix": "mellanox.com",
"isRdma": true,
"selectors": {
"vendors": ["15b3"],
"devices": ["1014"],
"drivers": ["mlx5_core"],
"pciAddresses": [":d8:00.2", ":d8:00.3", 
":d8:00.4", ":d8:00.5"],
"isRdma": true
}

The sriov device plugin starts without problems, the devices are correctly 
allocated:

{
  "cpu": "92",
  "ephemeral-storage": "419533922385",
  "hugepages-1Gi": "8Gi",
  "hugepages-2Mi": "4Gi",
  "intel.com/intel_sriov_dpdk": "0",
  "intel.com/sriov_cre": "3",
  "mellanox.com/mlnx_sriov_netdevice": "4",
  "mellanox.com/sriov_dp": "0",
  "memory": "183870336Ki",
  "pods": "110"
}

The Mellanox NICs are binded to the kernel driver mlx5_core:

pcgwpod009-c04:~ # dpdk-devbind --status

Network devices using kernel driver
===
:18:00.0 'Ethernet Controller 10G X550T 1563' if=em1 drv=ixgbe 
unused=vfio-pci
:18:00.1 'Ethernet Controller 10G X550T 1563' if=em2 drv=ixgbe 
unused=vfio-pci
:19:00.0 'Ethernet Controller 10G X550T 1563' if=em3 drv=ixgbe 
unused=vfio-pci
:19:00.1 'Ethernet Controller 10G X550T 1563' if=em4 drv=ixgbe 
unused=vfio-pci
:3b:00.0 'MT27700 Family [ConnectX-4] 1013' if=enp59s0f0 drv=mlx5_core 
unused=vfio-pci
:3b:00.1 'MT27700 Family [ConnectX-4] 1013' if=enp59s0f1 drv=mlx5_core 
unused=vfio-pci
:5e:00.0 '82599ES 10-Gigabit SFI/SFP+ Network Connection 10fb' if=p3p1 
drv=ixgbe unused=vfio-pci
:5e:00.1 '82599ES 10-Gigabit SFI/SFP+ Network Connection 10fb' if=p3p2 
drv=ixgbe unused=vfio-pci
:5e:10.0 '82599 Ethernet Controller Virtual Function 10ed' if= drv=ixgbevf 
unused=vfio-pci
:5e:10.2 '82599 Ethernet Controller Virtual Function 10ed' if=p3p1_1 
drv=ixgbevf unused=vfio-pci
:5e:10.4 '82599 Ethernet Controller Virtual Function 10ed' if= drv=ixgbevf 
unused=vfio-pci
:5e:10.6 '82599 Ethernet Controller Virtual Function 10ed' if=p3p1_3 
drv=ixgbevf unused=vfio-pci
:af:00.0 '82599ES 10-Gigabit SFI/SFP+ Network Connection 10fb' if=p4p1 
drv=ixgbe unused=vfio-pci
:af:00.1 '82599ES 10-Gigabit SFI/SFP+ Network Connection 10fb' if=p4p2 
drv=ixgbe unused=vfio-pci
:d8:00.0 'MT27700 Family [ConnectX-4] 1013' if=enp216s0f0 drv=mlx5_core 
unused=vfio-pci
:d8:00.1 'MT27700 Family [ConnectX-4] 1013' if=enp216s0f1 drv=mlx5_core 
unused=vfio-pci
:d8:00.2 'MT27700 Family [ConnectX-4 Virtual Function] 1014' if=enp216s0f2 
drv=mlx5_core unused=vfio-pci
:d8:00.3 'MT27700 Family [ConnectX-4 Virtual Function] 1014' if=enp216s0f3 
drv=mlx5_core unused=vfio-pci
:d8:00.4 'MT27700 Family [ConnectX-4 Virtual Function] 1014' if=enp216s0f4 
drv=mlx5_core unused=vfio-pci
:d8:00.5 'MT27700 Family [ConnectX-4 Virtual Function] 1014' if=enp216s0f5 
drv=mlx5_core unused=vfio-pci

The interfaces are up:

pcgwpod009-c04:~ # ibdev2netdev -v
:3b:00.0 mlx5_0 (MT4115 - MT1646K01301) CX416A - ConnectX-4 QSFP28 fw 
12.28.2006 port 1 (ACTIVE) ==> enp59s0f0 (Up)
:3b:00.1 mlx5_1 (MT4115 - MT1646K01301) CX416A - ConnectX-4 QSFP28 fw 
12.28.2006 port 1 (ACTIVE) ==> enp59s0f1 (Up)
:d8:00.0 mlx5_2 (MT4115 - MT1646K00538) CX416A - ConnectX-4 QSFP28 fw 
12.28.2006 port 1 (ACTIVE) ==> enp216s0f0 (Up)
:d8:00.1 mlx5_3 (MT4115 - MT1646K00538) CX416A - ConnectX-4 QSFP28 fw 
12.28.2006 port 1 (ACTIVE) ==> enp216s0f1 (Up)
:d8:00.2 mlx5_4 (MT4116 - NA)  fw 12.28.2006 port 1 (ACTIVE) ==> enp216s0f2 
(Up)
:d8:00.3 mlx5_5 (MT4116 - NA)  fw 12.28.2006 port 1 (ACTIVE) ==> enp216s0f3 
(Up)
:d8:00.4 mlx5_6 (MT4116 - NA)  fw 12.28.2006 port 1 (ACTIVE) ==> enp216s0f4 
(Up)
:d8:00.5 mlx5_7 (MT4116 - NA)  fw 12.28.2006 port 1 (ACTIVE) ==> enp216s0f5 
(Up)
pcgwpod009-c04:~ #


But when I run my application the Mellanox adapters are probed and I obtain the 
following error:

{"proc_id":"6"},"message":"[pio] EAL: Probe PCI driver: mlx5_pci (15b3:1014) 
device: :d8:00.4 (socket 1)"}
{"version":"0.2.0","timestamp":"2022-01-14T09:51:39.826+00:00","severity":"info","service_id":"eric-pc-up-data-plane","metadata":{"proc_id":"6"},"message":"[pio]
 net_mlx5: unable to recognize master/representors on the multiple IB devices"}
{"version":"0.2.0","timestamp":"2022-01-14T09:51:39.826+00:00","severity":"info","service_id":"eric-pc-up-data-plane","metadata":{"proc_id":"6"},"message":"[pio]
 common_mlx5: Failed to load driver = net_mlx5."}

Trouble bringing up dpdk testpmd with Mellanox ports

2022-01-16 Thread Sindhura Bandi
Hi,


I'm trying to bring up dpdk-testpmd application using Mellanox connectX-5 
ports. With a custom built dpdk, testpmd is not able to detect the ports.


OS & Kernel:

Linux debian-10 4.19.0-17-amd64 #1 SMP Debian 4.19.194-2 (2021-06-21) x86_64 
GNU/Linux

The steps followed:

  *   Installed MLNX_OFED_LINUX-4.9-4.0.8.0-debian10.0-x86_64 
(./mlnxofedinstall --skip-distro-check --upstream-libs --dpdk)
  *   Downloaded dpdk-18.11 source, and built it after making following changes 
in config

   CONFIG_RTE_LIBRTE_MLX5_PMD=y
   CONFIG_RTE_TEST_PMD_RECORD_CORE_CYCLES=y
   CONFIG_RTE_BUILD_SHARED_LIB=y

  *   When I run testpmd, it is not recognizing any Mellanox ports


#
root@debian-10:~/dpdk-18.11/myinstall# ./bin/testpmd -l 1-3  -w 82:00.0 
--no-pci -- --total-num-mbufs 1025
EAL: Detected 24 lcore(s)
EAL: Detected 2 NUMA nodes
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
EAL: No free hugepages reported in hugepages-1048576kB
EAL: Probing VFIO support...
testpmd: No probed ethernet devices
testpmd: create a new mbuf pool : n=1025, size=2176, 
socket=0
testpmd: preferred mempool ops selected: ring_mp_mc
testpmd: create a new mbuf pool : n=1025, size=2176, 
socket=1
testpmd: preferred mempool ops selected: ring_mp_mc
Done
No commandline core given, start packet forwarding
io packet forwarding - ports=0 - cores=0 - streams=0 - NUMA support enabled, MP 
allocation mode: native

  io packet forwarding packets/burst=32
  nb forwarding cores=1 - nb forwarding ports=0
Press enter to exit
##

root@debian-10:~# lspci | grep Mellanox
82:00.0 Ethernet controller: Mellanox Technologies MT28800 Family [ConnectX-5 
Ex]
82:00.1 Ethernet controller: Mellanox Technologies MT28800 Family [ConnectX-5 
Ex]
root@debian-10:~# ibv_devinfo
hca_id:mlx5_0
transport:InfiniBand (0)
fw_ver:16.28.4512
node_guid:b8ce:f603:00f2:7952
sys_image_guid:b8ce:f603:00f2:7952
vendor_id:0x02c9
vendor_part_id:4121
hw_ver:0x0
board_id:DEL04
phys_port_cnt:1
port:1
state:PORT_ACTIVE (4)
max_mtu:4096 (5)
active_mtu:1024 (3)
sm_lid:0
port_lid:0
port_lmc:0x00
link_layer:Ethernet

hca_id:mlx5_1
transport:InfiniBand (0)
fw_ver:16.28.4512
node_guid:b8ce:f603:00f2:7953
sys_image_guid:b8ce:f603:00f2:7952
vendor_id:0x02c9
vendor_part_id:4121
hw_ver:0x0
board_id:DEL04
phys_port_cnt:1
port:1
state:PORT_ACTIVE (4)
max_mtu:4096 (5)
active_mtu:1024 (3)
sm_lid:0
port_lid:0
port_lmc:0x00
link_layer:Ethernet


I'm not sure where I'm going wrong. Any hints will be much appreciated.

Thanks,
Sindhu


Potential bug with the X540-AT2 driver (upon missed packets), DPDK 21.11

2022-01-16 Thread Vinay Purohit
Summary: We’ve noticed that the PMD for the X540-AT2 does not recover
properly after it reports dropped packets (resulting in increments of
*rte_eth_stats.imissing*) due to CPU overload. The said driver then
continues to drop packets even when the CPU load subsides. A restart of the
just the driver (with no other changes) fixes the issue. Other drivers
(such as the I350 1GE PMD) do no exhibit this behavior under identical
traffic conditions and everything else being the same, causing us to
suspect a bug in the X540 PMD.



Here are more details:



We have a very simple application using DPDK 21.11 running on an x86_64
platform running CentOS 8. Application receives 64byte packets at 1Gbps
from port 0 of the X540-AT2 card, does some processing, and transmits
packets over port 1 of the same card. Everything’s ok when the CPU load is
moderate. When processing load saturates the CPU core, the *imissing* count
increments (as expected) as PMD cannot keep up with the received packets.
The real issue is that driver continues to miss packets and increments
*imissing* *even after the CPU load subsides to levels where previously it
reported no dropped packets.* A restart of the X540 driver using
*rte_eth_dev_stop()* and *rte_eth_dev_start() *fixes the issue. Here’s the
sequence:



   1. CPU core moderately loaded, X540-AT2 PMD reports no missed packets
   (all good).
   2. CPU core saturated, PMD reports missed packets (as expected)
   3. CPU core load subsides and is about the same as level in item 1
   above, but PMD continues to drop packets and increments *imissing*
   (strange)
   4. Issue of dropped packets gets fixed after restarting the port driver
   by calling rte_eth_dev_stop() and then rte_eth_dev_start() with no other
   changes and no restart of the overall process/thread/application.



The above behavior is not seen with other drivers, i.e., packet drops stop
upon mitigation of the CPU load level.



Has anyone else seen the above issue with the X540-AT2 card?



Thanks,

Vinay Purohit

CloudJuncxion, Inc.


smime.p7s
Description: S/MIME cryptographic signature