Hi Fabio,

How about running
find /sys/kernel/iommu_groups/ -type l
to identify devices that are in the same IOMMU group as 0000:09:00.0 ?

Thank you.

On Fri, 2 Aug 2024, Fabio Fernandes wrote:

Hi Ivan,

I'm using igb_uio because it's the one recommended for my target network card 
net/ena.

I've tried both vfio-pci and uio_pci_generic, but they fail for different 
reasons.

With vfio-pci, EAL tells me:

```
EAL: PCI device 0000:09:00.0 on NUMA socket -1
EAL:   probe driver: 8086:15f3 net_igc
EAL: 0000:09:00.0 VFIO group is not viable! Not all devices in IOMMU group 
bound to VFIO or unbound
EAL: Requested device 0000:09:00.0 cannot be used
```

I tried adding kernel boot parameter `iommu=on` with no luck.
I also tried unbinding my other cards:

```
Network devices using DPDK-compatible driver
============================================
0000:09:00.0 'Ethernet Controller I225-V 15f3' drv=vfio-pci 
unused=igc,uio_pci_generic

Other Network devices
=====================
0000:08:00.0 'MT7922 802.11ax PCI Express Wireless Network Adapter 0616' 
unused=mt7921e,vfio-pci,uio_pci_generic
0000:0a:00.0 'AQtion AQC113CS NBase-T/IEEE 802.3an Ethernet Controller [Antigua 
10G] 94c0' unused=atlantic,vfio-pci,uio_pci_generic
```

Resulting rte_eth_dev_count_total() == 0, so nothing starts.


Finally, I also tried `uio_pci_generic`:

```
Network devices using DPDK-compatible driver
============================================
0000:09:00.0 'Ethernet Controller I225-V 15f3' drv=uio_pci_generic 
unused=igc,vfio-pci
```

This time DPDK accepts the device, however, I see the same old dmesg error 
appearing again:

```
[ 1449.570184] uio_pci_generic 0000:09:00.0: AMD-Vi: Event logged 
[IO_PAGE_FAULT domain=0x0011 address=0x13397ff80 flags=0x0000]
```

If you do have any further suggestions, please let me know.
In any case, thank you for your feedback so far!

Regards,
Fabio


Sent with Proton Mail secure email.

On Thursday, August 1st, 2024 at 10:16 PM, Ivan Malov 
<ivan.ma...@arknetworks.am> wrote:

Hi Fabio,

With regard to endianness conversion, I'd rather expect that line
to be something like rte_le_to_cpu_32 as the source value is
declared __le32. But, as I noted before, this is likely a
don't care as your machine is probably little-endian, and
rte_cpu_to_le_32 thus might simply do nothing.

Whereas your observation of the error in dmesg is indeed a
valuable clue. Since it comes from igb_uio, my question is:
why at all use igb_uio? People say it's an outdated driver.
Have you considered using vfio-pci or uio_pci_generic
instead? I suggest you try binding to vfio-pci and
re-check with unmodified PMD source first.

Thank you.

On Thu, 1 Aug 2024, Fabio Fernandes wrote:

Hi Ivan,

Thank you for your response.

I've ran it with the flags you suggested and attached the produced log.

{ sudo ./dpdk-testpmd --log-level=pmd.net.igc,debug 2>&1; } > 
testpmd_with_debug_and_rx_print.log;

testpmd_with_debug_and_rx_print.log.zip

However, the driver never reaches point[1] (nor [2]) and this debug line never 
got logged. I've placed break points to confirm that the loop always exits just 
before [1], at this check:
`if (!(staterr & IGC_RXD_STAT_DD)) break;`

I've also instrumented testpmd.h as below, to confirm in the log file that RX 
is called many times and never returns anything but zeros:
```
static inline uint16_t
common_fwd_stream_receive(struct fwd_stream *fs, struct rte_mbuf **burst,
unsigned int nb_pkts)
{
uint16_t nb_rx;

nb_rx = rte_eth_rx_burst(fs->rx_port, fs->rx_queue, burst, nb_pkts);

// Instrumentation Begin
{
static uint64_t g_call_count = 0;
static uint64_t g_rx_sum = 0;
g_rx_sum += nb_rx;
++g_call_count;
if (nb_rx)
fprintf(stderr, "rte_eth_rx_burst: %u\n", nb_rx);
if ((g_call_count % 100000000UL) == 0)
fprintf(stderr, "g_rx_sum: %lu, g_call_count: %lu\n",
g_rx_sum, g_call_count);
}
// Instrumentation End

if (record_burst_stats)
fs->rx_burst_stats.pkt_burst_spread[nb_rx]++;
fs->rx_packets += nb_rx;
return nb_rx;
}

```

In regards to [3], I've changed that to use rte_cpu_to_be_32() instead and 
rebuilt DPDK, but with same results and the loop still always exits there.

I did, however, noticed something strange and this is probably a clue:

Every time I step over this line of `igc_rx_init()` in the debugger:
https://github.com/DPDK/dpdk/blob/v24.03/drivers/net/igc/igc_txrx.c#L1204

`IGC_WRITE_REG(hw, IGC_RDT(rxq->reg_idx), rxq->nb_rx_desc - 1);`

I get this in `dmesg` kernel, coming from the igb_uio kernel I've bound to the 
device I'm testing:

`[26185.005945] igb_uio 0000:09:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT 
domain=0x0011 address=0x116141a00 flags=0x0000]`

The address matches this in the debugger:

```
rxq->rx_ring_phys_addr
$1 = 0x116141a00

rxq->reg_idx
0

rxq->nb_rx_desc
1024
```

What do you think?

For more info, I'm on this exact DPDK commit:
commit eeb0605f118dae66e80faa44f7b3e88748032353 (HEAD -> v23.11, tag: v23.11

Thanks,
Fabio

Sent with Proton Mail secure email.

On Thursday, August 1st, 2024 at 3:24 PM, Ivan Malov ivan.ma...@arknetworks.am 
wrote:

Hi Fabio,

Have you tried to specify EAL option --log-level="pmd.net.igc,debug"
or --log-level='.*',8 when running the application? Perhaps doing
so can trigger printouts [1], [2]. See if you can't observe those.

Perhaps consider posting a brief excerpt of your code where
rte_eth_rx_burst() is invoked and return value is verified.

Also, albeit unrelated, it's rather peculiar that the code
does CPU-to-LE conversion [3] of descriptor status, but
the field itslef is declared as __le32 already: [4].

[1] https://github.com/DPDK/dpdk/blob/v24.03/drivers/net/igc/igc_txrx.c#L296
[2] https://github.com/DPDK/dpdk/blob/v24.03/drivers/net/igc/igc_txrx.c#L455
[3] https://github.com/DPDK/dpdk/blob/v24.03/drivers/net/igc/igc_txrx.c#L264
[4] 
https://github.com/DPDK/dpdk/blob/v24.03/drivers/net/igc/base/igc_base.h#L109

Thank you.

On Thu, 1 Aug 2024, Fabio Fernandes wrote:

Hi,

I have an issue with rte_eth_rx_burst() for IGC poll mode driver never 
returning any packets and need some advice.
I have this network port:
09:00.0 Ethernet controller: Intel Corporation Ethernet Controller I225-V (rev 
03)

Bound to igb_uio:
Network devices using DPDK-compatible driver
============================================
0000:09:00.0 'Ethernet Controller I225-V 15f3' drv=igb_uio unused=igc

I'm testing this both with testpmd and my own app, which works fine with other 
drivers such as net/ena and net/i40e. I'm using single RX/TX queue pair with 
default configs
with rte_eth_promiscuous_enable() and rte_eth_allmulticast_enable().

The device seems to rte_eth_dev_start() fine, and rte_eth_stats_get() seem to 
be detecting inbound packets. Below is the output from testpmd:

Press enter to exiteth_igc_interrupt_action(): Port 0: Link Up - speed 1000 
Mbps - full-duplex

Port 0: link state change event
^CTelling cores to stop...
Waiting for lcores to finish...

---------------------- Forward statistics for port 0 ----------------------
RX-packets: 129 RX-dropped: 800 RX-total: 929
TX-packets: 0 TX-dropped: 0 TX-total: 0
----------------------------------------------------------------------------

+++++++++++++++ Accumulated forward statistics for all ports+++++++++++++++
RX-packets: 129 RX-dropped: 800 RX-total: 929
TX-packets: 0 TX-dropped: 0 TX-total: 0
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Done.

However, rte_eth_rx_burst() never returns anything, neither in testpmd nor in 
my own app.

In my own app, I log both rte_eth_stats_get() and non-zero xstats from 
rte_eth_xstats_get_by_id():

07:02:13.406873186 INF stats.rx : 0
07:02:13.406892616 INF dev_stats.ipackets : 78
07:02:13.406903636 INF dev_stats.opackets : 0
07:02:13.406914166 INF dev_stats.imissed : 0
07:02:13.406924536 INF dev_stats.ierrors : 0
07:02:13.406934116 INF dev_stats.oerrors : 0
07:02:13.406943956 INF dev_stats.rx_nombuf : 0
07:02:13.407247777 INF xstats rx_good_packets : 78
07:02:13.407257147 INF xstats rx_good_bytes : 17205
07:02:13.407265267 INF xstats rx_size_64_packets : 6
07:02:13.407274627 INF xstats rx_size_65_to_127_packets : 31
07:02:13.407285757 INF xstats rx_size_128_to_255_packets : 22
07:02:13.407297537 INF xstats rx_size_256_to_511_packets : 16
07:02:13.407309127 INF xstats rx_size_512_to_1023_packets : 3
07:02:13.407321327 INF xstats rx_broadcast_packets : 8
07:02:13.407331597 INF xstats rx_multicast_packets : 64
07:02:13.407346357 INF xstats rx_total_packets : 78
07:02:13.407355547 INF xstats rx_total_bytes : 17205
07:02:13.407364127 INF xstats rx_sent_to_host_packets : 78
07:02:13.407375347 INF xstats interrupt_assert_count : 1

Still, rte_eth_rx_burst() never returns anything.

It's worthwhile to note that rte_eth_rx_burst() works fine when I, instead of 
net/igc, use net/ena (with ENA card) or net/i40e (Intel x710 card).

The debug log from EAL and net/igc is attached, in case that helps.
There's a warning "igc_rx_init(): forcing scatter mode", but I've already tried 
changing my mbuf sizes so that the warning goes away but that also didn't help.

Any advice?

Thanks,
Fabio

Reply via email to