I'm using the vfio-pci module with Intel X550-T2 NICs. I believe this means it will use the ixgbe driver? To be honest, I am a bit confused about the use of drivers in DPDK. I am using the first setup that I got to work and send/receive packets. Additional tips would be greatly appreciated. After loading the vfio-pci module I run dpdk-devbind.py --bind vfio-pci 65:00.1 and then I just use the standard DPDK API calls in my app. I was meaning to revisit this once my app was more complete.
On Sun, Nov 10, 2024 at 12:12 PM Stephen Hemminger <step...@networkplumber.org> wrote: > > On Sun, 10 Nov 2024 11:23:29 -0500 > Alan Beadle <ab.bea...@gmail.com> wrote: > > > Hi everyone, > > > > I am using DPDK to send two-way traffic between a pair of machines. My > > application has local readers, remote acknowledgments, as well as > > automatic retries when a packet is lost. For these reasons I am using > > rte_mbuf_refcnt_update() to prevent the NIC from freeing the packet > > and recycling the mbuf before my local readers are done and the remote > > reader has acknowledged the message. I was advised to do this in an > > earlier thread on this mailing list. > > > > However, this does not seem to be working. After running my app for > > awhile and exchanging about 1000 messages in this way, my queue of > > unacknowledged mbufs is getting corrupted. The mbufs attached to my > > queue seem to contain data for newer messages than what is supposed to > > be in them, and in some cases contains a totally different type of > > packet (an acknack for instance). Obviously this results in retries of > > those messages failing to send the correct data and my application > > gets stuck. > > > > I have ensured that the refcount is not reaching 0. Each new mbuf > > immediately has the refcnt incremented by 1. I was concerned that > > retries might need the refcnt bumped again, but if I bump the refcount > > every time I resend a specific mbuf to the NIC, the refcounts just > > keep getting higher. So it looks like re-bumping it on a resend is not > > necessary. > > > > I have ruled out other possible explanations. The mbufs are being > > reused by rte_pktmbuf_alloc. I even tried playing with the EAL > > settings related to the number of mbuf descriptors and saw my changes > > directly correlate with how long it takes this problem to occur. How > > do I really prevent the driver from reusing packets that I still might > > need to resend? > > > > Thanks in advance, > > -Alan > > Which driver, could be a driver bug. > > Also, you should be able to trace mbuf functions, either with rte_trace > or by other facility.