Is there anything in the usage I described in my previous email which
might explain this problem? Is there anything else wrong with what I'm
doing driver-wise?

On Sun, Nov 10, 2024 at 12:31 PM Alan Beadle <ab.bea...@gmail.com> wrote:
>
> I'm using the vfio-pci module with Intel X550-T2 NICs. I believe this
> means it will use the ixgbe driver? To be honest, I am a bit confused
> about the use of drivers in DPDK. I am using the first setup that I
> got to work and send/receive packets. Additional tips would be greatly
> appreciated. After loading the vfio-pci module I run dpdk-devbind.py
> --bind vfio-pci 65:00.1 and then I just use the standard DPDK API
> calls in my app. I was meaning to revisit this once my app was more
> complete.
>
> On Sun, Nov 10, 2024 at 12:12 PM Stephen Hemminger
> <step...@networkplumber.org> wrote:
> >
> > On Sun, 10 Nov 2024 11:23:29 -0500
> > Alan Beadle <ab.bea...@gmail.com> wrote:
> >
> > > Hi everyone,
> > >
> > > I am using DPDK to send two-way traffic between a pair of machines. My
> > > application has local readers, remote acknowledgments, as well as
> > > automatic retries when a packet is lost. For these reasons I am using
> > > rte_mbuf_refcnt_update() to prevent the NIC from freeing the packet
> > > and recycling the mbuf before my local readers are done and the remote
> > > reader has acknowledged the message. I was advised to do this in an
> > > earlier thread on this mailing list.
> > >
> > > However, this does not seem to be working. After running my app for
> > > awhile and exchanging about 1000 messages in this way, my queue of
> > > unacknowledged mbufs is getting corrupted. The mbufs attached to my
> > > queue seem to contain data for newer messages than what is supposed to
> > > be in them, and in some cases contains a totally different type of
> > > packet (an acknack for instance). Obviously this results in retries of
> > > those messages failing to send the correct data and my application
> > > gets stuck.
> > >
> > > I have ensured that the refcount is not reaching 0. Each new mbuf
> > > immediately has the refcnt incremented by 1. I was concerned that
> > > retries might need the refcnt bumped again, but if I bump the refcount
> > > every time I resend a specific mbuf to the NIC, the refcounts just
> > > keep getting higher. So it looks like re-bumping it on a resend is not
> > > necessary.
> > >
> > > I have ruled out other possible explanations. The mbufs are being
> > > reused by rte_pktmbuf_alloc. I even tried playing with the EAL
> > > settings related to the number of mbuf descriptors and saw my changes
> > > directly correlate with how long it takes this problem to occur. How
> > > do I really prevent the driver from reusing packets that I still might
> > > need to resend?
> > >
> > > Thanks in advance,
> > > -Alan
> >
> > Which driver, could be a driver bug.
> >
> > Also, you should be able to trace mbuf functions, either with rte_trace
> > or by other facility.

Reply via email to