On Mon, May 4, 2026 at 3:30 AM Eelco Chaudron <[email protected]> wrote:
>
>
> On 3 May 2026, at 13:57, Eli Britstein wrote:
>
> > On 30/04/2026 12:25, Eelco Chaudron wrote:
> >> External email: Use caution opening links or attachments
> >>
> >>
> >> On 1 Apr 2026, at 11:13, Eli Britstein wrote:
> >>
> >>> Introduce a new netdev type - netdev-doca.
> >>> In order to compile, need to install doca on the build machine.
> >> Hi Eli,
> >>
> >> I was running the DOCA test suite patch[1] on this series, and I get a
> >> few errors. The earlier tunnel related errors are gone though.
> >>
> >> 95: conntrack - IPv4 fragmentation FAILED (
> system-traffic.at:4881)
> >> 97: conntrack - IPv4 fragmentation expiry FAILED (
> system-traffic.at:5002)
> >> 105: conntrack - IPv6 fragmentation FAILED (
> system-traffic.at:5264)
> >> 107: conntrack - IPv6 fragmentation expiry FAILED (
> system-traffic.at:5390)
> >>
> >> These tests pass fine with check-dpdk and check-dpdk-offloads.
> >>
> >> I did not investigate the issues, as I'm focussing on the general
> >> review of the series, so please take a look.
> >
> > Thanks Eelco for this info.
> >
> > I looked into it.
> >
> > The RC is that there is a UAF in the following scenario:
> >
> > 1. A frag packet is added to frag_list in ipf module, and not returned
> to the mempool.
> > 2. the datapath is destroyed. At this point, we first remove the ports,
> and only then destroy CT/IPF.
> > 3. When a doca port is destroyed, it frees its packet mempool
> (rte_mempool_free). in this process, the memory is zeroed in dpdk code:
> lib/eal/common/malloc_elem.c -> malloc_elem_free() -> memset(ptr, 0,
> data_len).
> > 4. When ipf_destroy occurs, the packet is already free. It crashes when
> IPF tries to do dp_packet_delete().
> >
> > With DPDK, there is this commit that frees a mempool only when it's
> "full", meaning all its mbufs are returned to the pool:
> >
> > 91fccdad72a2 ("netdev-dpdk: Free mempool only when no in-use mbufs.")
> >
> > With this, the mempool is not destroyed at all unless another dpdk port
> is configured, then the sweep mechanism will destroy it.
> >
> > IMO, this behavior is a kind of a WA, for which I'm not sure what is the
> correct solution.
> >
> > I will add similar doca-sweep for now, but I think this is not correct.
> >
> > WDYT?
>
> I've included Mike and Kevin who worked on ipf before. Maybe they
> have some more insight into how to fix this.
>
The commit message indicates a non-ipf motivation for the mp_sweep thread,
and the current behaviour is very convenient because it allows ipf to age
off those packets as part of the frag expiry mechanism.
But I think it makes sense for ipf to walk the frags_list and purge any
fragments that come from an interface before we destroy the iface. Aside
from the UAF, the current behaviour could also enable a fragment from a
deleted interface to interfere with the packet transmission of a new
interface.
However, in the current implementation of ipf that would be a very
expensive operation. We could add another index on in_port to help reduce
that cost. We may also want to revisit how the ipf_lock works, a read/write
lock on the frag_lists (including exp_list, complete_list) and a separate
mutex in each ipf_list. This would improve parallelization, at the expense
of more complex code.
I could create an RFC for this if others agree.
Cheers,
M
> >
> > PS: I had a lot of trouble running the tests. There are few fixes to the
> infrastructure that I will post.
>
> Interested to find out what your problems are, as I had no problems
> following the documentation. Anyhow, I can include your changes (if you
> reply to the RFC patch), or you can include my patch in your series.
>
> Cheers,
>
> Eelco
>
>
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev