On 10/24/25 4:28 PM, Frode Nordahl wrote:
> The commit in the fixes tag extended struct dpif_execute with an
> upcall_pid member.
>
> This member was left uninitialized in dpif_execute_helper_cb(),
> leading to Open vSwitch providing the kernel with a PID for which
> it does not have a listener.
>
> This condition is caught by one of the existing system-traffic
> tests which is extended to explicitly check for presence of lost
> upcalls in the datapath.
>
> Fixes: 0d9dc8e9ca4a ("dpif-netlink: Provide original upcall pid in 'execute'
> commands.")
> Reported-at: https://launchpad.net/bugs/2127061
> Signed-off-by: Frode Nordahl <[email protected]>
> ---
> lib/dpif.c | 1 +
> tests/system-traffic.at | 2 ++
> 2 files changed, 3 insertions(+)
Good catch, Frode! Thanks for the patch.
>
> Wanted to note that retis was instrumental in tracking down the
> root cause if this issue, very useful tool!
>
> I also wanted to note that I spent considerable amonut of time
> trying to figure out a low touch way of creating a specifc test
> for this, but did not come up with any good alternatives.
>
> I appear to somehow have missed that an existing system test
> catches the condition, so I went ahead and expanded on that
> in the end.
Hmm. I can't reproduce the problem with this particular test.
There should be no upcalls after the packet is re-injected via
the 'execute' request. And also the pcap file check should be
failing if the packet was dropped. Could you explain a bit more
what you see in your test run?
>
> diff --git a/lib/dpif.c b/lib/dpif.c
> index 070fc0131..a062a4179 100644
> --- a/lib/dpif.c
> +++ b/lib/dpif.c
> @@ -1240,6 +1240,7 @@ dpif_execute_helper_cb(void *aux_, struct
> dp_packet_batch *packets_,
> execute.probe = false;
> execute.mtu = 0;
> execute.hash = 0;
> + execute.upcall_pid = 0;
Thinking a bit more about the problem, I think, we should be
getting the original upcall pid for this request. We have it
in the 'execute' structure before we're trying to execute with
help. We can store the value in the aux data and set it here.
The same as we do for the 'flow'. This will ensure that we're
keeping the upcall pid regardless if the execution is going
direct to kernel or some actions are executed in the userspace
first.
> aux->error = dpif_execute(aux->dpif, &execute);
> log_execute_message(aux->dpif, &this_module, &execute,
> true, aux->error);
> diff --git a/tests/system-traffic.at b/tests/system-traffic.at
> index 0c41c39ff..50a45383f 100644
> --- a/tests/system-traffic.at
> +++ b/tests/system-traffic.at
> @@ -3061,6 +3061,8 @@ NS_CHECK_EXEC([at_ns0], [$PYTHON3 $srcdir/sendpkt.py p0
> \
> "$(ovs-ofctl compose-packet --bare 'ICMP_PKT')"],
> [0], [ignore])
>
> +AT_CHECK([ovs-appctl dpctl/show | grep -q lookups.*lost:0])
> +
> dnl Check the expected decapsulated on the egress interface.
> OVS_WAIT_UNTIL([ovs-pcap p1.pcap | grep -q \
> "^$(ovs-ofctl compose-packet --bare 'ICMP_PKT')\$"])
Best regards, Ilya Maximets.
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev