On 11/10/25 3:31 PM, Frode Nordahl wrote:
> On 11/10/25 14:35, Ilya Maximets wrote:
>> On 10/30/25 12:34 AM, Frode Nordahl wrote:
>>> On 10/29/25 14:11, Ilya Maximets wrote:
>>>> On 10/27/25 11:16 PM, Frode Nordahl wrote:
>>>>> On 10/27/25 19:40, Ilya Maximets wrote:
>>>>>> On 10/24/25 4:28 PM, Frode Nordahl wrote:
>>>>>>> The commit in the fixes tag extended struct dpif_execute with an
>>>>>>> upcall_pid member.
>>>>>>>
>>>>>>> This member was left uninitialized in dpif_execute_helper_cb(),
>>>>>>> leading to Open vSwitch providing the kernel with a PID for which
>>>>>>> it does not have a listener.
>>>>>>>
>>>>>>> This condition is caught by one of the existing system-traffic
>>>>>>> tests which is extended to explicitly check for presence of lost
>>>>>>> upcalls in the datapath.
>>>>>>>
>>>>>>> Fixes: 0d9dc8e9ca4a ("dpif-netlink: Provide original upcall pid in 
>>>>>>> 'execute' commands.")
>>>>>>> Reported-at: https://launchpad.net/bugs/2127061
>>>>>>> Signed-off-by: Frode Nordahl <[email protected]>
>>>>>>> ---
>>>>>>>     lib/dpif.c              | 1 +
>>>>>>>     tests/system-traffic.at | 2 ++
>>>>>>>     2 files changed, 3 insertions(+)
>>>>>>
>>>>>> Good catch, Frode!  Thanks for the patch.
>>>>>
>>>>> Thank you for taking the time to review, Ilya, much appreciated!
>>>>>
>>>>>>>
>>>>>>> Wanted to note that retis was instrumental in tracking down the
>>>>>>> root cause if this issue, very useful tool!
>>>>>>>
>>>>>>> I also wanted to note that I spent considerable amonut of time
>>>>>>> trying to figure out a low touch way of creating a specifc test
>>>>>>> for this, but did not come up with any good alternatives.
>>>>>>>
>>>>>>> I appear to somehow have missed that an existing system test
>>>>>>> catches the condition, so I went ahead and expanded on that
>>>>>>> in the end.
>>>>>>
>>>>>> Hmm.  I can't reproduce the problem with this particular test.
>>>>>> There should be no upcalls after the packet is re-injected via
>>>>>> the 'execute' request.  And also the pcap file check should be
>>>>>> failing if the packet was dropped.  Could you explain a bit more
>>>>>> what you see in your test run?
>>>>>
>>>>> I identified this particular test as interesting by running the entire
>>>>> testsuite with retis and looking for evidence of a -111 return for
>>>>> upcalls in the kernel code.
>>>>>
>>>>> Rechecking this locally I see a difference in behavior depending on
>>>>> which compiler is used to build Open vSwitch, specifically I see the
>>>>> condition when compiling with gcc 14.2.0.
>>>>>
>>>>> Recompiling with gcc 15.2.0 on otherwise identical system makes the test
>>>>> pass.  I can only guess there is a difference in behavior for how the
>>>>> uninitialized value is treated.
>>>>>
>>>>> In any case, we cannot rely on compiler behavior for our test case, so I
>>>>> guess we need to find some other way of testing this.  Any suggestions
>>>>> welcome :-)
>>>>
>>>> One option, I believe, could be to have packet go through conntrack twice
>>>> (different zones) and add debug_slow into OpenFlow rules to force packets
>>>> to userspace after each recirculation.  That should end up with the packets
>>>> dropped.  At the same time it's still a gamble as we're relying on values
>>>> of the uninitialized memory either way.  So, not particularly reliable.
>>>>
>>>> We may omit the test for this issue entirely, since valgrind is actually
>>>> reporting the problem consistently while running this mpls test in
>>>> check-kernel-valgrind:
>>>
>>> That would certainly give more consistent results, and I think i favor
>>> this option.
>>>
>>>>    Thread 27 handler71:
>>>>    Conditional jump or move depends on uninitialised value(s)
>>>>       at 0x55986A: dpif_netlink_encode_execute (dpif-netlink.c:2028)
>>>>       by 0x55986A: dpif_netlink_operate__ (dpif-netlink.c:2112)
>>>>       by 0x559B75: dpif_netlink_operate_chunks (dpif-netlink.c:2428)
>>>>       by 0x559B75: dpif_netlink_operate (dpif-netlink.c:2487)
>>>>       by 0x48BDD9: dpif_operate (dpif.c:1370)
>>>>       by 0x48C386: dpif_execute (dpif.c:1324)
>>>>       by 0x48C386: dpif_execute (dpif.c:1314)
>>>>       by 0x48C386: dpif_execute_helper_cb (dpif.c:1243)
>>>>       by 0x4C277A: odp_execute_actions (odp-execute.c:1074)
>>>>       by 0x489AE0: dpif_execute_with_help (dpif.c:1299)
>>>>       by 0x48BD8A: dpif_operate (dpif.c:1435)
>>>>       by 0x446A8F: handle_upcalls (ofproto-dpif-upcall.c:1746)
>>>>       by 0x446A8F: recv_upcalls (ofproto-dpif-upcall.c:962)
>>>>       by 0x446C1E: udpif_upcall_handler (ofproto-dpif-upcall.c:857)
>>>>       by 0x5163C4: ovsthread_wrapper (ovs-thread.c:426)
>>>>       by 0x51CB199: start_thread (pthread_create.c:443)
>>>>       by 0x524F533: clone (clone.S:100)
>>>>     Uninitialised value was created by a stack allocation
>>>>       at 0x48C130: dpif_execute_helper_cb (dpif.c:1176)
>>>>
>>>>    Syscall param sendmsg(msg.msg_iov[0]) points to uninitialised byte(s)
>>>>       at 0x5250F4D: __libc_sendmsg (sendmsg.c:28)
>>>>       by 0x5250F4D: sendmsg (sendmsg.c:25)
>>>>       by 0x57355D: nl_sock_transact_multiple__ (netlink-socket.c:874)
>>>>       by 0x573955: nl_sock_transact_multiple.part.0 (netlink-socket.c:1097)
>>>>       by 0x574ADF: nl_sock_transact_multiple (netlink-socket.c:1062)
>>>>       by 0x574ADF: nl_transact_multiple (netlink-socket.c:1862)
>>>>       by 0x55993A: dpif_netlink_operate__ (dpif-netlink.c:2141)
>>>>       by 0x559B75: dpif_netlink_operate_chunks (dpif-netlink.c:2428)
>>>>       by 0x559B75: dpif_netlink_operate (dpif-netlink.c:2487)
>>>>       by 0x48BDD9: dpif_operate (dpif.c:1370)
>>>>       by 0x48C386: dpif_execute (dpif.c:1324)
>>>>       by 0x48C386: dpif_execute (dpif.c:1314)
>>>>       by 0x48C386: dpif_execute_helper_cb (dpif.c:1243)
>>>>       by 0x4C277A: odp_execute_actions (odp-execute.c:1074)
>>>>       by 0x489AE0: dpif_execute_with_help (dpif.c:1299)
>>>>       by 0x48BD8A: dpif_operate (dpif.c:1435)
>>>>       by 0x446A8F: handle_upcalls (ofproto-dpif-upcall.c:1746)
>>>>       by 0x446A8F: recv_upcalls (ofproto-dpif-upcall.c:962)
>>>>       by 0x446C1E: udpif_upcall_handler (ofproto-dpif-upcall.c:857)
>>>>       by 0x5163C4: ovsthread_wrapper (ovs-thread.c:426)
>>>>       by 0x51CB199: start_thread (pthread_create.c:443)
>>>>       by 0x524F533: clone (clone.S:100)
>>>>     Address 0xfeefdec is on thread 27's stack
>>>>     in frame #4, created by dpif_netlink_operate__ (dpif-netlink.c:2039)
>>>>     Uninitialised value was created by a stack allocation
>>>>       at 0x48C130: dpif_execute_helper_cb (dpif.c:1176)
>>>>
>>>>
>>>> The tests are painfully slow, so we're not running them in CI.  But we
>>>> really should be running them at least once per release or something.
>>>
>>> +1
>>>
>>>> I'll make a note for myself to do that.
>>>
>>> We have some quality assurance jobs running for upstream OVS/OVN on a
>>> cadence too, I'll have a look and see if we can do something in this
>>> space too as an extension of those jobs.
>>>
>>>>>
>>>>>>>
>>>>>>> diff --git a/lib/dpif.c b/lib/dpif.c
>>>>>>> index 070fc0131..a062a4179 100644
>>>>>>> --- a/lib/dpif.c
>>>>>>> +++ b/lib/dpif.c
>>>>>>> @@ -1240,6 +1240,7 @@ dpif_execute_helper_cb(void *aux_, struct 
>>>>>>> dp_packet_batch *packets_,
>>>>>>>             execute.probe = false;
>>>>>>>             execute.mtu = 0;
>>>>>>>             execute.hash = 0;
>>>>>>> +        execute.upcall_pid = 0;
>>>>>>
>>>>>> Thinking a bit more about the problem, I think, we should be
>>>>>> getting the original upcall pid for this request.  We have it
>>>>>> in the 'execute' structure before we're trying to execute with
>>>>>> help.  We can store the value in the aux data and set it here.
>>>>>> The same as we do for the 'flow'.  This will ensure that we're
>>>>>> keeping the upcall pid regardless if the execution is going
>>>>>> direct to kernel or some actions are executed in the userspace
>>>>>> first.
>>>>>
>>>>> Ack, I'll address this in a v2.
>>>>>
>>>>
>>>> I have a vague memory of already suggesting this somewhere, but I don't
>>>> remember where...  We seem to have a similar problem with the mtu and
>>>> the hash that are cleared, while they probably shouldn't be.  So, what
>>>> we could do here is to store the entire original 'execute' in the aux,
>>>> then copy it and only update fields that need to be updated, i.e.
>>>> actions, packet and the needs_help.  The rest should stay the same as
>>>> it was in the original request.
>>>>
>>>> WDYT?
>>>
>>> Makes sense to me, I'll have a go at incorporating this in the next
>>> iteration.  Thanks for the input and guidance!
>>
>> No problem.
>>
>> Do you have an approximate timeline for when you'll be able to send a v2?
>> It would be great to have it soon, as we also started seeing this issue
>> in fedora, and users may start deploying OVS 3.6 on new kernels soon.
> 
> We'll post something this week, the delay was caused by a couple of 
> weeks with in-person sprinting.

Sure, no worries.

> Mj (added to Cc) showed some interest in this problem, and prospect of 
> reaching the new OVS contributor slide on the upcoming conference, so 
> I'll work with her in getting the v2 posted.

Nice!  Looking forward to it!

Best regards, Ilya Maximets.
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to