Thank you Eelco.

Code search shows we have `recv_upcalls` and `upcall_cb` which will call 
`upcall_uninit`.
And dp_netdev_upcall will call the dp->upcall_cb.
So we have call stacks like this:
i) handle_packet_upcall->dp_netdev_upcall->upcall_cb->upcall_uninit
ii) 
dp_execute_userspace_action->dp_netdev_upcall->upcall_cb->upcall_uninit



Cloud you confirm these calls?


For your change, I'll run tests with recoreded packets to verify.


Regards,


LIU Yulong
 
 
------------------ Original ------------------
From: &nbsp;"Eelco&nbsp;Chaudron"<[email protected]&gt;;
Date: &nbsp;Fri, Oct 17, 2025 08:08 PM
To: &nbsp;"LIU Yulong"<[email protected]&gt;; 
Cc: &nbsp;"dev"<[email protected]&gt;; 
Subject: &nbsp;Re: [ovs-dev] [PATCH] ofproto-dpif-upcall: Add ovsrcu_postpone 
for ukey_delete__.

&nbsp;
Hi Liu,

I looked at the change; however, upcall_uninit() is only called for newly 
created (never inserted) ukeys, so the ovs_postpone() call is not needed.

However, I did find an issue in upcall_receive(), where, in an error path, it 
could use an uninitialized upcall structure — causing a ukey to be freed that 
should not have been.

Can you try out the diff below to see if it fixes your problem?

Cheers,

Eelco

diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c
index b3b4b2d2f..53b906a16 100644
--- a/ofproto/ofproto-dpif-upcall.c
+++ b/ofproto/ofproto-dpif-upcall.c
@@ -1230,6 +1230,17 @@ upcall_receive(struct upcall *upcall, const struct 
dpif_backer *backer,
{
 int error;

+&nbsp;&nbsp;&nbsp; /* Initialize the minimal required fields in the upcall 
structure to ensure
+&nbsp;&nbsp;&nbsp;&nbsp; * upcall_uninit() does not operate on invalid data. */
+&nbsp;&nbsp;&nbsp; upcall-&gt;have_recirc_ref = false;
+&nbsp;&nbsp;&nbsp; upcall-&gt;xout_initialized = false;
+&nbsp;&nbsp;&nbsp; upcall-&gt;ukey_persists = false;
+&nbsp;&nbsp;&nbsp; upcall-&gt;ukey = NULL;
+&nbsp;&nbsp;&nbsp; ofpbuf_use_stub(&amp;upcall-&gt;odp_actions, 
upcall-&gt;odp_actions_stub,
+&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
 sizeof upcall-&gt;odp_actions_stub);
+&nbsp;&nbsp;&nbsp; ofpbuf_init(&amp;upcall-&gt;put_actions, 0);
+
+
 upcall-&gt;type = classify_upcall(type, userdata, &amp;upcall-&gt;cookie);
 if (upcall-&gt;type == BAD_UPCALL) {
 return EAGAIN;
@@ -1258,19 +1269,11 @@ upcall_receive(struct upcall *upcall, const struct 
dpif_backer *backer,
 }

 upcall-&gt;recirc = NULL;
-&nbsp;&nbsp;&nbsp; upcall-&gt;have_recirc_ref = false;
 upcall-&gt;flow = flow;
 upcall-&gt;packet = packet;
 upcall-&gt;ufid = ufid;
 upcall-&gt;pmd_id = pmd_id;
-&nbsp;&nbsp;&nbsp; ofpbuf_use_stub(&amp;upcall-&gt;odp_actions, 
upcall-&gt;odp_actions_stub,
-&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
 sizeof upcall-&gt;odp_actions_stub);
-&nbsp;&nbsp;&nbsp; ofpbuf_init(&amp;upcall-&gt;put_actions, 0);

-&nbsp;&nbsp;&nbsp; upcall-&gt;xout_initialized = false;
-&nbsp;&nbsp;&nbsp; upcall-&gt;ukey_persists = false;
-
-&nbsp;&nbsp;&nbsp; upcall-&gt;ukey = NULL;
 upcall-&gt;key = NULL;
 upcall-&gt;key_len = 0;
 upcall-&gt;mru = mru;


On 16 Oct 2025, at 3:12, LIU Yulong wrote:

&gt; We have such call stack of coredump:
&gt; *0&nbsp; 0x00007f7f197ae337 in raise () from /lib64/libc.so.6
&gt; *1&nbsp; 0x00007f7f197afa28 in abort () from /lib64/libc.so.6
&gt; *2&nbsp; 0x000055934ca4f4ee in ovs_abort_valist (err_no=<optimized 
out&gt;, format=<optimized out&gt;, args=args@entry=0x7f7f07530360) at 
lib/util.c:499
&gt; *3&nbsp; 0x000055934ca4f584 in ovs_abort (err_no=err_no@entry=0, 
format=format@entry=0x55934ccedd18 "%s: %s() passed uninitialized ovs_mutex") 
at lib/util.c:491
&gt; *4&nbsp; 0x000055934ca1a4a1 in ovs_mutex_trylock_at 
(l_=l_@entry=0x7f7ed4a43e58, where=where@entry=0x55934cccb318 
"ofproto/ofproto-dpif-upcall.c:3014") at lib/ovs-thread.c:106
&gt; *5&nbsp; 0x000055934c943181 in revalidator_sweep__ 
(revalidator=revalidator@entry=0x5593518c1720, purge=purge@entry=false) at 
ofproto/ofproto-dpif-upcall.c:3014
&gt; *6&nbsp; 0x000055934c9471a6 in revalidator_sweep 
(revalidator=0x5593518c1720) at ofproto/ofproto-dpif-upcall.c:3072
&gt; *7&nbsp; udpif_revalidator (arg=0x5593518c1720) at 
ofproto/ofproto-dpif-upcall.c:1086
&gt; *8&nbsp; 0x000055934ca1b05f in ovsthread_wrapper (aux_=<optimized out&gt;) 
at lib/ovs-thread.c:422
&gt; *9&nbsp; 0x00007f7f1b6ece65 in start_thread () from /lib64/libpthread.so.0
&gt; *10 0x00007f7f1987688d in clone () from /lib64/libc.so.6
&gt;
&gt; When calling ovs_mutex_trylock() on ukey-&gt;mutex, ovs_mutex_trylock_at
&gt; sees that the input is an "uninitialized ovs_mutex" (l-&gt;where is NULL),
&gt; and aborts.
&gt;
&gt; This state can only occur after the mutex has not been initialized or
&gt; has been destroyed. The mutex is definitely initialized in ukey_create__
&gt; by ukey, so the "uninitialized" state is almost certainly "destroyed".
&gt; Destruction occurs in ukey_delete__, which calls 
ovs_mutex_destroy(&amp;ukey-&gt;mutex)
&gt; and sets where to NULL.
&gt;
&gt; When revalidator_sweep__ is traversing cmap and trying to lock 
(&amp;ukey-&gt;mutex),
&gt; it encounters a ukey that has been directly destroyed by ukey_delete__,
&gt; However, the ukey is still visible to the revalidator (either still in
&gt; the cmap or has not yet passed the RCU grace period), resulting in an
&gt; abort. That is to say, there is a path where ukey_delete__ is directly
&gt; called during concurrent traversal of ukey, bypassing the
&gt; cmap_remove + ovsrcu_postpone semantics of ukey_delete.
&gt;
&gt; Modify upcall_uninit() to change direct ukey_delete__ to RCU deferred
&gt; release to avoid concurrent traversal conflicts with revalidator.
&gt; This ensures that ukey_delete__ is not executed until after the
&gt; global grace period, and that the CMAP_FOR_EACH within
&gt; revalidator_sweep__ will not encounter a destroyed mutex before
&gt; the end of running cycle.
&gt;
&gt; Some earlier email discussions:
&gt; [1] 
https://mail.openvswitch.org/pipermail/ovs-discuss/2024-March/052973.html
&gt; [2] 
https://mail.openvswitch.org/pipermail/ovs-discuss/2024-February/052949.html
&gt; [3] 
https://mail.openvswitch.org/pipermail/ovs-discuss/2024-March/052993.html
&gt;
&gt; Signed-off-by: LIU Yulong <[email protected]&gt;
&gt; ---
&gt;&nbsp; ofproto/ofproto-dpif-upcall.c | 2 +-
&gt;&nbsp; 1 file changed, 1 insertion(+), 1 deletion(-)
&gt;
&gt; diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c
&gt; index 9dfa52d82..b3b4b2d2f 100644
&gt; --- a/ofproto/ofproto-dpif-upcall.c
&gt; +++ b/ofproto/ofproto-dpif-upcall.c
&gt; @@ -1386,7 +1386,7 @@ upcall_uninit(struct upcall *upcall)
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 
ofpbuf_uninit(&amp;upcall-&gt;put_actions);
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if (upcall-&gt;ukey) 
{
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
 if (!upcall-&gt;ukey_persists) {
&gt; 
-&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
 ukey_delete__(upcall-&gt;ukey);
&gt; 
+&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
 ovsrcu_postpone(ukey_delete__, upcall-&gt;ukey);
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
 }
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; } else if 
(upcall-&gt;have_recirc_ref) {
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
 /* The reference was transferred to the ukey if one was created. */
&gt; -- 
&gt; 2.50.1 (Apple Git-155)
&gt;
&gt; _______________________________________________
&gt; dev mailing list
&gt; [email protected]
&gt; https://mail.openvswitch.org/mailman/listinfo/ovs-dev
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to