On 4/11/26 4:14 PM, Weiming Shi wrote: > The vport netlink reply helpers allocate a fixed-size skb with > nlmsg_new(NLMSG_DEFAULT_SIZE, ...) but serialize the full upcall PID > array via ovs_vport_get_upcall_portids(). Since > ovs_vport_set_upcall_portids() accepts any non-zero multiple of > sizeof(u32) with no upper bound, a CAP_NET_ADMIN user can install a PID > array large enough to overflow the reply buffer. On systems with > unprivileged user namespaces enabled (e.g., Ubuntu default), this is > reachable via unshare -Urn since all OVS vport genl operations use > GENL_UNS_ADMIN_PERM. > > When the subsequent nla_put() fails with -EMSGSIZE, five BUG_ON(err < 0) > sites fire and panic the kernel: > > kernel BUG at net/openvswitch/datapath.c:2414! > Oops: invalid opcode: 0000 [#1] SMP KASAN NOPTI > CPU: 1 UID: 0 PID: 65 Comm: poc Not tainted 7.0.0-rc7-00195-geb216e422044 #1 > RIP: 0010:ovs_vport_cmd_set+0x34c/0x400 > Call Trace: > <TASK> > genl_family_rcv_msg_doit (net/netlink/genetlink.c:1116) > genl_rcv_msg (net/netlink/genetlink.c:1194) > netlink_rcv_skb (net/netlink/af_netlink.c:2550) > genl_rcv (net/netlink/genetlink.c:1219) > netlink_unicast (net/netlink/af_netlink.c:1344) > netlink_sendmsg (net/netlink/af_netlink.c:1894) > __sys_sendto (net/socket.c:2206) > __x64_sys_sendto (net/socket.c:2209) > do_syscall_64 (arch/x86/entry/syscall_64.c:63) > entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130) > </TASK> > Kernel panic - not syncing: Fatal exception > > Fix this by dynamically sizing the reply skb to account for the actual > PID array length, and replace the BUG_ON() calls with graceful error > returns.
Hi, Weiming. Thanks for working on this! The earlier attempt to fix this problem was here: https://lore.kernel.org/netdev/[email protected]/ CC: sunichi, maybe you can cooperate on the fix somehow. A few problems with the solution here: - You're locking to count the number of pids, then unlocking and then re-locking to fill the output. This is racy and may still cause the inability to put all the data into allocated space, since the number can theoretically change. - Failing the del command is very unfriendly to the userspace and also it becomes impossible for the user to delete the port without clearing the upcall pids first, which they will not know to do. - Failing the new command as done in this change is very confusing as the port is actually created while the user gets the error. - NLMSG_DEFAULT_SIZE + portids_size is not the actual size of the message, nothing in the code guarantees that the rest fits into NLMSG_DEFAULT_SIZE. - The number of pids being unbounded is a problem in itself. So, we need to approach this issue differently. As I suggested in the thread linked above, we should limit the maximum number of the pid to the number of CPUs, as it is done for the per-cpu dispatch socket array. There is no point in more sockets than CPUs and the userspace never creates that many. This way we can: - Fail the attempts to set up more pids than CPUs in the first place. - Know beforehand the maximum size to allocate - no need to check the actual number and re-lock. Create a function similar to ovs_dp_cmd_msg_size() to allocate based on what can be included in the worst case, which shouldn't be a lot. And all the BUG_ON() calls must remain BUG_ON()s as it is a bug if we're not counting correctly. > Fixes: b83d23a2a38b ("openvswitch: Introduce per-cpu upcall dispatch") This is not the right commit. The change you're making has nothing to do with the per-cpu dispatch. The actual commit you're looking for is much older: 5cd667b0a456 ("openvswitch: Allow each vport to have an array of 'port_id's.") Couple more notes: - Wait at least 24 hours between versions. Otherwise, people have no enough time to look at your patch. - Please, CC all maintainers including the ovs-dev list. It is moderated for new senders (the only reliable way to keep it out of spam) but we approve fast, so your next emails should go right through. Best regards, Ilya Maximets. _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
