On Tue, Dec 10, 2019 at 8:41 PM David Ahern <dsah...@gmail.com> wrote: > > On 12/10/19 12:37 PM, Matteo Croce wrote: > > On Tue, Dec 10, 2019 at 8:13 PM David Ahern <dsah...@gmail.com> wrote: > >> > >> Hi Matteo: > >> > >> On a hypervisor running a 4.14.91 kernel and OVS 2.11 I am seeing a > >> thundering herd wake up problem. Every packet punted to userspace wakes > >> up every one of the handler threads. On a box with 96 cpus, there are 71 > >> handler threads which means 71 process wakeups for every packet punted. > >> > >> This is really easy to see, just watch sched:sched_wakeup tracepoints. > >> With a few extra probes: > >> > >> perf probe sock_def_readable sk=%di > >> perf probe ep_poll_callback wait=%di mode=%si sync=%dx key=%cx > >> perf probe __wake_up_common wq_head=%di mode=%si nr_exclusive=%dx > >> wake_flags=%cx key=%8 > >> > >> you can see there is a single netlink socket and its wait queue contains > >> an entry for every handler thread. > >> > >> This does not happen with the 2.7.3 version. Roaming commits it appears > >> that the change in behavior comes from this commit: > >> > >> commit 69c51582ff786a68fc325c1c50624715482bc460 > >> Author: Matteo Croce <mcr...@redhat.com> > >> Date: Tue Sep 25 10:51:05 2018 +0200 > >> > >> dpif-netlink: don't allocate per thread netlink sockets > >> > >> > >> Is this a known problem? > >> > >> David > >> > > > > Hi David, > > > > before my patch, vswitchd created NxM sockets, being N the ports and M > > the active cores, > > because every thread opens a netlink socket per port. > > > > With my patch, a pool is created with N socket, one per port, and all > > the threads polls the same list > > with the EPOLLEXCLUSIVE flag. > > As the name suggests, EPOLLEXCLUSIVE lets the kernel wakeup only one > > of the waiting threads. > > > > I'm not aware of this problem, but it goes against the intended > > behaviour of EPOLLEXCLUSIVE. > > Such flag exists since Linux 4.5, can you check that it's passed > > correctly to epoll()? > > > > I get the theory, but the reality is that all threads are awakened. > Also, it is not limited to the 4.14 kernel; I see the same behavior with > 5.4. >
So all threads are awakened, even if there is only an upcall packet to read? This is not good, epoll() should wake just one thread at time, as the manpage says. I have to check this, do you have a minimal setup? How do you generate upcalls, it's BUM traffic or via action=userspace? Bye, -- Matteo Croce per aspera ad upstream _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev