On 2022-07-12 14:35 +02, Florian Obser <[email protected]> wrote:
> When the autoconf flag flaps around we might end up with multiple bpf
> FDs in flight. Things then get confusing. The kernel tells us we can
> read from the bpf FD but the data is actually "on the other FD", so
> read(2) returns 0.
>
> Found the hard way by, and patiently debugged with weerd@
>
> One way to trigger this is booting a vmm VM where dhcpleased(8)'s
> init_ifaces() loses a race against netstart(8). init_ifaces() would
> already see the autoconf flag and request a bpf FD.
> But then it would receive a RTM_IFINFO message without the autoconf flag
> set from when the interface came up. Then it will see another RTM_IFINFO
> message with the autoconf flag set and request yet another bpf FD. If
> the first bpf FD had not arrived yet we ended up with two in the frontend
> process.
>
> While here make sure a bpf FD has been configured before calling
> close(2) on it.
>
> OK?
>
Anyone? This is a slightly annoying bug... Difficult to hit though.
diff --git frontend.c frontend.c
index 2959a6c0d0f..275e5352e0b 100644
--- frontend.c
+++ frontend.c
@@ -1170,8 +1170,10 @@ remove_iface(uint32_t if_index)
return;
LIST_REMOVE(iface, entries);
- event_del(&iface->bpfev.ev);
- close(EVENT_FD(&iface->bpfev.ev));
+ if (event_initialized(&iface->bpfev.ev)) {
+ event_del(&iface->bpfev.ev);
+ close(EVENT_FD(&iface->bpfev.ev));
+ }
if (iface->udpsock != -1)
close(iface->udpsock);
free(iface);
@@ -1185,7 +1187,13 @@ set_bpfsock(int bpfsock, uint32_t if_index)
if ((iface = get_iface_by_id(if_index)) == NULL) {
/*
* The interface disappeared while we were waiting for the
- * parent process to open the raw socket.
+ * parent process to open the bpf socket.
+ */
+ close(bpfsock);
+ } else if (event_initialized(&iface->bpfev.ev)) {
+ /*
+ * The autoconf flag is flapping and we have multiple bpf
sockets in
+ * flight. We don't need this one because we already got one.
*/
close(bpfsock);
} else {
--
I'm not entirely sure you are real.