>Number: 165863 >Category: kern >Synopsis: [panic] [patch] in_lltable_prefix_free() races with >lla_lookup() and arptimer() >Confidential: no >Severity: critical >Priority: medium >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Thu Mar 08 23:00:27 UTC 2012 >Closed-Date: >Last-Modified: >Originator: Eric van Gyzen >Release: 8.2-RELEASE >Organization: Dell Compellent >Environment: FreeBSD 8.2-RELEASE amd64 >Description: in_lltable_prefix_free() does not acquire the if_afdata_lock, so it can free the llentry while another thread is in lla_lookup(), such as during packet processing.
While working on a fix, it quickly became apparent that in_lltable_prefix_free() does not safely drain the callout, so it races with arptimer(). If arptimer() has already tested callout_active() before in_lltable_prefix_free() calls callout_drain(), arptimer() will have freed the llentry before callout_drain() returns. I can reliably reproduce the first problem (lla_lookup) on 8.1-RELEASE and 8.2-RELEASE. I cannot reproduce it on 9.0-RELEASE or head, but since the relevant code has not changed, I suspect it still exists. I have not earnestly tried to reproduce the second problem (arptimer). The same problem exists in the IPv6 code, but that code is never called. >How-To-Repeat: In a second process, repeatedly and rapidly call SIOCSIFADDR and SIOCDIFADDR on the /only/ interface address by which that neighbor is reached. This drives in_lltable_prefix_free(). In one process, flood ping an IPv4 neighbor. This drives lla_lookup() via the processing of ARP replies, packets queued on la_hold, and echo request packets. This reliably panics in 10-20 seconds. >Fix: With the attached patch, my systems survive the test for over 30 minutes. Patch attached with submission follows: --- sys/netinet/in.c.orig 2012-03-08 12:57:29.000000000 -0600 +++ sys/netinet/in.c 2012-03-08 12:14:26.000000000 -0600 @@ -1351,21 +1351,39 @@ struct llentry *lle, *next; register int i; +restart: + IF_AFDATA_WLOCK(llt->llt_ifp); for (i=0; i < LLTBL_HASHTBL_SIZE; i++) { LIST_FOREACH_SAFE(lle, &llt->lle_head[i], lle_next, next) { - if (IN_ARE_MASKED_ADDR_EQUAL((struct sockaddr_in *)L3_ADDR(lle), + if (IN_ARE_MASKED_ADDR_EQUAL(satosin(L3_ADDR(lle)), pfx, msk)) { - int canceled; + int canceled = 0; - canceled = callout_drain(&lle->la_timer); LLE_WLOCK(lle); - if (canceled) + if (!callout_active(&lle->la_timer) || + (canceled = callout_stop(&lle->la_timer))) { + if (canceled) + LLE_REMREF(lle); + llentry_free(lle); + } else { + /* + * The callout is in progress. + * Since we deactivated it, it won't + * do anything. Drop its reference. + * Drop our locks to drain it, which + * requires restarting this function. + */ LLE_REMREF(lle); - llentry_free(lle); + LLE_WUNLOCK(lle); + IF_AFDATA_WUNLOCK(llt->llt_ifp); + (void) callout_drain(&lle->la_timer); + goto restart; + } } } } + IF_AFDATA_WUNLOCK(llt->llt_ifp); } >Release-Note: >Audit-Trail: >Unformatted: _______________________________________________ freebsd-bugs@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-bugs To unsubscribe, send any mail to "freebsd-bugs-unsubscr...@freebsd.org"