On 17/05/21(Mon) 16:24, Alexandr Nedvedicky wrote: > Hrvoje, > > managed to trigger diagnostic panics with diff [1] sent by bluhm@ > some time ago. The panic Hrvoje sees comes from ether_input() here: > > 414 > 415 /* > 416 * Third phase: bridge processing. > 417 * > 418 * Give the packet to a bridge interface, ie, bridge(4), > 419 * switch(4), or tpmr(4), if it is configured. A bridge > 420 * may take the packet and forward it to another port, or it > 421 * may return it here to ether_input() to support local > 422 * delivery to this port. > 423 */ > 424 > 425 ac = (struct arpcom *)ifp; > 426 > 427 smr_read_enter(); > 428 eb = SMR_PTR_GET(&ac->ac_brport); > 429 if (eb != NULL) { > 430 m = (*eb->eb_input)(ifp, m, dst, eb->eb_port); > 431 if (m == NULL) { > 432 smr_read_leave(); > 433 return; > 434 } > 435 } > 436 smr_read_leave(); > 437 > > in current tree the ether_input() is protected by NET_LOCK(), which is grabbed > by caller as a writer. bluhm's diff changes NET_LOCK() readlock, so > ether_input() can run concurrently. Switching NET_LOCK() to r-lock has > implications on smr read section above. The ting is the call to eb->eb_input() > can sleep now. This is something what needs to be avoided within smr section.
Is the new sleeping point introduced by the fact the PF_LOCK() is a rwlock? Did you consider using a mutex, at least for the time being, in order to not run in such issues?