On Wed, Jun 29, 2011 at 08:27:24PM -0400, Ted Unangst wrote: > On Thu, 30 Jun 2011, David Gwynne wrote: > > > > This driver is filled with bad juju. This changes all the waitoks to not > > > ok, so they are interrupt safe. It already appears to handle the failure > > > case. The rwlock is also totally unsafe and unnecessary. > > > > the issue is that bnx_init is called from softclock when it looks like bnx > > doesnt get any interrupts (so it doesnt do tx completions). i assumed > > bnx_init was only called from the ioctl paths which have process context. > > > > this diff is also unsafe because you still init the pool with the nointr > > allocator, but you're trying to fix the code so bnx_alloc_pkts via bnx_init > > is ok to call from interrupt context. > > > > a simpler fix would be to have bnx_watchdog use the system workq to call > > bnx_init to reset the chip. > > as you wish... :) I agree it's much simpler. Still needs testing. > > Index: if_bnx.c > =================================================================== > RCS file: /home/tedu/cvs/src/sys/dev/pci/if_bnx.c,v > retrieving revision 1.95 > diff -u -r1.95 if_bnx.c > --- if_bnx.c 22 Jun 2011 16:44:27 -0000 1.95 > +++ if_bnx.c 30 Jun 2011 00:25:38 -0000 > @@ -5125,7 +5125,7 @@ > > /* DBRUN(BNX_FATAL, bnx_breakpoint(sc)); */ > > - bnx_init(sc); > + workq_add_task(NULL, 0, (workq_fn)bnx_init, sc, NULL); > > ifp->if_oerrors++; > }
With the above patch, the splasserts go away. However, the device still goes into a weird stats where it says it's active but no interrupts are generated. When I run ifconfig bnx0 down; ifconfig bnx0 up, it comes back fine for a while, but goes down after an unspecified amount of time. Tom