On Wed, Jun 07, 2006 at 03:34:56PM -0400, Lennart Sorensen wrote:
> On Wed, Jun 07, 2006 at 11:20:40AM -0700, Don Fry wrote:
> 
> > Some areas of concern that you may have addressed already, I have not
> > scanned your changes yet, are what happens if the ring size is changed
> > without bringing down the interface (via ethtool), or if the loopback
> > test is run in a similar fashion, or a tx timeout occurs.
> 
> The same thing as if it was done before enabling napi.  From a few
> messages I have seen, it doesn't work right now, and it won't work any
> better with my changes.  I have never tried changing the ring size on
> the fly, so I don't know.

Today without your NAPI changes the device can be up and operational,
and change the ring size without hanging or causing a panic or causing
problems with the driver.  The same can be said for running the loopback
test, or when a tx timeout occurs.

It does not look like the same can be said for your NAPI changes, yet.
Try changing the ring size a dozen times during a receive storm and see
what happens.

> 
> It appears that the port is stopped before the ring size change is done,
> although I can't really tell how it handles things if the queue is not
> empty when it stops the port.  Does it try to handle anything left in
> the ring first or does it just toss those packets? (That I would
> consider wrong).
> 
> > The lp->lock MUST be held whenever accessing the csr or bcr registers as
> > this is a multi-step process, and has been the source of problems in the
> > past.  Even on UP systems.
> 
> Hmm, I just followed what appeared to be in pcnet32_rx and how tulip and
> a few other drivers had done their napi conversions.  It certainly works
> for me the way I did it.  Haven't seen any lockups yet.  I do see that I
> am not holding the lock when I acknowledge IRQs in pcnet32_poll, which
> pcnet32_rx doesn't need to worry about since it is called from the
> interrupt handler which already holds the lock.  That should be fixed
> then.

Yes, that is the minimum that needs to be changed.
There have been a lot of changes made to the driver, by myself included,
that seem to work just fine, but later the timing changes and you lose
the race, resulting in problems.  Changing one part of the driver
without understanding the whole thing is very risky.

> 
> So I can do:
>         // Clear RX interrupts
>         spin_lock(&lp->lock);
>         lp->a.write_csr (ioaddr, 0, 0x1400);
>         spin_unlock(&lp->lock);
> That part seems simple enough to protect.
> 
> Is this safe without holding the lock?
>     } while(lp->a.read_csr (ioaddr, 0) & 0x1400);

No, see below.

> Not sure how to wrap a lock around that one without holding the lock for
> way too long.
> 
> perhaps:
>         spin_lock(&lp->lock);
>       state=lp->a.read_csr (ioaddr, 0) & 0x1400;
>         spin_unlock(&lp->lock);
>     } while(state);
> Does that seem more reasonable?

The lock must be held during ANY read or write to a csr or bcr.

The problem is that accessing a csr/bcr takes two steps.  One is to
write the address register indicating which register to read or write,
and second, read or write the register.  The problem is that without the
lock, entity A writes to the rap to access register X, then before it
can access that register entity B writes the rap to access register Y,
At that point entity A will read, or worse write, the contents of the
wrong register.

This problem is even worse when reading/writing a PHY register because
it entails writing bcr 33 and reading/writing bcr 34, without being
interrupted by something that might change bcr 33.  Not likely, but not
something that can be ignored.  It may not happen often, but it will
happen eventually.

> 
> Len Sorensen

-- 
Don Fry
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to