On Mon, Jun 12, 2006 at 09:42:14AM -0700, Mitch Williams wrote:
> On Sun, 2006-06-11 at 17:13 -0700, Neil Horman wrote:
> > Any further thoughts on this guys?  I still think my last solution
> > solves all of
> > the netpoll problems, and isn't going to have any noticable impact on
> > performance.
> > 
> I haven't had time to evaluate performance on your patch (sorry!), but
> after thinking about it, I agree that it should not have any noticeable
> impact.  OTOH, performance tuning is a funny thing, and things you think
> won't cause problems often do.
> 
Thats ok, I just didn't hear out of anyone on friday, so I was curious as to
where we were on this.  I don't have the ability to do any real world
performance testing here, but I'll try to record the run time of the interrupt
routine on a limited number of frames here.

> Anyway, I'm still not quite ready to ACK this because it's just not
> future-proof.  Eventually, we will need to support multiple RX queues,
> and this solution will not work in that situation.
> 
Not sure I understand this.  My patch:
http://marc.theaimsgroup.com/?l=linux-netdev&m=114970506406155&w=2
still allows for the use of multiple rx
queues in the nominal case.  Only when we have to use a relatively slow netpoll
driven operation (kgdb, netdump, etc), do we need to receive on the same
interface that we transmit on.

> A simpler short-term solution is just to schedule our NAPI polling on
> the "real" netdev instead of our polling netdev.  This is a trivial
> change and works correctly with a single queue.  But, like your patch,
> it isn't future-proof.
> 
Again, not sure what you mean here.  My last patch proposal:
http://marc.theaimsgroup.com/?l=linux-netdev&m=114970807606096&w=2
Does precisely what you describe, but it still allows for multiple rx queues in
the nominal (non poll_controller driven) receive case.

> So, I'm still thinking and pondering on this one.
> 
> If we get a patch in to fix the recursive loop in netpoll, my original
> patch will work, right?  Or is there still another issue?
> 
I assume you are referring to this patch:
http://marc.theaimsgroup.com/?l=linux-netdev&m=114970506406155&w=2
If so, then no, that patch is still broken for the reasons Jeff outlined
previously, The recursion patch that I proposed earlier today is related to a
different recursion problem, and while the e1000 driver might be suceptable to
it, your patch is also suceptible to the data corruption that arises from when
the poll_controller calls adapter->clean_rx at the same that that dev->poll is
called for the same adapter on another cpu.  If that happens we can have two
cpu's writing to the same private net_device data without the benefit of a
spinlock to protect them.  And yes, you can add a spin lock to protect the case
where you have a dev->poll_controller and a dev->poll operation at the same
time, but that seems to me like it will also re-serialize all the parallel
operations that you could otherwise do with multiple rx queues

I'll post again if I get a chance to do some performance measurements
Regrds
Neil

> -Mitch
> -
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
/***************************************************
 *Neil Horman
 *Software Engineer
 *gpg keyid: 1024D / 0x92A74FA1 - http://pgp.mit.edu
 ***************************************************/
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to