On Sun, 2006-06-08 at 16:16 -0700, Jesse Brandeburg wrote:
[..]
> 
> As for specifics, for TX_WAKE_THRESHOLD, i noticed that we were
> starting the queue after every packet was cleaned, so when the ring
> went full there was a lot of queue thrash.

indeed this is what used to happen and was bad.... So this is a huge
improvement.
What happens now under steady state at high traffic transmits is,
instead of 1, you see E1000_TX_WEIGHT in between queue sleep/wakes. I
assume this is a given since E1000_TX_WEIGHT is higher than
TX_WAKE_THRESHOLD.  I am not sure if i can vouch for even more
improvement by mucking around with values of E1000_TX_WEIGHT.

Can you please take a look at the patch i posted? I would like to submit
that for inclusion. It does two things
a) make pruning available to be invoked from elsewhere (I tried to do it
from the tx path but it gave me non-good results)
b) makes E1000_TX_WEIGHT and TX_WAKE_THRESHOLD relative to the size
of the transmit ring. I think this is a sane thing to do.

You could either extract the bits or i could resend to you as two
different patches. I have tested it and it works.

>   tg3 seemed to fix it in a
> smart way and so I did a similar fix.  Note we should have at least
> MAX_SKB_FRAGS (usually 32) + a few descriptors free before we should
> start the tx again, otherwise we run the risk of a maximum fragmented
> packet being unable to fit in the tx ring.

I noticed you check for that in the tx path.

> now, for E1000_TX_WEIGHT, that was more of an experiment as i noticed
> we could stay in transmit clean up forEVER (okay not literally) which
> would really violate our NAPI timeslice.  

Interesting. The only time i have seen the NAPI time slice kick in is in
slow hardware or emulators (like UML). 
I wonder if the pruning path could be made faster? What is the most
consuming item? I realize there will be a substantial amount of cache
misses. Maybe in addition to prunning E1000_TX_WEIGHT descriptors also
fire a timer to clean up the rest (to avoid it being accounted for in
the napi timeslice;->). Essentially i think you have some thing in the
pruning path that needs to be optimized. Profiling and improving that
would help.

> I messed with some values
> and 64 didn't really seem like too bad a compromise (it does slow
> things down just a tad in the general case) while really helping a
> couple of tests where there were lots of outstanding transmits
> happening at the same time as lots of receives.
> 

The later are the kind of tests i am running. If you are a router or a
busy server they apply. In slow machines a ping flood also applies etc. 

> This need for a tx weight is yet another global (design) problem with
> NAPI enabled drivers, 

oh yes, the Intel cabal - blame NAPI first;-> 
IMO, the problem is you are consuming too many cycles in the receive
path. NAPI has to be fair to all netdevices and cant hog all the CPU
because a certain netdevice uses too many cycles to process a packet. 

> but someday I'll try to document some of the issues I've seen.

I think it would be invaluable. Just dont jump to blame Canada^WNAPI
conclusion because it distracts; 

cheers,
jamal

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to