jamal ([EMAIL PROTECTED]) wrote:
> On Wed, 2007-05-09 at 14:55 +0100, James Chapman wrote:
> 
> > Thanks Jamal. Yes, I'd already read your paper. I think my idea is 
> > different to the ideas described in your paper 
> 
> I am hoping you can pick from the lessons of what has been tried and
> failed and the justification for critiqueing something as "Failed". 
> If you have - i feel i accomplished something useful writting the paper.
> 
> > and I'm in the process of 
> > writing it up as an RFC to post to netdev. 
> 
> Please cc me if you want my feedback - I am backlogged by about 3000
> messages on netdev.
> 
> > Briefly though, the driver's 
> > ->poll() holds off doing netif_rx_complete() for 1-2 jiffies and keeps 
> > itself in poll_on mode for that time, consuming no quota. 
> > The net_rx 
> > softirq processing loop is modified to detect the case when the only 
> > devices in its poll list are doing no work (consuming no quota). The 
> > driver's ->poll() samples jiffies while it is considering when to do the 
> > netif_rx_complete() like your Parked state - no timers are used.
> 
> Ok, so the difference seems to be you actually poll instead for those
> jiffies instead starting a timer in the parked state - is that right?
> 
> > If I missed that this approach has already been tried before and 
> > rejected, please let me know. I see better throughput and latency in my 
> > packet forwarding and LAN setups using it.
> 
> If you read the paper: There are no issues with high throughput - NAPI
> kicks in.
> The challenge to be overcome is at low traffic, if you have a real fast
> processor your cpu-cycles-used/bits-processed ratio is high....

I'm not sure cpu-cycles-used/bits-processed is the correct metric to use.
An alternative would be to look at cpu-cycles-used/unit-time (i.e.
CPU utilization or load) for a given bit-rate or packet-rate. This would
make an interesting graph.

At low packet-rate, CPU utilization is low so doing extra work per packet
is probably OK. Utilizing 2% of CPU vesus 1% is negligible. But at higher
rate, when there is more CPU utilization, using 40% of CPU versus 60% is
significant. I think the absolute different in CPU utilization is more
important than the relative difference. iow, 2% versus 1%, even though 
a 2x difference in cpu-cycles/packet, is negligible compared to 40% 
versus 60%.

> If you are polling (softirqs have high prio and depending on the cpu,
> there could be a few gazillion within those 1-2 jiffies), then isnt the
> end result still a high cpu-cycles used?
> Thats what the timer tries to avoid (do nothing until some deffered
> point).

Using a timer might also behave better in a tick-less (CONFIG_NO_HZ) 
configuration.

> If you waste cpu cycles and poll, I can see that (to emphasize: For
> FastCPU-LowTraffic scenario), you will end up _not_ having latency
> issues i pointed out, but you surely have digressed from the original
> goal which is to address the cpu abuse at low traffic (because you abuse
> more cpu).
> 
> One thing that may be valuable is to show that the timers and polls are
> not much different in terms of cpu abuse (It's theoretically not true,
> but reality may not match).
> The other thing is now hrestimers are on, can you try using a piece of
> hardware that can get those kicked and see if you see any useful results
> on latency.
> 
> cheers,
> jamal
> 
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to