Nate Nielsen wrote:
Sam Leffler wrote:

You might try explaining why you think polling helps your performance.
Unless you've significantly restructured the interrupt handling in the
driver most work is deferred to a non-interrupt context.


Yes, I saw that. However the interrupts themselves when they are fired
at upwards of a thousand per second do greatly affect performance and
scheduling on a slow box.

Max pps for an ath device are 8-9K/sec. Typical is 3-4K for a single bidirectional tcp stream when the client has a strong signal.


I understand that it may not seem intuitive on faster systems which are
completely capable of handling such a small interrupt load. But these
embedded systems are slow little puppies, running between 133 and 266
Mhz (586 class chip).

I am familiar with embedded systems.


Adding polling to this driver does increase performance on embedded
systems. With my current patch (on a 233Mhz system), the throughput (in
this case a simple TCP stream) goes up by ~6Mbits, from 18Mbits to 24Mbits.

I routinely get >20 Mb/s for a single client running upstream TCP netperf through a soekris 4511. If you are seeing 6Mb/s you have something else wrong.


But that's not the main benefit. The main thing is the scheduling.
Without having the thousands of interrupts, the box is better able to
balance the RX/TX with the other aspects, such as encapsulation, packet
filtering and other activities.

When the entire box is polling driven, it's total throughput (ethernet,
encapsulation, hardware encryption, wireless) increases greatly and does
not exhibit livelock symptoms.

Without polling, these slow systems easily exhibit livelock. This is
where incoming traffic can cause so many interrupts that outgoing
traffic is completely halted and the throughput drops to zero or near
zero. Under these conditions userland processes also run barely or not
at all. The scheduler has no say in what's going on in the system, as
interrupts overwhelm all other activity.

I've not seen livelock in any situations though there are some issues with the priority of the taskqueue thread. Polling is not a panacea; you are potentially increasing latency which has ramifications.



Also the
driver in 6.0 and later does tx interrupt mitigation and rx interrupt
coalescing so I wouldn't expect interrupt handling to be a performance
limitation.


Interesting. If there's an option to enable it, I very well may have
missed it.

It is always used.


However it should be noted, that the default behaviour (in 6.0 release)
seems to be that the hardware generates about around 2000 interrupts per
second at around 15 - 18 Mbits throughput.

You need to identify what kind of interrupts there are and what type of ath hardware you are using. You can trivially reduce the tx interrupt load by turning off interrupts on EOL and just using the periodic interrupts generated every N tx descriptors. But if you profile I suspect you will find the interrupt overhead is not significant relative to other costs.



There are other issues that can affect performance but you
haven't mentioned them...


Obviously these are not mainstream performance issues for people just
trying to connect to an access point. But when the atheros cards are
used in backhaul applications and are running in the low power embedded
systems you typically see on an antenna mast, polling makes a big
difference.

I'm not convinced polling is worthwhile w/o a major restructuring of the driver. OTOH this shouldn't stop you from pushing forward...

        Sam


_______________________________________________
[email protected] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Reply via email to