On Sun, 24 Aug 2014, Jonathan Morton wrote:

I've done some reading on how wifi actually works, and what mechanisms the 
latest variants use to improve performance.  It might be helpful to summarise 
my understanding here - biased towards the newer variants, since they are by 
now widely deployed.

First a note on the variants themselves:

802.11 without suffix is obsolete and no longer in use.
802.11a was the original 5GHz band version, giving 54Mbps in 20MHz channels.
802.11b was the first "affordable" version, using 2.4GHz and giving 11Mbps in 
20MHz channels.
802.11g brought the 802.11a modulation schemes and (theoretical) performance to 
the 2.4GHz band.
802.11n is dual-band, but optionally.  Aggregation, 40MHz channels, 
single-target MIMO.
802.11ac is 5GHz only.  More aggregation, 80 & 160MHz channels, multi-target 
MIMO.  Rationalised options, dropping many 'n' features that are more trouble than 
they're worth.  Coexists nicely with older 20MHz-channel equipment, and nearby APs 
with overlapping spectrum.


My general impression is that 802.11ac makes a serious effort to improve matters in heavily-congested, many-clients scenarios, which was where earlier variants had the most trouble. If you're planning to set up or go to a major conference, the best easy thing you can do is get 'ac' equipment all round - if nothing else, it's guaranteed to support the 5GHz band. Of course, we're not just considering the easy solutions.

If ac had reasonable drivers available I would agree, but when you are limited to factory firmware, it's not good.

Now for some technical details:

The wireless spectrum is fundamentally a shared-access medium. It also has the complication of being noisy and having various path-loss mechanisms, and of the "hidden node" problem where one client might not be able to hear another client's transmission, even though both are in range of the AP.

Thus wifi uses a CSMA/CA algorithm as follows:

1) Listen for competing carrier. If heard, backoff and retry later. (Listening is continuous, and detected preambles are used to infer the time-length of packets when the data modulation is unreadable.)
2) Perform an RTS/CTS handshake.  If CTS doesn't arrive, backoff and retry 
later.
3) Transmit, and await acknowledgement. If no ack, backoff and retry later, possibly using different modulation.

This can be compared to Ethernet's CSMA/CD algorithm:

1) Listen for competing carrier.  If heard, backoff and retry later.
2) Transmit, listening for collision with a competing transmission. If collision, backoff and retry later.

In both cases, the backoff is random and exponentially increasing, to reduce the chance of repeated collisions.

The 2.4GHz band is chock-full of noise sources, from legacy 802.11b/g equipment to cordless phones, Bluetooth, and even microwave ovens - which generate the best part of a kilowatt of RF energy, but somehow manage to contain the vast majority of it within the cavity. It's also a relatively narrow band, with only three completely separate 20MHz channels available in most of the world (four in Japan).

This isn't a massive concern for home use, but consumers still notice the effects surprisingly often. Perhaps they live in an apartment block with lots of devices and APs crowded together in an unmanaged mess. Perhaps they have a large home to themselves, but a bunch of noisy equipment reduces the effective range and reliability of their network. It's not uncommon to hear about networks that drop out whenever the phone rings, thanks to an old cordless phone.

The 5GHz band is much less crowded. There are several channels which are shared with weather radar, so wifi equipment can't use those unless they are capable of detecting the radar transmissions, but even without those there are far more 20MHz channels available. There's also much less legacy equipment using it - even 802.11a is relatively uncommon (and is fairly benign in behaviour). The downside is that 5GHz doesn't propagate as far, or as easily through walls.

Wider bandwidth channels can be used to shorten the time taken for each transmission. However, this effect is not linear, because the RTS/CTS handshake and preamble are fixed overheads (since they must be transmitted at a low speed to ensure that all clients can hear them), taking the same length of time regardless of any other enhancements. This implies that in seriously geographically-congested scenarios, 20MHz channels (and lots of APs to use them all) are still the most efficient. MIMO can still be used to beneficial effect in these situations.

Another good reason for sticking to 20MHz channels is that it gives you more channels available, so you can deploy more APs without them interfering with each other's footprints. This can significantly reduce the distance between the mobile user and the closest AP.

Multi-target MIMO allows an AP to transmit to several clients simultaneously, without requiring the client to support MIMO themselves. This requires the AP's antennas and radios to be dynamically reconfigured for beamforming - giving each client a clear version of its own signal and a null for the other signals - which is a tricky procedure. APs that do implement this well are highly valuable in congested situations.

how many different targets can such APs handle? if it's only a small number, I'm not sure it helps much.

Also, is this a transmit-only feature? or can it help decipher multiple mobile devices transmitting at the same time?

Single-target MIMO allows higher bandwidth between one client at a time and the AP. Both the AP and the client must support MIMO for this to work. There are physical constraints which limit the ability for handheld devices to support MIMO. In general, this form of MIMO improves throughput in the home, but is not very useful in congested situations. High individual throughput is not what's needed in a crowded arena; rather, reliable if slow individual throughput, reasonable latency, and high aggregate throughput.

well, if the higher bandwidth to an individual user ended up reducing the airtime that user takes up, it could help. but I suspect that the devices that do this couldn't keep track of a few dozen endpoints.

Choosing the most effective radio bandwidth and modulation is a difficult problem. The Minstrel algorithm seems to be an effective solution for general traffic. Some manual constraints may be appropriate in some circumstances, such as reducing the maximum radio bandwidth (trading throughput of one AP against coexistence with other APs) and increasing the modulation rate of management broadcasts (reducing per-packet overhead).

agreed.

Packet aggregation allow several IP packets to be combined into a single wireless transmission. This avoids performing the CSMA/CA steps repeatedly, which is a considerable overhead. There are several types of packet aggregation - the type adopted by 802.11ac allows individual IP packets within a transmission to be link-layer acknowledged separately, so that a minor corruption doesn't require transmission of the entire aggregate. By contrast, 802.11n also supported a version which did require that, despite a slightly lower overhead.

There are other overheads that are saved with this, since the TCP packet is encapsulated in the wireless transmission, things like link-layer encryption and other encasulation overhead benefit from this aggregation.

But with the n style 'all or nothing' mode, the fact that the transmission takes longer, and is therefor more likely to get clobbered is a much more significant problem.

This needs to be tweakable. In low-congestion, high throughput situations, you want to do a lot of aggregation, in high-congestion situations, you want to limit this.

note, "low-contstion, high throughput" doesn't have to mean a small number of stations. It could be a significant number of mobile devices that are all watching streaming video from the AP. The AP could be transmitting nearly continuously, but the mobile devices transmit only in response, so there would be very little contention)

Implicit in the packet-aggregation system is the problem of collecting packets to aggregate. Each transmission is between the AP and one client, so the packets aggregated by the AP all have to be for the same client. (The client can assume that all packets go to the AP.) A fair-queueing algorithm could have the effect of forming per-client queues, so several suitable packets could easily be located in such a queue. In a straight FIFO queue, however, packets for the same client are likely to be separated in the queue and thus difficult to find. It is therefore *obviously* in the AP's interest to implement a fair-queueing algorithm based on client MAC address, even if it does nothing else to manage congestion.

NB: if a single aggregate could be intended to be heard by more than one client, then the complexity of multi-target beamforming MIMO would not be necessary. This is how I infer the strict one-to-one nature of data transmissions, as distinct from management broadcasts.

yes, multicast has a lot of potential benefits, but it's never lived up to it's promises in the real world. In effect, everything is unicast, even if you have a lot of people watching the same video, they are all at slightly different points, needing slightly different packets retransmitted, etc.

In a radio environment this is even more so. one station may be hearing something perfectly while another is unable to hear the same packet due to a hidden node transmission.

On 23 Aug, 2014, at 10:26 pm, Michael Welzl wrote:

because of the "function" i wrote above: the more you retry, the more you need to buffer when traffic continuously arrives because you're stuck trying to send a frame again.

huh, I'm missing something here, retrying sends would require you to buffer more when sending.

aren't you the saying the same thing as I ? Sorry else, I might have expressed it confusingly somehow

There should be enough buffering to allow effective aggregation, but as little as possible on top of that. I don't know how much aggregation can be done, but I assume that there is a limit, and that it's not especially high in terms of full-length packets. After all, tying up the channel for long periods of time is unfair to other clients - a typical latency/throughput tradeoff.

Aggregation is not necessarily worth pursuing.

Equally clearly, in a heavily congested scenario the AP benefits from having a lot of buffer divided among a large number of clients, but each client should have only a small buffer.

the key thing is how long the data sits in the buffer. If it sits too long, it doesn't matter that it's the only packet for this client, it still is too much buffering.

If people are retrying when they really don't need to, that cuts down on the 
avialable airtime.

Yes

Given that TCP retries on loss, and UDP protocols are generally loss-tolerant to a degree, there should therefore be a limit on how hard the link-layer stuff tries to get each individual packet through. Minstrel appears to be designed around a time limit for that sort of thing, which seems sane - and they explicitly talk about TCP retransmit timers in that context.

With that said, link-layer retries are a valid mechanism to minimise unnecessarily lost packets. It's also not new - bus/hub Ethernet does this on collision detection. What Ethernet doesn't have is the link-layer ack, so there's an additional set of reasons why a backoff-and-retry might happen in wifi.

Modern wifi variants use packet aggregation to improve efficiency. This only works when there are multiple packets to send at a time from one place to a specific other place - which is more likely when the link is congested. In the event of a retry, it makes sense to aggregate newly buffered packets with the original ones, to reduce the number of negotiation and retry cycles.

up to a point. It could easily be that the right thing to do is NOT to aggregate the new packets because it will make it far more likely that they will all fail (ac mitigates this in theory, but until there is really driver support, the practice is questionable)

But if you have continual transmissions taking place, so you have a hard time getting a chance to send your traffic, then you really do have congestion and should be dropping packets to let the sender know that it shouldn't try to generate as much.

Yes; but the complexity that I was pointing at (but maybe it's a simple parameter, more like a 0 or 1 situation in practice?) lies in the word "continual". How long do you try before you decide that the sending TCP should really think it *is* congestion? To really optimize the behavior, that would have to depend on the RTT, which you can't easily know.

There are TCP congestion algorithms which explicitly address this (eg. Westwood+), by reacting only a little to individual drops, but reacting more rapidly if drops occur frequently. In principle they should also react quickly to ECN, because that is never triggered by random noise loss alone.

correct.

David Lang
_______________________________________________
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat

Reply via email to