On Mon, Nov 09, 2020 at 09:09:37AM +0100, Tobias Waldekranz wrote: > On Mon, Nov 09, 2020 at 02:30, Vladimir Oltean <olte...@gmail.com> wrote: > > On Mon, Nov 09, 2020 at 12:59:39AM +0100, Andrew Lunn wrote: > >> We also need to make sure the static entries get removed correctly > >> when a host moves. The mv88e6xxx will not replace a static entry with > >> a dynamically learned one. It will probably rise an ATU violation > >> interrupt that frames have come in the wrong port. > > > > This is a good one. Currently every implementer of .port_fdb_add assumes > > a static entry is what we want, but that is not the case here. We want > > an entry that can expire or the switch can move it to a different port > > when there is evidence that it's wrong. Should we add more arguments to > > the API? > > I don't think that would help. You would essentially be trading one > situation where station moves causes loss of traffic for another > one. But now you have also increased the background load of an already > choked resource, the MDIO bus.
In practice, DSA switches are already very demanding of their management interface throughput, for PTP and things like that. I do expect that if you spent any significant amount of time with DSA, you already know the ins and outs of your MDIO/SPI/I2C controller and it would already be optimized for efficiency. But ok, we can add this to the list of cons. > At least on mv88e6xxx, your only option to allow the hardware to move > the station to another port autonomously is to add the entry as a > dynamically learnt one. However, since the switch does not perform any > SA learning on the CPU port in this world, the entry would have to be > refreshed by software, otherwise it would just age out. > > Then you run in to this situation: > > A and B are communicating. > > br0 > .----'|'----. > | | | > swp0 swp1 wlan0 > | | > A B > > The switch's FDB: > A: swp0 > B: cpu0 (due to this patchset) > > Now B roams to an AP somewhere behind swp1 and continues to communicate > with A. > > br0 > .----'|'----. > | | | > swp0 swp1 wlan0 > | | > A B > > The switch's FDB: > A: swp0 > B: swp1 > > But br0 sees none of this, so at whatever interval we choose we will > refresh the FDB, moving the station back to the cpu: > > A: swp0 > B: cpu0 No, br0 should see some traffic from station B. Not the unicast traffic towards station A, of course (because that has already been learnt to go towards swp0), but some broadcast ARP, or some multicast ND. This is the big assumption behind any solution: that the stations are not silent and make their presence known somehow. > So now you have traded the issue of having to wait for the hardware to > age out its entry, to the issue of having to wait for br0 to age out its > entry. Right? That's the thing. The software bridge will never expire its entry in br_fdb_update if traffic is continuously coming in. fdb->updated will just keep getting larger and larger after each incoming packet. But the hardware bridge is not aware of this traffic. So: - if the hardware bridge has a dynamic entry installed (one that's subject to ageing), that entry will eventually expire within 5 minutes when its software equivalent won't. Then no switchdev event will ever come back to update the hardware bridge, since from the software's perspective it was never supposed to expire. It's as if we _do_ want the entry to be static. But: - if the hardware bridge has a static entry installed, then that entry might become wrong and cause connectivity loss until the software bridge figures it out. It's what Andrew described as a 'hybrid' entry. We would want a 'static' entry (one that doesn't age out based on a timer) that is 'weak' (can be overridden when traffic comes in on a different port). I'm not sure either that such thing exists. So for now, static entries are the best we've got. Let's re-run the simulation knowing that we're working with static addresses towards the CPU, to see how bad things are. AP 1: +------------------------------------------------------------------------+ | br0 | +------------------------------------------------------------------------+ +------------+ +------------+ +------------+ +------------+ +------------+ | swp0 | | swp1 | | swp2 | | swp3 | | wlan0 | +------------+ +------------+ +------------+ +------------+ +------------+ | ^ ^ | | | | | | | Client A Client B | | | +------------+ +------------+ +------------+ +------------+ +------------+ | swp0 | | swp1 | | swp2 | | swp3 | | wlan0 | +------------+ +------------+ +------------+ +------------+ +------------+ +------------------------------------------------------------------------+ | br0 | +------------------------------------------------------------------------+ AP 2 - br0 of AP 1 will lean that Clients A and B are reachable via wlan0. The DSA switch will snoop these and add static entries towards the CPU port. - the hardware fdb of the DSA switch, as well as br0 on AP 2, will learn that Clients A and B are reachable through swp0, because of our assumption of non-silent stations. There are no static entries involved on AP 2 for now. Client B disconnects from AP 1 and roams to AP 2. AP 1: +------------------------------------------------------------------------+ | br0 | +------------------------------------------------------------------------+ +------------+ +------------+ +------------+ +------------+ +------------+ | swp0 | | swp1 | | swp2 | | swp3 | | wlan0 | +------------+ +------------+ +------------+ +------------+ +------------+ | ^ | | | Client A | | | Client B | | | v +------------+ +------------+ +------------+ +------------+ +------------+ | swp0 | | swp1 | | swp2 | | swp3 | | wlan0 | +------------+ +------------+ +------------+ +------------+ +------------+ +------------------------------------------------------------------------+ | br0 | +------------------------------------------------------------------------+ AP 2 - br0 of AP 1 still knows that Client A is reachable via wlan0 (no change) - In the general case, br0 of AP 1 will _NOT_ know that Client B has left wlan0. So there is still a static entry for Client B towards the CPU port. - Right now, any attempt from Client A to directly address Client B via unicast would result, if the FDB were to be consulted, in packet drops, because the switch on AP 1 would say 'wait a minute, I'm receiving a packet for Client B from the CPU port, but Client B is reachable via the CPU port!'. Luckily for us, the switches that we're working with are not looking up the FDB for CPU injected traffic, remember? So I don't think this is a problem. So unicast packets would be delivered to anywhere that the software bridge wanted to. Right now, even the software bridge has a wrong impression of where Client B is. - remember the assumption that Client B is not silent at startup. So some broadcast packets with Client B's source MAC address will reach the Ethernet segment. The hardware switch on AP 1 will have no problem accepting these packets, since they are broadcast/multicast. They will reach the software bridge. At this point, the software bridge finally learns the new destination for Client B, and it emits a new SWITCHDEV_FDB_ADD_TO_DEVICE event. Today we ignore that, because added_by_user will be false. That's what we do wrong/incomplete in this RFC patch set. We should keep track of static addresses installed on the CPU port, and if we ever receive a !added_by_user notification on a DSA switch port for one of those addresses, we should update the hardware FDB of the DSA switch on AP 1. So there you have it, it's not that bad. More work needs to be done, but IMO it's still workable. But now maybe it makes more sense to treat the switches that perform hardware SA learning on the CPU port separately, after I've digested this a bit.