On Tue, Apr 13, 2021 at 01:06, Vladimir Oltean <olte...@gmail.com> wrote: > On Mon, Apr 12, 2021 at 11:49:22PM +0200, Tobias Waldekranz wrote: >> On Tue, Apr 13, 2021 at 00:34, Vladimir Oltean <olte...@gmail.com> wrote: >> > On Mon, Apr 12, 2021 at 11:22:45PM +0200, Tobias Waldekranz wrote: >> >> On Mon, Apr 12, 2021 at 21:30, Marek Behun <marek.be...@nic.cz> wrote: >> >> > On Mon, 12 Apr 2021 14:46:11 +0200 >> >> > Tobias Waldekranz <tob...@waldekranz.com> wrote: >> >> > >> >> >> I agree. Unless you only have a few really wideband flows, a LAG will >> >> >> typically do a great job with balancing. This will happen without the >> >> >> user having to do any configuration at all. It would also perform well >> >> >> in "router-on-a-stick"-setups where the incoming and outgoing port is >> >> >> the same. >> >> > >> >> > TLDR: The problem with LAGs how they are currently implemented is that >> >> > for Turris Omnia, basically in 1/16 of configurations the traffic would >> >> > go via one CPU port anyway. >> >> > >> >> > >> >> > >> >> > One potencial problem that I see with using LAGs for aggregating CPU >> >> > ports on mv88e6xxx is how these switches determine the port for a >> >> > packet: only the src and dst MAC address is used for the hash that >> >> > chooses the port. >> >> > >> >> > The most common scenario for Turris Omnia, for example, where we have 2 >> >> > CPU ports and 5 user ports, is that into these 5 user ports the user >> >> > plugs 5 simple devices (no switches, so only one peer MAC address for >> >> > port). So we have only 5 pairs of src + dst MAC addresses. If we simply >> >> > fill the LAG table as it is done now, then there is 2 * 0.5^5 = 1/16 >> >> > chance that all packets would go through one CPU port. >> >> > >> >> > In order to have real load balancing in this scenario, we would either >> >> > have to recompute the LAG mask table depending on the MAC addresses, or >> >> > rewrite the LAG mask table somewhat randomly periodically. (This could >> >> > be in theory offloaded onto the Z80 internal CPU for some of the >> >> > switches of the mv88e6xxx family, but not for Omnia.) >> >> >> >> I thought that the option to associate each port netdev with a DSA >> >> master would only be used on transmit. Are you saying that there is a >> >> way to configure an mv88e6xxx chip to steer packets to different CPU >> >> ports depending on the incoming port? >> >> >> >> The reason that the traffic is directed towards the CPU is that some >> >> kind of entry in the ATU says so, and the destination of that entry will >> >> either be a port vector or a LAG. Of those two, only the LAG will offer >> >> any kind of balancing. What am I missing? >> >> >> >> Transmit is easy; you are already in the CPU, so you can use an >> >> arbitrarily fancy hashing algo/ebpf classifier/whatever to load balance >> >> in that case. >> > >> > Say a user port receives a broadcast frame. Based on your understanding >> > where user-to-CPU port assignments are used only for TX, which CPU port >> > should be selected by the switch for this broadcast packet, and by which >> > mechanism? >> >> AFAIK, the only option available to you (again, if there is no LAG set >> up) is to statically choose one CPU port per entry. But hopefully Marek >> can teach me some new tricks! >> >> So for any known (since the broadcast address is loaded in the ATU it is >> known) destination (b/m/u-cast), you can only "load balance" based on >> the DA. You would also have to make sure that unknown unicast and >> unknown multicast is only allowed to egress one of the CPU ports. >> >> If you have a LAG OTOH, you could include all CPU ports in the port >> vectors of those same entries. The LAG mask would then do the final >> filtering so that you only send a single copy to the CPU. > > I forgot that mv88e6xxx keeps the broadcast address in the ATU. I wanted > to know what is done in the flooding case, therefore I should have asked > about unknown destination traffic. It is sent to one CPU but not the > other based on what information? > > And for destinations loaded into the ATU, how is user port isolation > performed? Say lan0 and lan1 have the same MAC address of 00:01:02:03:04:05, > but lan0 goes to the eth0 DSA master and lan1 goes to eth1. How many ATU > entries would there be for host addresses, and towards which port would > they point? Are they isolated by a port private VLAN or something along > those lines?
This is what I do not understand. This is what I hope that Marek can tell me. To my knowledge, there is no way to per-port load balancing from the switch to the CPU. In any given FID, there can be only one entry per address, and that entry can only point to either a vector or a LAG. So my theory is that the only way of getting any load balancing, however flawed, on receive (from switch to CPU) is by setting up a LAG. Hopefully there is some trick that I do not know about which means we have another option available to us.