On Tue, Apr 13, 2021 at 17:14, Marek Behun <marek.be...@nic.cz> wrote: > On Tue, 13 Apr 2021 16:46:32 +0200 > Tobias Waldekranz <tob...@waldekranz.com> wrote: > >> On Tue, Apr 13, 2021 at 02:27, Marek Behun <marek.be...@nic.cz> wrote: >> > On Tue, 13 Apr 2021 01:54:50 +0200 >> > Marek Behun <marek.be...@nic.cz> wrote: >> > >> >> I will look into this, maybe ask some follow-up questions. >> > >> > Tobias, >> > >> > it seems that currently the LAGs in mv88e6xxx driver do not use the >> > HashTrunk feature (which can be enabled via bit 11 of the >> > MV88E6XXX_G2_TRUNK_MAPPING register). >> >> This should be set at the bottom of mv88e6xxx_lag_sync_masks. >> >> > If we used this feature and if we knew what hash function it uses, we >> > could write a userspace tool that could recompute new MAC >> > addresses for the CPU ports in order to avoid the problem I explained >> > previously... >> > >> > Or the tool can simply inject frames into the switch and try different >> > MAC addresses for the CPU ports until desired load-balancing is reached. >> > >> > What do you think? >> >> As you concluded in your followup, not being able to have a fixed MAC >> for the CPU seems weird. >> >> Maybe you could do the inverse? Allow userspace to set the masks for an >> individual bond/team port in a hash-based LAG, then you can offload that >> to DSA. > > What masks?
The table defined in Global2/Register7. When a frame is mapped to a LAG (e.g. by an ATU lookup), all member ports will added to the frame's destination vector. The mask table is the block that then filters the vector to only include a single member. By modifying that table, you can choose which buckets are assigned to which member ports. This includes assigning 7 buckets to one member and 1 to the other for example. At the moment, mv88e6xxx will statically determine this mapping (in mv88e6xxx_lag_set_port_mask), by trying to spread the buckets as evenly as possible. It will also rebalance the assignments whenever a link goes down, or is "detached" in LACP terms. You could imagine a different mode in which the DSA driver would receive the bucket allocation from the bond/team driver (which in turn could come all the way from userspace). Userspace could then implement whatever strategy it wants to maximize utilization, though still bound by the limitations of the hardware in terms of fields considered during hashing of course.