On Mon, Nov 09, 2020 at 12:59:39AM +0100, Andrew Lunn wrote: > On Sun, Nov 08, 2020 at 07:23:55PM +0200, Vladimir Oltean wrote: > > On Sun, Nov 08, 2020 at 10:09:25PM +0800, DENG Qingfang wrote: > > > Can it be turned off for switches that support SA learning from CPU? > > > > Is there a good reason I would add another property per switch and not > > just do it unconditionally? > > Just throwing out ideas, i've no idea if they are relevant. I wonder > if we can get into issues with fast ageing with a topology change? We > don't have too much control over the hardware. I think some devices > just flush everything, or maybe just one port. So we have different > life times for CPU port database entries and user port database > entries?
A quick scan for "port_fast_age" did not find any implementers who do not act upon the "port" argument. > We might also run into bugs with flushing removing static database > entries which should not be. But that would be a bug. I can imagine that happening, when there are multiple bridges spanning a DSA switch, each bridge also contains a "foreign" interface, and the 2 bridging domains service 2 stations that have the same MAC address. In that case, since the fdb_add and fdb_del are not reference-counted on the shared DSA CPU port, we would indeed trigger this bug. I was on the fence on whether to include the reference counting patch I have for host MDBs, and to make these addresses refcounted as well. What do you think? > Also, dumping the database might run into bugs since we have not had > entries for the CPU port before. I don't see what conditions can make this happen. > We also need to make sure the static entries get removed correctly > when a host moves. The mv88e6xxx will not replace a static entry with > a dynamically learned one. It will probably rise an ATU violation > interrupt that frames have come in the wrong port. This is a good one. Currently every implementer of .port_fdb_add assumes a static entry is what we want, but that is not the case here. We want an entry that can expire or the switch can move it to a different port when there is evidence that it's wrong. Should we add more arguments to the API? > What about switches which do not implement port_fdb_add? Do these > patches at least do something sensible? dsa_slave_switchdev_event -> dsa_slave_switchdev_event_work -> dsa_port_fdb_add -> dsa_port_notify(DSA_NOTIFIER_FDB_ADD) -> dsa_switch_fdb_add -> if (!ds->ops->port_fdb_add) return -EOPNOTSUPP; -> an error is printed with dev_dbg, and dsa_fdb_offload_notify(switchdev_work) is not called. On dsa_port_fdb_del error, there is also an attempt to call dev_close() on error, but only on user ports, which the CPU port is not. So, we do something almost sensible, but mostly by mistake it seems. I think the simplest would be to simply avoid all this nonsense right away in dsa_slave_switchdev_event: diff --git a/net/dsa/slave.c b/net/dsa/slave.c --- a/net/dsa/slave.c +++ b/net/dsa/slave.c @@ -2188,6 +2188,9 @@ static int dsa_slave_switchdev_event(struct notifier_block *unused, dp = p->dp->cpu_dp; } + if (!dp->ds->ops->port_fdb_add || !dp->ds->ops->port_fdb_del) + return NOTIFY_DONE; + switchdev_work = kzalloc(sizeof(*switchdev_work), GFP_ATOMIC); if (!switchdev_work) return NOTIFY_BAD;