On 10/4/2025 10:27 AM, Linus Lüssing wrote:
On Wed, Sep 17, 2025 at 02:30:51PM +0300, Ido Schimmel wrote:
But before making changes, I want to better understand the problem you
are seeing. Is it specific to the offloaded data path? I believe the
problem was fixed in the software data path by this commit:
Two issues I noticed recently, even without any hardware switch
offloading, on plain soft bridges:
1) (Probably not the issue here? But just to avoid that this
causes additional confusion:) we don't seem to properly converge to
the lowest MAC address, which is a bug, a violation of the RFCs.
If we received an IGMP/MLD query from a foreign host with an
address like fe80::2 and selected it and then enable our own
multicast querier with a lower address like fe80::1 on our bridge
interface for example then we won't send our queries, won't reelect
ourself. If I recall correctly. (Not too critical though, as at least we
have a querier on the link. But I find the election code a bit
confusing and I wouldn't dare to touch it without adding some tests.)
I agree that there might be some corner cases which the current election
code does not handle very well (one of them is outlined below).
2) Without Ido's suggested workaround when the bridge multicast snooping
+ querier is enabled before the IPv6 DAD has taken place then our
first IGMP/MLD query will fizzle, not be transmitted.
This (#2) is what this patch trying to address. With DAD enabled, the
first MLD Query is never transmitted. That essentially means that the
Robustness Variable is 1 (which is not very robust).
However (at least for a non-hardware-offloaded) bridge as far as I
recall this shouldn't create any multicast packet loss and should
operate as "normal" with flooding multicast data packets first,
with multicast snooping activating on multicast data
after another IGMP/MLD querier interval has elapsed (default:
125 sec.)?
Some systems could not afford to flood multicast traffic. Think of some
resource-constrained low power sensors connected to a network with high
volume multicast video traffic for example. The multicast traffic could
easily choke the sensors and is essentially a DDoS attack.
Which indeed could be optimized and is confusing, this delay could
be avoided. Is that that the issue you mean, Joseph?
(I'd consider it more an optimization, so for net-next, not
net though.)
I'm not sure this should be categorized as an optimization. If we never
intend to send Startup Queries, that's a different story. But if we
intend to send it but failed, I think that should be a bug.
In current implementation, :: always wins the election
That would be news to me.
RFC2710, section 5:
To be valid, the Query message MUST come from a link-
local IPv6 Source Address
RFC3810, section 5.1.14, is even more explicit:
5.1.14. Source Addresses for Queries
All MLDv2 Queries MUST be sent with a valid IPv6 link-local source
address. If a node (router or host) receives a Query message with
the IPv6 Source Address set to the unspecified address (::), or any
other address that is not a valid IPv6 link-local address, it MUST
silently discard the message and SHOULD log a warning.
So :: can't be used as a source address for an MLD query.
And since 2014 with "bridge: multicast: add sanity check for query source
addresses"
(https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=6565b9eeef194afbb3beec80d6dd2447f4091f8c)
we should be adhering to that requirement? Let me know if I'm missing
something.
This is what I meant by ":: always wins":
In br_multicast_select_querier(),
if (ipv6_addr_cmp(&saddr->src.ip6, &querier->addr.src.ip6) <= 0)
goto update;
If querier->addr.src.ip6 is 0, nothing can be less than that, so "::
always wins".
However,
1. querier->addr.src.ip6 is (un)initialized(?) to 0 (I couldn't find the
place where ip6_querier.addr is initialized)
2. Querier election cannot take place due to the comparison above, until
the bridge selects itself first via br_multicast_select_own_querier()
3. the bridge only selects itself after the first successful Query is
sent to the host
4. br_ip6_multicast_alloc_query() will fail if v6 address is not valid
So, without this patch a system would have to wait for
31.25 seconds (for the second Query to the host to selects itself) +
~125 seconds (for the next Query from the real Querier to arrive)
in order to receive multicast traffic. For some embedded devices that's
a very long time (imagine turning on a TV and have to wait for 2 minutes
and a half before it starts working).
Thanks,
Joseph
For IPv4 and 0.0.0.0 this is a different story though... I'm not
aware of a requirement in RFCs to avoid 0.0.0.0 in IGMP
queries. And "intuitively" one would prefer 0.0.0.0 to be the
least prefered querier address. But when taking the IGMP RFCs
literally then 0.0.0.0 would be the lowest one and always win... And RFC4541
unfortunately does not clarify the use of 0.0.0.0 for IGMP queries.
Not quite sure what the common practice among other layer 2 multicast
snooping implemetations across other vendos is.
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0888d5f3c0f183ea6177355752ada433d370ac89
And Linus is working [1][2] on reflecting it to device drivers so that
the hardware data path will act like the software data path and flood
unregistered multicast traffic to all the ports as long as no querier
was detected.
Right, for hardware offloading bridges/switches I'm on it, next
revision shouldn't take much longer...
Regards, Linus