Hi Josh,
On 1/11/24 06:47, Josh Coombs wrote:
I've got a brand new r151050e install using a Mellanox CNX5 card, dual
25Gb ports paired up in an aggr to a Juniper EX4650 cluster. It will
only work if I start snoop on the aggr. Without doing so, it won't pass
traffic. I ran into this back in 2019 with bnx devices after upgrading
to r151030 and was never able to find a fix, that box I ended up
changing NICs to intel to get around the problem.
It also works if I do a snoop -P -d aggr0 so it may not be
promiscuous mode directly that's 'fixing' things?
I've seen this bug as well. When I dtrace the calls into mlxcx what I
see is that aggr never gives the driver any VLAN tag filters for the
default group (but it does give MAC filters), so no traffic other than
on the default tag ends up received.
If you perturb the MAC state of the aggr enough it will switch to
explicit VLAN tag filters and work fine (e.g. if you add a VNIC as well
as the VLAN interface, the existence of the VNIC will fix it since that
causes MAC to add an explicit VLAN tag filter for the vlan DL)
I think this is a semantic bug here -- I suspect MAC is assuming that if
it adds just MAC filters and no VLAN filters to a NIC, that means all
tagged traffic for that MAC should be matched, not just un-tagged.
Unfortunately the documentation (mac_capab_rings.9e and mac.9e) is not
very clear on this point and some drivers (definitely mlxcx I can speak
for, since I wrote most of it) have interpreted it differently.
For now I've just been always using VNICs, since those always generate
explicit filters and work fine. But we should get this fixed up, and
probably the documentation adjusted to spell it out more clearly so no
other new drivers make the same mistake going forwards.
------------------------------------------
illumos: illumos-discuss
Permalink:
https://illumos.topicbox.com/groups/discuss/T608dab80e5db30f6-M7a4bf3d343ab2df1faf1eed8
Delivery options: https://illumos.topicbox.com/groups/discuss/subscription