Hi Josh,

On 1/11/24 06:47, Josh Coombs wrote:
I've got a brand new r151050e install using a Mellanox CNX5 card, dual 25Gb ports paired up in an aggr to a Juniper EX4650 cluster. It will only work if I start snoop on the aggr. Without doing so, it won't pass traffic. I ran into this back in 2019 with bnx devices after upgrading to r151030 and was never able to find a fix, that box I ended up changing NICs to intel to get around the problem.

It also works if I do a snoop -P -d aggr0 so it may not be promiscuous mode directly that's 'fixing' things?


I've seen this bug as well. When I dtrace the calls into mlxcx what I see is that aggr never gives the driver any VLAN tag filters for the default group (but it does give MAC filters), so no traffic other than on the default tag ends up received.

If you perturb the MAC state of the aggr enough it will switch to explicit VLAN tag filters and work fine (e.g. if you add a VNIC as well as the VLAN interface, the existence of the VNIC will fix it since that causes MAC to add an explicit VLAN tag filter for the vlan DL)

I think this is a semantic bug here -- I suspect MAC is assuming that if it adds just MAC filters and no VLAN filters to a NIC, that means all tagged traffic for that MAC should be matched, not just un-tagged. Unfortunately the documentation (mac_capab_rings.9e and mac.9e) is not very clear on this point and some drivers (definitely mlxcx I can speak for, since I wrote most of it) have interpreted it differently.

For now I've just been always using VNICs, since those always generate explicit filters and work fine. But we should get this fixed up, and probably the documentation adjusted to spell it out more clearly so no other new drivers make the same mistake going forwards.

------------------------------------------
illumos: illumos-discuss
Permalink: 
https://illumos.topicbox.com/groups/discuss/T608dab80e5db30f6-M7a4bf3d343ab2df1faf1eed8
Delivery options: https://illumos.topicbox.com/groups/discuss/subscription

Reply via email to