Re: Radv proto sending adverts on wrong interface
Hi Kees, On Mon, Mar 13, 2023 at 07:23:17AM +0100, Kees Meijs | Nefos wrote: > About VLAN configuration: I guess one should never ever use VLAN_DEFAULT > c.q. VLAN 1 at all. Vendors often think differently about this case, > sometimes allowing to have a .1Q tag, sometimes not. Or sort of "both" as > well. In combination with protocols such as LACP or (variations on) STP > resulting in great fun, which of course is very sarcastic. I changed the default using `default-vlan-id 4095` for that very reason. Perhaps this is still a default vid related bug. I'll see if I can reproduce this with some other vid. > My two cents after opening up a box with a new switch, starting your > configuration: explicitly disable VLAN1 and continue using others. Same goes > for VLAN > 4090 by the way. These numbers are sometimes "reserved". The brocade docs list the reserved vlans (4091, 4092) explicitly and allow to change them like the default vid. The 4095 default vid I'm using is the example they use in the docs so I'd be surprised if that's broken. Using such low vlan ids is a holdover from using some ath9k wifi APs that only supported 4-bit vids so we had to make every number count ;) I'll definetly reconsider that in the future haha. Thanks, --Daniel
Re: Radv proto sending adverts on wrong interface
Hi, On 13-03-2023 05:52, d...@darkboxed.org wrote: It looks like made a mistake when testing my patch. It does in fact not fix the problem. I then did some more reading of the linux scriptures and it turns out PACKET_OUTGOING ("Out" in tcpdump) should actually be reliable so that meant that the "M" means that packet is actually coming in from the outside. Lo and behold I had an unintentonal, but at glance harmless, vlan configuration on the switch both enp1s0 and enp2s0 are connected to. Essentially enp2 is untagged vlan 1 and enp1 is untagged vlan 4 and tagged vlan 1 on the switch side. When sending the (untagged) RA on enp2 then I would expect to receive this with a vlan 1 tag on enp1 which would have made it obvious what is going on, but no it was coming in untagged. Smells like a switch bug[1] to me or maybe I don't understand 802.1Q VLANs as well as I thought... Sorry for the noise. About VLAN configuration: I guess one should never ever use VLAN_DEFAULT c.q. VLAN 1 at all. Vendors often think differently about this case, sometimes allowing to have a .1Q tag, sometimes not. Or sort of "both" as well. In combination with protocols such as LACP or (variations on) STP resulting in great fun, which of course is very sarcastic. My two cents after opening up a box with a new switch, starting your configuration: explicitly disable VLAN1 and continue using others. Same goes for VLAN > 4090 by the way. These numbers are sometimes "reserved". Cheers, Kees
Re: Radv proto sending adverts on wrong interface
Hi Ondrej, It looks like made a mistake when testing my patch. It does in fact not fix the problem. I then did some more reading of the linux scriptures and it turns out PACKET_OUTGOING ("Out" in tcpdump) should actually be reliable so that meant that the "M" means that packet is actually coming in from the outside. Lo and behold I had an unintentonal, but at glance harmless, vlan configuration on the switch both enp1s0 and enp2s0 are connected to. Essentially enp2 is untagged vlan 1 and enp1 is untagged vlan 4 and tagged vlan 1 on the switch side. When sending the (untagged) RA on enp2 then I would expect to receive this with a vlan 1 tag on enp1 which would have made it obvious what is going on, but no it was coming in untagged. Smells like a switch bug[1] to me or maybe I don't understand 802.1Q VLANs as well as I thought... Sorry for the noise. Thanks, --Daniel [1]: This is with a Brocade ICX 6450 running R08030u. Relevant config snippets: vlan 1 by port tagged ethe 1/1/1 1/1/3 router-interface ve 1 vlan 4 by port tagged ethe 1/1/1 1/1/3 interface ethernet 1/1/3 dual-mode 1 I can see untagged multicast going into 1/1/3 (enp2s0) coming out 1/1/1 as untagged despite 1/1/3 being in dual-mode. Interestingly this also happens for unicasts but only in one direction. If I add the enp1s0 lladdr to the neighbour table I can see pings through enp2s0 come in untagged on enp1s0, but the return seems to be filtered which is why ND doesn't work (remember: ND responses are sent as unicast). Here's to hoping affordable open Linux NOS switches to come onto the second hand market eventually...
Re: Radv proto sending adverts on wrong interface
On Sun, Mar 12, 2023 at 02:36:50PM +0100, d...@darkboxed.org wrote: > I noticed something in tcpdump just now, when using -iany the incorrect RA > advert shows up as an "M" (multicast) as opposed to "Out" on the correct > interface. This only happens when sending an RA on enp2s0 not any of the > stacked vlan interfaces: > > enp2s0 Out IP6 fe80::debb > ff02::1: ICMP6, router advertisement, length > 112 > enp1s0 M IP6 fe80::debb > ff02::1: ICMP6, router advertisement, length > 112 > enp2s0.20 Out IP6 fe80::20d:b9ff:fe4e:9055 > ff02::1: ICMP6, router > advertisement, length 80 > enp2s0.40 Out IP6 fe80::debb > ff02::1: ICMP6, router advertisement, > length 80 > enp2s0.10 Out IP6 fe80::20d:b9ff:fe4e:9055 > ff02::1: ICMP6, router > advertisement, length 48 > > Not sure what to make of this. > > (Note: fe80::debb isn't the link-local address of enp1s0, which seems > weird) Which interface has fe80::debb address? -- Elen sila lumenn' omentielvo Ondrej 'Santiago' Zajicek (email: santi...@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
Re: Radv proto sending adverts on wrong interface
> The field sin6_scope_id should be used only for link-local addresses (to > define their scope), not as a way to route multicasts. > > (Hmm, ff02::/16 is defined as link-local multicast address, so perhaps > setting sin6_scope_id makes sense.) FWIW, babeld uses the sin6_scope_id when sending its multicast packets (it does not do the setsockopt), and we've never received a report about packets going out the wrong interface. -- Juliusz
Re: Radv proto sending adverts on wrong interface
Hi Ondrej, On Sun, Mar 12, 2023 at 01:29:23PM +0100, Ondrej Zajicek wrote: > I do not really get this. For multicast, outgoing interface is defined by > setsockopt(IPV6_MULTICAST_IF) in sk_setup_multicast6(). Hmm, I hadn't seen that. That is odd indeed. Looking at this again I also noticed we set the interface using PKTINFO (ipi6_ifindex) too which seems to make setting MULTICAST_IF redundant. > The field sin6_scope_id should be used only for link-local addresses (to > define their scope), not as a way to route multicasts. Not sure what is going on either, but I have absolutely positively observed the packets on the wrong interface and my patch fixed this. I have three physical en interfaces (enp{1,2,3}s0 and a couple of vlan ones on top of enp2s0. The RAs should be going out enp2s0* only but enp1s0 is still getting one. enp3s0 is also up but isn't receiving this erronous RA. I noticed something in tcpdump just now, when using -iany the incorrect RA advert shows up as an "M" (multicast) as opposed to "Out" on the correct interface. This only happens when sending an RA on enp2s0 not any of the stacked vlan interfaces: enp2s0 Out IP6 fe80::debb > ff02::1: ICMP6, router advertisement, length 112 enp1s0 M IP6 fe80::debb > ff02::1: ICMP6, router advertisement, length 112 enp2s0.20 Out IP6 fe80::20d:b9ff:fe4e:9055 > ff02::1: ICMP6, router advertisement, length 80 enp2s0.40 Out IP6 fe80::debb > ff02::1: ICMP6, router advertisement, length 80 enp2s0.10 Out IP6 fe80::20d:b9ff:fe4e:9055 > ff02::1: ICMP6, router advertisement, length 48 Not sure what to make of this. (Note: fe80::debb isn't the link-local address of enp1s0, which seems weird) > (Hmm, ff02::/16 is defined as link-local multicast address, so perhaps > setting sin6_scope_id makes sense.) > > If sending (IPv6) multicasts does not work properly, that should be also > noticed in OSPFv3/RIPng, but i am not aware of such issue. Well there are quite a few options here, perhaps those protos just have subtly different socket setup or perhaps there's something particular about my setup that makes it go wrong on the kernel side? Thanks, --Daniel
Re: Radv proto sending adverts on wrong interface
On Sat, Mar 11, 2023 at 06:28:58AM +0100, Daniel Gröber wrote: > Hi, > > I'm using bird as a replacement for radvd since the latter has a > longstanding issue with sending adverts on unconfigured interfaces under > complex conditions. > > Turns out bird has a similar issue :) > > Looking at the code, when opening the socket for an interface in > radv_sk_open we set sk->iface as you'd expect, which should cause packets > to be sent directly via this interface. > > However radv sends packets to the all-nodes multicasts address for periodic > adverts, see radv_send_ra. This then calls sk_send_to which (eventually) > calls sockaddr_fill6. Here rther we find this code: > > if (ifa && ipa_is_link_local(a)) > sa->sin6_scope_id = ifa->index; > > This would seem to be the problem to me, since a=ff02::1 doesn't pass this > check so the sendmsg call goes out without the interface-index being > communicated to the kernel. Hi I do not really get this. For multicast, outgoing interface is defined by setsockopt(IPV6_MULTICAST_IF) in sk_setup_multicast6(). The field sin6_scope_id should be used only for link-local addresses (to define their scope), not as a way to route multicasts. (Hmm, ff02::/16 is defined as link-local multicast address, so perhaps setting sin6_scope_id makes sense.) If sending (IPv6) multicasts does not work properly, that should be also noticed in OSPFv3/RIPng, but i am not aware of such issue. -- Elen sila lumenn' omentielvo Ondrej 'Santiago' Zajicek (email: santi...@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
Radv proto sending adverts on wrong interface
Hi, I'm using bird as a replacement for radvd since the latter has a longstanding issue with sending adverts on unconfigured interfaces under complex conditions. Turns out bird has a similar issue :) Looking at the code, when opening the socket for an interface in radv_sk_open we set sk->iface as you'd expect, which should cause packets to be sent directly via this interface. However radv sends packets to the all-nodes multicasts address for periodic adverts, see radv_send_ra. This then calls sk_send_to which (eventually) calls sockaddr_fill6. Here rther we find this code: if (ifa && ipa_is_link_local(a)) sa->sin6_scope_id = ifa->index; This would seem to be the problem to me, since a=ff02::1 doesn't pass this check so the sendmsg call goes out without the interface-index being communicated to the kernel. It looks like that causes it to just pick a random interface though I would expect it to just multicast this. Not sure why that is but it's broken on our side either way. I'd add a check here for whether the saddr is link-local too. That should cover this case. Any comments/objections? Thanks, --Daniel