> -----Original Message----- > From: Alexander Duyck [mailto:alexander.du...@gmail.com] > Sent: Tuesday, January 30, 2018 2:39 PM > To: Keller, Jacob E <jacob.e.kel...@intel.com> > Cc: Shannon Nelson <shannon.nel...@oracle.com>; netdev@vger.kernel.org; > Duyck, Alexander H <alexander.h.du...@intel.com> > Subject: Re: macvlan devices and vlan interaction > > On Tue, Jan 30, 2018 at 2:20 PM, Keller, Jacob E > <jacob.e.kel...@intel.com> wrote: > >> -----Original Message----- > >> From: Alexander Duyck [mailto:alexander.du...@gmail.com] > >> Sent: Tuesday, January 30, 2018 12:49 PM > >> To: Shannon Nelson <shannon.nel...@oracle.com> > >> Cc: Keller, Jacob E <jacob.e.kel...@intel.com>; netdev@vger.kernel.org; > Duyck, > >> Alexander H <alexander.h.du...@intel.com> > >> Subject: Re: macvlan devices and vlan interaction > >> > >> > Hi Jake, > >> > > >> > The current behavior seems logical to me, but I suppose Alex might argue > >> > differently. The macvlan was put onto the default lowerdev assuming the > >> > lowerdev will hand it all the default traffic, and then the macvlan > >> > splits > >> > out its own vlan traffic. As soon as the lowerdev assumption changes, > >> > it is > >> > going to change what gets pushed up to the macvlan dev. If the lowerdev > >> > is > >> > separating the vlan traffic out of the "default" flow headed to the > >> > macvlan, > >> > then the initial assumption has changed and the vlan traffic has been > >> > vectored off before it can be delivered up the stack to the macvlan. > >> > >> It depends on what your goal is. In my mind making macvlan VLAN > >> challenged is the easier solution since you just have to add some > >> pass-thru ops to the VLAN drivers and you can guarantee that you are > >> passing MAC-VLAN pair for each address on the interface for the call. > >> The alternative gets to be a bit more complex since it requires > >> multiple rules, one for non-tagged and one per VLAN for tagged > >> traffic. > >> > >> > There's an argument that the lowerdev shouldn't know anything about the > >> > upperdev's routing, just deliver to the upperdev and let the upperdev > >> > worry > >> > about it. But perhaps this becomes is a question of precedence: does the > >> > lowerdev split traffic first by mac address or by vlan tag. > >> > >> That is where things get messy. We found it splits by VLAN tag if the > >> VLAN is present on the lowerdev, or it splits by MAC if it is not. > >> That is why as Jake pointed out adding the VLAN to the lower dev > >> causes issues. > >> > > > > Yes, right now the problem is that it splits differently depending on > > whether or > not a VLAN is present on the lower dev. > > > >> > I don't like your option 1: as you point out, it breaks current > >> > functionality, likely depended upon in some containers that are using > >> > macvlans to manage their traffic. We don't know what's going on inside > >> > that > >> > container and I don't think we want to break its ability to split its own > >> > vlans. > >> > >> Maybe we should look at an option 1.5. Mark the lowerdev as VLAN > >> challenged if any macvlan is operating with any VLANs enabled on it > >> since we can only really allow VLAN filtering to occur at one level > >> reliably. Either that or maybe we look at making VLANs and rx_handler > >> setups mutually exclusive. > >> > > > > Actually.. what if we changed the order of splitting, so that we always > > check > macvlan MAC address first, before checking VLANs? > > > > This should work in both cases of macvlan -> VLAN -> lowerdev, or VLAN -> > macvlan -> lowerdev. > > The thing you have to then watch out for is how something like this > would impact bonding or bridging since both of those use the Rx > handler as well from my understanding. I suppose it would make sense > though to do Rx handler first and then VLAN since the Rx handler > should be placed on the VLAN itself if you are > bridging/bonding/macvlan over a VLAN versus the reverse. > > > In the first case, the macvlan isn't directly attached to the lowerdev, so > > we'd do > VLAN filtering first, and then the VLAN would check MAC address. > > Right, that bit works without any issues. > > > In the second case, even if lowerdev also had the VLAN, we'd do macvlan > filtering first, and things would work. > > That piece makes sense, at least for macvlan. > > > Both the lowerdev VLAN and upperdev macvlan should receive traffic correctly > in this case. > > > > I think this resolves the problem of which device goes to which VLAN. > > Right, this works for macvlan. The only concern I have then is bond > and bridging. It is probably fine but it wouldn't hurt to check. >
Yea, it's quite possible bonding and bridging won't work quite right... > > I don't know if it resolves the issues with leaked VLANs, where a VLAN > > added to > the macvlan device causes traffic for that VLAN to be received by all the MAC > addresses of the lowerdev... > > > > I suppose this might not be considered a problem? The traffic could be > > received > either way if you're in promiscuous mode. It's not like we have a sense of > "trusted" configuration either. > > > > I think some separate work for the case of macvlan on top of VLAN on top of > lower dev can be done as well, to enable offloading in this case. I'll have > some > more thoughts on that soon. > > > > Thanks, > > Jake > > So the issue with all of this apears to be: > > commit 2425717b27eb92b175335ca4ff0bb218cbe0cb64 > Author: John Fastabend <john.r.fastab...@intel.com> > Date: Mon Oct 10 09:16:41 2011 +0000 > > net: allow vlan traffic to be received under bond > > The problem is that patch made it so that you could put a bond on two > interfaces and still peel out traffic via VLANs. Prior to this patch > the code is the way we were already discussing. > Interesting. I am not sure how to resolve this without breaking one or the other case. The commit message itself calls out that: Putting a VLAN.228 on both the bond0 and eth2 device will result in eth2.228 receiving the skb. I don't think this is completely unexpected and was the result prior to the rx_handler result. So.. to me this *was* unexpected, but I guess it matches how things used to work? Thanks, Jake