Hi Han,

thanks a lot for your comments. I've just sent v3 of the patch that should cover issues you mentioned. More comments inline below.

Thanks!
Ihar

On 5/5/20 2:31 AM, Han Zhou wrote:


On Mon, May 4, 2020 at 9:52 AM Ihar Hrachyshka <ihrac...@redhat.com <mailto:ihrac...@redhat.com>> wrote:
>
> On Fri, May 1, 2020 at 3:28 AM Han Zhou <hz...@ovn.org <mailto:hz...@ovn.org>> wrote:
> >
> >
> >
> > On Tue, Apr 28, 2020 at 1:38 PM Ihar Hrachyshka <ihrac...@redhat.com <mailto:ihrac...@redhat.com>> wrote:
> > >
> > > Assuming only a single localnet port is actually plugged mapped on
> > > each chassis, this allows to maintain disjoint networks plugged to the
> > > same switch.  This is useful to simplify resource management for
> > > OpenStack "routed provider networks" feature [1] where a single
> > > "network" (which traditionally maps to logical switches in OVN) is
> > > comprised of multiple L2 segments and assumes external L3 routing
> > > implemented between the segments.
> > >
> > > TODO: consider E-W routing between localnet vlan tagged LSs
> > >       (ovn-chassis-mac-mappings).
> > >
> > > [1]: https://docs.openstack.org/ocata/networking-guide/config-routed-networks.html
> >
> > After reading the context and related discussions, I am a little concerned about this approach. It changes the concept of a logical switch in OVN, which was a L2 concept. It may be ok to change the concept if all the aspects are understood and properly documented and implemented. However, at least it is not clear to me how would it work on the physical pipeline when VIF1 on chassisA sends packet to VIF2 on chassis B through physical network when the two VIFs are on same LS that has multiple localnet ports. Today only one localnet port is allowed, so it is handled by sending the packet through *the* localnet port (without localnet port it goes through tunnel). With this patch it seems would choose the last localnet port, regardless of the physical location (L2 segment) of each VIF. I would expect the data structure change in "local_datapath" to maintain a list of localnet ports instead of only one, and the logic in physical.c to pick up the desired localnet port, according to the L2 segment it physically belongs to. (it shouldn't be a mapping between chassis and physical network/vlan, because it is natural to have multiple VLANs connected to a single chassis and have VIFs on different VLANs).
>
> The assumption in this patch is, while the north database may have
> multiple localnet ports per switch, only one segment can actually be
> plugged per chassis. Which reduces the problem of multiple localnet
> ports and selecting the right one to the current situation where
> there's a single localnet port per switch for each chassis.

OK. I think I didn't get the real meaning of this assumption earlier. It is reasonable to have this assumption, just like the earlier assumption that we only allows one localnet port for a L2 LS. Now that the LS is actually modeling a L3 fabric connected with multiple L2 segments, it is reasonable to assume that: 1. Only one localnet port should exist for each L2 segment. (so if there are N localnet ports in one LS, we assume they represents N L2 segments connected by L3 fabric) 2. For each chassis, it should plug to only one L2 segment for each such L3 LS. (I think this needs to be validated in the binding.c to at least warn such misconfiguration.)


It's already validated and we log a warning message on finding a second localnet port that has a bridge mapping for it. We don't fail to start the controller or stop flow configuration. I don't think it's needed but please clarify if you think it is.



It seems the implementation supported these assumptions well, but it may be documented more clearly for the L3 LS concept (I am not sure what can be a better word, since LS appears to be L2 concept) and use cases, and the above assumptions/constraints as well.

In addition, it is better to make it clear that a chassis can still be plugged with multiple localnet ports for different L3 LSes. This is a typical scenario of routed network use case with VRFs. It should still work with this patch, but I'd suggest to add a test case to cover such scenario to make sure.


I updated documentation with - hopefully - better wording and clarifying how this is supposed to be used in real life.



> (The
> assumption is implicit right now but we can also enforce it if
> needed.) Perhaps in the future we should consider expanding the
> feature to allow for multiple localnet ports plugged to fabric per
> chassis, but that's not the goal here.
>
For different L3 LSes, I think it is supported (and should be) already. For multiple localnet port for same L3 LS, I am not sure if it is reasonable in the future, but at least I totally agree that it is not a goal for now.

> >
> > Furthermore, if we change the concept of LS being a single L2 domain, we may need to consider more changes in the L2 logical pipelines, such as: > > - L2 lookup stage should only look up destination for the ports that are in the same segment. > > - BUM traffic should only go to the localnet port that belong to the same L2 segment of the LS.
> >
I thought about the BUM (broadcast, unknown, multicast) traffic again. It seems current phyisical pipeline handling for localnet would also handle that well in the case of mulitple localnet ports in same LS. (I implemented that logical initially, and never thought about this use case, and I am surprised that it seems would just work for this case, with the earlier assumptions). I'd suggest to at least add one or two test cases to ensure it works and avoid any future regressions since there are implications to make this work and I am afraid if there are any futher changes in the future could break it. The test cases can be: - For "unknown MAC", it is sent out through the physical bridge, once and only once. - For broadcast traffic (e.g. ARP), it is sent to all ports that are physically located on the same L2 segment (some on the same chassis, some on other chassis but same L2), once and only once.

> > All this are suggesting that we need to introduce a *segment* abstraction in OVN, and multiple segments belong to same LS. I wonder if it is even more complex to implement this in OVN than in Neutron plugin. Maybe I missed something here and made it more complex than necessary. Please correct me if I am wrong :)
>
> Indeed, if OVN models the OpenStack multi-segment network concept
> natively, 1) it would be easier for OpenStack and 2) it would just
> shift complexity from OpenStack to OVN. (I don't think it would be
> MORE complex than implementing it in the neutron plugin, but about the
> same.) This patch series is an attempt to facilitate neutron
> implementation of routed provider networks (multi-segment networks
> with L3-adjacent L2-domains) with as little blood on both sides.
>
Initially I was thinking about the full L2 segment modeling within LS. In theory it would be same level of complexity as implementing in Neutron plugin, but C would usually take more effort than python for such things. However, I realized that the modeling change may be not necessary in OVN to support the routed network use case, given that the above mentioned assumptions are stricly followed. Each localnet port is the modeling of a L2 segment - we just can't tell which port belong to which L2 segment from NB objects. The mapping is only reflected by the port-bindings and chassis configuration. So L2 becomes physical part, and NB only tells L3 model for the routed network. With this design, there are still things that would looks confusing because of the modeling missing in NB, such as: - the logical flow generation blindly generate L2 lookup flows, ARP response flows, etc. for all ports on the LS - the MC_Groups (MC_Flood, MC_Unknown, which are L2 concept) generated in SB contains all ports in the L3 network

In most cases the tricks we did with localnet port in physical pipeline would handle it well. Maybe one weird scenario I could think of is that when a VIF1 sends ARP request for IP of VIF2 which is on a different chassis on a different L2 segment, VIF1 would get the ARP resolved, while this shall never happen in physical network. However, I don't think this should create any real problem. I hope this would not result in any unexpected limitations because of the "simplified" modeling in NB for routed network.

To summarize, I agree with the solution of this patch, with the below comments: - Document the concept change of LS (aka L3 LS), the use case, the assumptions and constraints more clearly, in the localnet description in ovn-architecture.xml.
- Validate the assumption and print warning logs if it is not met.
- Add test cases to cover:
  a) The VRF use case of routed network. I.e. M logical networks (LSes), each has N segments (so in total there are M x N localnet ports), M localnet ports for the M LSes can still work on same chassis.   b) For "unknown MAC", it is sent out through the physical bridge, once and only once.   c) For broadcast traffic (e.g. ARP), it is sent to all ports that are physically located on the same L2 segment (some on the same chassis, some on other chassis but same L2), once and only once.


I've updated the test suite with more test scenarios to - hopefully - cover your list where it was not already covered.



Thanks,
Han

_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to