Hi Guru,

Thanks for taking a look and providing feedback.
Regarding:
"
I had a talk with Numan in IRC and I think I understand the problem better. So 
my above understanding was clearly wrong. Let me re-summarize the problem 
statement in another mail.
"

No worries, please take your time and feel free to put up your understanding, i 
will be happy to explain.

Thanks

Regards,
Ankur

________________________________
From: Guru Shetty <g...@ovn.org>
Sent: Thursday, November 8, 2018 8:35:07 AM
To: Ankur Sharma
Cc: ovs-dev
Subject: Re: [ovs-dev] OVN based distributed virtual routing for VLAN backed 
networks



On Wed, 7 Nov 2018 at 15:01, Guru Shetty <g...@ovn.org<mailto:g...@ovn.org>> 
wrote:


On Fri, 19 Oct 2018 at 15:17, Ankur Sharma 
<ankur.sha...@nutanix.com<mailto:ankur.sha...@nutanix.com>> wrote:

Hi Guru,

Thanks for taking a look.
Please find the detailed explanation of problem statement inline.

Thanks

Regards,
Ankur



From: Guru Shetty <g...@ovn.org<mailto:g...@ovn.org>>
Sent: Friday, October 19, 2018 9:35 AM
To: Ankur Sharma <ankur.sha...@nutanix.com<mailto:ankur.sha...@nutanix.com>>
Cc: ovs-dev <ovs-dev@openvswitch.org<mailto:ovs-dev@openvswitch.org>>
Subject: Re: [ovs-dev] OVN based distributed virtual routing for VLAN backed 
networks





On Tue, 16 Oct 2018 at 15:43, Ankur Sharma 
<ankur.sha...@nutanix.com<mailto:ankur.sha...@nutanix.com>> wrote:

Hi,

We have done some effort in evaluating usage of OVN for
Distributed Virtual Routing (DVR) for vlan backed networks.



Would you mind explaining the above statement with a lot of details? I would 
like to understand the problem well before looking at the proposed solution.

[ANKUR]:
a. OVN provides logical routing and switching, but was designed for scenario 
where logical switch is of type “overlay”.

     i.e Packets going on the wire are always encapsulated (exception would be 
communication with external network, via NAT).

b. Our proposal is to enhance OVN to support cases where logical switch is of 
type “vlan”,

     i.e there is no encapsulation and “inner” packet goes on the wire “as is”.



c. We evaluated current OVN implementation for both logical-switches and 
logical-routers.



d. Our proposal is meant to highlight the gaps (as of now) and how we plan to 
fix them.

Let me summarize my understanding of your mail and you can poke holes in it, if 
I am wrong and probably rephrase your problem statement.

Current state of OVN
================

The following is the current state of OVN's "localnet" feature. I haven't used 
"localnet" feature for a long time and this is my understanding based on 
reading the man pages and the localnet test in tests/ovn.at 
[ovn.at]<https://urldefense.proofpoint.com/v2/url?u=http-3A__ovn.at&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=dxOUUPDO8lp4XtDs5q7VNhMQ6MJ_RtDYvs7WgW-UxFY&s=q_l0UEOuDJKgfGVSZG5l18y5zUJTfgmm6un3_TCmaes&e=>
 today.

OVN currently supports logical switches and lets you interconnect them with 
logical routers. In a particular logical switch, you can have many logical 
ports which are backed by a VM running on different hypervisors. The VMs across 
multiple hypervisors which are backed by a OVN logical port only talk via 
overlay networks. In addition, you can add a logical port of type "localnet" to 
these logical switches. This "localnet" logical port has a "tag" associated 
with it. This localnet port is used to let the VMs of a OVN logical switch talk 
to other network endpoints that are not in OVN (for e.g a baremetal machine), 
but are in the same subnet. When OVN (i.e br-int) receives a ARP broadcast 
packet from a OVN logical port in a logical switch, it will be sent to all 
other logical ports including the "localnet" port.

You can potentially connect two of these logical switches via a OVN router. The 
OVN known logical ports will talk to each other via overlay networks. But, if 
you want to exit the network, you need to go out of a gateway port.

You can also use a "l2gateway" to connect to multiple vlan backed machines in 
the same logical switch that are outside OVN.

I had a talk with Numan in IRC and I think I understand the problem better. So 
my above understanding was clearly wrong. Let me re-summarize the problem 
statement in another mail.





My understanding of your problem statement
===================================

You want to achieve the same feature set as what currently exists in OVN, but 
you don't want to use overlay networks when 2 known logical ports of OVN (say 
backed by a VM) exists in a "vlan backed logical switch". Is that the only 
difference? If that is the only difference, my question is "why?". Why do you 
want to avoid overlay networks? Do you gain anything else out of it? What is it?











We would like to take it forward with the community.

We understand that some of the work could be overlapping with existing
patches in review.

We would appreciate the feedback and would be happy to update our patches
to avoid known overlaps.

This email explains the proposal. We will be following it up with patches.
Each "CODE CHANGES" section summarizes the change that corresponding patch
would have.


DISTRIBUTED VIRTUAL ROUTING FOR VLAN BACKED NETWORKS
======================================================


1. OVN Bridge Deployment
------------------------------------

Our design follows following ovn-bridge deployment model
(please refer to figure OVN Bridge deployment).
    i. br-int ==> OVN managed bridge.
       br-pif ==> Learning Bridge, where physical NICs will be connected.

   ii. Any packet that should be on physical network, will travel from BR-INT
       to BR-PIF, via patch ports (localnet ports).

2. Layer 2
-------------

   DESIGN:
   ~~~~~~~
   a. Leverage on localnet logical port type as path port between br-int and
       br-pif.
   b. Each VLAN backed logical switch will have a localnet port connected
       to it.
   c. Tagging and untagging of vlan headers happens at localnet port boundary.

   PIPELINE EXECUTION:
   ~~~~~~~~~~~~~~~~~~~
   a. Unlike geneve encap based solution, where we execute ingress pipeline on
       source chassis and egress pipeline on destination chassis, for vlan
       backed logical switches, packet will go through ingress pipeline
       on destination chassis as well.

   PACKET FLOW (Figure 1. shows topology and Figure 2. shows the packet flow):
   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   a. VM sends unicast traffic (destined to VM2_MAC) to br-int.
   b. For br-int, destination mac is not local, hence it will forward it to
       localnet port (by design), which is attached to br-pif. This is
       the stage at which vlan tag is added. Br-pif forwards the packet
       to physical interface.
   c. br-pif on destination chassis sends the received traffic to patch-ports
       on br-int (as unicast or unknown unicast).
   d. br-int does vlan tag check, strips the vlan header and sends
       the packet to ingress pipeline of the corresponding datapath.


   KEY DIFFERENCES AS COMPARED TO OVERLAY:
   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   a. No encapsulation.
   b. Both ingress and egress pipelines of logical switch are executed on
       both source and destination hypervisor (unlike overlay where ingress
       pipeline is executed on source hypervisor and egress on destination).

   CODE CHANGES:
   ~~~~~~~~~~~~~
   a. ovn-nb.ovsschema:
        1. Add a new column to table Logical_Switch.
        2. Column name would be "type".
        3. Values would be either "vlan" or "overlay", with "overlay"
            being default.

   b. ovn-sbctl:
        1. Add a new cli which sets the "type" of logical-switch.
            ovn-nbctl ls-set-network-type SWITCH TYPE

   c. ovn-northd:
        1. Add a new enum to ovn_datapath struct, which will indicate
            if logical_switch datapath type is overlay or vlan.
        2. Populate a new key value pair in southbound database for Datapath
            Bindings of Logical_Switch.
        3. Key value pair: <logical-switch-type, "vlan" or "overlay">, default
            will be overlay.


3. Layer 3 East West
--------------------

   DESIGN:
   ~~~~~~~
   a. Since the router port is distributed and there is no encapsulation,
       hence packets with router port mac as source mac cannot go on wire.
   b. We propose replacing router port mac with a chassis specific mac,
       whenever packet goes on wire.
   c. Number of chassis_mac per chassis could be dependent on number of
       physical nics and corresponding bond policy  on br-pif.

      As of now, we propose only one chassis_mac per chassis
      (shared by all resident logical routers). However, we are analyzing
      if br-pif's bond policy would require more macs per chassis.

   PIPELINE EXECUTION:
   ~~~~~~~~~~~~~~~~~~~
   a. For a DVR E-W flow, both ingress and egress pipelines for logical_router
       will execute on source chassis only.

   PACKET FLOW (Figure 3. shows topology and Figure 4. shows the packet flow):
   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   a. VM1 sends packet (destined to IP2), to br-int.
   b. On Source hypervisor, packet goes through following pipelines:
      1. Ingress: logical-switch 1
      2. Egress:  logical-switch 1
      3. Ingress: logical-router
      4. Egress:  logical-router
      5. Ingress: logical-switch2
      6. Egress:  logical-switch2

      On wire, packet goes out with destination logical switch's vlan.
      As mentioned in design, source mac (RP2_MAC) would be replaced with
      CHASSIS_MAC and destination mac would be that of VM2.

   c. Packet reaches destination chassis and enters logical-switch2
       pipeline in br-int.
   d. Packet goes through logical-switch2 pipeline (both ingress and egress)
       and gets forwarded to VM2.

   CODE CHANGES:
   ~~~~~~~~~~~~~
   a. ovn-sb.ovsschema:
        1. Add a new column to the table Chassis.
        2. Column name would be "chassis_macs", type being string and no
            limit on range of values.
        3. This column will hold a list if chassis unique macs.
        4. This table will be populated from ovn-controller.

   b. ovn-sbctl:
        1. CLI to add/delete chassis_macs to/from the south bound database.

   c. ovn-controller:
        1. Read chassis macs from OVS Open_Vswitch table and populate
            south bound database.
        2. In table=65, add a new flow at priority 150, which will do following:
           a. Match: source_mac == router_port_mac, metadata ==
               destination_logical_switch, logical_outport = localnet_port
           b. Action: Replace source mac with chassis_mac, add vlan tag.


4. LAYER 3 North South (NO NAT)
-------------------------------

   DESIGN:
   ~~~~~~~
   a. For talking to external network endpoint, we will need a gateway
      on OVN DVR.
   b. We propose to use the gateway_chassis construct to achieve the same.
   c. LRP will be attached to Gateway Chassis(s) and only on the active
       chassis we will respond to ARP request for the LRP IP from undelay
       network.
   d. If NATing (keeping state) is not involved then traffic need not go
       via the gateway chassis always, i.e traffic from OVN chassis to
       external network need not go via the gateway chassis.

   PIPELINE EXECUTION:
   ~~~~~~~~~~~~~~~~~~~
   a. From endpoint on OVN chassis to endpoint on underlay.
      i. Like DVR E-W, logical_router ingress and egress pipelines are
         executed on source chassis.

   b. From endpoint on underlay TO endpoint on OVN chassis.
      i. logical_router ingress and egress pipelines are executed on
         gateway chassis.

   PACKET FLOW LS ENDPOINT to UNDERLAY ENDPOINT (Figure 5. shows topology):
   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   a. Packet flow in this case is exactly same as Layer 3 E-W.


   PACKET FLOW UNDERLAY ENDPOINT to LS ENDPOINT (Figure 5. shows topology and
   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   Figure 6. shows the packet flow):
   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   a. Gateway for endpoints behind DVR will be resident on only
       gateway-chassis.
   b. Unicast packets will come to gateway-chassis, with destination MAC
       being RP2_MAC.
   c. From now on, it is like L3 E-W flow.

   CODE CHANGES:
   ~~~~~~~~~~~~~
   a. ovn-northd:
        1. Changes to respond to vlan backed router port ARP from uplink,
           only if it is on a gateway chassis.
        2. Changes to make sure that in the absence of NAT configuration,
           OVN_CHASSIS to external network traffic does not go via the gateway
           chassis.

   b. ovn-controller:
        1. Send out garps, advertising the vlan backed router port's
           (which has gateway chassis attached to it) from the
           active gateway chassis.


5. LAYER 3 North South (NAT)
----------------------------

   SNAT, DNAT, SNAT_AND_DNAT (without external mac):
   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   a. Our proposal aligns with following patch series which is out for review:
       link <http://patchwork.ozlabs.org/patch/952119/ 
[patchwork.ozlabs.org]<https://urldefense.proofpoint.com/v2/url?u=http-3A__patchwork.ozlabs.org_patch_952119_&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=W9SaC-Y4c66z_A0dCbHWCbvbp5dlFgMaLgJckzFobEU&s=Txu4M2xluHnEGQWjNTNPBziw18TH0s98QHTTsJe3fV4&e=>>

   b. However, our implementation deviates from proposal in following areas:
      i. Usage of lr_in_ip_routing:
         Our implementation sets the redirect flag after routing decision is 
taken.
         This is to ensure that a user entered static route will not affect the
         redirect decision (unless it is meant to).

     ii. Using Tenant VLAN ID for "redirection":
         Our implementation uses external network router port's
         (router port that has gateway chassis attached to it) vlan id
         for redirection. This is because chassisredirect port is NOT on
         tenant network and logically packet is being forwarded to
         chassisredirect port.


   SNAT_AND_DNAT (with external mac):
   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   a. Current OVN implementation of not going via gateway chassis aligns with
       our design and it worked fine.


This is just an initial proposal. We have identified more areas that
should be worked upon, we will submit patches (and put forth topics/design for 
discussion),
as we make progress.


Thanks

Regards,
Ankur
_______________________________________________
dev mailing list
d...@openvswitch.org<mailto:d...@openvswitch.org>
https://mail.openvswitch.org/mailman/listinfo/ovs-dev 
[mail.openvswitch.org]<https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_mailman_listinfo_ovs-2Ddev&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=W9SaC-Y4c66z_A0dCbHWCbvbp5dlFgMaLgJckzFobEU&s=CISx3hjn0RpVgFyNMLZJLtBIfzp5g1naShRNAqfHqo8&e=>
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to