On 01/03/2022 19:18, Numan Siddique wrote:
On Tue, Mar 1, 2022 at 6:31 AM Brendan Doyle<brendan.do...@oracle.com>  wrote:


On 01/03/2022 01:12, Numan Siddique wrote:
On Mon, Feb 28, 2022 at 11:40 AM Brendan Doyle<brendan.do...@oracle.com>  wrote:

On 20/02/2022 23:38, Han Zhou wrote:



On Thu, Feb 17, 2022 at 3:23 AM Brendan Doyle<brendan.do...@oracle.com>  wrote:
Hi,

So I have a Distributed Gateway Port (DGP) on a Gateway through which
VMs in the overlay can access
underlay networks. If the VM is not on the chassis where the DGP is
scheduled then the traffic takes
the extra tunneled hop to the chassis where the DGP is and is then sent
to the underlay via the localnet
switch there.  It would be great if I could avoid that extra hop, whilst
still having the Gateway do NAT
and routing.  So I'm wondering if 'reside-on-redirect-chassis'  or '
redirect-type' can be used to
do this? and if so which one? Also will normal traffic between to VMs in
the overlay on different chassis
still be tunneled through the underlay?

I'll do some experimentation later, but a nod in the right direction
would be appreciated.

Hi Brendan,

Not sure if I understood your question well. If you want to avoid the gateway hop, you 
can just configure NAT rules with "dnat_and_snat" to do distributed NAT.
'reside-on-redirect-chassis' and 'redirect-type' don't seem to help for your 
use case because you mentioned your VMs are in overlay, while those options are 
for VLAN based logical networks.


I must admit I found the ovn-nb/ovn-architetcure doc quite confusing on this. I 
tracked down
the patches and some ovn-discuss exchange and I think I understand it better 
now. It
seems that rather then avoiding the extra hop to the gateway chassis, these 
redirect options
just let you take that extra hop as a VLAN tagged packet in the underlay rather 
than a geneve
encapsulated packet. Which is not what I was hoping they would do. I'm not sure 
what you mean
by " you can just configure NAT rules with "dnat_and_snat" to do distributed 
NAT" I could
not find any reference to "distributed NAT" in the man pages. We use a 
distributed router
port Gateway for HA. Consider this config:

switch 2ffbd2d8-5421-452a-b776-e32cc55f4789 (ls_vcn3)
      port 269089c4-9464-41ec-9f63-6b3804b34b07
          addresses: ["52:54:00:30:38:35 192.16.1.5"]
      port ls_vcn3_net1-lr_vcn3_net1
          type: router
          addresses: ["40:44:00:00:00:90"]
          router-port: lr_vcn3_net1-ls_vcn3_net1
      port 284195d2-9280-4334-900e-571ecd00327a
          addresses: ["52:54:00:02:55:96 192.16.1.6"]

switch af0f96b0-d7af-41ef-b5b7-fa8ae0186c2c (ls_vcn3_backbone)
      port lsb_vcn3-lr_vcn3
          type: router
          router-port: lr_vcn3-lsb_vcn3
      port lsb_vcn3_net1-lr_vcn3_net1
          type: router
          router-port: lr_vcn3_net1-lsb_vcn3_net1

switch 29bdf630-8326-4aa4-aee1-d9bca9ea298c (ls_vcn3_external_ugw)
      port ls_vcn3_external_ugw-lr_vcn3
          type: router
          router-port: lr_vcn3-ls_vcn3_external_ugw
      port ln-ls_vcn3_external_ugw
          type: localnet
          addresses: ["unknown"]

router 638a1401-4e5f-450b-9d35-fe74defc8769 (lr_vcn3)
      port lr_vcn3-ls_vcn3_external_ugw
          mac: "40:44:00:00:01:60"
          networks: ["253.255.80.4/16"]
          gateway chassis: [sca15-rain17 sca15-rain05 sca15-rain06]
      port lr_vcn3-lsb_vcn3
          mac: "40:44:00:00:01:50"
          networks: ["253.255.25.2/25"]
      nat cb880ef9-d51c-4265-abdc-db6accaa4d4a
          external ip: "253.255.80.4"
          logical ip: "192.16.0.0/16"
          type: "snat"

router f370260a-c90c-4824-91a8-fba787272dc2 (lr_vcn3_net1)
      port lr_vcn3_net1-lsb_vcn3_net1
          mac: "40:44:00:00:00:a0"
          networks: ["253.255.25.1/25"]
      port lr_vcn3_net1-ls_vcn3_net1
          mac: "40:44:00:00:00:90"
          networks: ["192.16.1.1/24"]

ovn-nbctl lr-nat-list lr_vcn3
TYPE             EXTERNAL_IP        EXTERNAL_PORT    LOGICAL_IP            
EXTERNAL_MAC         LOGICAL_PORT
snat             253.255.80.4                        192.16.0.0/16

ovn-sbctl show

Chassis sca15-rain05
      hostname: sca15-rain05
      Encap geneve
          ip: "253.255.2.5"
          options: {csum="true"}
      Port_Binding "269089c4-9464-41ec-9f63-6b3804b34b07"
      Port_Binding cr-lr_vcn3-ls_vcn3_external_ugw

Chassis sca15-rain06
      hostname: sca15-rain06
      Encap geneve
          ip: "253.255.2.6"
          options: {csum="true"}
      Port_Binding "284195d2-9280-4334-900e-571ecd00327a"


So if the overlay VM with IP 192.16.1.5  (hosted on chassis sca15-rain05) is 
sending traffic to the
underlay it will use "physical"/localnet bridge on  sca15-rain05. But if the 
overlay VM with IP 192.16.1.6
(hosted on chassis sca15-rain06) is sending traffic to the underlay, the 
traffic is tunneled to sca15-rain05
first, because that is where the DR port is, and then it is sent to the 
underlay via the physical/localnet
bridge there. It was that extra hop I was trying to avoid. i.e Have the NAT done
by the flows on chassis sca15-rain06 and be sent to the underlay by the 
localnet/physical bridge on that
chassis. But seems like this is not possible?
If you want to avoid this extra hop,  you need to create a
dnat_and_snat entry on router lr_vcn3 of type "dnat_and_snat"
and also set external_mac and logical_port columns.

Something like this:

ovn-nbctl lr-nat-add lr_vcn3 dnat_and_snat 253.255.80.100 192.16.1.6
284195d2-9280-4334-900e-571ecd00327a 30:54:00:00:00:04

With your topology shared above,  you have created a NAT entry in the
router lr_vcn3 of type "snat".-
And OVN doesn't support distributed snat.  And I'm not sure if it is
even possible (I could be wrong).

If you define dnat_and_snat entry as I showed above in the example,
then ovn-controller running on chassis sca15-rain06
will send garps to the underlay network with the IP-MAC -
(253.255.80.100 - 30:54:00:00:00:04).  And that ovn-controller will
also
do NATting from 192.169.1.6 to 253.255.80.100.  This way, the packet
will not be tunneled to  sca15-rain05. Hope this makes sense.
Thanks Numan, yes this helps somewhat but If I understand this correctly
then
then as I can multiple VMs on multiple chassis and the distributed
router port on
the lr_vcn3 gateway (cr-lr_vcn3-ls_vcn3_external_ugw) can  be scheduled and
move between multiple chassis then  to ensure no extra hop I'd have to
have a
dnat_and_snat entry on lr_vcn3 with a  unique underlay IP and MAC for
every VM.
So in this case where there are just two  VMs I'd have to have:

ovn-nbctl lr-nat-add lr_vcn3 dnat_and_snat 253.255.80.100 192.16.1.6
284195d2-9280-4334-900e-571ecd00327a 30:54:00:00:00:04
ovn-nbctl lr-nat-add lr_vcn3 dnat_and_snat 253.255.80.101 192.16.1.5
269089c4-9464-41ec-9f63-6b3804b34b07 30:54:00:00:00:05


Which unfortunately does not work for all the use case of gateways we have.
In some cases we do need SNAT for a CIDR Not an IP

Also the ovn-nbctl man page states

"The logical_port and external_mac are only accepted when router is a
distributed router (rather than a gateway  router) "

And lr_vcn3 is a gateway router.
lr_vcn3 is not a gateway router.  It is distributed router with
gateway router port - lr_vcn3-ls_vcn3_external_ugw

To create a gateway router,  you need to pin a router to  a particular
chassis and all the routing for the router
is handled on that chassis.

Eg. ovn-nbctl set logical_router lr_vcn3 options:chassis=sca15-rain17
sca15-rain05 sca15-rain06

Sorry I'm a little confused by that you said to "pin a router to a *particular *chassis" Yet the command above specifies three chassis? Does that mean it will be pinned to one of the three, and that if that chassis fails it will not fail over to one of the
others as is the case for a distributed router with gateway router port?
will make the router lr_vcn3 as a gateway router.  And we don't
support dnat_and_snat type with external_mac and logical_port
for gateway routers.


Also as I mentioned OVN doesn't support distributed SNAT.  In your
case, you have configured an snat entry for IP - 253.255.80.4.

Which means  it is not possible for every compute node to do SNAT from
the overlay network ip of the VMs to 253.255.80.4 (for the South to
North external traffic)
as upstream router will see the IP - MAC (253.255.80.4 -
40:44:00:00:01:60) from multiple ports.

Yes, but if I removed the snat entry and replaced with

ovn-nbctl lr-nat-add lr_vcn3 dnat_and_snat 253.255.80.100 192.16.1.6
284195d2-9280-4334-900e-571ecd00327a 30:54:00:00:00:04
ovn-nbctl lr-nat-add lr_vcn3 dnat_and_snat 253.255.80.101 192.16.1.5
269089c4-9464-41ec-9f63-6b3804b34b07 30:54:00:00:00:05

Then would I have a distributed router with gateway router port that
regardless of where it was scheduled (sca15-rain05 or sca15-rain06)
that traffic from all VMs would always  go directly to the underlay
via the physical bridge on their host chassis.

Hope this makes it clear.

Numan


I think Han also suggested the same.

Since your overlay logical switch ls_vcn3 doesn't have a localnet
port,   'reside-on-redirect-chassis' option doesn't come into picture
at all.
This option is used when you create VMs on provider networks (like for
example ls_vcn3_external_ugw) and you connect these logical switches
to an OVN router.

Thanks
Numan


Brendan





Thanks,
Han

Thanks


Brendan

_______________________________________________
discuss mailing list
disc...@openvswitch.org
https://urldefense.com/v3/__https://mail.openvswitch.org/mailman/listinfo/ovs-discuss__;!!ACWV5N9M2RV99hQ!cTLgyBQpfKdvUEuLP1yuMMoCJ2NS80YuD9ZUheR22GgpnxrtsibaJUX9seQ0dj1Suzg$
_______________________________________________
discuss mailing list
disc...@openvswitch.org
https://urldefense.com/v3/__https://mail.openvswitch.org/mailman/listinfo/ovs-discuss__;!!ACWV5N9M2RV99hQ!cTLgyBQpfKdvUEuLP1yuMMoCJ2NS80YuD9ZUheR22GgpnxrtsibaJUX9seQ0dj1Suzg$
_______________________________________________
discuss mailing list
disc...@openvswitch.org
https://urldefense.com/v3/__https://mail.openvswitch.org/mailman/listinfo/ovs-discuss__;!!ACWV5N9M2RV99hQ!apQlr9pYYLc0jDV5WysFYy4QJjCj97Kodz5lskj8xggLSIZFVv7MzTzGyo0qvTYdpsg$
_______________________________________________
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss

Reply via email to