On Thu, May 16, 2019 at 11:40 AM Numan Siddique <nusid...@redhat.com> wrote:
> > > On Thu, May 16, 2019 at 10:20 PM Han Zhou <zhou...@gmail.com> wrote: > >> >> >> On Wed, May 15, 2019 at 11:59 PM Numan Siddique <nusid...@redhat.com> >> wrote: >> >>> >>> >>> On Thu, May 16, 2019 at 1:12 AM Han Zhou <zhou...@gmail.com> wrote: >>> >>>> >>>> >>>> On Wed, May 15, 2019 at 11:55 AM Numan Siddique <nusid...@redhat.com> >>>> wrote: >>>> >>>>> >>>>> >>>>> On Thu, May 16, 2019 at 12:10 AM Han Zhou <zhou...@gmail.com> wrote: >>>>> >>>>>> >>>>>> >>>>>> On Wed, May 15, 2019 at 11:36 AM <nusid...@redhat.com> wrote: >>>>>> > >>>>>> > From: Numan Siddique <nusid...@redhat.com> >>>>>> > >>>>>> > This new type is added for the following reasons: >>>>>> > >>>>>> > - When a load balancer is created in an OpenStack deployment with >>>>>> Octavia >>>>>> > service, it creates a logical port 'VIP' for the virtual ip. >>>>>> > >>>>>> > - This logical port is not bound to any VIF. >>>>>> > >>>>>> > - Octavia service creates a service VM (with another logical port >>>>>> 'P' which >>>>>> > belongs to the same logical switch) >>>>>> > >>>>>> > - The virtual ip 'VIP' is configured on this service VM. >>>>>> > >>>>>> > - This service VM provides the load balancing for the VIP with >>>>>> the configured >>>>>> > backend IPs. >>>>>> > >>>>>> > - Octavia service can be configured to create few service VMs >>>>>> with active-standby mode >>>>>> > with the active VM configured with the VIP. The VIP can move >>>>>> between >>>>>> > these service nodes. >>>>>> > >>>>>> > Presently there are few problems: >>>>>> > >>>>>> > - When a floating ip (externally reachable IP) is associated to >>>>>> the VIP and if >>>>>> > the compute nodes have external connectivity then the external >>>>>> traffic cannot >>>>>> > reach the VIP using the floating ip as the VIP logical port >>>>>> would be down. >>>>>> > dnat_and_snat entry in NAT table for this vip will have >>>>>> 'external_mac' and >>>>>> > 'logical_port' configured. >>>>>> > >>>>>> > - The only way to make it work is to clear the 'external_mac' >>>>>> entry so that >>>>>> > the gateway chassis does the DNAT for the VIP. >>>>>> > >>>>>> > To solve these problems, this patch proposes a new logical port >>>>>> type - virtual. >>>>>> > CMS when creating the logical port for the VIP, should >>>>>> > >>>>>> > - set the type as 'virtual' >>>>>> > >>>>>> > - configure the VIP in the newly added column >>>>>> Logical_Switch_Port.virtual_ip >>>>>> > >>>>>> > - And set the virtual parents in the new added column >>>>>> Logical_Switch_Port.virtual_parents. >>>>>> > These virtual parents are the one which can be configured wit >>>>>> the VIP. >>>>>> > >>>>>> > If suppose the virtual_ip is configured to 10.0.0.10 on a virtual >>>>>> logical port 'sw0-vip' >>>>>> > and the virtual_parents are set to - [sw0-p1, sw0-p2] then below >>>>>> logical flows are added in the >>>>>> > lsp_in_arp_rsp logical switch pipeline >>>>>> > >>>>>> > - table=11(ls_in_arp_rsp), priority=100, >>>>>> > match=(inport == "sw0-p1" && ((arp.op == 1 && arp.spa == >>>>>> 10.0.0.10 && arp.tpa == 10.0.0.10) || >>>>>> > (arp.op == 2 && arp.spa == >>>>>> 10.0.0.10))), >>>>>> > action=(bind_vport("sw0-vip", inport); next;) >>>>>> > - table=11(ls_in_arp_rsp), priority=100, >>>>>> > match=(inport == "sw0-p2" && ((arp.op == 1 && arp.spa == >>>>>> 10.0.0.10 && arp.tpa == 10.0.0.10) || >>>>>> > (arp.op == 2 && arp.spa == >>>>>> 10.0.0.10))), >>>>>> > action=(bind_vport("sw0-vip", inport); next;) >>>>>> > >>>>>> > The action bind_vport will claim the logical port - sw0-vip on the >>>>>> chassis where this action >>>>>> > is executed. Since the port - sw0-vip is claimed by a chassis, the >>>>>> dnat_and_snat rule for >>>>>> > the VIP will be handled by the compute node. >>>>>> > >>>>>> > Signed-off-by: Numan Siddique <nusid...@redhat.com> >>>>>> >>>>>> Hi Numan, this looks interesting. I haven't reviewed code yet, but >>>>>> just some questions to better understand the feature. >>>>>> >>>>>> Firstly, can Octavia be implemented by using the distributed LB >>>>>> feature of OVN, instead of using dedicated node? What's the major gap for >>>>>> using the OVN LB? >>>>>> >>>>>> >>>>> Yes. Its possible to use the native OVN LB feature. There's already a >>>>> provider driver in Octavia for >>>>> OVN ( >>>>> https://github.com/openstack/networking-ovn/blob/master/networking_ovn/octavia/ovn_driver.py >>>>> ). >>>>> When creating the LB providing the option --provider-driver=ovn will >>>>> create OVN LB. >>>>> >>>>> However OVN LB is limited to L4 and there are no health checks. >>>>> Octavia amphora driver supports lots of >>>>> features like L7, health check etc. I think we should definitely look >>>>> into adding the health monitor feature for OVN LB. >>>>> But I think supporting L7 LBs is out of question for OVN LB. >>>>> For complex load balancer needs, I think its better to rely on >>>>> external load balancers like the Octavia amphora driver. >>>>> Octavia amphora driver creates a service VM and runs an haproxy >>>>> instance inside it to provider the load balancing. >>>>> >>>>> Thanks for the detailed explain. Yes, this makes sense! >>>> >>>>> >>>>> >>>>> >>>>>> Secondly, how is associating the floating-ip with the VIP configured >>>>>> currently? >>>>>> >>>>> >>>>> networking-ovn creates a dnat_and_snat entry when floating ip is >>>>> associated for the VIP port. Right now it doesn't set >>>>> the external_mac and logical_port columns for DVR deployments. And >>>>> this has been a draw back. >>>>> >>>> >>>> Ok, thanks. Maybe I should check more details on neutron >>>> networking-ovn. >>>> >>>>> >>>>>> Thirdly, can static route be used to route the VIP to the VM, instead >>>>>> of creating a lport for the VIP? I.e. create a route in the logical >>>>>> router: >>>>>> destination - VIP, next hop - service VM IP. >>>>>> >>>>>> >>>>> I am not sure on this one. I think it may work in one scenario where >>>>> Octavia amphora driver creates one instance of the service VM. >>>>> >>>>> It also provides the option for HA of the VIP. It creates multiple >>>>> service VMs in active-standby mode and HA managed by keepalived. >>>>> The master VM configures the VIP and runs the haproxy instance. If the >>>>> master VM goes down, then keepalived cluster will chose >>>>> another master and configure the VIP there. I am not sure in this >>>>> scenario, the static route option would work or not as the service VM IP >>>>> could change. >>>>> >>>>> When a standby service VM becomes master, keepalived running there >>>>> sends a GARP for the VIP. That's the reason I took the approach >>>>> of binding the VIP port when an arp packet is seen for the VIP. >>>>> >>>>> >>>> For HA, it should work if CMS updates the static route to point to IP >>>> of the standby VM, when it detects that the active VM is down, right? >>>> >>> >>> I need to check with networking-ovn folks. I have CCed Lucas, Daniel >>> and Carlos (Octavia developer) to this thread. >>> >>> But I don't know how CMS would detect it. In the case of OpenStack, its >>> octavia which creates the >>> service VMs. So not sure how Octavia and networking-ovn can coordinate. >>> >>> >>> >>>> Alternatively, the static route's next hop for the VIP can be a >>>> virtual-IP for the active-standby VM pair instead of the individual VM IP. >>>> This virtual IP can be announced by sending GARP by the active VM just like >>>> the way you mentioned. >>>> >>> >>> I think this would require changes in the Octavia Amphora VM management. >>> Right now the VIP is moved around using keepalived. This would require >>> creating another neutron port for the virtual ip. >>> >>> >>> >>>> The benefit of the static routes is that a big number of VIPs can be >>>> summarized to small number of subnets (and routes), which is much more >>>> scalable. >>>> A third approach (without scale benefit) may be simply binding the >>>> individual VIP on the active VM and sending GARP to re-announce it when >>>> fail-over happens. >>>> >>> I think this is the approach taken by Octavia. >>> >>> But I don't think none of these approaches solve the DVR issue. i.e if >>> the compute nodes have access to external network, then the FIP of the VIP >>> should be reachable >>> directly via the compute node hosting the master service VM. >>> >>> With the proposed patch, since we will be binding the VIP port to a >>> chassis, that chassis will send the GARP for the FIP of the VIP. >>> When failover happens, the ovn-controller of the new master service VM >>> will take care of sending the GARP for this FIP. >>> >>> Ok, so the GARP from the service VMs is for FIP instead of VIP. But I am >> confused again here. >> > > No no. I think there is some confusion here. The service VM sends the GARP > for the VIP. > > Ok, I think I misread the sentence: "... service VM will take care of sending the GARP for this FIP" and ignored "the ovn-controller of ...". That was the only confusion to me and now it is clear. Thank you! > >> I think typically a LB node (the service VM) should take care VIP >> anouncement. Now if the VIP subnet is internal and we need external IP >> (Floating IP) for external hosts to access the VIP, then NAT should >> naturally happen on the Gateway nodes. >> > > That's right. But we have DVR use case where the compute nodes have > external connectivity. In this case, the external traffic doesn't need to > go the Gateway nodes. > Instead DNAT happens in the compute node itself. > > Please see the "external_mac" and "logical_port" columns of NAT table in > Northbound DB. > > When there is a NAT entry of type - dnat_and_snat with external_mac and > logical_port set, the ovn-controller which has claimed this logical port > sends the GARP for the FIP (external_ip column in the NAT table) so that > external routers learn that this FIP is reachable via this compute node. > > With this patch - ovn-controller hosting the service VM with the VIP > configured, binds the VIP logical port (of type virtual), so that it sends > the GARP for the FIP of VIP. > And when any external entity wants to reach the VIP using the FIP, the > packet will be handled in this chassis and the FIP will be unDNATted to the > VIP and then finally delivered to the Service VM. > > > >> If the "external host" is actually hosted in the same OVN virtual >> environment, then the NAT may be done in a distributed way, and it seems to >> be what this patch is supposed to solve. I am not sure if I am correct so >> far. If I am correct, this doesn't explain why the GARP is for FIP. >> > > As I mentioned above, the GARP for the FIP is already handled in > ovn-controller when external_mac and logical_port is set. > > Does this make sense to you now ? > > Please let me know if there are any questions. > > Thanks > Numan > > I think it should just be VIP, and since FIP is NATed to VIP, the traffic >> to FIP should finally reach the service VM. Did I misunderstand anything? >> >> >>> Thanks >>> Numan >>> >>> I think all above alternatives doesn't require any change in OVN. Please >>>> correct me if it doesn't work in particular scenarios. I am particularly >>>> interested in this because we are also working on LB use cases with OVN. I >>>> believe the approach taken by your patch would certainly work. I just want >>>> to know if current OVN is sufficient to support your requirement without >>>> any changes. >>>> >>> >> _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev