Hi vivek, On Fri, Dec 19, 2014 at 10:44 AM, Narasimhan, Vivekanandan <vivekanandan.narasim...@hp.com> wrote: > Hi Mike, > > Few clarifications inline [Vivek] > > -----Original Message----- > From: Mike Kolesnik [mailto:mkole...@redhat.com] > Sent: Thursday, December 18, 2014 10:58 PM > To: OpenStack Development Mailing List (not for usage questions) > Subject: Re: [openstack-dev] [Neutron][L2Pop][HA Routers] Request for > comments for a possible solution > > Hi Mathieu, > > Thanks for the quick reply, some comments inline.. > > Regards, > Mike > > ----- Original Message ----- >> Hi mike, >> >> thanks for working on this bug : >> >> On Thu, Dec 18, 2014 at 1:47 PM, Gary Kotton <gkot...@vmware.com> wrote: >> > >> > >> > On 12/18/14, 2:06 PM, "Mike Kolesnik" <mkole...@redhat.com> wrote: >> > >> >>Hi Neutron community members. >> >> >> >>I wanted to query the community about a proposal of how to fix HA >> >>routers not working with L2Population (bug 1365476[1]). >> >>This bug is important to fix especially if we want to have HA >> >>routers and DVR routers working together. >> >> >> >>[1] https://bugs.launchpad.net/neutron/+bug/1365476 >> >> >> >>What's happening now? >> >>* HA routers use distributed ports, i.e. the port with the same IP & >> >>MAC >> >> details is applied on all nodes where an L3 agent is hosting this >> >>router. >> >>* Currently, the port details have a binding pointing to an >> >>arbitrary node >> >> and this is not updated. >> >>* L2pop takes this "potentially stale" information and uses it to create: >> >> 1. A tunnel to the node. >> >> 2. An FDB entry that directs traffic for that port to that node. >> >> 3. If ARP responder is on, ARP requests will not traverse the network. >> >>* Problem is, the master router wouldn't necessarily be running on >> >>the >> >> reported agent. >> >> This means that traffic would not reach the master node but some >> >>arbitrary >> >> node where the router master might be running, but might be in >> >>another >> >> state (standby, fail). >> >> >> >>What is proposed? >> >>Basically the idea is not to do L2Pop for HA router ports that >> >>reside on the tenant network. >> >>Instead, we would create a tunnel to each node hosting the HA router >> >>so that the normal learning switch functionality would take care of >> >>switching the traffic to the master router. >> > >> > In Neutron we just ensure that the MAC address is unique per network. >> > Could a duplicate MAC address cause problems here? >> >> gary, AFAIU, from a Neutron POV, there is only one port, which is the >> router Port, which is plugged twice. One time per port. >> I think that the capacity to bind a port to several host is also a >> prerequisite for a clean solution here. This will be provided by >> patches to this bug : >> https://bugs.launchpad.net/neutron/+bug/1367391 >> >> >> >>This way no matter where the master router is currently running, the >> >>data plane would know how to forward traffic to it. >> >>This solution requires changes on the controller only. >> >> >> >>What's to gain? >> >>* Data plane only solution, independent of the control plane. >> >>* Lowest failover time (same as HA routers today). >> >>* High backport potential: >> >> * No APIs changed/added. >> >> * No configuration changes. >> >> * No DB changes. >> >> * Changes localized to a single file and limited in scope. >> >> >> >>What's the alternative? >> >>An alternative solution would be to have the controller update the >> >>port binding on the single port so that the plain old L2Pop happens >> >>and notifies about the location of the master router. >> >>This basically negates all the benefits of the proposed solution, >> >>but is wider. >> >>This solution depends on the report-ha-router-master spec which is >> >>currently in the implementation phase. >> >> >> >>It's important to note that these two solutions don't collide and >> >>could be done independently. The one I'm proposing just makes more >> >>sense from an HA viewpoint because of it's benefits which fit the HA >> >>methodology of being fast & having as little outside dependency as >> >>possible. >> >>It could be done as an initial solution which solves the bug for >> >>mechanism drivers that support normal learning switch (OVS), and >> >>later kept as an optimization to the more general, controller based, >> >>solution which will solve the issue for any mechanism driver working >> >>with L2Pop (Linux Bridge, possibly others). >> >> >> >>Would love to hear your thoughts on the subject. >> >> You will have to clearly update the doc to mention that deployment >> with Linuxbridge+l2pop are not compatible with HA. > > Yes this should be added and this is already the situation right now. > However if anyone would like to work on a LB fix (the general one or some > specific one) I would gladly help with reviewing it. > >> >> Moreover, this solution is downgrading the l2pop solution, by >> disabling the ARP-responder when VMs want to talk to a HA router. >> This means that ARP requests will be duplicated to every overlay >> tunnel to feed the OVS Mac learning table. >> This is something that we were trying to avoid with l2pop. But may be >> this is acceptable. > > Yes basically you're correct, however this would be only limited to those > tunnels that connect to the nodes where the HA router is hosted, so we would > still limit the amount of traffic that is sent across the underlay. > > Also bear in mind that ARP is actually good (at least in OVS case) since it > helps the VM locate on which tunnel the master is, so once it receives the > ARP response it records a flow that directs the traffic to the correct > tunnel, so we just get hit by the one ARP broadcast but it's sort of a > necessary evil in order to locate the master.. > > [Vivek] When the failover happens, the VMs would be actually sending traffic > to the old master node. > They won't be getting any response back. > > At this time does the VMs redo an ARP request for the HA Router? > And that again sets up the learned rules correctly again in br-tun, so that > the routed traffic > from VM continues on to the new master..
The new master will send a gARP packet to update the learning tables. > > >> >> I know that ofagent is also using l2pop, I would like to know if >> ofagent deployment will be compatible with the workaround that you are >> proposing. > > I would like to know that too, hopefully someone from OFagent can shed some > light. > >> >> My concern is that, with DVR, there are at least two major features >> that are not compatible with Linuxbridge. >> Linuxbridge is not running in the gate. I don't know if anybody is >> running a 3rd party testing with Linuxbridge deployments. If anybody >> does, it would be great to have it voting on gerrit! >> >> But I really wonder what is the future of linuxbridge compatibility? >> should we keep on improving OVS solution without taking into account >> the linuxbridge implementation? > > I don't know actually, but my capability is to fix it for OVS the best way > possible. > As I said the situation for LB won't become worse than it already is, legacy > routers would till function as always.. This fix also will not block fixing > LB in any other way since it can be easily adjusted (if > necessary) to work only for supporting mechanisms (OVS AFAIK). > > Also if anyone is willing to pick up the glove and implement the general > controller based fix, or something more focused on LB I will happily help > review what I can. > > [Vivek] Also by this proposal, will the HA Router be able to co-operate with > DVR which actually mandates L2-Pop? > > -- > Thanks, > > Vivek > >> >> Regards, >> >> Mathieu >> >> >> >> >>Regards, >> >>Mike >> >> >> >>_______________________________________________ >> >>OpenStack-dev mailing list >> >>OpenStack-dev@lists.openstack.org >> >>http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >> > >> > >> > _______________________________________________ >> > OpenStack-dev mailing list >> > OpenStack-dev@lists.openstack.org >> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >> >> _______________________________________________ >> OpenStack-dev mailing list >> OpenStack-dev@lists.openstack.org >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >> > > _______________________________________________ > OpenStack-dev mailing list > OpenStack-dev@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > _______________________________________________ > OpenStack-dev mailing list > OpenStack-dev@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev _______________________________________________ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev