Re: [openstack-dev] [Neutron] Alternative approaches for L3 HA
Anil Venkata wrote: > On Thu, Feb 23, 2017 at 12:10 AM, Miguel Angel Ajo Pelayo > wrote: > > On Wed, Feb 22, 2017 at 11:28 AM, Adam Spiers wrote: > >> With help from others, I have started an analysis of some of the > >> different approaches to L3 HA: > >> > >> https://ethercalc.openstack.org/Pike-Neutron-L3-HA > >> > >> (although I take responsibility for all mistakes ;-) > > Did you test with this patch https://review.openstack.org/#/c/255237/ ? It > was merged in newton cycle. > With this patch, HA+L2pop doesn't depend on control plane during fail over, > hence failover should be faster(same as without l2pop). Thanks Anil! I've updated the spreadsheet to take this into account. > >> It would be great if someone from RH or RDO could provide information > >> on how this RDO (and/or RH OSP?) solution based on Pacemaker + > >> keepalived works - if so, I volunteer to: > >> > >> - help populate column E of the above sheet so that we can > >> understand if there are still remaining gaps in the solution, and > >> > >> - document it (e.g. in the HA guide). Even if this only ended up > >> being considered as a shorter-term solution, I think it's still > >> worth documenting so that it's another option available to > >> everyone. > >> > >> Thanks! > > > I have updated the spreadsheet. Thanks a lot Miguel and everyone else who contributed to the spreadsheet so far! After a very productive meeting this morning at the PTG, I think it is quite close to completion now, and I am already working with the docs team on moving it into official documentation, either in the HA Guide (which I am trying to help maintain) or the Networking Guide. I don't have strong opinions on where it should live - if anyone does then please let us know now. I also attempted to write up a mini-report summarising this morning's meeting for future reference; it's (currently) at line 279 onwards of: https://etherpad.openstack.org/p/neutron-ptg-pike-final but I'll reproduce it here for convenience. The conclusion, at least as I understand it, is as follows: - The l3_ha solution is already working pretty well in many deployments, especially when coupled with a few extra benefits from Pacemaker (although https://bugs.launchpad.net/neutron/+bugs?field.tag=l3-ha might suggest otherwise ...) - Some more refinements to this solution could be made to reduce the remaining corner cases where failures are not handled well. - I (and hopefully others) will work towards documenting this solution in more detail. - In the mean time, Ann Taraday and anyone else interested may continue out-of-tree experiments with different architectures such as tooz/etcd. It is expected that these would be invasive changes, possibly taking at least 1-2 release cycles to stabilise, but they might still be worth it. - If a PoC is submitted for review and looks promising, we can decide whether it makes sense to aim to replace the existing keepalived solution, or instead offer it as an alternative by introducing pluggable L3 drivers. However, adding a driver abstraction layer would also be costly and expand the test matrix, at a time where developer resources are scarce. So there would need to be a compelling reason to do this. I hope that's a reasonably accurate representation of the outcome from this morning - obviously feel free to submit comments if I missed or mistook anything. Thanks for a great meeting! __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Neutron] Alternative approaches for L3 HA
On Wed, Feb 22, 2017 at 1:40 PM, Miguel Angel Ajo Pelayo wrote: > I have updated the spreadsheet. In the case of RH/RDO we're using the same > architecture > in the case of HA, pacemaker is not taking care of those anymore since the > HA-NG implementation. > > We let systemd take care to restart the services that die, and we worked > with the community > to make sure that agents and services are robust in case of dependent > services (database, rabbitmq > ) failures, to make sure they reconnect and continue when those become > available. Thanks Miguel, I added a little bit of info the spreadsheet as well. > > On Wed, Feb 22, 2017 at 11:28 AM, Adam Spiers wrote: >> >> Kosnik, Lubosz wrote: >> > About success of RDO we need to remember that this deployment utilizes >> > Peacemaker and when I was working on this feature and even I spoke with >> > Assaf this external application was doing everything to make this solution >> > working. >> > Peacemaker was responsible for checking external and internal >> > connectivity. To detect split brain. Elect master, even keepalived was >> > running but Peacemaker was automatically killing all services and moving >> > FIP. >> > Assaf - is there any change in this implementation in RDO? Or you’re >> > still doing everything outside of Neutron? >> > >> > Because if RDO success is build on Peacemaker it means that yes, Neutron >> > needs some solution which will be available for more than RH deployments. >> >> Agreed. >> >> With help from others, I have started an analysis of some of the >> different approaches to L3 HA: >> >> https://ethercalc.openstack.org/Pike-Neutron-L3-HA >> >> (although I take responsibility for all mistakes ;-) >> >> It would be great if someone from RH or RDO could provide information >> on how this RDO (and/or RH OSP?) solution based on Pacemaker + >> keepalived works - if so, I volunteer to: >> >> - help populate column E of the above sheet so that we can >> understand if there are still remaining gaps in the solution, and >> >> - document it (e.g. in the HA guide). Even if this only ended up >> being considered as a shorter-term solution, I think it's still >> worth documenting so that it's another option available to >> everyone. >> >> Thanks! >> >> __ >> OpenStack Development Mailing List (not for usage questions) >> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > > > __ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Neutron] Alternative approaches for L3 HA
On 13 February 2017 at 23:23, Kosnik, Lubosz wrote: > So from my perspective I can tell that problem is completely in > architecture and even without something outside of Neutron we cannot solve > that. > Two releases ago I started to work on hardening that feature but all my > ideas was killed by Armando and Assaf. The decided that adding outside > dependency will open the doors for a new bugs from dependencies into > Neutron [1]. > I am pretty sure it wasn't our intentions to 'kill' your ideas, but otherwise set you on the right path for fixing the bug. I still believe that a complete and robust L3 HA solution cannot be built solely with Neutron alone, and that's what I was trying to say with the comment referenced below. > > You need to know that there are two outstanding bugs in this feature. > There is a internal and outside connectivity split brain. [2] this patch > made by me is “fixing” part of the problem. It allows you specify > additional tests to verify connectivity from router to GW. > Also there is a problem with connectivity between network nodes. It’s more > problematic and like you said it’s unsolvable in my opinion without using > external mechanism. > > If there will be any need to help with anything I would love to help with > sharing my knowledge about this feature and what exactly is not working. If > anyone needs any help with anything about this please ping me on email or > IRC. > > [1] https://bugs.launchpad.net/neutron/+bug/1375625/comments/31 > [2] https://review.openstack.org/#/c/273546/ > > Lubosz > > On Feb 13, 2017, at 4:10 AM, Anna Taraday > wrote: > > To avoid dependency of data plane on control plane it is possible to > deploy a separate key-value storage cluster on data plane side, using the > same network nodes. > I'm proposing to make some changes to enable experimentation in this > field, we are yet to come up with any other concrete solution. > > On Mon, Feb 13, 2017 at 2:01 PM wrote: > >> Hi, >> >> >> >> >> >> We also operate using Juno with the VRRP HA implementation and at had to >> patch through several bugs before getting to the Mitaka release. >> >> An pluggable, drop-in alternative would be highly appreciated. However >> our experience has been that the decoupling of VRRP from the control plane >> is actually a benefit as when the control plane is down the traffic is not >> affected. >> >> In a solution where the L3 HA implementation becomes tied to the >> availability of the control plane (etcd cluster or any other KV store) then >> an operator would have to account for extra failure scenarios for the KV >> store which would affect multiple routers than the outage of a single L3 >> node which is the case we usually have to account now. >> >> >> >> >> >> Just my $.02 >> >> >> >> Cristian >> >> >> >> *From:* Anna Taraday [mailto:akamyshnik...@mirantis.com] >> *Sent:* Monday, February 13, 2017 11:45 AM >> *To:* OpenStack Development Mailing List (not for usage questions) >> *Subject:* Re: [openstack-dev] [Neutron] Alternative approaches for L3 HA >> >> >> >> In etcd for each HA router we can store key which will identify which >> agent is active. L3 agents will "watch" this key. >> All these tools have leader election mechanism which can be used to get >> agent which is active for current HA router. >> >> >> >> On Mon, Feb 13, 2017 at 7:02 AM zhi wrote: >> >> Hi, we are using L3 HA in our production environment now. Router >> instances communicate to each other by VRRP protocol. In my opinion, >> although VRRP is a control plane thing, but the real VRRP traffic is using >> data plane nic so that router namespaces can not talk to each other >> sometimes when the data plan is busy. If we were used etcd (or other), >> does every router instance register one "id" in etcd ? >> >> >> >> >> >> Thanks >> >> Zhi Chang >> >> >> __ >> OpenStack Development Mailing List (not for usage questions) >> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject: >> unsubscribe >> <http://openstack-dev-requ...@lists.openstack.org/?subject:unsubscribe> >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >> >> -- >> >> Regards, >> Ann Taraday >> >> __
Re: [openstack-dev] [Neutron] Alternative approaches for L3 HA
On Thu, Feb 23, 2017 at 12:10 AM, Miguel Angel Ajo Pelayo < majop...@redhat.com> wrote: > I have updated the spreadsheet. In the case of RH/RDO we're using the same > architecture > in the case of HA, pacemaker is not taking care of those anymore since the > HA-NG implementation. > > We let systemd take care to restart the services that die, and we worked > with the community > to make sure that agents and services are robust in case of dependent > services (database, rabbitmq > ) failures, to make sure they reconnect and continue when those become > available. > > On Wed, Feb 22, 2017 at 11:28 AM, Adam Spiers wrote: > >> Kosnik, Lubosz wrote: >> > About success of RDO we need to remember that this deployment utilizes >> Peacemaker and when I was working on this feature and even I spoke with >> Assaf this external application was doing everything to make this solution >> working. >> > Peacemaker was responsible for checking external and internal >> connectivity. To detect split brain. Elect master, even keepalived was >> running but Peacemaker was automatically killing all services and moving >> FIP. >> > Assaf - is there any change in this implementation in RDO? Or you’re >> still doing everything outside of Neutron? >> > >> > Because if RDO success is build on Peacemaker it means that yes, >> Neutron needs some solution which will be available for more than RH >> deployments. >> >> Agreed. >> >> With help from others, I have started an analysis of some of the >> different approaches to L3 HA: >> >> https://ethercalc.openstack.org/Pike-Neutron-L3-HA >> >> (although I take responsibility for all mistakes ;-) >> > Did you test with this patch https://review.openstack.org/#/c/255237/ ? It was merged in newton cycle. With this patch, HA+L2pop doesn't depend on control plane during fail over, hence failover should be faster(same as without l2pop). > >> It would be great if someone from RH or RDO could provide information >> on how this RDO (and/or RH OSP?) solution based on Pacemaker + >> keepalived works - if so, I volunteer to: >> >> - help populate column E of the above sheet so that we can >> understand if there are still remaining gaps in the solution, and >> >> - document it (e.g. in the HA guide). Even if this only ended up >> being considered as a shorter-term solution, I think it's still >> worth documenting so that it's another option available to >> everyone. >> >> Thanks! >> >> >> __ >> OpenStack Development Mailing List (not for usage questions) >> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscrib >> e >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >> > > > __ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Neutron] Alternative approaches for L3 HA
I have updated the spreadsheet. In the case of RH/RDO we're using the same architecture in the case of HA, pacemaker is not taking care of those anymore since the HA-NG implementation. We let systemd take care to restart the services that die, and we worked with the community to make sure that agents and services are robust in case of dependent services (database, rabbitmq ) failures, to make sure they reconnect and continue when those become available. On Wed, Feb 22, 2017 at 11:28 AM, Adam Spiers wrote: > Kosnik, Lubosz wrote: > > About success of RDO we need to remember that this deployment utilizes > Peacemaker and when I was working on this feature and even I spoke with > Assaf this external application was doing everything to make this solution > working. > > Peacemaker was responsible for checking external and internal > connectivity. To detect split brain. Elect master, even keepalived was > running but Peacemaker was automatically killing all services and moving > FIP. > > Assaf - is there any change in this implementation in RDO? Or you’re > still doing everything outside of Neutron? > > > > Because if RDO success is build on Peacemaker it means that yes, Neutron > needs some solution which will be available for more than RH deployments. > > Agreed. > > With help from others, I have started an analysis of some of the > different approaches to L3 HA: > > https://ethercalc.openstack.org/Pike-Neutron-L3-HA > > (although I take responsibility for all mistakes ;-) > > It would be great if someone from RH or RDO could provide information > on how this RDO (and/or RH OSP?) solution based on Pacemaker + > keepalived works - if so, I volunteer to: > > - help populate column E of the above sheet so that we can > understand if there are still remaining gaps in the solution, and > > - document it (e.g. in the HA guide). Even if this only ended up > being considered as a shorter-term solution, I think it's still > worth documenting so that it's another option available to > everyone. > > Thanks! > > __ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Neutron] Alternative approaches for L3 HA
Kosnik, Lubosz wrote: > About success of RDO we need to remember that this deployment utilizes > Peacemaker and when I was working on this feature and even I spoke with Assaf > this external application was doing everything to make this solution working. > Peacemaker was responsible for checking external and internal connectivity. > To detect split brain. Elect master, even keepalived was running but > Peacemaker was automatically killing all services and moving FIP. > Assaf - is there any change in this implementation in RDO? Or you’re still > doing everything outside of Neutron? > > Because if RDO success is build on Peacemaker it means that yes, Neutron > needs some solution which will be available for more than RH deployments. Agreed. With help from others, I have started an analysis of some of the different approaches to L3 HA: https://ethercalc.openstack.org/Pike-Neutron-L3-HA (although I take responsibility for all mistakes ;-) It would be great if someone from RH or RDO could provide information on how this RDO (and/or RH OSP?) solution based on Pacemaker + keepalived works - if so, I volunteer to: - help populate column E of the above sheet so that we can understand if there are still remaining gaps in the solution, and - document it (e.g. in the HA guide). Even if this only ended up being considered as a shorter-term solution, I think it's still worth documenting so that it's another option available to everyone. Thanks! __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Neutron] Alternative approaches for L3 HA
About success of RDO we need to remember that this deployment utilizes Peacemaker and when I was working on this feature and even I spoke with Assaf this external application was doing everything to make this solution working. Peacemaker was responsible for checking external and internal connectivity. To detect split brain. Elect master, even keepalived was running but Peacemaker was automatically killing all services and moving FIP. Assaf - is there any change in this implementation in RDO? Or you’re still doing everything outside of Neutron? Because if RDO success is build on Peacemaker it means that yes, Neutron needs some solution which will be available for more than RH deployments. Lubosz On Feb 15, 2017, at 3:22 AM, Anna Taraday mailto:akamyshnik...@mirantis.com>> wrote: If I propose some concrete solution that will be discussion about one solution not about making things flexible. At first I wanted to propose some PoC for other approach, but during my experiments I understood that we may have different approaches, but for all of them we need pluggable HA router in Neutron. The thing that bothers me about L3 HA - it is complex. Yes, we fixed bunch of races and John did significant refactor, but it is still too complex. In the end we want to use L3 HA + DVR but DVR is pretty complex by itself. We would like to try to offload this complexity to external service to replace management of keepalived instances and networks withing Neutron. Router rescheduling is not really an alternative for L3 HA. RDO with L3 HA is a great example of success, but we want to have ability to try something else that can suit other OpenStack deployments better. I wrote this email to understand whether community have interest in something like this, so that it will be worth doing. On Tue, Feb 14, 2017 at 10:20 PM Assaf Muller mailto:as...@redhat.com>> wrote: On Fri, Feb 10, 2017 at 12:27 PM, Anna Taraday mailto:akamyshnik...@mirantis.com>> wrote: > Hello everyone! > > In Juno in Neutron was implemented L3 HA feature based on Keepalived (VRRP). > During next cycles it was improved, we performed scale testing [1] to find > weak places and tried to fix them. The only alternative for L3 HA with VRRP > is router rescheduling performed by Neutron server, but it is significantly > slower and depends on control plane. > > What issues we experienced with L3 HA VRRP? > > Bugs in Keepalived (bad versions) [2] > Split brain [3] > Complex structure (ha networks, ha interfaces) - which actually cause races > that we were fixing during Liberty, Mitaka and Newton. > > This all is not critical, but this is a bad experience and not everyone > ready (or want) to use Keepalived approach. > > I think we can make things more flexible. For example, we can allow user to > use external services like etcd instead of Keepalived to synchronize current > HA state across agents. I've done several experiments and I've got failover > time comparable to L3 HA with VRRP. Tooz [4] can be used to abstract from > concrete backend. For example, it can allow us to use Zookeeper, Redis and > other backends to store HA state. > > What I want to propose? > > I want to bring up idea that Neutron should have some general classes for L3 > HA which will allow to use not only Keepalived but also other backends for > HA state. This at least will make it easier to try some other approaches and > compare them with existing ones. > > Does this sound reasonable? I understand that the intention is to add pluggability upstream so that you could examine the viability of alternative solutions. I'd advise instead to do the research locally, and if you find concrete benefits to an alternative solution, come back, show your work and have a discussion about it then. Merging extra complexity in the form of a plug point without knowing if we're actually going to need it seems risky. On another note, after years of work the stability issues have largely been resolved and L3 HA is in a good state with modern releases of OpenStack. It's not a authoritative solution in the sense that it doesn't cover every possible failure mode, but it covers the major ones and in that sense better than not having any form of HA, and as you pointed out the existing alternatives are not in a better state. The subtext in your email is that now L3 HA is technically where we want it, but some users are resisting adoption because of bad PR or a bad past experience, but not for technical reasons. If that is the case, then perhaps some good PR would be a more cost effective investment than investigating, implementing, stabilizing and maintaining a different backend that will likely take at least a cycle to get merged and another 1 to 2 cycles to iron out kinks. Would you have a critical mass of developers ready to support a pluggable L3 HA now and in the long term? Finally, I can share that L3 HA has been the default in RDO-land for a few cycles now and is being used widely and success
Re: [openstack-dev] [Neutron] Alternative approaches for L3 HA
If I propose some concrete solution that will be discussion about one solution not about making things flexible. At first I wanted to propose some PoC for other approach, but during my experiments I understood that we may have different approaches, but for all of them we need pluggable HA router in Neutron. The thing that bothers me about L3 HA - it is complex. Yes, we fixed bunch of races and John did significant refactor, but it is still too complex. In the end we want to use L3 HA + DVR but DVR is pretty complex by itself. We would like to try to offload this complexity to external service to replace management of keepalived instances and networks withing Neutron. Router rescheduling is not really an alternative for L3 HA. RDO with L3 HA is a great example of success, but we want to have ability to try something else that can suit other OpenStack deployments better. I wrote this email to understand whether community have interest in something like this, so that it will be worth doing. On Tue, Feb 14, 2017 at 10:20 PM Assaf Muller wrote: On Fri, Feb 10, 2017 at 12:27 PM, Anna Taraday wrote: > Hello everyone! > > In Juno in Neutron was implemented L3 HA feature based on Keepalived (VRRP). > During next cycles it was improved, we performed scale testing [1] to find > weak places and tried to fix them. The only alternative for L3 HA with VRRP > is router rescheduling performed by Neutron server, but it is significantly > slower and depends on control plane. > > What issues we experienced with L3 HA VRRP? > > Bugs in Keepalived (bad versions) [2] > Split brain [3] > Complex structure (ha networks, ha interfaces) - which actually cause races > that we were fixing during Liberty, Mitaka and Newton. > > This all is not critical, but this is a bad experience and not everyone > ready (or want) to use Keepalived approach. > > I think we can make things more flexible. For example, we can allow user to > use external services like etcd instead of Keepalived to synchronize current > HA state across agents. I've done several experiments and I've got failover > time comparable to L3 HA with VRRP. Tooz [4] can be used to abstract from > concrete backend. For example, it can allow us to use Zookeeper, Redis and > other backends to store HA state. > > What I want to propose? > > I want to bring up idea that Neutron should have some general classes for L3 > HA which will allow to use not only Keepalived but also other backends for > HA state. This at least will make it easier to try some other approaches and > compare them with existing ones. > > Does this sound reasonable? I understand that the intention is to add pluggability upstream so that you could examine the viability of alternative solutions. I'd advise instead to do the research locally, and if you find concrete benefits to an alternative solution, come back, show your work and have a discussion about it then. Merging extra complexity in the form of a plug point without knowing if we're actually going to need it seems risky. On another note, after years of work the stability issues have largely been resolved and L3 HA is in a good state with modern releases of OpenStack. It's not a authoritative solution in the sense that it doesn't cover every possible failure mode, but it covers the major ones and in that sense better than not having any form of HA, and as you pointed out the existing alternatives are not in a better state. The subtext in your email is that now L3 HA is technically where we want it, but some users are resisting adoption because of bad PR or a bad past experience, but not for technical reasons. If that is the case, then perhaps some good PR would be a more cost effective investment than investigating, implementing, stabilizing and maintaining a different backend that will likely take at least a cycle to get merged and another 1 to 2 cycles to iron out kinks. Would you have a critical mass of developers ready to support a pluggable L3 HA now and in the long term? Finally, I can share that L3 HA has been the default in RDO-land for a few cycles now and is being used widely and successfully, in some cases at significant scale. > > [1] - > http://docs.openstack.org/developer/performance-docs/test_results/neutron_features/index.html > [2] - https://bugs.launchpad.net/neutron/+bug/1497272 > https://bugs.launchpad.net/neutron/+bug/1433172 > [3] - https://bugs.launchpad.net/neutron/+bug/1375625 > [4] - http://docs.openstack.org/developer/tooz/ > > > > > -- > Regards, > Ann Taraday > > __ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.ope
Re: [openstack-dev] [Neutron] Alternative approaches for L3 HA
On Fri, Feb 10, 2017 at 12:27 PM, Anna Taraday wrote: > Hello everyone! > > In Juno in Neutron was implemented L3 HA feature based on Keepalived (VRRP). > During next cycles it was improved, we performed scale testing [1] to find > weak places and tried to fix them. The only alternative for L3 HA with VRRP > is router rescheduling performed by Neutron server, but it is significantly > slower and depends on control plane. > > What issues we experienced with L3 HA VRRP? > > Bugs in Keepalived (bad versions) [2] > Split brain [3] > Complex structure (ha networks, ha interfaces) - which actually cause races > that we were fixing during Liberty, Mitaka and Newton. > > This all is not critical, but this is a bad experience and not everyone > ready (or want) to use Keepalived approach. > > I think we can make things more flexible. For example, we can allow user to > use external services like etcd instead of Keepalived to synchronize current > HA state across agents. I've done several experiments and I've got failover > time comparable to L3 HA with VRRP. Tooz [4] can be used to abstract from > concrete backend. For example, it can allow us to use Zookeeper, Redis and > other backends to store HA state. > > What I want to propose? > > I want to bring up idea that Neutron should have some general classes for L3 > HA which will allow to use not only Keepalived but also other backends for > HA state. This at least will make it easier to try some other approaches and > compare them with existing ones. > > Does this sound reasonable? I understand that the intention is to add pluggability upstream so that you could examine the viability of alternative solutions. I'd advise instead to do the research locally, and if you find concrete benefits to an alternative solution, come back, show your work and have a discussion about it then. Merging extra complexity in the form of a plug point without knowing if we're actually going to need it seems risky. On another note, after years of work the stability issues have largely been resolved and L3 HA is in a good state with modern releases of OpenStack. It's not a authoritative solution in the sense that it doesn't cover every possible failure mode, but it covers the major ones and in that sense better than not having any form of HA, and as you pointed out the existing alternatives are not in a better state. The subtext in your email is that now L3 HA is technically where we want it, but some users are resisting adoption because of bad PR or a bad past experience, but not for technical reasons. If that is the case, then perhaps some good PR would be a more cost effective investment than investigating, implementing, stabilizing and maintaining a different backend that will likely take at least a cycle to get merged and another 1 to 2 cycles to iron out kinks. Would you have a critical mass of developers ready to support a pluggable L3 HA now and in the long term? Finally, I can share that L3 HA has been the default in RDO-land for a few cycles now and is being used widely and successfully, in some cases at significant scale. > > [1] - > http://docs.openstack.org/developer/performance-docs/test_results/neutron_features/index.html > [2] - https://bugs.launchpad.net/neutron/+bug/1497272 > https://bugs.launchpad.net/neutron/+bug/1433172 > [3] - https://bugs.launchpad.net/neutron/+bug/1375625 > [4] - http://docs.openstack.org/developer/tooz/ > > > > > -- > Regards, > Ann Taraday > > __ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Neutron] Alternative approaches for L3 HA
Hey Lubosz, First of all, good job responding to the concerns of users and keeping a healthy OpenStack ecosystem. I’ve one comment about this email, maybe it was a cultural thing but I found it a little aggressive, specially the use of some pronouns and names. I know you and I’m completely sure that your intention was to help others in the best way, hopefully this is not a blind spot. Regards, Victor Morales From: "Kosnik, Lubosz" mailto:lubosz.kos...@intel.com>> Reply-To: "OpenStack Development Mailing List (not for usage questions)" mailto:openstack-dev@lists.openstack.org>> Date: Monday, February 13, 2017 at 10:23 PM To: "OpenStack Development Mailing List (not for usage questions)" mailto:openstack-dev@lists.openstack.org>> Subject: Re: [openstack-dev] [Neutron] Alternative approaches for L3 HA So from my perspective I can tell that problem is completely in architecture and even without something outside of Neutron we cannot solve that. Two releases ago I started to work on hardening that feature but all my ideas was killed by Armando and Assaf. The decided that adding outside dependency will open the doors for a new bugs from dependencies into Neutron [1]. You need to know that there are two outstanding bugs in this feature. There is a internal and outside connectivity split brain. [2] this patch made by me is “fixing” part of the problem. It allows you specify additional tests to verify connectivity from router to GW. Also there is a problem with connectivity between network nodes. It’s more problematic and like you said it’s unsolvable in my opinion without using external mechanism. If there will be any need to help with anything I would love to help with sharing my knowledge about this feature and what exactly is not working. If anyone needs any help with anything about this please ping me on email or IRC. [1] https://bugs.launchpad.net/neutron/+bug/1375625/comments/31 [2] https://review.openstack.org/#/c/273546/ Lubosz On Feb 13, 2017, at 4:10 AM, Anna Taraday mailto:akamyshnik...@mirantis.com>> wrote: To avoid dependency of data plane on control plane it is possible to deploy a separate key-value storage cluster on data plane side, using the same network nodes. I'm proposing to make some changes to enable experimentation in this field, we are yet to come up with any other concrete solution. On Mon, Feb 13, 2017 at 2:01 PM mailto:cristi.ca...@orange.com>> wrote: Hi, We also operate using Juno with the VRRP HA implementation and at had to patch through several bugs before getting to the Mitaka release. An pluggable, drop-in alternative would be highly appreciated. However our experience has been that the decoupling of VRRP from the control plane is actually a benefit as when the control plane is down the traffic is not affected. In a solution where the L3 HA implementation becomes tied to the availability of the control plane (etcd cluster or any other KV store) then an operator would have to account for extra failure scenarios for the KV store which would affect multiple routers than the outage of a single L3 node which is the case we usually have to account now. Just my $.02 Cristian From: Anna Taraday [mailto:akamyshnik...@mirantis.com<mailto:akamyshnik...@mirantis.com>] Sent: Monday, February 13, 2017 11:45 AM To: OpenStack Development Mailing List (not for usage questions) Subject: Re: [openstack-dev] [Neutron] Alternative approaches for L3 HA In etcd for each HA router we can store key which will identify which agent is active. L3 agents will "watch" this key. All these tools have leader election mechanism which can be used to get agent which is active for current HA router. On Mon, Feb 13, 2017 at 7:02 AM zhi mailto:changzhi1...@gmail.com>> wrote: Hi, we are using L3 HA in our production environment now. Router instances communicate to each other by VRRP protocol. In my opinion, although VRRP is a control plane thing, but the real VRRP traffic is using data plane nic so that router namespaces can not talk to each other sometimes when the data plan is busy. If we were used etcd (or other), does every router instance register one "id" in etcd ? Thanks Zhi Chang __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe<http://openstack-dev-requ...@lists.openstack.org/?subject:unsubscribe> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Regards, Ann Taraday _ Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc pas etre diffuses, exploites ou copies
Re: [openstack-dev] [Neutron] Alternative approaches for L3 HA
So from my perspective I can tell that problem is completely in architecture and even without something outside of Neutron we cannot solve that. Two releases ago I started to work on hardening that feature but all my ideas was killed by Armando and Assaf. The decided that adding outside dependency will open the doors for a new bugs from dependencies into Neutron [1]. You need to know that there are two outstanding bugs in this feature. There is a internal and outside connectivity split brain. [2] this patch made by me is “fixing” part of the problem. It allows you specify additional tests to verify connectivity from router to GW. Also there is a problem with connectivity between network nodes. It’s more problematic and like you said it’s unsolvable in my opinion without using external mechanism. If there will be any need to help with anything I would love to help with sharing my knowledge about this feature and what exactly is not working. If anyone needs any help with anything about this please ping me on email or IRC. [1] https://bugs.launchpad.net/neutron/+bug/1375625/comments/31 [2] https://review.openstack.org/#/c/273546/ Lubosz On Feb 13, 2017, at 4:10 AM, Anna Taraday mailto:akamyshnik...@mirantis.com>> wrote: To avoid dependency of data plane on control plane it is possible to deploy a separate key-value storage cluster on data plane side, using the same network nodes. I'm proposing to make some changes to enable experimentation in this field, we are yet to come up with any other concrete solution. On Mon, Feb 13, 2017 at 2:01 PM mailto:cristi.ca...@orange.com>> wrote: Hi, We also operate using Juno with the VRRP HA implementation and at had to patch through several bugs before getting to the Mitaka release. An pluggable, drop-in alternative would be highly appreciated. However our experience has been that the decoupling of VRRP from the control plane is actually a benefit as when the control plane is down the traffic is not affected. In a solution where the L3 HA implementation becomes tied to the availability of the control plane (etcd cluster or any other KV store) then an operator would have to account for extra failure scenarios for the KV store which would affect multiple routers than the outage of a single L3 node which is the case we usually have to account now. Just my $.02 Cristian From: Anna Taraday [mailto:akamyshnik...@mirantis.com<mailto:akamyshnik...@mirantis.com>] Sent: Monday, February 13, 2017 11:45 AM To: OpenStack Development Mailing List (not for usage questions) Subject: Re: [openstack-dev] [Neutron] Alternative approaches for L3 HA In etcd for each HA router we can store key which will identify which agent is active. L3 agents will "watch" this key. All these tools have leader election mechanism which can be used to get agent which is active for current HA router. On Mon, Feb 13, 2017 at 7:02 AM zhi mailto:changzhi1...@gmail.com>> wrote: Hi, we are using L3 HA in our production environment now. Router instances communicate to each other by VRRP protocol. In my opinion, although VRRP is a control plane thing, but the real VRRP traffic is using data plane nic so that router namespaces can not talk to each other sometimes when the data plan is busy. If we were used etcd (or other), does every router instance register one "id" in etcd ? Thanks Zhi Chang __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe<http://openstack-dev-requ...@lists.openstack.org/?subject:unsubscribe> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Regards, Ann Taraday _ Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration, Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci. This message and its attachments may contain confidential or privileged information that may be protected by law; they should not be distributed, used or copied without authorisation. If you have received this email in error, please notify the sender and delete this message and its attachments. As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified. Thank you. __ OpenStack Development Mailing List (not for usage questions) Unsubscrib
Re: [openstack-dev] [Neutron] Alternative approaches for L3 HA
To avoid dependency of data plane on control plane it is possible to deploy a separate key-value storage cluster on data plane side, using the same network nodes. I'm proposing to make some changes to enable experimentation in this field, we are yet to come up with any other concrete solution. On Mon, Feb 13, 2017 at 2:01 PM wrote: > Hi, > > > > > > We also operate using Juno with the VRRP HA implementation and at had to > patch through several bugs before getting to the Mitaka release. > > An pluggable, drop-in alternative would be highly appreciated. However our > experience has been that the decoupling of VRRP from the control plane is > actually a benefit as when the control plane is down the traffic is not > affected. > > In a solution where the L3 HA implementation becomes tied to the > availability of the control plane (etcd cluster or any other KV store) then > an operator would have to account for extra failure scenarios for the KV > store which would affect multiple routers than the outage of a single L3 > node which is the case we usually have to account now. > > > > > > Just my $.02 > > > > Cristian > > > > *From:* Anna Taraday [mailto:akamyshnik...@mirantis.com] > *Sent:* Monday, February 13, 2017 11:45 AM > *To:* OpenStack Development Mailing List (not for usage questions) > *Subject:* Re: [openstack-dev] [Neutron] Alternative approaches for L3 HA > > > > In etcd for each HA router we can store key which will identify which > agent is active. L3 agents will "watch" this key. > All these tools have leader election mechanism which can be used to get > agent which is active for current HA router. > > > > On Mon, Feb 13, 2017 at 7:02 AM zhi wrote: > > Hi, we are using L3 HA in our production environment now. Router instances > communicate to each other by VRRP protocol. In my opinion, although VRRP is > a control plane thing, but the real VRRP traffic is using data plane nic so > that router namespaces can not talk to each other sometimes when the data > plan is busy. If we were used etcd (or other), does every router instance > register one "id" in etcd ? > > > > > > Thanks > > Zhi Chang > > __ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > -- > > Regards, > Ann Taraday > > _ > > Ce message et ses pieces jointes peuvent contenir des informations > confidentielles ou privilegiees et ne doivent donc > pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu > ce message par erreur, veuillez le signaler > a l'expediteur et le detruire ainsi que les pieces jointes. Les messages > electroniques etant susceptibles d'alteration, > Orange decline toute responsabilite si ce message a ete altere, deforme ou > falsifie. Merci. > > This message and its attachments may contain confidential or privileged > information that may be protected by law; > they should not be distributed, used or copied without authorisation. > If you have received this email in error, please notify the sender and delete > this message and its attachments. > As emails may be altered, Orange is not liable for messages that have been > modified, changed or falsified. > Thank you. > > __ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > -- Regards, Ann Taraday __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Neutron] Alternative approaches for L3 HA
Hi, We also operate using Juno with the VRRP HA implementation and at had to patch through several bugs before getting to the Mitaka release. An pluggable, drop-in alternative would be highly appreciated. However our experience has been that the decoupling of VRRP from the control plane is actually a benefit as when the control plane is down the traffic is not affected. In a solution where the L3 HA implementation becomes tied to the availability of the control plane (etcd cluster or any other KV store) then an operator would have to account for extra failure scenarios for the KV store which would affect multiple routers than the outage of a single L3 node which is the case we usually have to account now. Just my $.02 Cristian From: Anna Taraday [mailto:akamyshnik...@mirantis.com] Sent: Monday, February 13, 2017 11:45 AM To: OpenStack Development Mailing List (not for usage questions) Subject: Re: [openstack-dev] [Neutron] Alternative approaches for L3 HA In etcd for each HA router we can store key which will identify which agent is active. L3 agents will "watch" this key. All these tools have leader election mechanism which can be used to get agent which is active for current HA router. On Mon, Feb 13, 2017 at 7:02 AM zhi mailto:changzhi1...@gmail.com>> wrote: Hi, we are using L3 HA in our production environment now. Router instances communicate to each other by VRRP protocol. In my opinion, although VRRP is a control plane thing, but the real VRRP traffic is using data plane nic so that router namespaces can not talk to each other sometimes when the data plan is busy. If we were used etcd (or other), does every router instance register one "id" in etcd ? Thanks Zhi Chang __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe<http://openstack-dev-requ...@lists.openstack.org?subject:unsubscribe> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Regards, Ann Taraday _ Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration, Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci. This message and its attachments may contain confidential or privileged information that may be protected by law; they should not be distributed, used or copied without authorisation. If you have received this email in error, please notify the sender and delete this message and its attachments. As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified. Thank you. __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Neutron] Alternative approaches for L3 HA
In etcd for each HA router we can store key which will identify which agent is active. L3 agents will "watch" this key. All these tools have leader election mechanism which can be used to get agent which is active for current HA router. On Mon, Feb 13, 2017 at 7:02 AM zhi wrote: > Hi, we are using L3 HA in our production environment now. Router instances > communicate to each other by VRRP protocol. In my opinion, although VRRP is > a control plane thing, but the real VRRP traffic is using data plane nic so > that router namespaces can not talk to each other sometimes when the data > plan is busy. If we were used etcd (or other), does every router instance > register one "id" in etcd ? > > > Thanks > Zhi Chang > __ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > -- Regards, Ann Taraday __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Neutron] Alternative approaches for L3 HA
Hi, we are using L3 HA in our production environment now. Router instances communicate to each other by VRRP protocol. In my opinion, although VRRP is a control plane thing, but the real VRRP traffic is using data plane nic so that router namespaces can not talk to each other sometimes when the data plan is busy. If we were used etcd (or other), does every router instance register one "id" in etcd ? Thanks Zhi Chang __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [Neutron] Alternative approaches for L3 HA
Hello everyone! In Juno in Neutron was implemented L3 HA feature based on Keepalived (VRRP). During next cycles it was improved, we performed scale testing [1] to find weak places and tried to fix them. The only alternative for L3 HA with VRRP is router rescheduling performed by Neutron server, but it is significantly slower and depends on control plane. What issues we experienced with L3 HA VRRP? 1. Bugs in Keepalived (bad versions) [2] 2. Split brain [3] 3. Complex structure (ha networks, ha interfaces) - which actually cause races that we were fixing during Liberty, Mitaka and Newton. This all is not critical, but this is a bad experience and not everyone ready (or want) to use Keepalived approach. I think we can make things more flexible. For example, we can allow user to use external services like etcd instead of Keepalived to synchronize current HA state across agents. I've done several experiments and I've got failover time comparable to L3 HA with VRRP. Tooz [4] can be used to abstract from concrete backend. For example, it can allow us to use Zookeeper, Redis and other backends to store HA state. What I want to propose? I want to bring up idea that Neutron should have some general classes for L3 HA which will allow to use not only Keepalived but also other backends for HA state. This at least will make it easier to try some other approaches and compare them with existing ones. Does this sound reasonable? [1] - http://docs.openstack.org/developer/performance-docs/test_results/neutron_features/index.html [2] - https://bugs.launchpad.net/neutron/+bug/1497272 https://bugs.launchpad.net/neutron/+bug/1433172 [3] - https://bugs.launchpad.net/neutron/+bug/1375625 [4] - http://docs.openstack.org/developer/tooz/ -- Regards, Ann Taraday __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev