Re: [openstack-dev] [heat] health maintenance in autoscaling groups
Doug Wiegley do...@a10networks.com wrote on 07/23/2014 11:24:32 PM: Hi Mike, and listed the possible values of the status field, including INACTIVE. Other sources are telling me that status=INACTIVE when the health monitor thinks the member is unhealthy, status!=INACTIVE when the health monitor thinks the member is healthy. What's going on here? Indeed, the code will return a server status of INACTIVE if the lbaas agent marks a member ‘DOWN’. But, nowhere can I find that it actually ever does so. My statements about the status field for lbaas/neutron came from the author of the ref lbaas driver; I’ll check with him tomorrow and see if I misunderstood. I did an experiment, and found that the PoolMember status did indeed switch between ACTIVE and INACTIVE depending on the health of the member. Thanks, Mike ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [heat] health maintenance in autoscaling groups
Hi mike, Awesome, thanks for chasing that down. Now I need to close the loop and figure out where that linkage is, so I don't go crazy. Thanks, Doug On Jul 24, 2014, at 10:06 PM, Mike Spreitzer mspre...@us.ibm.commailto:mspre...@us.ibm.com wrote: Doug Wiegley do...@a10networks.commailto:do...@a10networks.com wrote on 07/23/2014 11:24:32 PM: Hi Mike, and listed the possible values of the status field, including INACTIVE. Other sources are telling me that status=INACTIVE when the health monitor thinks the member is unhealthy, status!=INACTIVE when the health monitor thinks the member is healthy. What's going on here? Indeed, the code will return a server status of INACTIVE if the lbaas agent marks a member ‘DOWN’. But, nowhere can I find that it actually ever does so. My statements about the status field for lbaas/neutron came from the author of the ref lbaas driver; I’ll check with him tomorrow and see if I misunderstood. I did an experiment, and found that the PoolMember status did indeed switch between ACTIVE and INACTIVE depending on the health of the member. Thanks, Mike ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.orgmailto:OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [heat] health maintenance in autoscaling groups
Doug Wiegley do...@a10networks.com wrote on 07/16/2014 04:58:52 PM: You do recall correctly, and there are currently no mechanisms for notifying anything outside of the load balancer backend when the health monitor/member state changes. But there *is* a mechanism for some outside thing to query the load balancer for the health of a pool member, right? I am thinking specifically of http://docs.openstack.org/api/openstack-network/2.0/content/GET_showMember__v2.0_pools__pool_id__members__member_id__lbaas_ext_ops_member.html --- whose response includes a status field for the member. Is there documentation for what values can appear in that field, and what each value means? Supposing we can leverage the pool member status, there remains an issue: establishing a link between an OS::Neutron::PoolMember and the corresponding scaling group member. We could conceivably expand the scaling group code so that if the member type is a stack then the contents of the stack are searched (perhaps recursively) for resources of type OS::Neutron::PoolMember, but that is a tad too automatic for my taste. It could pick up irrelevant PoolMembers. And such a level of implicit behavior is outside our normal style of doing things. We could follow the AWS style, by adding an optional property to the scaling group resource types --- where the value of that property can be the UUID of an OS::Neutron::LoadBalancer or an OS::Neutron::Pool. But that still does not link up an individual scaling group member with its corresponding PoolMember. Remember that if we are doing this at all, each scaling group member must be a stack. I think the simplest way to solve this would be to define a way that a such stack can put in its outputs the ID of the corresponding PoolMember. I would be willing to settle for simply saying that if such a stack has an output of type string and name __OS_pool_member then the value of that output is taken to be the ID of the corresponding PoolMember. Some people do not like reserved names; if that must be avoided then we can expand the schema language with a way to identify which stack output carries the PoolMember ID. Another alternative would be to add an optional scaling group property to carry the name of the stack output in question. There is also currently no way for an external system to inject health information about an LB or its members. I do not know that the injection has to be to the LB; in AWS the injection is to the scaling group. That would be acceptable to me too. Thoughts? Thanks, Mike___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [heat] health maintenance in autoscaling groups
But there *is* a mechanism for some outside thing to query the load balancer for the health of a pool member, right? I am thinking specifically of http://docs.openstack.org/api/openstack-network/2.0/content/GET_showMember__v2.0_pools__pool_id__members__member_id__lbaas_ext_ops_member.html --- whose response includes a status field for the member. Is there documentation for what values can appear in that field, and what each value means? The state of the world today: ‘status’ in the neutron database is configuration/provisioning status, not operational status. Neutron-wide thing. We were discussing adding operational status fields (or a neutron REST call to get the info from the backend) last month, but it’s something that isn’t planned for a serious conversation until Kilo, at present. The current possible lbaas values (from neutron/plugins/common/constants.py): # Service operation status constants ACTIVE = ACTIVE DOWN = DOWN PENDING_CREATE = PENDING_CREATE PENDING_UPDATE = PENDING_UPDATE PENDING_DELETE = PENDING_DELETE INACTIVE = INACTIVE ERROR = ERROR … It does look like you can make a stats() call for some backends and get limited operational information, but it will not be uniform, nor universally supported. Thanks, doug From: Mike Spreitzer mspre...@us.ibm.commailto:mspre...@us.ibm.com Reply-To: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.orgmailto:openstack-dev@lists.openstack.org Date: Wednesday, July 23, 2014 at 1:27 PM To: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.orgmailto:openstack-dev@lists.openstack.org Subject: Re: [openstack-dev] [heat] health maintenance in autoscaling groups Doug Wiegley do...@a10networks.commailto:do...@a10networks.com wrote on 07/16/2014 04:58:52 PM: You do recall correctly, and there are currently no mechanisms for notifying anything outside of the load balancer backend when the health monitor/member state changes. But there *is* a mechanism for some outside thing to query the load balancer for the health of a pool member, right? I am thinking specifically of http://docs.openstack.org/api/openstack-network/2.0/content/GET_showMember__v2.0_pools__pool_id__members__member_id__lbaas_ext_ops_member.html --- whose response includes a status field for the member. Is there documentation for what values can appear in that field, and what each value means? Supposing we can leverage the pool member status, there remains an issue: establishing a link between an OS::Neutron::PoolMember and the corresponding scaling group member. We could conceivably expand the scaling group code so that if the member type is a stack then the contents of the stack are searched (perhaps recursively) for resources of type OS::Neutron::PoolMember, but that is a tad too automatic for my taste. It could pick up irrelevant PoolMembers. And such a level of implicit behavior is outside our normal style of doing things. We could follow the AWS style, by adding an optional property to the scaling group resource types --- where the value of that property can be the UUID of an OS::Neutron::LoadBalancer or an OS::Neutron::Pool. But that still does not link up an individual scaling group member with its corresponding PoolMember. Remember that if we are doing this at all, each scaling group member must be a stack. I think the simplest way to solve this would be to define a way that a such stack can put in its outputs the ID of the corresponding PoolMember. I would be willing to settle for simply saying that if such a stack has an output of type string and name __OS_pool_member then the value of that output is taken to be the ID of the corresponding PoolMember. Some people do not like reserved names; if that must be avoided then we can expand the schema language with a way to identify which stack output carries the PoolMember ID. Another alternative would be to add an optional scaling group property to carry the name of the stack output in question. There is also currently no way for an external system to inject health information about an LB or its members. I do not know that the injection has to be to the LB; in AWS the injection is to the scaling group. That would be acceptable to me too. Thoughts? Thanks, Mike ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [heat] health maintenance in autoscaling groups
Doug Wiegley do...@a10networks.com wrote on 07/23/2014 03:43:02 PM: From: Doug Wiegley do...@a10networks.com ... The state of the world today: ‘status’ in the neutron database is configuration/provisioning status, not operational status. Neutron- wide thing. We were discussing adding operational status fields (or a neutron REST call to get the info from the backend) last month, but it’s something that isn’t planned for a serious conversation until Kilo, at present. Thanks for the prompt response. Let me just grasp at one last straw: is there any chance that Neutron will soon define and implement Ceilometer metrics that reveal PoolMember health? Thanks, Mike ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [heat] health maintenance in autoscaling groups
Great question, and to my knowledge, not at present. There is an ongoing discussion about a common usage framework for ceilometer, for all the various *aaS things, but status I not included (yet!). I think that spec is in gerrit. Thanks, Doug From: Mike Spreitzer mspre...@us.ibm.commailto:mspre...@us.ibm.com Reply-To: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.orgmailto:openstack-dev@lists.openstack.org Date: Wednesday, July 23, 2014 at 2:03 PM To: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.orgmailto:openstack-dev@lists.openstack.org Subject: Re: [openstack-dev] [heat] health maintenance in autoscaling groups Doug Wiegley do...@a10networks.commailto:do...@a10networks.com wrote on 07/23/2014 03:43:02 PM: From: Doug Wiegley do...@a10networks.commailto:do...@a10networks.com ... The state of the world today: ‘status’ in the neutron database is configuration/provisioning status, not operational status. Neutron- wide thing. We were discussing adding operational status fields (or a neutron REST call to get the info from the backend) last month, but it’s something that isn’t planned for a serious conversation until Kilo, at present. Thanks for the prompt response. Let me just grasp at one last straw: is there any chance that Neutron will soon define and implement Ceilometer metrics that reveal PoolMember health? Thanks, Mike ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [heat] health maintenance in autoscaling groups
It's probably worth pointing out that most of the Neutron LBaaS team are spending most of our time doing a major revision to Neutron LBaaS. How stats processing should happen has definitely been discussed but not resolved at present-- and in any case it was apparent to those working on the project that it has secondary importance compared to the revision work presently underway. I personally would like to have queries about most objects in the stats API to Neutron LBaaS return a dictionary or statuses for child objects which then a UI or auto-scaling system can interpret however it wishes. Your points are certainly well made, and I agree that it might also be useful to inject status information externally, or have some kind of hook there to get event notifications when individual member statuses change. But this is really a discussion that needs to happen once the current code drive is near fruition (ie. for Kilo). Stephen On Wed, Jul 23, 2014 at 1:27 PM, Doug Wiegley do...@a10networks.com wrote: Great question, and to my knowledge, not at present. There is an ongoing discussion about a common usage framework for ceilometer, for all the various *aaS things, but status I not included (yet!). I think that spec is in gerrit. Thanks, Doug From: Mike Spreitzer mspre...@us.ibm.com Reply-To: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.org Date: Wednesday, July 23, 2014 at 2:03 PM To: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.org Subject: Re: [openstack-dev] [heat] health maintenance in autoscaling groups Doug Wiegley do...@a10networks.com wrote on 07/23/2014 03:43:02 PM: From: Doug Wiegley do...@a10networks.com ... The state of the world today: ‘status’ in the neutron database is configuration/provisioning status, not operational status. Neutron- wide thing. We were discussing adding operational status fields (or a neutron REST call to get the info from the backend) last month, but it’s something that isn’t planned for a serious conversation until Kilo, at present. Thanks for the prompt response. Let me just grasp at one last straw: is there any chance that Neutron will soon define and implement Ceilometer metrics that reveal PoolMember health? Thanks, Mike ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Stephen Balukoff Blue Box Group, LLC (800)613-4305 x807 ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [heat] health maintenance in autoscaling groups
Stephen Balukoff sbaluk...@bluebox.net wrote on 07/23/2014 09:14:35 PM: It's probably worth pointing out that most of the Neutron LBaaS team are spending most of our time doing a major revision to Neutron LBaaS. How stats processing should happen has definitely been discussed but not resolved at present-- and in any case it was apparent to those working on the project that it has secondary importance compared to the revision work presently underway. I personally would like to have queries about most objects in the stats API to Neutron LBaaS return a dictionary or I presume you meant of rather than or. statuses for child objects which then a UI or auto-scaling system can interpret however it wishes. That last part makes me a little nervious. I have seen can interpret however it wishes mean can not draw any useful inferences because there are no standards for that content. I presume that as the grand and glorious future arrives, it will be with due respect for backwards compatibility. In the present, I am getting what appears to be conflicting information on the status field of the responses of http://docs.openstack.org/api/openstack-network/2.0/content/GET_showMember__v2.0_pools__pool_id__members__member_id__lbaas_ext_ops_member.html Doug Wiegely wrote ‘status’ in the neutron database is configuration/provisioning status, not operational status and listed the possible values of the status field, including INACTIVE. Other sources are telling me that status=INACTIVE when the health monitor thinks the member is unhealthy, status!=INACTIVE when the health monitor thinks the member is healthy. What's going on here? Thanks, Mike ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [heat] health maintenance in autoscaling groups
Hi Mike, and listed the possible values of the status field, including INACTIVE. Other sources are telling me that status=INACTIVE when the health monitor thinks the member is unhealthy, status!=INACTIVE when the health monitor thinks the member is healthy. What's going on here? Indeed, the code will return a server status of INACTIVE if the lbaas agent marks a member ‘DOWN’. But, nowhere can I find that it actually ever does so. My statements about the status field for lbaas/neutron came from the author of the ref lbaas driver; I’ll check with him tomorrow and see if I misunderstood. Thanks, doug From: Mike Spreitzer mspre...@us.ibm.commailto:mspre...@us.ibm.com Reply-To: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.orgmailto:openstack-dev@lists.openstack.org Date: Wednesday, July 23, 2014 at 9:14 PM To: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.orgmailto:openstack-dev@lists.openstack.org Subject: Re: [openstack-dev] [heat] health maintenance in autoscaling groups Stephen Balukoff sbaluk...@bluebox.netmailto:sbaluk...@bluebox.net wrote on 07/23/2014 09:14:35 PM: It's probably worth pointing out that most of the Neutron LBaaS team are spending most of our time doing a major revision to Neutron LBaaS. How stats processing should happen has definitely been discussed but not resolved at present-- and in any case it was apparent to those working on the project that it has secondary importance compared to the revision work presently underway. I personally would like to have queries about most objects in the stats API to Neutron LBaaS return a dictionary or I presume you meant of rather than or. statuses for child objects which then a UI or auto-scaling system can interpret however it wishes. That last part makes me a little nervious. I have seen can interpret however it wishes mean can not draw any useful inferences because there are no standards for that content. I presume that as the grand and glorious future arrives, it will be with due respect for backwards compatibility. In the present, I am getting what appears to be conflicting information on the status field of the responses of http://docs.openstack.org/api/openstack-network/2.0/content/GET_showMember__v2.0_pools__pool_id__members__member_id__lbaas_ext_ops_member.html Doug Wiegely wrote ‘status’ in the neutron database is configuration/provisioning status, not operational status and listed the possible values of the status field, including INACTIVE. Other sources are telling me that status=INACTIVE when the health monitor thinks the member is unhealthy, status!=INACTIVE when the health monitor thinks the member is healthy. What's going on here? Thanks, Mike ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [heat] health maintenance in autoscaling groups
Though, this is probably a good time to talk requirements, and to start thinking about whether this is an lbaas issue, or an advanced services (*aaS) issue, so we can have some useful discussions at the summit, and not solve this scaling metrics problem 8 different ways. Doug From: Stephen Balukoff sbaluk...@bluebox.netmailto:sbaluk...@bluebox.net Reply-To: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.orgmailto:openstack-dev@lists.openstack.org Date: Wednesday, July 23, 2014 at 7:14 PM To: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.orgmailto:openstack-dev@lists.openstack.org Subject: Re: [openstack-dev] [heat] health maintenance in autoscaling groups It's probably worth pointing out that most of the Neutron LBaaS team are spending most of our time doing a major revision to Neutron LBaaS. How stats processing should happen has definitely been discussed but not resolved at present-- and in any case it was apparent to those working on the project that it has secondary importance compared to the revision work presently underway. I personally would like to have queries about most objects in the stats API to Neutron LBaaS return a dictionary or statuses for child objects which then a UI or auto-scaling system can interpret however it wishes. Your points are certainly well made, and I agree that it might also be useful to inject status information externally, or have some kind of hook there to get event notifications when individual member statuses change. But this is really a discussion that needs to happen once the current code drive is near fruition (ie. for Kilo). Stephen On Wed, Jul 23, 2014 at 1:27 PM, Doug Wiegley do...@a10networks.commailto:do...@a10networks.com wrote: Great question, and to my knowledge, not at present. There is an ongoing discussion about a common usage framework for ceilometer, for all the various *aaS things, but status I not included (yet!). I think that spec is in gerrit. Thanks, Doug From: Mike Spreitzer mspre...@us.ibm.commailto:mspre...@us.ibm.com Reply-To: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.orgmailto:openstack-dev@lists.openstack.org Date: Wednesday, July 23, 2014 at 2:03 PM To: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.orgmailto:openstack-dev@lists.openstack.org Subject: Re: [openstack-dev] [heat] health maintenance in autoscaling groups Doug Wiegley do...@a10networks.commailto:do...@a10networks.com wrote on 07/23/2014 03:43:02 PM: From: Doug Wiegley do...@a10networks.commailto:do...@a10networks.com ... The state of the world today: ‘status’ in the neutron database is configuration/provisioning status, not operational status. Neutron- wide thing. We were discussing adding operational status fields (or a neutron REST call to get the info from the backend) last month, but it’s something that isn’t planned for a serious conversation until Kilo, at present. Thanks for the prompt response. Let me just grasp at one last straw: is there any chance that Neutron will soon define and implement Ceilometer metrics that reveal PoolMember health? Thanks, Mike ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.orgmailto:OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Stephen Balukoff Blue Box Group, LLC (800)613-4305 x807 ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [heat] health maintenance in autoscaling groups
Thomas Herve thomas.he...@enovance.com wrote on 07/17/2014 02:06:13 AM: There are 4 resources related to neutron load balancing. OS::Neutron::LoadBalancer is probably the least useful and the one you can *not* use, as it's only there for compatibility with AWS::AutoScaling::AutoScalingGroup. OS::Neutron::HealthMonitor does the health checking part, although maybe not in the way you want it. OK, let's work with these. My current view is this: supposing the Convergence work delivers monitoring of health according to a member's status in its service and reacts accordingly, the gaps (compared to AWS functionality) are the abilities to (1) get member health from application level pings (e.g., URL polling) and (2) accept member health declarations from an external system, with consistent reaction to health information from all sources. Source (1) is what an OS::Neutron::HealthMonitor specifies, and an OS::Neutron::Pool is the thing that takes such a spec. So we could complete the (1) part if there were a way to tell a scaling group to poll the member health information developed by an OS::Neutron::Pool. Does that look like the right approach? For (2), this would amount to having an API that an external system (with proper authorization) can use to declare member health. In the grand and glorious future when scaling groups have true APIs rather than being Heat hacks, such a thing would be part of those APIs. In the immediate future we could simply add this to the Heat API. Such an operation would take somethings like a stack name or UUID, the name or UUID of a resource that is a scaling group, and the member name or UUID of the Resource whose health is being declared, and health_status=unhealthy. Does that look about right? For both of these new sources, the remaining question is how to get the right reaction. In the case that the member is actually deleted already, life is easy. Let's talk about the other cases. Note that AWS admits that there might be false detection of unhealth as a member's contents finish getting into regular operation; AWS handles this by saying that the right reaction is to react only after unhealth has been consistently detected for a configured amount of time. The simplest thing for a scaling group to do might be to include that hysteresis and eventually effect removal of a member by generating a new template that excludes the to-be-deleted member and doing an UPDATE on itself (qua stack) with that new template. Does that look about right? Thanks, Mike ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [heat] health maintenance in autoscaling groups
Excerpts from Mike Spreitzer's message of 2014-07-18 09:12:21 -0700: Thomas Herve thomas.he...@enovance.com wrote on 07/17/2014 02:06:13 AM: There are 4 resources related to neutron load balancing. OS::Neutron::LoadBalancer is probably the least useful and the one you can *not* use, as it's only there for compatibility with AWS::AutoScaling::AutoScalingGroup. OS::Neutron::HealthMonitor does the health checking part, although maybe not in the way you want it. OK, let's work with these. My current view is this: supposing the Convergence work delivers monitoring of health according to a member's status in its service and reacts accordingly, the gaps (compared to AWS functionality) are the abilities to (1) get member health from application level pings (e.g., URL polling) and (2) accept member health declarations from an external system, with consistent reaction to health information from all sources. Convergence will not deliver monitoring, though I understand how one might have that misunderstanding. Convergence will check with the API that controls a physical resource to determine what Heat should consider its status to be for the purpose of ongoing orchestration. Source (1) is what an OS::Neutron::HealthMonitor specifies, and an OS::Neutron::Pool is the thing that takes such a spec. So we could complete the (1) part if there were a way to tell a scaling group to poll the member health information developed by an OS::Neutron::Pool. Does that look like the right approach? For (2), this would amount to having an API that an external system (with proper authorization) can use to declare member health. In the grand and glorious future when scaling groups have true APIs rather than being Heat hacks, such a thing would be part of those APIs. In the immediate future we could simply add this to the Heat API. Such an operation would take somethings like a stack name or UUID, the name or UUID of a resource that is a scaling group, and the member name or UUID of the Resource whose health is being declared, and health_status=unhealthy. Does that look about right? Isn't (2) covered already by the cloudwatch API in Heat? I am going to claim ignorance of it a bit, as I've never used it, but it seems like the same thing. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [heat] health maintenance in autoscaling groups
Clint Byrum cl...@fewbar.com wrote on 07/18/2014 12:56:32 PM: Excerpts from Mike Spreitzer's message of 2014-07-18 09:12:21 -0700: ... OK, let's work with these. My current view is this: supposing the Convergence work delivers monitoring of health according to a member's status in its service and reacts accordingly, the gaps (compared to AWS functionality) are the abilities to (1) get member health from application level pings (e.g., URL polling) and (2) accept member health declarations from an external system, with consistent reaction to health information from all sources. Convergence will not deliver monitoring, though I understand how one might have that misunderstanding. Convergence will check with the API that controls a physical resource to determine what Heat should consider its status to be for the purpose of ongoing orchestration. If I understand correctly, your point is that healing is not automatic. Since a scaling group is a nested stack, the observing part of Convergence will automatically note in the DB when the physical resource behind a scaling group member (in its role as a stack resource) is deleted. And when convergence engine gets around to acting on that Resource, the backing physical resource will be automatically re-created. But there is nothing that automatically links the notice of divergence to the converging action. Have I got that right? Source (1) is what an OS::Neutron::HealthMonitor specifies, and an OS::Neutron::Pool is the thing that takes such a spec. So we could complete the (1) part if there were a way to tell a scaling group to poll the member health information developed by an OS::Neutron::Pool. Does that look like the right approach? For (2), this would amount to having an API that an external system (with proper authorization) can use to declare member health. In the grand and glorious future when scaling groups have true APIs rather than being Heat hacks, such a thing would be part of those APIs. In the immediate future we could simply add this to the Heat API. Such an operation would take somethings like a stack name or UUID, the name or UUID of a resource that is a scaling group, and the member name or UUID of the Resource whose health is being declared, and health_status=unhealthy. Does that look about right? Isn't (2) covered already by the cloudwatch API in Heat? I am going to claim ignorance of it a bit, as I've never used it, but it seems like the same thing. I presume that by cloudwatch API you mean Ceilometer. Today a Ceilometer alarm can be given a URL to invoke but can not be told about any special headers or body to use in the invocation (i.e., no parameters for the HTTP operation). More to the point, the idea here is supporting a general external system that might determine health in its own way, not necessarily through programming Ceilometer to detect it. Thanks, Mike ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [heat] health maintenance in autoscaling groups
Excerpts from Mike Spreitzer's message of 2014-07-18 10:38:32 -0700: Clint Byrum cl...@fewbar.com wrote on 07/18/2014 12:56:32 PM: Excerpts from Mike Spreitzer's message of 2014-07-18 09:12:21 -0700: ... OK, let's work with these. My current view is this: supposing the Convergence work delivers monitoring of health according to a member's status in its service and reacts accordingly, the gaps (compared to AWS functionality) are the abilities to (1) get member health from application level pings (e.g., URL polling) and (2) accept member health declarations from an external system, with consistent reaction to health information from all sources. Convergence will not deliver monitoring, though I understand how one might have that misunderstanding. Convergence will check with the API that controls a physical resource to determine what Heat should consider its status to be for the purpose of ongoing orchestration. If I understand correctly, your point is that healing is not automatic. Since a scaling group is a nested stack, the observing part of Convergence will automatically note in the DB when the physical resource behind a scaling group member (in its role as a stack resource) is deleted. And when convergence engine gets around to acting on that Resource, the backing physical resource will be automatically re-created. But there is nothing that automatically links the notice of divergence to the converging action. Have I got that right? Yes you have it right. I just wanted to be clear, that is not monitoring. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [heat] health maintenance in autoscaling groups
The check url is already a part of Neutron LBaaS IIRC. Yep. LBaaS is a work in progress, right? You mean more than OpenStack in general? :) The LBaaS API in Neutron has been working fine since Havana. It's certainly has shortcomings and it seems there is a big refactoring in plan, though. Those of use using Nova networking are not feeling the love, unfortunately. That's to be expected. nova-network is going to be supported, but you won't get new features for it. As far as Heat goes, there is no LBaaS resource type. The OS::Neutron::LoadBalancer resource type does not have any health checking properties. There are 4 resources related to neutron load balancing. OS::Neutron::LoadBalancer is probably the least useful and the one you can *not* use, as it's only there for compatibility with AWS::AutoScaling::AutoScalingGroup. OS::Neutron::HealthMonitor does the health checking part, although maybe not in the way you want it. -- Thomas ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [heat] health maintenance in autoscaling groups
Clint Byrum cl...@fewbar.com wrote on 07/02/2014 01:54:49 PM: Excerpts from Qiming Teng's message of 2014-07-02 00:02:14 -0700: Just some random thoughts below ... On Tue, Jul 01, 2014 at 03:47:03PM -0400, Mike Spreitzer wrote: In AWS, an autoscaling group includes health maintenance functionality --- both an ability to detect basic forms of failures and an abilityto react properly to failures detected by itself or by a load balancer. What is the thinking about how to get this functionality in OpenStack? Since We are prototyping a solution to this problem at IBM Research - China lab. The idea is to leverage oslo.messaging and ceilometer events for instance (possibly other resource such as port, securitygroup ...) failure detection and handling. Hm.. perhaps you should be contributing some reviews here as you may have some real insight: https://review.openstack.org/#/c/100012/ This sounds a lot like what we're working on for continuous convergence. I noticed that health checking in AWS goes beyond convergence. In AWS an ELB can be configured with a URL to ping, for application-level health checking. And an ASG can simply be *told* the health of a member by a user's own external health system. I think we should have analogous functionality in OpenStack. Does that make sense to you? If so, do you have any opinion on the right way to integrate, so that we do not have three completely independent health maintenance systems? Thanks, Mike___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [heat] health maintenance in autoscaling groups
Excerpts from Mike Spreitzer's message of 2014-07-16 10:50:42 -0700: Clint Byrum cl...@fewbar.com wrote on 07/02/2014 01:54:49 PM: Excerpts from Qiming Teng's message of 2014-07-02 00:02:14 -0700: Just some random thoughts below ... On Tue, Jul 01, 2014 at 03:47:03PM -0400, Mike Spreitzer wrote: In AWS, an autoscaling group includes health maintenance functionality --- both an ability to detect basic forms of failures and an abilityto react properly to failures detected by itself or by a load balancer. What is the thinking about how to get this functionality in OpenStack? Since We are prototyping a solution to this problem at IBM Research - China lab. The idea is to leverage oslo.messaging and ceilometer events for instance (possibly other resource such as port, securitygroup ...) failure detection and handling. Hm.. perhaps you should be contributing some reviews here as you may have some real insight: https://review.openstack.org/#/c/100012/ This sounds a lot like what we're working on for continuous convergence. I noticed that health checking in AWS goes beyond convergence. In AWS an ELB can be configured with a URL to ping, for application-level health checking. And an ASG can simply be *told* the health of a member by a user's own external health system. I think we should have analogous functionality in OpenStack. Does that make sense to you? If so, do you have any opinion on the right way to integrate, so that we do not have three completely independent health maintenance systems? The check url is already a part of Neutron LBaaS IIRC. What may not be a part is notifications for when all members are reporting down (which might be something to trigger scale-up). If we don't have push checks in our auto scaling implementation then we don't have a proper auto scaling implementation. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [heat] health maintenance in autoscaling groups
On 7/16/14, 2:43 PM, Clint Byrum cl...@fewbar.com wrote: Excerpts from Mike Spreitzer's message of 2014-07-16 10:50:42 -0700: Clint Byrum cl...@fewbar.com wrote on 07/02/2014 01:54:49 PM: Excerpts from Qiming Teng's message of 2014-07-02 00:02:14 -0700: Just some random thoughts below ... On Tue, Jul 01, 2014 at 03:47:03PM -0400, Mike Spreitzer wrote: In AWS, an autoscaling group includes health maintenance functionality --- both an ability to detect basic forms of failures and an abilityto react properly to failures detected by itself or by a load balancer. What is the thinking about how to get this functionality in OpenStack? Since We are prototyping a solution to this problem at IBM Research - China lab. The idea is to leverage oslo.messaging and ceilometer events for instance (possibly other resource such as port, securitygroup ...) failure detection and handling. Hm.. perhaps you should be contributing some reviews here as you may have some real insight: https://review.openstack.org/#/c/100012/ This sounds a lot like what we're working on for continuous convergence. I noticed that health checking in AWS goes beyond convergence. In AWS an ELB can be configured with a URL to ping, for application-level health checking. And an ASG can simply be *told* the health of a member by a user's own external health system. I think we should have analogous functionality in OpenStack. Does that make sense to you? If so, do you have any opinion on the right way to integrate, so that we do not have three completely independent health maintenance systems? The check url is already a part of Neutron LBaaS IIRC. What may not be a part is notifications for when all members are reporting down (which might be something to trigger scale-up). You do recall correctly, and there are currently no mechanisms for notifying anything outside of the load balancer backend when the health monitor/member state changes. There is also currently no way for an external system to inject health information about an LB or its members. Both would be interesting additions. doug If we don't have push checks in our auto scaling implementation then we don't have a proper auto scaling implementation. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [heat] health maintenance in autoscaling groups
Doug Wiegley do...@a10networks.com wrote on 07/16/2014 04:58:52 PM: On 7/16/14, 2:43 PM, Clint Byrum cl...@fewbar.com wrote: Excerpts from Mike Spreitzer's message of 2014-07-16 10:50:42 -0700: ... I noticed that health checking in AWS goes beyond convergence. In AWS an ELB can be configured with a URL to ping, for application-level health checking. And an ASG can simply be *told* the health of a member by a user's own external health system. I think we should have analogous functionality in OpenStack. Does that make sense to you? If so, do you have any opinion on the right way to integrate, so that we do not have three completely independent health maintenance systems? The check url is already a part of Neutron LBaaS IIRC. Yep. LBaaS is a work in progress, right? Those of use using Nova networking are not feeling the love, unfortunately. As far as Heat goes, there is no LBaaS resource type. The OS::Neutron::LoadBalancer resource type does not have any health checking properties. The AWS::ElasticLoadBalancing::LoadBalancer does have a parameter that prescribes health checking --- but, as far as I know, there is no way to ask such a load balancer for its opinion of a member's health. What may not be a part is notifications for when all members are reporting down (which might be something to trigger scale-up). I do not think we want an ASG to react only when all members are down; I think an ASG should maintain at least its minimum size (although I have to admit that I do not understand why the current code has an explicit exception to that). You do recall correctly, and there are currently no mechanisms for notifying anything outside of the load balancer backend when the health monitor/member state changes. This is true in AWS as well. The AWS design is that you can configure the ASG to poll the ELB for its opinion of member health. The idea seems to be that an ASG can get health information from three kinds of sources (polling status in EC2, polling ELB, and being explicitly informed), synthesizes its own summary opinion, and reacts in due time. There is also currently no way for an external system to inject health information about an LB or its members. Both would be interesting additions. doug If we don't have push checks in our auto scaling implementation then we don't have a proper auto scaling implementation. I am not sure what is meant by push checks. Thanks, Mike ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [heat] health maintenance in autoscaling groups
Zane Bitter zbit...@redhat.com wrote on 07/01/2014 06:58:47 PM: On 01/07/14 15:47, Mike Spreitzer wrote: In AWS, an autoscaling group includes health maintenance functionality --- both an ability to detect basic forms of failures and an ability to react properly to failures detected by itself or by a load balancer. What is the thinking about how to get this functionality in OpenStack? Since OpenStack's OS::Heat::AutoScalingGroup has a more general member type, what is the thinking about what failure detection means (and how it would be accomplished, communicated)? I have not found design discussion of this; have I missed something? Yes :) https://review.openstack.org/#/c/95907/ The idea is that Convergence will provide health maintenance for _all_ forms of resources in Heat. Once this is implemented, autoscaling gets it for free by virtue of that fact that it manages resources using Heat stacks. Ah, right. My reading of that design is not quite so simple. Note that in the User Stories section it calls for different treatment of Compute instances depending on whether they are in a scaling group. That's why I was thinking of this from a scaling group perspective. But perhaps the more natural approach is to take the pervasive perspective and figure out how to suppress convergence for the Compute instances to which it should not apply. Thanks, Mike ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [heat] health maintenance in autoscaling groups
Just some random thoughts below ... On Tue, Jul 01, 2014 at 03:47:03PM -0400, Mike Spreitzer wrote: In AWS, an autoscaling group includes health maintenance functionality --- both an ability to detect basic forms of failures and an ability to react properly to failures detected by itself or by a load balancer. What is the thinking about how to get this functionality in OpenStack? Since We are prototyping a solution to this problem at IBM Research - China lab. The idea is to leverage oslo.messaging and ceilometer events for instance (possibly other resource such as port, securitygroup ...) failure detection and handling. OpenStack's OS::Heat::AutoScalingGroup has a more general member type, what is the thinking about what failure detection means (and how it would be accomplished, communicated)? When most OpenStack services are making use of oslo.notify, in theory, a service should be able to send/receive events related to resource status. In our current prototype, at least host failure (detected in Nova and reported with a patch), VM failure (detected by nova), and some lifecycle events of other resources can be detected and then collected by Ceilometer. There is certainly a possibility to listen to the message queue directly from Heat, but we only implemented the Ceilometer centric approach. I have not found design discussion of this; have I missed something? I suppose the natural answer for OpenStack would be centered around webhooks. An OpenStack scaling group (OS SG = OS::Heat::AutoScalingGroup or AWS::AutoScaling::AutoScalingGroup or OS::Heat::ResourceGroup or OS::Heat::InstanceGroup) could generate a webhook per member, with the meaning of the webhook being that the member has been detected as dead and should be deleted and removed from the group --- and a replacement member created if needed to respect the group's minimum size. Well, I would suggest we generalize this into a event messaging or signaling solution, instead of just 'webhooks'. The reason is that webhooks as it is implemented today is not carrying a payload of useful information -- I'm referring to the alarms in Ceilometer. There are other cases as well. A member failure could be caused by a temporary communication problem, which means it may show up quickly when a replacement member is already being created. It may mean that we have to respond to an 'online' event in addition to an 'offline' event? When the member is a Compute instance and Ceilometer exists, the OS SG could define a Ceilometer alarm for each member (by including these alarms in the template generated for the nested stack that is the SG), programmed to hit the member's deletion webhook when death is detected (I imagine there are a few ways to write a Ceilometer condition that detects instance death). Yes. Compute instance failure can be detected with a Ceilometer plugin. In our prototype, we developed a Dispatcher plugin that can handle events like 'compute.instance.delete.end', 'compute.instance.create.end' after they have been processed based on a event_definitions.yaml file. There could be other ways, I think. The problem here today is about the recovery of SG member. If it is a compute instance, we can 'reboot', 'rebuild', 'evacuate', 'migrate' it, just to name a few options. The most brutal way to do this is like what HARestarter is doing today -- delete followed by a create. When the member is a nested stack and Ceilometer exists, it could be the member stack's responsibility to include a Ceilometer alarm that detects the member stack's death and hit the member stack's deletion webhook. This is difficult. A '(nested) stack' is a Heat specific abstraction -- recall that we have to annotate a nova server resource in its metadata to which stack this server belongs. Besides the 'visible' resources specified in a template, Heat may create internal data structures and/or resources (e.g. users) for a stack. I am not quite sure a stack's death can be easily detected from outside Heat. It would be at least cumbersome to have Heat notify Ceilometer that a stack is dead, and then have Ceilometer send back a signal. There is a small matter of how the author of the template used to create the member stack writes some template snippet that creates a Ceilometer alarm that is specific to a member stack that does not exist yet. How about just one signal responder per ScalingGroup? A SG is supposed to be in a better position to make the judgement: do I have to recreate a failed member? am I recreating it right now or wait a few seconds? maybe I should recreate the member on some specific AZs? If there is only one signal responder per SG, then the 'webhook' (or resource signal, my preference) need to carry a payload indicating when and which member is down. I suppose we could stipulate that if the member template includes a parameter with name member_name and type string then the OS OG
Re: [openstack-dev] [heat] health maintenance in autoscaling groups
On Wed, Jul 02, 2014 at 03:02:14PM +0800, Qiming Teng wrote: Just some random thoughts below ... On Tue, Jul 01, 2014 at 03:47:03PM -0400, Mike Spreitzer wrote: In AWS, an autoscaling group includes health maintenance functionality --- both an ability to detect basic forms of failures and an ability to react properly to failures detected by itself or by a load balancer. What is the thinking about how to get this functionality in OpenStack? Since We are prototyping a solution to this problem at IBM Research - China lab. The idea is to leverage oslo.messaging and ceilometer events for instance (possibly other resource such as port, securitygroup ...) failure detection and handling. This sounds interesting, are you planning to propose a spec for heat describing this work and submit your patches to heat? OpenStack's OS::Heat::AutoScalingGroup has a more general member type, what is the thinking about what failure detection means (and how it would be accomplished, communicated)? When most OpenStack services are making use of oslo.notify, in theory, a service should be able to send/receive events related to resource status. In our current prototype, at least host failure (detected in Nova and reported with a patch), VM failure (detected by nova), and some lifecycle events of other resources can be detected and then collected by Ceilometer. There is certainly a possibility to listen to the message queue directly from Heat, but we only implemented the Ceilometer centric approach. It has been pointed out a few times that in large deployments, different services may not share the same message bus. So while *an* option could be heat listenting to the message bus, I'd prefer that we maintain the alarm notifications via the ReST API as the primary signalling mechanism. I have not found design discussion of this; have I missed something? I suppose the natural answer for OpenStack would be centered around webhooks. An OpenStack scaling group (OS SG = OS::Heat::AutoScalingGroup or AWS::AutoScaling::AutoScalingGroup or OS::Heat::ResourceGroup or OS::Heat::InstanceGroup) could generate a webhook per member, with the meaning of the webhook being that the member has been detected as dead and should be deleted and removed from the group --- and a replacement member created if needed to respect the group's minimum size. Well, I would suggest we generalize this into a event messaging or signaling solution, instead of just 'webhooks'. The reason is that webhooks as it is implemented today is not carrying a payload of useful information -- I'm referring to the alarms in Ceilometer. The resource signal interface used by ceilometer can carry whatever data you like, so the existing solution works fine, we don't need a new one IMO. For example look at this patch which converts WaitConditions to use the resource_signal interface: https://review.openstack.org/#/c/101351/2/heat/engine/resources/wait_condition.py We pass the data to the WaitCondition via a resource signal, the exact same transport that is used for alarm notifications from ceilometer. Note the webhook thing really just means a pre-signed request, which using the v2 AWS style signed requests (currently the only option for heat pre-signed requests) does not sign the request body. This is a security disadvantage (addressed by the v3 AWS signing scheme), but it does mean you can pass data via the pre-signed URL. An alternative to pre-signed URLs is simply to make an authenticated call to the native ReST API, but then whatever is signalling requires either credentials, a token, or a trust to impersonate the stack owner. Again, you can pass whatever data you want via this interface. There are other cases as well. A member failure could be caused by a temporary communication problem, which means it may show up quickly when a replacement member is already being created. It may mean that we have to respond to an 'online' event in addition to an 'offline' event? When the member is a Compute instance and Ceilometer exists, the OS SG could define a Ceilometer alarm for each member (by including these alarms in the template generated for the nested stack that is the SG), programmed to hit the member's deletion webhook when death is detected (I imagine there are a few ways to write a Ceilometer condition that detects instance death). Yes. Compute instance failure can be detected with a Ceilometer plugin. In our prototype, we developed a Dispatcher plugin that can handle events like 'compute.instance.delete.end', 'compute.instance.create.end' after they have been processed based on a event_definitions.yaml file. There could be other ways, I think. Are you aware of the Existence of instance meter in ceilometer? http://docs.openstack.org/developer/ceilometer/measurements.html I noticed that recently and wondered if it provides an initial metric we could
Re: [openstack-dev] [heat] health maintenance in autoscaling groups
Mike Spreitzer/Watson/IBM@IBMUS wrote on 07/02/2014 02:41:48 AM: Zane Bitter zbit...@redhat.com wrote on 07/01/2014 06:58:47 PM: On 01/07/14 15:47, Mike Spreitzer wrote: In AWS, an autoscaling group includes health maintenance functionality --- both an ability to detect basic forms of failures and an ability to react properly to failures detected by itself or by a load balancer. What is the thinking about how to get this functionality in OpenStack? Since OpenStack's OS::Heat::AutoScalingGroup has a more general member type, what is the thinking about what failure detection means (and how it would be accomplished, communicated)? I have not found design discussion of this; have I missed something? Yes :) https://review.openstack.org/#/c/95907/ The idea is that Convergence will provide health maintenance for _all_ forms of resources in Heat. Once this is implemented, autoscaling gets it for free by virtue of that fact that it manages resources using Heat stacks. Ah, right. My reading of that design is not quite so simple... There are a couple more issues that arise in this approach. The biggest one is how to integrate application-level failure detection. I added a comment to this effect on the Convergence spec. Another issue is that, at least initially, Convergence is not always on; rather, it is an operation that can be invoked on a stack. When would a scaling group invoke this action on itself (more precisely, itself considered as a nested stack)? One obvious possibility is on a periodic basis. It the convergence operation is pretty cheap when no divergence has been detected, then that might be acceptable. Otherwise we might want the scaling group to set up some sort of notification, but that would be another batch of member-type specific code. Thanks, Mike ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [heat] health maintenance in autoscaling groups
Qiming Teng teng...@linux.vnet.ibm.com wrote on 07/02/2014 03:02:14 AM: Just some random thoughts below ... On Tue, Jul 01, 2014 at 03:47:03PM -0400, Mike Spreitzer wrote: ... I have not found design discussion of this; have I missed something? I suppose the natural answer for OpenStack would be centered around webhooks... Well, I would suggest we generalize this into a event messaging or signaling solution, instead of just 'webhooks'. The reason is that webhooks as it is implemented today is not carrying a payload of useful information -- I'm referring to the alarms in Ceilometer. OK, this is great (and Steve Hardy provided more details in his reply), I did not know about the existing abilities to have a payload. However Ceilometer alarms are still deficient in that way, right? A Ceilometer alarm's action list is simply a list of URLs, right? I would be happy to say let's generalize Ceilometer alarms to allow a payload in an action. There are other cases as well. A member failure could be caused by a temporary communication problem, which means it may show up quickly when a replacement member is already being created. It may mean that we have to respond to an 'online' event in addition to an 'offline' event? ... The problem here today is about the recovery of SG member. If it is a compute instance, we can 'reboot', 'rebuild', 'evacuate', 'migrate' it, just to name a few options. The most brutal way to do this is like what HARestarter is doing today -- delete followed by a create. We could get into arbitrary subtlety, and maybe eventually will do better, but I think we can start with a simple solution that is widely applicable. The simple solution is that once the decision has been made to do convergence on a member (note that this is distinct from merely detecting and noting a divergence) then it will be done regardless of whether the doomed member later appears to have recovered, and the convergence action for a scaling group member is to delete the old member and create a replacement (not in that order). When the member is a nested stack and Ceilometer exists, it could be the member stack's responsibility to include a Ceilometer alarm that detects the member stack's death and hit the member stack's deletion webhook. This is difficult. A '(nested) stack' is a Heat specific abstraction -- recall that we have to annotate a nova server resource in its metadata to which stack this server belongs. Besides the 'visible' resources specified in a template, Heat may create internal data structures and/or resources (e.g. users) for a stack. I am not quite sure a stack's death can be easily detected from outside Heat. It would be at least cumbersome to have Heat notify Ceilometer that a stack is dead, and then have Ceilometer send back a signal. A (nested) stack is not only a heat-specific abstraction but its semantics and failure modes are specific to the stack (at least, its template). I think we have no practical choice but to let the template author declare how failure is detected. It could be as simple as creating a Ceilometer alarms that detect death one or more resources in the nested stack; it could be more complicated Ceilometer stuff; it could be based on something other than, or in addition to, Ceilometer. If today there are not enough sensors to detect failures of all kinds of resources, I consider that a gap in telemetry (and think it is small enough that we can proceed usefully today, and should plan on filling that gap over time). There is a small matter of how the author of the template used to create the member stack writes some template snippet that creates a Ceilometer alarm that is specific to a member stack that does not exist yet. How about just one signal responder per ScalingGroup? A SG is supposed to be in a better position to make the judgement: do I have to recreate a failed member? am I recreating it right now or wait a few seconds? maybe I should recreate the member on some specific AZs? That is confusing two issues. The thing that is new here is making the scaling group recognize member failure; the primary reaction is to update its accounting of members (which, in the current code, must be done by making sure the failed member is deleted); recovery of other scaling group aspects is fairly old-hat, it is analogous to the problems that the scaling group already solves when asked to increase its size. ... I suppose we could stipulate that if the member template includes a parameter with name member_name and type string then the OS OG takes care of supplying the correct value of that parameter; as illustrated in the asg_of_stacks.yaml of https://review.openstack.org/#/c/97366/ , a member template can use a template parameter to tag Ceilometer data for querying. The URL of the member stack's deletion webhook could be passed to the member
Re: [openstack-dev] [heat] health maintenance in autoscaling groups
Steven Hardy sha...@redhat.com wrote on 07/02/2014 06:02:36 AM: On Wed, Jul 02, 2014 at 03:02:14PM +0800, Qiming Teng wrote: Just some random thoughts below ... On Tue, Jul 01, 2014 at 03:47:03PM -0400, Mike Spreitzer wrote: ... The resource signal interface used by ceilometer can carry whatever data you like, so the existing solution works fine, we don't need a new one IMO. That's great, I did not know about it. Thanks for the details on this, I do think it is an improvement. Yes, it is a slight security downgrade --- the signed URL is effectively a capability, and allowing a payload increases the cross section of what an attacker can do with one stolen capability. But I am willing to live with it. ... Are you aware of the Existence of instance meter in ceilometer? I am, and was thinking it might be very directly usable. Has anybody tested or demonstrated this? ... How about just one signal responder per ScalingGroup? A SG is supposed to be in a better position to make the judgement: do I have to recreate a failed member? am I recreating it right now or wait a few seconds? maybe I should recreate the member on some specific AZs? This is what we have already - you have one ScalingPolicy (which is a SignalResponder), and the ScalingPolicy is the place where you make the decision about what to do with the data provided from the alarm. I think the existing ScalingPolicy is about a different issue. ScalingPolicy is mis-named; it is really only a ScalingAction, and it is about how to adjust the desired size. It does not address the key missing piece here, which is how the scaling group updates its accounting of the number of members it has. That accounting is done simply by counting members. So if a member becomes dysfunctional but remains extant, the scaling group logic continues to count it. Hmm, can a scaling group today properly cope with member deletion if prodded to do a ScalingPolicy(Action) that is 'add 1 member'? (I had considered 'add 0 members' but that fails to produce a change in an important case --- when the size is now below minimum (fun fact about the code!). ) ... I am not in favor of the per-member webhook design. But I vote for an additional *implicit* parameter to a nested stack of any groups. It could be an index or a name. I agree, we just need appropriate metadata in ceilometer, which can then be passed back to heat via the resource signal when the alarm happens. We need to get the relevant meter samples in Ceilometer tagged with something that is unique to the [scaling group, member] and referencable in the template source. For the case of a scaling group whose member type is nested stack, you could invent a way to implicitly pass such tagging down through all the intervening abstractions. I was supposing the preferred solution would be for the template author to explicitly do this. ... Thanks, Mike ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [heat] health maintenance in autoscaling groups
On 02/07/14 02:41, Mike Spreitzer wrote: Zane Bitter zbit...@redhat.com wrote on 07/01/2014 06:58:47 PM: On 01/07/14 15:47, Mike Spreitzer wrote: In AWS, an autoscaling group includes health maintenance functionality --- both an ability to detect basic forms of failures and an ability to react properly to failures detected by itself or by a load balancer. What is the thinking about how to get this functionality in OpenStack? Since OpenStack's OS::Heat::AutoScalingGroup has a more general member type, what is the thinking about what failure detection means (and how it would be accomplished, communicated)? I have not found design discussion of this; have I missed something? Yes :) https://review.openstack.org/#/c/95907/ The idea is that Convergence will provide health maintenance for _all_ forms of resources in Heat. Once this is implemented, autoscaling gets it for free by virtue of that fact that it manages resources using Heat stacks. Ah, right. My reading of that design is not quite so simple. Note that in the User Stories section it calls for different treatment of Compute instances depending on whether they are in a scaling group. I don't believe that is a correct reading. - ZB ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [heat] health maintenance in autoscaling groups
On Wed, Jul 02, 2014 at 11:02:36AM +0100, Steven Hardy wrote: On Wed, Jul 02, 2014 at 03:02:14PM +0800, Qiming Teng wrote: Just some random thoughts below ... On Tue, Jul 01, 2014 at 03:47:03PM -0400, Mike Spreitzer wrote: In AWS, an autoscaling group includes health maintenance functionality --- both an ability to detect basic forms of failures and an ability to react properly to failures detected by itself or by a load balancer. What is the thinking about how to get this functionality in OpenStack? Since We are prototyping a solution to this problem at IBM Research - China lab. The idea is to leverage oslo.messaging and ceilometer events for instance (possibly other resource such as port, securitygroup ...) failure detection and handling. This sounds interesting, are you planning to propose a spec for heat describing this work and submit your patches to heat? Steve, this work is still a prototype yet having the loop to be closed. The basic idea is: 1. Ensure nova server redundancy by providing a VMCluster resource type in the form of Heat plugin. It could be contributed back to the community if proved useful. I have two concerns: 1) it is not a generic solution yet, due to lacking support to template resources; 2) instead of a new resource type, maybe a better approach is to add an optional group of properties to Nova server specifying its HA requirement. 2. Detection of Host/VM failures. Currently rely on Nova's detection of VM lifecycle events. I'm not sure it is applicable to hypervisors other than KVM. We have some patches to the ServiceGroup service in Nova so that Host failures can be detected and reported too. This can be a patch to Nova. 3. Recovery from Host/VM failures. We can either use the Events collected by Ceilometer directly, or have Ceilometer convert Event into Samples so that we can reuse the Alarm service (evaluator + notifier). Neither way is working now. For the Event path, we are blocked by the authentication problem; for the Alarm path, we don't know how to carry a payload via the AlarmUrl. Some help and guidance would be highly appreciated. OpenStack's OS::Heat::AutoScalingGroup has a more general member type, what is the thinking about what failure detection means (and how it would be accomplished, communicated)? When most OpenStack services are making use of oslo.notify, in theory, a service should be able to send/receive events related to resource status. In our current prototype, at least host failure (detected in Nova and reported with a patch), VM failure (detected by nova), and some lifecycle events of other resources can be detected and then collected by Ceilometer. There is certainly a possibility to listen to the message queue directly from Heat, but we only implemented the Ceilometer centric approach. It has been pointed out a few times that in large deployments, different services may not share the same message bus. So while *an* option could be heat listenting to the message bus, I'd prefer that we maintain the alarm notifications via the ReST API as the primary signalling mechanism. Agreed. IIRC, somewhere in the Ceilometer documentation, it was suggested to use different use different queues for different purposes. No objection to keep alarms as the primary mechanism till we have a compelling reason to change. I have not found design discussion of this; have I missed something? I suppose the natural answer for OpenStack would be centered around webhooks. An OpenStack scaling group (OS SG = OS::Heat::AutoScalingGroup or AWS::AutoScaling::AutoScalingGroup or OS::Heat::ResourceGroup or OS::Heat::InstanceGroup) could generate a webhook per member, with the meaning of the webhook being that the member has been detected as dead and should be deleted and removed from the group --- and a replacement member created if needed to respect the group's minimum size. Well, I would suggest we generalize this into a event messaging or signaling solution, instead of just 'webhooks'. The reason is that webhooks as it is implemented today is not carrying a payload of useful information -- I'm referring to the alarms in Ceilometer. The resource signal interface used by ceilometer can carry whatever data you like, so the existing solution works fine, we don't need a new one IMO. For example look at this patch which converts WaitConditions to use the resource_signal interface: https://review.openstack.org/#/c/101351/2/heat/engine/resources/wait_condition.py We pass the data to the WaitCondition via a resource signal, the exact same transport that is used for alarm notifications from ceilometer. I can understand the Heat side processing of signal payload. From the triggering side, I haven't seen an example showing me how to encode additional payload into the 'alarmUrl' string. Resource signal ReST call uses a
Re: [openstack-dev] [heat] health maintenance in autoscaling groups
On Wed, Jul 02, 2014 at 12:29:31PM -0400, Mike Spreitzer wrote: Qiming Teng teng...@linux.vnet.ibm.com wrote on 07/02/2014 03:02:14 AM: Just some random thoughts below ... On Tue, Jul 01, 2014 at 03:47:03PM -0400, Mike Spreitzer wrote: ... I have not found design discussion of this; have I missed something? I suppose the natural answer for OpenStack would be centered around webhooks... Well, I would suggest we generalize this into a event messaging or signaling solution, instead of just 'webhooks'. The reason is that webhooks as it is implemented today is not carrying a payload of useful information -- I'm referring to the alarms in Ceilometer. OK, this is great (and Steve Hardy provided more details in his reply), I did not know about the existing abilities to have a payload. However Ceilometer alarms are still deficient in that way, right? A Ceilometer alarm's action list is simply a list of URLs, right? I would be happy to say let's generalize Ceilometer alarms to allow a payload in an action. Yes. Steve kindly pointed out that an alarm could be used to carry a payload, though not yet implemented. My concern is actually about 'flexibility'. For different purposes, an alarm may be required to carry payload of different formats. We need a specification/protocol between Heat and Ceilometer so that Heat can specify in an alarm: - tell me when/which instance is down/up when sending me an alarm about instance lifecycle. - tell me which instances from my group are affected when a host is down - (other use cases?) There are other cases as well. A member failure could be caused by a temporary communication problem, which means it may show up quickly when a replacement member is already being created. It may mean that we have to respond to an 'online' event in addition to an 'offline' event? ... The problem here today is about the recovery of SG member. If it is a compute instance, we can 'reboot', 'rebuild', 'evacuate', 'migrate' it, just to name a few options. The most brutal way to do this is like what HARestarter is doing today -- delete followed by a create. We could get into arbitrary subtlety, and maybe eventually will do better, but I think we can start with a simple solution that is widely applicable. The simple solution is that once the decision has been made to do convergence on a member (note that this is distinct from merely detecting and noting a divergence) then it will be done regardless of whether the doomed member later appears to have recovered, and the convergence action for a scaling group member is to delete the old member and create a replacement (not in that order). Umh ... For transient errors, it won't be uncommon that some members may appear unreachable (e.g. from a load balancer), as a result of say image downloading saturating network bandwidth. Sovling this using convergence logic? The observer sees only 2 members running instead of 3 which is the desired state, then convergene engine starts to create a new member. Now, the previously disappeared member showed up again. What should the observer do? Would it be smart enough to know that this is the old member coming back to life thus cancel the creation of the new member? Would it be able to recognize that this instance was part of a Resource Group at all? When the member is a nested stack and Ceilometer exists, it could be the member stack's responsibility to include a Ceilometer alarm that detects the member stack's death and hit the member stack's deletion webhook. This is difficult. A '(nested) stack' is a Heat specific abstraction -- recall that we have to annotate a nova server resource in its metadata to which stack this server belongs. Besides the 'visible' resources specified in a template, Heat may create internal data structures and/or resources (e.g. users) for a stack. I am not quite sure a stack's death can be easily detected from outside Heat. It would be at least cumbersome to have Heat notify Ceilometer that a stack is dead, and then have Ceilometer send back a signal. A (nested) stack is not only a heat-specific abstraction but its semantics and failure modes are specific to the stack (at least, its template). I think we have no practical choice but to let the template author declare how failure is detected. It could be as simple as creating a Ceilometer alarms that detect death one or more resources in the nested stack; it could be more complicated Ceilometer stuff; it could be based on something other than, or in addition to, Ceilometer. If today there are not enough sensors to detect failures of all kinds of resources, I consider that a gap in telemetry (and think it is small enough that we can proceed usefully today, and should plan on filling that gap over time). My opinion is that we cannot blame Ceilometer for the lack of
Re: [openstack-dev] [heat] health maintenance in autoscaling groups
On Wed, Jul 02, 2014 at 10:54:49AM -0700, Clint Byrum wrote: Excerpts from Qiming Teng's message of 2014-07-02 00:02:14 -0700: Just some random thoughts below ... On Tue, Jul 01, 2014 at 03:47:03PM -0400, Mike Spreitzer wrote: In AWS, an autoscaling group includes health maintenance functionality --- both an ability to detect basic forms of failures and an ability to react properly to failures detected by itself or by a load balancer. What is the thinking about how to get this functionality in OpenStack? Since We are prototyping a solution to this problem at IBM Research - China lab. The idea is to leverage oslo.messaging and ceilometer events for instance (possibly other resource such as port, securitygroup ...) failure detection and handling. Hm.. perhaps you should be contributing some reviews here as you may have some real insight: https://review.openstack.org/#/c/100012/ This sounds a lot like what we're working on for continuous convergence. Great. I will look into this specs and see if I can contribute some ideas. OpenStack's OS::Heat::AutoScalingGroup has a more general member type, what is the thinking about what failure detection means (and how it would be accomplished, communicated)? When most OpenStack services are making use of oslo.notify, in theory, a service should be able to send/receive events related to resource status. In our current prototype, at least host failure (detected in Nova and reported with a patch), VM failure (detected by nova), and some lifecycle events of other resources can be detected and then collected by Ceilometer. There is certainly a possibility to listen to the message queue directly from Heat, but we only implemented the Ceilometer centric approach. I have not found design discussion of this; have I missed something? I suppose the natural answer for OpenStack would be centered around webhooks. An OpenStack scaling group (OS SG = OS::Heat::AutoScalingGroup or AWS::AutoScaling::AutoScalingGroup or OS::Heat::ResourceGroup or OS::Heat::InstanceGroup) could generate a webhook per member, with the meaning of the webhook being that the member has been detected as dead and should be deleted and removed from the group --- and a replacement member created if needed to respect the group's minimum size. Well, I would suggest we generalize this into a event messaging or signaling solution, instead of just 'webhooks'. The reason is that webhooks as it is implemented today is not carrying a payload of useful information -- I'm referring to the alarms in Ceilometer. There are other cases as well. A member failure could be caused by a temporary communication problem, which means it may show up quickly when a replacement member is already being created. It may mean that we have to respond to an 'online' event in addition to an 'offline' event? The ideas behind convergence help a lot here. Skew happens in distributed systems, so we expect it constantly. In the extra-capacity situation above, we would just deal with it by scaling back down. There are also situations where we might accidentally create two physical resources because we got a 500 from the API but it was after the resource was being created. This is the same problem, and has the same answer: pick one and scale down (and if this is a critical server like a database, we'll need lifecycle callbacks that will prevent suddenly killing a node that would cost you uptime or recovery time). Glad to know this is considered and handled with a generic solution. As for recovering a server, I still suggest we have a per-resource-type restart logic. In the case of a nested stack, callbacks seem the right way to go. When the member is a Compute instance and Ceilometer exists, the OS SG could define a Ceilometer alarm for each member (by including these alarms in the template generated for the nested stack that is the SG), programmed to hit the member's deletion webhook when death is detected (I imagine there are a few ways to write a Ceilometer condition that detects instance death). Yes. Compute instance failure can be detected with a Ceilometer plugin. In our prototype, we developed a Dispatcher plugin that can handle events like 'compute.instance.delete.end', 'compute.instance.create.end' after they have been processed based on a event_definitions.yaml file. There could be other ways, I think. The problem here today is about the recovery of SG member. If it is a compute instance, we can 'reboot', 'rebuild', 'evacuate', 'migrate' it, just to name a few options. The most brutal way to do this is like what HARestarter is doing today -- delete followed by a create. Right, so lifecycle callbacks are useful here, as we can expose an interface for delaying and even cancelling a lifecycle event.
[openstack-dev] [heat] health maintenance in autoscaling groups
In AWS, an autoscaling group includes health maintenance functionality --- both an ability to detect basic forms of failures and an ability to react properly to failures detected by itself or by a load balancer. What is the thinking about how to get this functionality in OpenStack? Since OpenStack's OS::Heat::AutoScalingGroup has a more general member type, what is the thinking about what failure detection means (and how it would be accomplished, communicated)? I have not found design discussion of this; have I missed something? I suppose the natural answer for OpenStack would be centered around webhooks. An OpenStack scaling group (OS SG = OS::Heat::AutoScalingGroup or AWS::AutoScaling::AutoScalingGroup or OS::Heat::ResourceGroup or OS::Heat::InstanceGroup) could generate a webhook per member, with the meaning of the webhook being that the member has been detected as dead and should be deleted and removed from the group --- and a replacement member created if needed to respect the group's minimum size. When the member is a Compute instance and Ceilometer exists, the OS SG could define a Ceilometer alarm for each member (by including these alarms in the template generated for the nested stack that is the SG), programmed to hit the member's deletion webhook when death is detected (I imagine there are a few ways to write a Ceilometer condition that detects instance death). When the member is a nested stack and Ceilometer exists, it could be the member stack's responsibility to include a Ceilometer alarm that detects the member stack's death and hit the member stack's deletion webhook. There is a small matter of how the author of the template used to create the member stack writes some template snippet that creates a Ceilometer alarm that is specific to a member stack that does not exist yet. I suppose we could stipulate that if the member template includes a parameter with name member_name and type string then the OS OG takes care of supplying the correct value of that parameter; as illustrated in the asg_of_stacks.yaml of https://review.openstack.org/#/c/97366/ , a member template can use a template parameter to tag Ceilometer data for querying. The URL of the member stack's deletion webhook could be passed to the member template via the same sort of convention. When Ceilometer does not exist, it is less obvious to me what could usefully be done. Are there any useful SG member types besides Compute instances and nested stacks? Note that a nested stack could also pass its member deletion webhook to a load balancer (that is willing to accept such a thing, of course), so we get a lot of unity of mechanism between the case of detection by infrastructure vs. application level detection. I am not entirely happy with the idea of a webhook per member. If I understand correctly, generating webhooks is a somewhat expensive and problematic process. What would be the alternative? Thanks, Mike___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [heat] health maintenance in autoscaling groups
On 01/07/14 15:47, Mike Spreitzer wrote: In AWS, an autoscaling group includes health maintenance functionality --- both an ability to detect basic forms of failures and an ability to react properly to failures detected by itself or by a load balancer. What is the thinking about how to get this functionality in OpenStack? Since OpenStack's OS::Heat::AutoScalingGroup has a more general member type, what is the thinking about what failure detection means (and how it would be accomplished, communicated)? I have not found design discussion of this; have I missed something? Yes :) https://review.openstack.org/#/c/95907/ The idea is that Convergence will provide health maintenance for _all_ forms of resources in Heat. Once this is implemented, autoscaling gets it for free by virtue of that fact that it manages resources using Heat stacks. cheers, Zane. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev