Re: [Openstack] running HA cluster of guests within openstack
On 04/13/2012 12:53 PM, Pádraig Brady wrote: On 04/13/2012 10:31 AM, ikke wrote: I likely am not the first one to ask this, but since I didn't find a thread about it I start one. Is there any shared experience available what are the capabilities of OpenStack to run cluster of guests in the cloud? Do you have experience of the following questions, or links to more info? The questions relate to running a legacy HA cluster in virtual env, and moving it into cloud... I'll just point out two early stage projects that used in combination can provide a HA solution. http://wiki.openstack.org/Heat http://wiki.openstack.org/ResourceMonitorAlertsandNotifications These are similar to AWS CloudFormations and CloudWatch respectively. I notice Heat V4 has just been released. Here is some additional info on High Availability: https://github.com/heat-api/heat/wiki/Roadmap-Feature:-High-Availability and some notes on using it in its current form: https://github.com/heat-api/heat/wiki/Using-HA cheers, Pádraig. ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] running HA cluster of guests within openstack
On Fri, Apr 13, 2012 at 5:45 PM, Jason Kölker jkoel...@rackspace.com wrote: On Fri, 2012-04-13 at 12:31 +0300, ikke wrote: 1. Private networks between guests - Doable now using Quantum 1.1. Defining VLANs visible to guest machines to separate clusters internal traffic, VLAN tags should not be stripped by host (QinQ) VLANs and Quantum private networks are pretty much the same thing, why would you want both? For legacy reasons. The cluster at the moment handles the cluster internal network with VLANs, and for such the cloud layer should just virtualize the HW functionality. It would need to provide the VLAN layer for guests for the time being until the guest could be modified not to require it and handle VLAN network configuration via OpenStack interfaces instead. Some of the questions are due the legacy need. OpenStack would offer similar functionality, but if you intend to bring a legacy apps as such into cloud, there is plenty of modifications needed to adapt the legacy SW into cloud concepts. Adaptation takes time, and in some cases it might be cheaper faster to adapt the cloud layer to provide legacy HW as virtualized, HW abstraction layer. While talking about legacy SW, I mean HUGE amount of code written over decades, which is not easily modifiable. 1.2. Set pre-defined MAC addresses for the guests, needed by non-IP traffic within the guest cluster (layer2 addressing) If you send the mac address to Melange when you create the interface it will record it for that instance: http://melange.readthedocs.org/en/latest/apidoc.html#interfaces Thanks for the link, it is exactly what I was looking for! -it ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] running HA cluster of guests within openstack
On Fri, Apr 13, 2012 at 2:53 PM, Pádraig Brady p...@draigbrady.com wrote: On 04/13/2012 10:31 AM, ikke wrote: I'll just point out two early stage projects that used in combination can provide a HA solution. http://wiki.openstack.org/Heat http://wiki.openstack.org/ResourceMonitorAlertsandNotifications cheers, Pádraig. Thanks for the links, I'll look into them. It looks good having a pluggable monitoring interface. By a quick look I don't see how do the local driver connect to libvirt, is the alert notified in fast manner or based on periodic polling. I need to take a further look into it. Hopefully there could be local HW watchdog emulated in Qemu that would somehow be connected to the plugin framework to allow fast reaction times to guest being stuck. Also, it would make sense to have some kind of a local decision done immediately about the reboot of a stuck guest, instead of taking time to report it centrally and wait for the central manager decision. cheers, Ilkka ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] running HA cluster of guests within openstack
One item more into HA features, hot plugging. 2.8. Hot plug pre-warning events. - Nova should tell the registered client that a node/guest is going to be shutoff, and the remote entry would be given time to ack that. ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
[Openstack] running HA cluster of guests within openstack
I likely am not the first one to ask this, but since I didn't find a thread about it I start one. Is there any shared experience available what are the capabilities of OpenStack to run cluster of guests in the cloud? Do you have experience of the following questions, or links to more info? The questions relate to running a legacy HA cluster in virtual env, and moving it into cloud... 1. Private networks between guests - Doable now using Quantum 1.1. Defining VLANs visible to guest machines to separate clusters internal traffic, VLAN tags should not be stripped by host (QinQ) 1.2. Set pre-defined MAC addresses for the guests, needed by non-IP traffic within the guest cluster (layer2 addressing) - will Melange do this, according to docs it's not in plans? 2. HA capabilities 2.1. Failure notification times need to be fast, i.e. no tcp timeout allowed - there seems to be some activity to integrate pacemaker 2.2. Failure notification of both guests and hosts needs to be included 2.3. Guest cluster controller should be able to monitor the states, and get fast notifications of the events. - rather in milliseconds than in seconds - basically the host should have parent of the guest pid notifying of a child process failure. - Host should have a virtual watch-dog noticing of a guest being stuck 2.4. Failure recovery time, how fast can OS bring up failed guest? - any measurements of time from failure to noticing it, and time that the guest is restarted and back up? 2.5. virtual HW manager (guest isolation) - Any plans to integrate a piece from which a state of guest could be reliably queried, e.g. guaranteeing that if I ask to power off another guest, it get's done in given time (millisecs), and not pending on e.g. some tcp timeout, and thus leading to split brain case of running two similar guest simultaneously. E.g. starting another guest to replace shut down one, but due some communications error the first one didn't really shut before the new one is already up. - should be able to reliably cut down the guests network and disk access to guarantee the above case 2.6. Shared disks - Could there be a shared scsi device concept for the legacy HW abstraction? - Qemu/KVM supports this, what would it take to make OS to understand such disk devices? 2.7. Isolation of redundant nodes - In some cases there are nodes that need to backup each others 2N, N+1, there should be a way to make sure they run on different host. - This project might be aiming for that? http://wiki.openstack.org/DistributedScheduler This was something from top of my head, it would be interesting to hear your thoughts about the issues. This need is coming from the telco world, which would need a telco-cloud with such more real-time features in it. Certainly the same applies to many other legacy environments too. BR, Ilkka Tengvall ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] running HA cluster of guests within openstack
On 04/13/2012 10:31 AM, ikke wrote: I likely am not the first one to ask this, but since I didn't find a thread about it I start one. Is there any shared experience available what are the capabilities of OpenStack to run cluster of guests in the cloud? Do you have experience of the following questions, or links to more info? The questions relate to running a legacy HA cluster in virtual env, and moving it into cloud... I'll just point out two early stage projects that used in combination can provide a HA solution. http://wiki.openstack.org/Heat http://wiki.openstack.org/ResourceMonitorAlertsandNotifications These are similar to AWS CloudFormations and CloudWatch respectively. cheers, Pádraig. ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] running HA cluster of guests within openstack
On Apr 13, 2012, at 4:31 AM, ikke wrote: 2.5. virtual HW manager (guest isolation) - Any plans to integrate a piece from which a state of guest could be reliably queried, e.g. guaranteeing that if I ask to power off another guest, it get's done in given time (millisecs), and not pending on e.g. some tcp timeout, and thus leading to split brain case of running two similar guest simultaneously. E.g. starting another guest to replace shut down one, but due some communications error the first one didn't really shut before the new one is already up. - should be able to reliably cut down the guests network and disk access to guarantee the above case This would be a huge win for clustering. Having a reliable and immediate STONITH capability within a virtual environment would be really handy for environments which have sensitive needs for shared storage (whether it's remote iscsi storage or DRBD). It would be relatively trivia to assemble a fencing daemon to make requests to the API to hard reboot a misbehaving member of a cluster. Good points! -- Major Hayden ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] running HA cluster of guests within openstack
On Fri, 2012-04-13 at 12:31 +0300, ikke wrote: 1. Private networks between guests - Doable now using Quantum 1.1. Defining VLANs visible to guest machines to separate clusters internal traffic, VLAN tags should not be stripped by host (QinQ) VLANs and Quantum private networks are pretty much the same thing, why would you want both? 1.2. Set pre-defined MAC addresses for the guests, needed by non-IP traffic within the guest cluster (layer2 addressing) - will Melange do this, according to docs it's not in plans? If you send the mac address to Melange when you create the interface it will record it for that instance: http://melange.readthedocs.org/en/latest/apidoc.html#interfaces Happy Hacking! 7-11 ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] running HA cluster of guests within openstack
Hi Ikke, great work! :-) Am 13.04.12 11:31, schrieb ikke: I likely am not the first one to ask this, but since I didn't find a thread about it I start one. Is there any shared experience available what are the capabilities of OpenStack to run cluster of guests in the cloud? Do you have experience of the following questions, or links to more info? The questions relate to running a legacy HA cluster in virtual env, and moving it into cloud... 1. Private networks between guests [...] BR, Ilkka Tengvall ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp I think, as Major pointed out already, that the biggest problem right now is that there is a certain lack of easy-to-use STONITH solutions to trigger STONITH events from within virtual machines. I have something cooking here using the latest version of Pacemaker; should this turn out to work, it would make many things a lot easier. I'll elaborate a little bit more on this once I have it working the way I want it. Concerning the general subject of virtual machines (and clustered VMs for that matter) within OpenStack, I think there is some stuff missing in Nova that would be necessary (granted -- in one way or another, it would be possible to make Pacemaker deal with VMs that have failed within Nova, but in my eyes, that'd be crazy). Nova knows what VMs are supposed to be there and Nova can find out which VMs are in fact running and which are not, so I think Nova should make sure that those VMs that are supposed to run are, well, running :) Best regards Martin -- Martin Gerhard Loschwitz Chief Brand Officer, Principal Consultant hastexo Professional Services CONFIDENTIALITY NOTICE: This e-mail and/or the accompanying documents are privileged and confidential under applicable law. The person who receives this message and who is not the addressee, one of his employees or an agent entitled to hand it over to the addressee, is informed that he may not use, disclose or reproduce the contents thereof. Should you have received this e-mail (or any copy thereof) in error, please let us know by telephone or e-mail without delay and delete the message from your system. Thank you. ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp