Re: [Openstack] running HA cluster of guests within openstack

2012-06-27 Thread Pádraig Brady
On 04/13/2012 12:53 PM, Pádraig Brady wrote:
 On 04/13/2012 10:31 AM, ikke wrote:
 I likely am not the first one to ask this, but since I didn't find a
 thread about it I start one.

 Is there any shared experience available what are the capabilities of
 OpenStack to run cluster of guests in the cloud? Do you have
 experience of the following questions, or links to more info? The
 questions relate to running a legacy HA cluster in virtual env, and
 moving it into cloud...
 
 I'll just point out two early stage projects
 that used in combination can provide a HA solution.
 
 http://wiki.openstack.org/Heat
 http://wiki.openstack.org/ResourceMonitorAlertsandNotifications
 
 These are similar to AWS CloudFormations and CloudWatch respectively.

I notice Heat V4 has just been released.

Here is some additional info on High Availability:
https://github.com/heat-api/heat/wiki/Roadmap-Feature:-High-Availability

and some notes on using it in its current form:
https://github.com/heat-api/heat/wiki/Using-HA

cheers,
Pádraig.

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] running HA cluster of guests within openstack

2012-04-16 Thread ikke
On Fri, Apr 13, 2012 at 5:45 PM, Jason Kölker jkoel...@rackspace.com wrote:
 On Fri, 2012-04-13 at 12:31 +0300, ikke wrote:

 1. Private networks between guests
   - Doable now using Quantum
 1.1. Defining VLANs visible to guest machines to separate clusters
 internal traffic,
        VLAN tags should not be stripped by host (QinQ)

 VLANs and Quantum private networks are pretty much the same thing, why
 would you want both?

For legacy reasons. The cluster at the moment handles the cluster
internal network with VLANs, and for such the cloud layer should just
virtualize the HW functionality. It would need to provide the VLAN
layer for guests for the time being until the guest could be modified
not to require it and handle VLAN network configuration via OpenStack
interfaces instead.

Some of the questions are due the legacy need. OpenStack would offer
similar functionality, but if you intend to bring a legacy apps as
such into cloud, there is plenty of modifications needed to adapt the
legacy SW into cloud concepts. Adaptation takes time, and in some
cases it might be cheaper  faster to adapt the cloud layer to provide
legacy HW as virtualized, HW abstraction layer.

While talking about legacy SW, I mean HUGE amount of code written over
decades, which is not easily modifiable.

 1.2. Set pre-defined MAC addresses for the guests, needed by non-IP
        traffic within the guest cluster (layer2 addressing)
 If you send the mac address to Melange when you create the interface it
 will record it for that instance:

 http://melange.readthedocs.org/en/latest/apidoc.html#interfaces

Thanks for the link, it is exactly what I was looking for!

 -it

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] running HA cluster of guests within openstack

2012-04-16 Thread ikke
On Fri, Apr 13, 2012 at 2:53 PM, Pádraig Brady p...@draigbrady.com wrote:
 On 04/13/2012 10:31 AM, ikke wrote:
 I'll just point out two early stage projects
 that used in combination can provide a HA solution.

 http://wiki.openstack.org/Heat
 http://wiki.openstack.org/ResourceMonitorAlertsandNotifications
 cheers,
 Pádraig.

Thanks for the links, I'll look into them. It looks good having a
pluggable monitoring interface. By a quick look I don't see how do the
local driver connect to libvirt, is the alert notified in fast manner
or based on periodic polling. I need to take a further look into it.

Hopefully there could be local HW watchdog emulated in Qemu that would
somehow be connected to the plugin framework to allow fast reaction
times to guest being stuck.

Also, it would make sense to have some kind of a local decision done
immediately about the reboot of a stuck  guest, instead of taking time
to report it centrally and wait for the central manager decision.

cheers,
Ilkka

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] running HA cluster of guests within openstack

2012-04-16 Thread ikke
One item more into HA features, hot plugging.

2.8. Hot plug pre-warning events.
- Nova should tell the registered client that a node/guest is going to
be shutoff,
  and the remote entry would be given time to ack that.

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


[Openstack] running HA cluster of guests within openstack

2012-04-13 Thread ikke
I likely am not the first one to ask this, but since I didn't find a
thread about it I start one.

Is there any shared experience available what are the capabilities of
OpenStack to run cluster of guests in the cloud? Do you have
experience of the following questions, or links to more info? The
questions relate to running a legacy HA cluster in virtual env, and
moving it into cloud...

1. Private networks between guests
  - Doable now using Quantum
1.1. Defining VLANs visible to guest machines to separate clusters
internal traffic,
   VLAN tags should not be stripped by host (QinQ)
1.2. Set pre-defined MAC addresses for the guests, needed by non-IP
   traffic within the guest cluster (layer2 addressing)
  - will Melange do this, according to docs it's not in plans?
2. HA capabilities
2.1. Failure notification times need to be fast, i.e. no tcp timeout allowed
  - there seems to be some activity to integrate pacemaker
2.2. Failure notification of both guests and hosts needs to be included
2.3. Guest cluster controller should be able to monitor the states,
  and get fast notifications of the events.
  - rather in milliseconds than in seconds
  - basically the host should have parent of the guest pid notifying
of a child process failure.
  - Host should have a virtual watch-dog noticing of a guest being stuck
2.4. Failure recovery time, how fast can OS bring up failed guest?
  - any measurements of time from failure to noticing it,
and time that the guest is restarted and back up?
2.5. virtual HW manager (guest isolation)
  - Any plans to integrate a piece from which a state of guest could
be reliably queried, e.g. guaranteeing that if I ask to power
off another
guest, it get's done in given time (millisecs), and not
pending on e.g. some tcp
timeout, and thus leading to split brain case of running two
similar guest
simultaneously. E.g. starting another guest to replace shut
down one, but
due some communications error the first one didn't really shut
before the
new one is already up.
 - should be able to reliably cut down the guests network and disk access to
   guarantee the above case
2.6. Shared disks
 - Could there be a shared scsi device concept for the legacy HW
abstraction?
 - Qemu/KVM supports this, what would it take to make OS to understand
   such disk devices?
2.7. Isolation of redundant nodes
 - In some cases there are nodes that need to backup each others 2N, N+1,
   there should be a way to make sure they run on different host.
 - This project might be aiming for that?
http://wiki.openstack.org/DistributedScheduler

This was something from top of my head, it would be interesting to
hear your thoughts about the issues. This need is coming from the
telco world, which would need a telco-cloud with such more real-time
features in it. Certainly the same applies to many other legacy
environments too.

BR,

 Ilkka Tengvall

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] running HA cluster of guests within openstack

2012-04-13 Thread Pádraig Brady
On 04/13/2012 10:31 AM, ikke wrote:
 I likely am not the first one to ask this, but since I didn't find a
 thread about it I start one.
 
 Is there any shared experience available what are the capabilities of
 OpenStack to run cluster of guests in the cloud? Do you have
 experience of the following questions, or links to more info? The
 questions relate to running a legacy HA cluster in virtual env, and
 moving it into cloud...

I'll just point out two early stage projects
that used in combination can provide a HA solution.

http://wiki.openstack.org/Heat
http://wiki.openstack.org/ResourceMonitorAlertsandNotifications

These are similar to AWS CloudFormations and CloudWatch respectively.

cheers,
Pádraig.

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] running HA cluster of guests within openstack

2012-04-13 Thread Major Hayden
On Apr 13, 2012, at 4:31 AM, ikke wrote:

 2.5. virtual HW manager (guest isolation)
  - Any plans to integrate a piece from which a state of guest could
be reliably queried, e.g. guaranteeing that if I ask to power
 off another
guest, it get's done in given time (millisecs), and not
 pending on e.g. some tcp
timeout, and thus leading to split brain case of running two
 similar guest
simultaneously. E.g. starting another guest to replace shut
 down one, but
due some communications error the first one didn't really shut
 before the
new one is already up.
 - should be able to reliably cut down the guests network and disk access 
 to
   guarantee the above case

This would be a huge win for clustering.

Having a reliable and immediate STONITH capability within a virtual environment 
would be really handy for environments which have sensitive needs for shared 
storage (whether it's remote iscsi storage or DRBD).  It would be relatively 
trivia to assemble a fencing daemon to make requests to the API to hard reboot 
a misbehaving member of a cluster.

Good points!

-- 
Major Hayden
___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] running HA cluster of guests within openstack

2012-04-13 Thread Jason Kölker
On Fri, 2012-04-13 at 12:31 +0300, ikke wrote:

 1. Private networks between guests
   - Doable now using Quantum
 1.1. Defining VLANs visible to guest machines to separate clusters
 internal traffic,
VLAN tags should not be stripped by host (QinQ)

VLANs and Quantum private networks are pretty much the same thing, why
would you want both?

 1.2. Set pre-defined MAC addresses for the guests, needed by non-IP
traffic within the guest cluster (layer2 addressing)
   - will Melange do this, according to docs it's not in plans?

If you send the mac address to Melange when you create the interface it
will record it for that instance:

http://melange.readthedocs.org/en/latest/apidoc.html#interfaces

Happy Hacking!

7-11


___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] running HA cluster of guests within openstack

2012-04-13 Thread Martin Gerhard Loschwitz
Hi Ikke,

great work! :-)

Am 13.04.12 11:31, schrieb ikke:
 I likely am not the first one to ask this, but since I didn't find a
 thread about it I start one.
 
 Is there any shared experience available what are the capabilities of
 OpenStack to run cluster of guests in the cloud? Do you have
 experience of the following questions, or links to more info? The
 questions relate to running a legacy HA cluster in virtual env, and
 moving it into cloud...
 
 1. Private networks between guests
 [...]
 
 BR,
 
  Ilkka Tengvall
 
 ___
 Mailing list: https://launchpad.net/~openstack
 Post to : openstack@lists.launchpad.net
 Unsubscribe : https://launchpad.net/~openstack
 More help   : https://help.launchpad.net/ListHelp

I think, as Major pointed out already, that the biggest problem right now
is that there is a certain lack of easy-to-use STONITH solutions to trigger
STONITH events from within virtual machines. I have something cooking here
using the latest version of Pacemaker; should this turn out to work, it
would make many things a lot easier. I'll elaborate a little bit more on
this once I have it working the way I want it.

Concerning the general subject of virtual machines (and clustered VMs for
that matter) within OpenStack, I think there is some stuff missing in Nova
that would be necessary (granted -- in one way or another, it would be
possible to make Pacemaker deal with VMs that have failed within Nova, but
in my eyes, that'd be crazy). Nova knows what VMs are supposed to be there
and Nova can find out which VMs are in fact running and which are not, so
I think Nova should make sure that those VMs that are supposed to run are,
well, running :)

Best regards
Martin

-- 
Martin Gerhard Loschwitz
Chief Brand Officer, Principal Consultant
hastexo Professional Services

CONFIDENTIALITY NOTICE: This e-mail and/or the accompanying documents
are privileged and confidential under applicable law. The person who
receives this message and who is not the addressee, one of his employees
or an agent entitled to hand it over to the addressee, is informed that
he may not use, disclose or reproduce the contents thereof. Should you
have received this e-mail (or any copy thereof) in error, please let us
know by telephone or e-mail without delay and delete the message from
your system. Thank you.

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp