Re: [Openstack] HA Compute & Instance Evacuation

2018-05-14 Thread Patil, Tushar
Hi Torin,

>> Do you have a timetable on when the patch will be merged? If it is a 
>> relatively small window of time, I would rather wait to use
>> the patched mainline code.
You should be able to test masakari successfully as below three patches are 
already merged.

1. https://review.openstack.org/#/c/546492/15 - openstack/masakari-monitors (it 
doesn't use masakariclient any more)
2. https://review.openstack.org/#/c/567781/ - openstack/requirements 
(openstacksdk lower constraints updated to 0.13.0)
3. https://review.openstack.org/#/c/536653/ - openstack/masakari (change 
service-type from "ha" to "instance-ha".

If you are planning to install Openstack using latest devstack, then it will 
install openstacksdk 0.13.0 by default. No need to take any further action by 
yourself otherwise you need to ensure that you have correct version of 
openstacksdk (0.13.0) and also add masakari endpoint to use the correct 
service-type. Recommend to install latest masakari using devstack.

4. https://review.openstack.org/#/c/557634/2 - python-masakariclient (This 
patch needs to be merged ASAP)
If you are planning to use python-masakariclient to create failover segments or 
add hosts etc, then you will need to wait until this patch is merged. We need 
to update this patch to add correct version of openstacksdk in 
requirements.txt. We will merge this particular patch by tomorrow. But if you 
plan to add failover segment/hosts by calling RestFul API using curl or any 
other method, then probably you won't face any issues.

Regards,
Tushar Patil





From: Torin Woltjer <torin.wolt...@granddial.com>
Sent: Friday, May 11, 2018 11:46:05 PM
To: Patil, Tushar
Cc: jpetr...@coredial.com; openstack@lists.openstack.org
Subject: Re: [Openstack] HA Compute & Instance Evacuation

On Friday, May 11, 2018 12:40:58 AM EDT Patil, Tushar wrote:
> I think this is what is needed to make it work.
> Install openstacksdk version 0.13.0.
>
> Apply  patch: https://review.openstack.org/#/c/546492/
>
> In this patch ,we need to bump openstacksdk version from 0.11.2 to 0.13.0.
> We will merge above patch soon.

Do you have a timetable on when the patch will be merged? If it is a
relatively small window of time, I would rather wait to use the patched
mainline code. Otherwise, I am willing to try to work with the patch.
Additionally, patching python is something that I am not familiar with. Is
there a good resource on doing this?

You have been a great help so far, thanks again.


Disclaimer: This email and any attachments are sent in strictest confidence for 
the sole use of the addressee and may contain legally privileged,confidential, 
and proprietary data. If you are not the intended recipient,please advise the 
sender by replying promptly to this email and then delete and destroy this 
email and any attachments without any further use, copying or forwarding.

___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack


Re: [Openstack] HA Compute & Instance Evacuation

2018-05-11 Thread Torin Woltjer
On Friday, May 11, 2018 12:40:58 AM EDT Patil, Tushar wrote:
> I think this is what is needed to make it work.
> Install openstacksdk version 0.13.0.
> 
> Apply  patch: https://review.openstack.org/#/c/546492/
> 
> In this patch ,we need to bump openstacksdk version from 0.11.2 to 0.13.0.
> We will merge above patch soon.

Do you have a timetable on when the patch will be merged? If it is a 
relatively small window of time, I would rather wait to use the patched 
mainline code. Otherwise, I am willing to try to work with the patch. 
Additionally, patching python is something that I am not familiar with. Is 
there a good resource on doing this?

You have been a great help so far, thanks again.




___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack


Re: [Openstack] HA Compute & Instance Evacuation

2018-05-10 Thread Patil, Tushar
Hi Torin,

Presently, masakari-monitors is completely broken. Extremely sorry for the 
inconvenience.

I think this is what is needed to make it work.
Install openstacksdk version 0.13.0.

Apply  patch: https://review.openstack.org/#/c/546492/

In this patch ,we need to bump openstacksdk version from 0.11.2 to 0.13.0.
We will merge above patch soon.

Regards,
Tushar Patil



From: Torin Woltjer <torin.wolt...@granddial.com>
Sent: Thursday, May 10, 2018 11:08:58 PM
To: Patil, Tushar
Cc: jpetr...@coredial.com; openstack@lists.openstack.org
Subject: Re: [Openstack] HA Compute & Instance Evacuation

Hi Tushar,

I followed the documentation to set up the masakari monitors, after I
installed the masakari API. None of the monitor services seem to work. I keep
getting an error: "AttributeError: 'module' object has no attribute 'URI'"
Here is the full output: http://paste.openstack.org/show/720761/
Are you aware of what causes the issue? Can you provide any example configs
for a working masakari setup?


On Sunday, May 6, 2018 10:41:48 PM EDT Patil, Tushar wrote:
> Hi Torin,
>
> Masakari supports 4 different types of recovery methods at the time of
> creation of failover_segment.
>
> 1. auto: It will let nova decide on which compute host the instances should
> be evacuated.
>
> 2. reserved_host:  You will first need to add reserved hosts to the failover
> segments. Masakari engine will select the first available reserved host
> from the failover segment, enable compute service in nova and then use that
> reserved host to evacuate the instances from the failed compute host.
>
> 3. auto_priority: it will first try to evacuate instances using 'auto'
> recovery method, if it's fails then it attempts to evacuate using
> "reserved_host" recovery method.
>
> 4. rh_priority: It's opposite of above "auto_priority" recovery method. it
> will first try to evacuate instances using 'reserved_host' recovery method,
> if it's fails then it attempts to evacuate using "auto" recovery method.
>
> In your case you will need to use "auto" recovery method.
>
> Please refer to the below documentation links for more details.
>
> Masakari system architecture:
> https://docs.openstack.org/masakari/latest/
>
> Masakari api-ref:
> https://developer.openstack.org/api-ref/instance-ha/
>
> To install masakari-monitors with pacemaker/corosync:
> https://review.openstack.org/#/c/489095/6/doc/source/install_and_configure_d
> ebian.rst
>
> Other ways to reach us: Masakari weekly meeting on #openstack-meeting IRC
> channel on every Tuesday at 0400 UTC or else you can post your queries on
> #openstack-masakari IRC channel.
>
> Regards,
> Tushar




Disclaimer: This email and any attachments are sent in strictest confidence for 
the sole use of the addressee and may contain legally privileged,confidential, 
and proprietary data. If you are not the intended recipient,please advise the 
sender by replying promptly to this email and then delete and destroy this 
email and any attachments without any further use, copying or forwarding.

___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack


Re: [Openstack] HA Compute & Instance Evacuation

2018-05-07 Thread Pablo Iranzo Gómez

+++ Torin Woltjer [02/05/18 20:39 +]:

There is no HA behaviour for compute nodes.

You are referring to HA of workloads running on compute nodes, not HA of
compute nodes themselves.

It was a mistake for me to say HA when referring to compute and instances. 
Really I want to avoid a situation where one of my compute hosts gives up the 
ghost, and all of the instances are offline until someone reboots them on a 
different host. I would like them to automatically reboot on a healthy compute 
node.


Check out Masakari:

https://wiki.openstack.org/wiki/Masakari

This looks like the kind of thing I'm searching for.

I'm seeing 3 components here, I'm assuming one goes on compute hosts and one or 
both of the others go on the control nodes? Is there any documentation 
outlining the procedure for deploying this? Will there be any problem running 
the Masakari API service on 2 machines simultaneously, sitting behind HAProxy?




Check for 'Instance HA':

https://blueprints.launchpad.net/tripleo/+spec/instance-ha

Which more or less came with:

https://github.com/beekhof/osp-ha-deploy/blob/master/pcmk/compute-managed.scenario
https://github.com/beekhof/osp-ha-deploy/blob/master/pcmk/controller-managed.scenario

Ansible scripts are at git://github.com/redhat-openstack/tripleo-quickstart-utils 


And enabled via: ansible-playbook
/home/stack/ansible-instanceha/playbooks/overcloud-instance-ha.yml \
-e release="RELEASE"


This of course requires a valid HA deployment setup on the controllers
(usually tripleO or OSP Director).

Regards,
Pablo






___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack



--

Pablo Iranzo Gómez (pablo.ira...@redhat.com)  GnuPG: 0x5BD8E1E4
Principal Software Maintenance Engineer - OpenStackiranzo @ IRC
RHC{A,SS,DS,VA,E,SA,SP,AOSP}, JBCAA#110-215-852RHCA Level V

Blog: https://iranzo.github.io   Citellus: https://citellus.org


signature.asc
Description: PGP signature
___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack


Re: [Openstack] HA Compute & Instance Evacuation

2018-05-06 Thread Patil, Tushar
Hi Torin,

Masakari supports 4 different types of recovery methods at the time of creation 
of failover_segment.

1. auto: It will let nova decide on which compute host the instances should be 
evacuated.

2. reserved_host:  You will first need to add reserved hosts to the failover 
segments. Masakari engine will select the first available reserved host from 
the failover segment, enable compute service in nova and then use that reserved 
host to evacuate the instances from the failed compute host.

3. auto_priority: it will first try to evacuate instances using 'auto' recovery 
method, if it's fails then it attempts to evacuate using "reserved_host" 
recovery method.

4. rh_priority: It's opposite of above "auto_priority" recovery method. it will 
first try to evacuate instances using 'reserved_host' recovery method, if it's 
fails then it attempts to evacuate using "auto" recovery method.

In your case you will need to use "auto" recovery method.

Please refer to the below documentation links for more details.

Masakari system architecture:
https://docs.openstack.org/masakari/latest/

Masakari api-ref:
https://developer.openstack.org/api-ref/instance-ha/

To install masakari-monitors with pacemaker/corosync:
https://review.openstack.org/#/c/489095/6/doc/source/install_and_configure_debian.rst

Other ways to reach us: Masakari weekly meeting on #openstack-meeting IRC 
channel on every Tuesday at 0400 UTC or else you can post your queries on 
#openstack-masakari IRC channel.

Regards,
Tushar


From: Torin Woltjer <torin.wolt...@granddial.com>
Sent: Saturday, May 5, 2018 3:43:05 AM
To: jpetr...@coredial.com
Cc: openstack@lists.openstack.org
Subject: Re: [Openstack] HA Compute & Instance Evacuation

Thank you very much for the information. Just for clarification, when you say 
reserved hosts, do you mean that I must keep unloaded virtualization hosts in 
reserve? Or can Masakari move instances from a downed host to an already loaded 
host that has open capacity?
Disclaimer: This email and any attachments are sent in strictest confidence for 
the sole use of the addressee and may contain legally privileged,confidential, 
and proprietary data. If you are not the intended recipient,please advise the 
sender by replying promptly to this email and then delete and destroy this 
email and any attachments without any further use, copying or forwarding.

___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack


Re: [Openstack] HA Compute & Instance Evacuation

2018-05-04 Thread Torin Woltjer
Thank you very much for the information. Just for clarification, when you say 
reserved hosts, do you mean that I must keep unloaded virtualization hosts in 
reserve? Or can Masakari move instances from a downed host to an already loaded 
host that has open capacity?


___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack


Re: [Openstack] HA Compute & Instance Evacuation

2018-05-02 Thread John Petrini
Take this with a grain of salt because we're using the original version
before the project moved under the Big Tent and I'm not sure how much it's
evolved since then. I assume the basic functions are the same though.

You're correct; Corosync and Pacemaker are used to determine if a compute
node goes down. The masakari-host-monitor process runs on each compute node
and checks the cluster status and sends a notification to
masakari-controller when a node goes down. The controller process keeps a
list of reserved hosts in it's database and calls nova host-evacuate to
move the Instances to one of the reserved hosts.

In our environment I also configured STONITH and I'd highly recommend it.
With STONITH Pacemaker sends a shutdown command to the Out of Band
Management card of the unreachable node to make sure that it can't come
back and cause a conflict.

There are two other components, masakari-process-monitor and
masakari-instance-monitor. These also run on your compute nodes. The former
watches the nova-compute service and the later monitors running instances
and restarts them if necessary.

Looking here it seems they've split Masakari into thee different repos:
https://github.com/openstack?utf8=%E2%9C%93=masakari==

masakari - The controller service and API
masakari-monitors - Compute node monitoring services
python-masakari-client - The cli tools
___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack


Re: [Openstack] HA Compute & Instance Evacuation

2018-05-02 Thread Torin Woltjer
I'm vaguely familiar with Pacemaker/Corosync, as I'm using it with HAProxy on 
my controller nodes. I'm assuming in this instance that you use Pacemaker on 
your compute hosts so masakari can detect host outages? If possible could you 
go into more detail about the configuration? I would like to use Masakari and 
I'm having trouble finding a step by step or other documentation to get started 
with.


___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack


Re: [Openstack] HA Compute & Instance Evacuation

2018-05-02 Thread Torin Woltjer
> There is no HA behaviour for compute nodes.
>
> You are referring to HA of workloads running on compute nodes, not HA of
> compute nodes themselves.
It was a mistake for me to say HA when referring to compute and instances. 
Really I want to avoid a situation where one of my compute hosts gives up the 
ghost, and all of the instances are offline until someone reboots them on a 
different host. I would like them to automatically reboot on a healthy compute 
node.

> Check out Masakari:
>
> https://wiki.openstack.org/wiki/Masakari
This looks like the kind of thing I'm searching for.

I'm seeing 3 components here, I'm assuming one goes on compute hosts and one or 
both of the others go on the control nodes? Is there any documentation 
outlining the procedure for deploying this? Will there be any problem running 
the Masakari API service on 2 machines simultaneously, sitting behind HAProxy?

___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack


Re: [Openstack] HA Compute & Instance Evacuation

2018-05-02 Thread John Petrini
We're using the original Masakari project for this and it works really
well. In fact just last week we lost a compute node and all of VM's were
successfully migrated to a reserve host in under 5 minutes. It's a really
nice feeling when your infrastructure heals itself before you even get a
chance to start troubleshooting.

It does require a good deal of configuration to get it up and running,
especially the clustering with Pacemaker/Corosync so be prepared to get
familiar with those tools and STONITH if you're not already. Worth it if
some of your infrastructure doesn't have redundancy built in at higher
level.
___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack


Re: [Openstack] HA Compute & Instance Evacuation

2018-05-02 Thread Jay Pipes

On 05/02/2018 02:43 PM, Torin Woltjer wrote:

I am working on setting up Openstack for HA and one of the last orders of
business is getting HA behavior out of the compute nodes.


There is no HA behaviour for compute nodes.


Is there a project that will automatically evacuate instances from a
downed or failed compute host, and automatically reboot them on their
new host?

Check out Masakari:

https://wiki.openstack.org/wiki/Masakari


I'm curious what suggestions people have about this, or whatever
advice you might have. Is there a best way of getting this
functionality, or anything else I should be aware of?


You are referring to HA of workloads running on compute nodes, not HA of 
compute nodes themselves.


My advice would be to install Kubernetes on one or more VMs (with the 
VMs acting as Kubernetes nodes) and use that project's excellent 
orchestrator for daemonsets/statefulsets which is essentially the use 
case you are describing.


The OpenStack Compute API (implemented in Nova) is not an orchestration 
API. It's a low-level infrastructure API for executing basic actions on 
compute resources.


Best,
-jay

___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack


[Openstack] HA Compute & Instance Evacuation

2018-05-02 Thread Torin Woltjer
I am working on setting up Openstack for HA and one of the last orders of 
business is getting HA behavior out of the compute nodes. Is there a project 
that will automatically evacuate instances from a downed or failed compute 
host, and automatically reboot them on their new host? I'm curious what 
suggestions people have about this, or whatever advice you might have. Is 
there a best way of getting this functionality, or anything else I should be 
aware of?

Thanks,




___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack