Re: [Openstack-operators] Remote pacemaker on coHi mpute nodes
Hello, adding remote compute with: pcs resource create computenode1 remote server=10.102.184.91 instead of: pcs resource create computenode1 ocf:pacemaker:remote reconnect_interval=60 op monitor interval=20 SOLVES the issue when unexpected compute node reboot happens. It returns online and works fine. Thank all for help Regards 2017-05-13 16:55 GMT+02:00 Sam P: > Hi, > > This might not what exactly you are looking for... but... you may extend > this. > In Masakari [0], we use pacemaker-remote in masakari-monitors[1] to > monitor node failures. > In [1], there is hostmonitor.sh, which will gonna deprecate in next > cycle, but straightforward way to do this. > [0] https://wiki.openstack.org/wiki/Masakari > [1] https://github.com/openstack/masakari-monitors/tree/master/ > masakarimonitors/hostmonitor > > Then there is pacemaker-resources agents, > https://github.com/openstack/openstack-resource-agents/tree/master/ocf > > > I have already tried "pcs resource cleanup" but it cleans fine all > resources > > but not remote nodes. > > Anycase on monday I'll send what you requested. > Hope we can get more details on Monday. > > --- Regards, > Sampath > > > > On Sat, May 13, 2017 at 9:52 PM, Ignazio Cassano > wrote: > > Thanks Curtis. > > I have already tried "pcs resource cleanup" but it cleans fine all > resources > > but not remote nodes. > > Anycase on monday I'll send what you requested. > > Regards > > Ignazio > > > > Il 13/Mag/2017 14:27, "Curtis" ha scritto: > > > > On Fri, May 12, 2017 at 10:23 PM, Ignazio Cassano > > wrote: > >> Hi Curtis, at this time I am using remote pacemaker only for controlli > ng > >> openstack services on compute nodes (neutron openvswitch-agent, > >> nova-compute, ceilometer compute). I wrote my own ansible playbooks to > >> install and configure all components. > >> Second step could be expand it for vm high availability. > >> I did not find any procedure for cleaning up compute node after > rebooting > >> and I googled a lot without luck. > > > > Can you paste some putput of something like "pcs status" and I can try > > to take a look? > > > > I've only used pacemaker a little, but I'm fairly sure it's going to > > be something like "pcs resource cleanup " > > > > Thanks, > > Curtis. > > > >> Regards > >> Ignazio > >> > >> Il 13/Mag/2017 00:32, "Curtis" ha scritto: > >> > >> On Fri, May 12, 2017 at 8:51 AM, Ignazio Cassano > >> wrote: > >>> Hello All, > >>> I installed openstack newton p > >>> with a pacemaker cluster made up of 3 controllers and 2 compute nodes. > >>> All > >>> computer have centos 7.3. > >>> Compute nodes are provided with remote pacemaker ocf resource. > >>> If before shutting down a compute node I disable the compute node > >>> resource > >>> in the cluster and enable it when the compute returns up, it work fine > >>> and > >>> cluster shows it online. > >>> If the compute node goes down before disabling the compute node > resource > >>> in > >>> the cluster, it remains offline also after it is powered up. > >>> The only solution I found is removing the compute node resource in the > >>> cluster and add it again with a different name (adding this new name in > >>> all > >>> controllers /etc/hosts file). > >>> With the above workaround it returns online for the cluster and all its > >>> resources (openstack-nova-compute etc etc) return to work fine. > >>> Please, does anyone know a better solution ? > >> > >> What are you using pacemaker for on the compute nodes? I have not done > >> that personally, but my impression is that sometimes people do that in > >> order to have virtual machines restarted somewhere else should the > >> compute node go down outside of a maintenance window (ie. "instance > >> high availability"). Is that your use case? If so, I would imagine > >> there is some kind of clean up procedure to put the compute node back > >> into use when pacemaker thinks it has failed. Did you use some kind of > >> openstack distribution or follow a particular installation document to > >> enable this pacemaker setup? > >> > >> It sounds like everything is working as expected (if my guess is > >> right) and you just need the right steps to bring the node back into > >> the cluster. > >> > >> Thanks, > >> Curtis. > >> > >> > >>> Regards > >>> Ignazio > >>> > >>> > >>> ___ > >>> OpenStack-operators mailing list > >>> OpenStack-operators@lists.openstack.org > >>> http://lists.openstack.org/cgi-bin/mailman/listinfo/ > openstack-operators > >>> > >> > >> > >> > >> -- > >> Blog: serverascode.com > >> > >> > > > > > > > > -- > > Blog: serverascode.com > > > > > > > > ___ > > OpenStack-operators mailing list > > OpenStack-operators@lists.openstack.org > >
Re: [Openstack-operators] Remote pacemaker on coHi mpute nodes
Hello again, meanwile I was connected to my lab, I read some documentation. Adding remote compute with the following command: pcs resource create computenode1 remote server=10.102.184.91 pcs property set --node computenode1 osprole=compute instead of: pcs resource create compute-1 ocf:pacemaker:remote reconnect_interval=60 op monitor interval=20 pcs property set --node compute-1 osprole=compute I can avoid to insert aliases every time an unexpected compute reboot happens. Anycase I would prefer to avoid this workaround. I expect remote compute resource returns online like any other resources in case of failure. Regards Ignazio 2017-05-13 16:55 GMT+02:00 Sam P: > Hi, > > This might not what exactly you are looking for... but... you may extend > this. > In Masakari [0], we use pacemaker-remote in masakari-monitors[1] to > monitor node failures. > In [1], there is hostmonitor.sh, which will gonna deprecate in next > cycle, but straightforward way to do this. > [0] https://wiki.openstack.org/wiki/Masakari > [1] https://github.com/openstack/masakari-monitors/tree/master/ > masakarimonitors/hostmonitor > > Then there is pacemaker-resources agents, > https://github.com/openstack/openstack-resource-agents/tree/master/ocf > > > I have already tried "pcs resource cleanup" but it cleans fine all > resources > > but not remote nodes. > > Anycase on monday I'll send what you requested. > Hope we can get more details on Monday. > > --- Regards, > Sampath > > > > On Sat, May 13, 2017 at 9:52 PM, Ignazio Cassano > wrote: > > Thanks Curtis. > > I have already tried "pcs resource cleanup" but it cleans fine all > resources > > but not remote nodes. > > Anycase on monday I'll send what you requested. > > Regards > > Ignazio > > > > Il 13/Mag/2017 14:27, "Curtis" ha scritto: > > > > On Fri, May 12, 2017 at 10:23 PM, Ignazio Cassano > > wrote: > >> Hi Curtis, at this time I am using remote pacemaker only for controlli > ng > >> openstack services on compute nodes (neutron openvswitch-agent, > >> nova-compute, ceilometer compute). I wrote my own ansible playbooks to > >> install and configure all components. > >> Second step could be expand it for vm high availability. > >> I did not find any procedure for cleaning up compute node after > rebooting > >> and I googled a lot without luck. > > > > Can you paste some putput of something like "pcs status" and I can try > > to take a look? > > > > I've only used pacemaker a little, but I'm fairly sure it's going to > > be something like "pcs resource cleanup " > > > > Thanks, > > Curtis. > > > >> Regards > >> Ignazio > >> > >> Il 13/Mag/2017 00:32, "Curtis" ha scritto: > >> > >> On Fri, May 12, 2017 at 8:51 AM, Ignazio Cassano > >> wrote: > >>> Hello All, > >>> I installed openstack newton p > >>> with a pacemaker cluster made up of 3 controllers and 2 compute nodes. > >>> All > >>> computer have centos 7.3. > >>> Compute nodes are provided with remote pacemaker ocf resource. > >>> If before shutting down a compute node I disable the compute node > >>> resource > >>> in the cluster and enable it when the compute returns up, it work fine > >>> and > >>> cluster shows it online. > >>> If the compute node goes down before disabling the compute node > resource > >>> in > >>> the cluster, it remains offline also after it is powered up. > >>> The only solution I found is removing the compute node resource in the > >>> cluster and add it again with a different name (adding this new name in > >>> all > >>> controllers /etc/hosts file). > >>> With the above workaround it returns online for the cluster and all its > >>> resources (openstack-nova-compute etc etc) return to work fine. > >>> Please, does anyone know a better solution ? > >> > >> What are you using pacemaker for on the compute nodes? I have not done > >> that personally, but my impression is that sometimes people do that in > >> order to have virtual machines restarted somewhere else should the > >> compute node go down outside of a maintenance window (ie. "instance > >> high availability"). Is that your use case? If so, I would imagine > >> there is some kind of clean up procedure to put the compute node back > >> into use when pacemaker thinks it has failed. Did you use some kind of > >> openstack distribution or follow a particular installation document to > >> enable this pacemaker setup? > >> > >> It sounds like everything is working as expected (if my guess is > >> right) and you just need the right steps to bring the node back into > >> the cluster. > >> > >> Thanks, > >> Curtis. > >> > >> > >>> Regards > >>> Ignazio > >>> > >>> > >>> ___ > >>> OpenStack-operators mailing list > >>> OpenStack-operators@lists.openstack.org > >>> http://lists.openstack.org/cgi-bin/mailman/listinfo/ > openstack-operators >
Re: [Openstack-operators] Remote pacemaker on coHi mpute nodes
Hello, this morning I connected to my office by remote to send information you requested. Attached here there are: status: results of command pcs status resources: results of command pcs resources hosts: controllers /etc/hosts where I added aliases for compute nodes every time I rebooted a compute node simulating an unexpected reboot As you can see last simulation was on compute-node1. Infact it is marked offline but its remote pacemaker service is online. [root@compute-1 ~]# systemctl status pacemaker_remote.service ● pacemaker_remote.service - Pacemaker Remote Service Loaded: loaded (/usr/lib/systemd/system/pacemaker_remote.service; enabled; vendor preset: disabled) Active: active (running) since ven 2017-05-12 09:30:08 EDT; 1 day 20h ago Docs: man:pacemaker_remoted http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html/Pacemaker_Remote/index.html Main PID: 3756 (pacemaker_remot) CGroup: /system.slice/pacemaker_remote.service └─3756 /usr/sbin/pacemaker_remoted mag 12 09:30:08 compute-1 systemd[1]: Started Pacemaker Remote Service. mag 12 09:30:08 compute-1 systemd[1]: Starting Pacemaker Remote Service... mag 12 09:30:08 compute-1 pacemaker_remoted[3756]: notice: Additional loggi... mag 12 09:30:08 compute-1 pacemaker_remoted[3756]: notice: Starting a tls l... mag 12 09:30:08 compute-1 pacemaker_remoted[3756]: notice: Listening on add... Hint: Some lines were ellipsized, use -l to show in full. Regards Ignazio 2017-05-13 16:55 GMT+02:00 Sam P: > Hi, > > This might not what exactly you are looking for... but... you may extend > this. > In Masakari [0], we use pacemaker-remote in masakari-monitors[1] to > monitor node failures. > In [1], there is hostmonitor.sh, which will gonna deprecate in next > cycle, but straightforward way to do this. > [0] https://wiki.openstack.org/wiki/Masakari > [1] https://github.com/openstack/masakari-monitors/tree/master/ > masakarimonitors/hostmonitor > > Then there is pacemaker-resources agents, > https://github.com/openstack/openstack-resource-agents/tree/master/ocf > > > I have already tried "pcs resource cleanup" but it cleans fine all > resources > > but not remote nodes. > > Anycase on monday I'll send what you requested. > Hope we can get more details on Monday. > > --- Regards, > Sampath > > > > On Sat, May 13, 2017 at 9:52 PM, Ignazio Cassano > wrote: > > Thanks Curtis. > > I have already tried "pcs resource cleanup" but it cleans fine all > resources > > but not remote nodes. > > Anycase on monday I'll send what you requested. > > Regards > > Ignazio > > > > Il 13/Mag/2017 14:27, "Curtis" ha scritto: > > > > On Fri, May 12, 2017 at 10:23 PM, Ignazio Cassano > > wrote: > >> Hi Curtis, at this time I am using remote pacemaker only for controlli > ng > >> openstack services on compute nodes (neutron openvswitch-agent, > >> nova-compute, ceilometer compute). I wrote my own ansible playbooks to > >> install and configure all components. > >> Second step could be expand it for vm high availability. > >> I did not find any procedure for cleaning up compute node after > rebooting > >> and I googled a lot without luck. > > > > Can you paste some putput of something like "pcs status" and I can try > > to take a look? > > > > I've only used pacemaker a little, but I'm fairly sure it's going to > > be something like "pcs resource cleanup " > > > > Thanks, > > Curtis. > > > >> Regards > >> Ignazio > >> > >> Il 13/Mag/2017 00:32, "Curtis" ha scritto: > >> > >> On Fri, May 12, 2017 at 8:51 AM, Ignazio Cassano > >> wrote: > >>> Hello All, > >>> I installed openstack newton p > >>> with a pacemaker cluster made up of 3 controllers and 2 compute nodes. > >>> All > >>> computer have centos 7.3. > >>> Compute nodes are provided with remote pacemaker ocf resource. > >>> If before shutting down a compute node I disable the compute node > >>> resource > >>> in the cluster and enable it when the compute returns up, it work fine > >>> and > >>> cluster shows it online. > >>> If the compute node goes down before disabling the compute node > resource > >>> in > >>> the cluster, it remains offline also after it is powered up. > >>> The only solution I found is removing the compute node resource in the > >>> cluster and add it again with a different name (adding this new name in > >>> all > >>> controllers /etc/hosts file). > >>> With the above workaround it returns online for the cluster and all its > >>> resources (openstack-nova-compute etc etc) return to work fine. > >>> Please, does anyone know a better solution ? > >> > >> What are you using pacemaker for on the compute nodes? I have not done > >> that personally, but my impression is that sometimes people do that in > >> order to have virtual machines restarted somewhere else should the > >> compute node go down outside of a
Re: [Openstack-operators] Remote pacemaker on coHi mpute nodes
Hi, This might not what exactly you are looking for... but... you may extend this. In Masakari [0], we use pacemaker-remote in masakari-monitors[1] to monitor node failures. In [1], there is hostmonitor.sh, which will gonna deprecate in next cycle, but straightforward way to do this. [0] https://wiki.openstack.org/wiki/Masakari [1] https://github.com/openstack/masakari-monitors/tree/master/masakarimonitors/hostmonitor Then there is pacemaker-resources agents, https://github.com/openstack/openstack-resource-agents/tree/master/ocf > I have already tried "pcs resource cleanup" but it cleans fine all resources > but not remote nodes. > Anycase on monday I'll send what you requested. Hope we can get more details on Monday. --- Regards, Sampath On Sat, May 13, 2017 at 9:52 PM, Ignazio Cassanowrote: > Thanks Curtis. > I have already tried "pcs resource cleanup" but it cleans fine all resources > but not remote nodes. > Anycase on monday I'll send what you requested. > Regards > Ignazio > > Il 13/Mag/2017 14:27, "Curtis" ha scritto: > > On Fri, May 12, 2017 at 10:23 PM, Ignazio Cassano > wrote: >> Hi Curtis, at this time I am using remote pacemaker only for controlli ng >> openstack services on compute nodes (neutron openvswitch-agent, >> nova-compute, ceilometer compute). I wrote my own ansible playbooks to >> install and configure all components. >> Second step could be expand it for vm high availability. >> I did not find any procedure for cleaning up compute node after rebooting >> and I googled a lot without luck. > > Can you paste some putput of something like "pcs status" and I can try > to take a look? > > I've only used pacemaker a little, but I'm fairly sure it's going to > be something like "pcs resource cleanup " > > Thanks, > Curtis. > >> Regards >> Ignazio >> >> Il 13/Mag/2017 00:32, "Curtis" ha scritto: >> >> On Fri, May 12, 2017 at 8:51 AM, Ignazio Cassano >> wrote: >>> Hello All, >>> I installed openstack newton p >>> with a pacemaker cluster made up of 3 controllers and 2 compute nodes. >>> All >>> computer have centos 7.3. >>> Compute nodes are provided with remote pacemaker ocf resource. >>> If before shutting down a compute node I disable the compute node >>> resource >>> in the cluster and enable it when the compute returns up, it work fine >>> and >>> cluster shows it online. >>> If the compute node goes down before disabling the compute node resource >>> in >>> the cluster, it remains offline also after it is powered up. >>> The only solution I found is removing the compute node resource in the >>> cluster and add it again with a different name (adding this new name in >>> all >>> controllers /etc/hosts file). >>> With the above workaround it returns online for the cluster and all its >>> resources (openstack-nova-compute etc etc) return to work fine. >>> Please, does anyone know a better solution ? >> >> What are you using pacemaker for on the compute nodes? I have not done >> that personally, but my impression is that sometimes people do that in >> order to have virtual machines restarted somewhere else should the >> compute node go down outside of a maintenance window (ie. "instance >> high availability"). Is that your use case? If so, I would imagine >> there is some kind of clean up procedure to put the compute node back >> into use when pacemaker thinks it has failed. Did you use some kind of >> openstack distribution or follow a particular installation document to >> enable this pacemaker setup? >> >> It sounds like everything is working as expected (if my guess is >> right) and you just need the right steps to bring the node back into >> the cluster. >> >> Thanks, >> Curtis. >> >> >>> Regards >>> Ignazio >>> >>> >>> ___ >>> OpenStack-operators mailing list >>> OpenStack-operators@lists.openstack.org >>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators >>> >> >> >> >> -- >> Blog: serverascode.com >> >> > > > > -- > Blog: serverascode.com > > > > ___ > OpenStack-operators mailing list > OpenStack-operators@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators > ___ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators