Re: [ClusterLabs] start a resource
On 2016-05-17 09:21, Ken Gaillot wrote: What happens after "pcs resource cleanup"? "pcs status" reports the time associated with each failure, so you can check whether you are seeing the same failure or a new one. The system log is usually the best starting point, as it will have messages from pacemaker, corosync and the resource agents. Yes, it'd be much easier if I saw anything useful from pcs and/or in the logs. I've another active/passive pair to setup, if I get a round tuit -- hopefully in the next few weeks -- I'll see if I can reproduce this. Dimitri ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] start a resource
On 05/16/2016 12:22 PM, Dimitri Maziuk wrote: > On 05/13/2016 04:31 PM, Ken Gaillot wrote: > >> That is definitely not a properly functioning cluster. Something >> is going wrong at some level. > > Yeah, well... how do I find out what/where? What happens after "pcs resource cleanup"? "pcs status" reports the time associated with each failure, so you can check whether you are seeing the same failure or a new one. The system log is usually the best starting point, as it will have messages from pacemaker, corosync and the resource agents. You can look around the time of the failure(s) to look for details or anything unusual. Pacemaker also has a detail log (by default, /var/log/pacemaker.log). In general, this is more useful to developers than administrators, but if the system log doesn't help, it can sometimes shed a little more light. > One question: in corosync.conf I have nodelist { node { ring0_addr: > node1_name nodeid: 1 } node { ring0_addr: node2_name nodeid: 2 } } > > Could 'pcs cluster stop/start' reset the interface that resolves > to nodeX_name? If so, that would answer why ssh connections get > killed. No, Pacemaker and pcs don't touch the interfaces (unless of course you explicitly add a cluster resource to do so, which wouldn't work anyway for the interface(s) that corosync itself needs to use). ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] start a resource
On 05/13/2016 04:31 PM, Ken Gaillot wrote: > That is definitely not a properly functioning cluster. Something is > going wrong at some level. Yeah, well... how do I find out what/where? One question: in corosync.conf I have nodelist { node { ring0_addr: node1_name nodeid: 1 } node { ring0_addr: node2_name nodeid: 2 } } Could 'pcs cluster stop/start' reset the interface that resolves to nodeX_name? If so, that would answer why ssh connections get killed. -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signature ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] start a resource
On 05/06/2016 01:01 PM, Dimitri Maziuk wrote: > On 05/06/2016 12:05 PM, Ian wrote: >> Are you getting any other errors now that you've fixed the >> config? > > It's running now that I did the cluster stop/start, but no: I > wasn't getting any other errors. I did have a symlink resource > "stopped" for no apparent reason and with no errors logged. > > The cluster is a basic active-passive pair. The relevant part of > the setup is: > > drbd filesystem floating ip colocated with drbd filesystem +inf > order drbd filesystem then floating ip > > ocf:heartbeat:symlink resource that does /etc/rsyncd.conf -> > /drbd/etc/rsyncd.conf colocated with drbd filesystem +inf order > drbd filesystem then the symlink > > ocf:heartbeat:rsyncd resource that is colocated with the symlink > order symlink then rsyncd order floating ip then rsyncd > > (Looking at this, maybe I should also colocate rsyncd with floating > ip to avoid any confusion in pacemaker's little brain.) Not strictly necessary, since rsync is colocated with symlink which is colocated with filesystem, and ip is also colocated with filesystem. But it is a good idea to model all logical dependencies, since you don't know what changes you might make to the configuration in the future. If you want rsyncd to always be with the floating ip, then by all means add a colocation constraint. > But this is not specific to rsyncd: the behaviour was exactly the > same when a co-worker made a typo in apache config (which is > another resource on the same cluster). The only way to restart > apache was to "pcs cluster stop ; pcs cluster start" and that > randomly killed ssh connections to the nodes' "proper" IPs. That is definitely not a properly functioning cluster. Something is going wrong at some level. When you say that "pcs resource cleanup" didn't fix the issue, what happened after that? Did "pcs status" still show an error for the resource? If so, there was an additional failure. ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] start a resource
On 05/06/2016 12:05 PM, Ian wrote: > Are you getting any other errors now that you've fixed the config? It's running now that I did the cluster stop/start, but no: I wasn't getting any other errors. I did have a symlink resource "stopped" for no apparent reason and with no errors logged. The cluster is a basic active-passive pair. The relevant part of the setup is: drbd filesystem floating ip colocated with drbd filesystem +inf order drbd filesystem then floating ip ocf:heartbeat:symlink resource that does /etc/rsyncd.conf -> /drbd/etc/rsyncd.conf colocated with drbd filesystem +inf order drbd filesystem then the symlink ocf:heartbeat:rsyncd resource that is colocated with the symlink order symlink then rsyncd order floating ip then rsyncd (Looking at this, maybe I should also colocate rsyncd with floating ip to avoid any confusion in pacemaker's little brain.) But this is not specific to rsyncd: the behaviour was exactly the same when a co-worker made a typo in apache config (which is another resource on the same cluster). The only way to restart apache was to "pcs cluster stop ; pcs cluster start" and that randomly killed ssh connections to the nodes' "proper" IPs. -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signature ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] start a resource
Are you getting any other errors now that you've fixed the config? What does your config file look like? On May 6, 2016 10:33 AM, "Dmitri Maziuk" wrote: > On 2016-05-05 23:50, Moiz Arif wrote: > >> Hi Dimitri, >> >> Try cleanup of the fail count for the resource with the any of the below >> commands: >> >> via pcs : pcs resource cleanup rsyncd >> > > Tried it, didn't work. Tried pcs resource debug-start rsyncd -- got no > errors, resource didn't start. Tried disable/enable. > > So far the only way I've been able to do this is pcs cluster stop ; pcs > cluster start which is ridiculous on a production cluster with drbd and a > database etc. (And it killed my ssh connection to the other node, again.) > > Ay other suggestions? > Thanks, > Dima > > > ___ > Users mailing list: Users@clusterlabs.org > http://clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org > ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] start a resource
On 2016-05-05 23:50, Moiz Arif wrote: Hi Dimitri, Try cleanup of the fail count for the resource with the any of the below commands: via pcs : pcs resource cleanup rsyncd Tried it, didn't work. Tried pcs resource debug-start rsyncd -- got no errors, resource didn't start. Tried disable/enable. So far the only way I've been able to do this is pcs cluster stop ; pcs cluster start which is ridiculous on a production cluster with drbd and a database etc. (And it killed my ssh connection to the other node, again.) Ay other suggestions? Thanks, Dima ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] start a resource
Hi Dimitri, Try cleanup of the fail count for the resource with the any of the below commands: via pcs : pcs resource cleanup rsyncdvia crm: crm resource cleanup rsyncd Hope it helps. Moiz To: users@clusterlabs.org From: dmaz...@bmrb.wisc.edu Date: Thu, 5 May 2016 14:15:09 -0500 Subject: [ClusterLabs] start a resource Hi all, I'm sure it must be a FAQ, but how do I start a resource? E.g. Failed Actions: * rsyncd_start_0 on tarpon 'unknown error' (1): call=78, status=complete, exitreason='Error. "pid file" entry required in the rsyncd config file by rsyncd OCF RA.', last-rc-change='Thu May 5 13:55:50 2016', queued=0ms, exec=51ms OK, I fixed the config file, how do I restart rsyncd now? TIA -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org