> -----Original Message----- > From: Users [mailto:users-boun...@clusterlabs.org] On Behalf Of Ken Gaillot > Sent: Wednesday, August 01, 2018 2:17 PM > To: Cluster Labs - All topics related to open-source clustering welcomed > <users@clusterlabs.org> > Subject: Re: [ClusterLabs] Why Won't Resources Move? > > On Wed, 2018-08-01 at 03:49 +0000, Eric Robinson wrote: > > I have what seems to be a healthy cluster, but I can’t get resources > > to move. > > > > Here’s what’s installed… > > > > [root@001db01a cluster]# yum list installed|egrep "pacem|coro" > > corosync.x86_64 2.4.3-2.el7_5.1 @updates > > corosynclib.x86_64 2.4.3-2.el7_5.1 @updates > > pacemaker.x86_64 1.1.18-11.el7_5.3 @updates > > pacemaker-cli.x86_64 1.1.18-11.el7_5.3 @updates > > pacemaker-cluster-libs.x86_64 1.1.18-11.el7_5.3 @updates > > pacemaker-libs.x86_64 1.1.18-11.el7_5.3 @updates > > > > Cluster status looks good… > > > > [root@001db01b cluster]# pcs status > > Cluster name: 001db01ab > > Stack: corosync > > Current DC: 001db01b (version 1.1.18-11.el7_5.3-2b07d5c5a9) - > > partition with quorum Last updated: Wed Aug 1 03:44:47 2018 Last > > change: Wed Aug 1 03:22:18 2018 by root via cibadmin on 001db01a > > > > 2 nodes configured > > 11 resources configured > > > > Online: [ 001db01a 001db01b ] > > > > Full list of resources: > > > > p_vip_clust01 (ocf::heartbeat:IPaddr2): Started 001db01b > > p_azip_clust01 (ocf::heartbeat:AZaddr2): Started 001db01b > > Master/Slave Set: ms_drbd0 [p_drbd0] > > Masters: [ 001db01b ] > > Slaves: [ 001db01a ] > > Master/Slave Set: ms_drbd1 [p_drbd1] > > Masters: [ 001db01b ] > > Slaves: [ 001db01a ] > > p_fs_clust01 (ocf::heartbeat:Filesystem): Started 001db01b > > p_fs_clust02 (ocf::heartbeat:Filesystem): Started 001db01b > > p_vip_clust02 (ocf::heartbeat:IPaddr2): Started 001db01b > > p_azip_clust02 (ocf::heartbeat:AZaddr2): Started 001db01b > > p_mysql_001 (lsb:mysql_001): Started 001db01b > > > > Daemon Status: > > corosync: active/disabled > > pacemaker: active/disabled > > pcsd: active/enabled > > > > Constraints look like this… > > > > [root@001db01b cluster]# pcs constraint Location Constraints: > > Ordering Constraints: > > promote ms_drbd0 then start p_fs_clust01 (kind:Mandatory) > > promote ms_drbd1 then start p_fs_clust02 (kind:Mandatory) > > start p_fs_clust01 then start p_vip_clust01 (kind:Mandatory) > > start p_vip_clust01 then start p_azip_clust01 (kind:Mandatory) > > start p_fs_clust02 then start p_vip_clust02 (kind:Mandatory) > > start p_vip_clust02 then start p_azip_clust02 (kind:Mandatory) > > start p_vip_clust01 then start p_mysql_001 (kind:Mandatory) > > Colocation Constraints: > > p_azip_clust01 with p_vip_clust01 (score:INFINITY) > > p_fs_clust01 with ms_drbd0 (score:INFINITY) (with-rsc-role:Master) > > p_fs_clust02 with ms_drbd1 (score:INFINITY) (with-rsc-role:Master) > > p_vip_clust01 with p_fs_clust01 (score:INFINITY) > > p_vip_clust02 with p_fs_clust02 (score:INFINITY) > > p_azip_clust02 with p_vip_clust02 (score:INFINITY) > > p_mysql_001 with p_vip_clust01 (score:INFINITY) Ticket Constraints: > > > > But when I issue a move command, nothing at all happens. > > > > I see this in the log on one node… > > > > Aug 01 03:21:57 [16550] 001db01b cib: info: > > cib_perform_op: ++ /cib/configuration/constraints: <rsc_location > > id="cli-prefer-ms_drbd0" rsc="ms_drbd0" role="Started" > > node="001db01a" score="INFINITY"/> > > Aug 01 03:21:57 [16550] 001db01b cib: info: > > cib_process_request: Completed cib_modify operation for section > > constraints: OK (rc=0, origin=001db01a/crm_resource/4, > > version=0.138.0) > > Aug 01 03:21:57 [16555] 001db01b crmd: info: > > abort_transition_graph: Transition aborted by rsc_location.cli- > > prefer-ms_drbd0 'create': Configuration change | cib=0.138.0 > > source=te_update_diff:456 path=/cib/configuration/constraints > > complete=true > > > > And I see this in the log on the other node… > > > > notice: p_drbd1_monitor_60000:69196:stderr [ Error signing on to the > > CIB service: Transport endpoint is not connected ] > > The message likely came from the resource agent calling crm_attribute to set > a node attribute. That message usually means the cluster isn't running on that > node, so it's highly suspect. The cib might have crashed, which should be in > the > log as well. I'd look into that first.
I rebooted the server and afterwards I'm still getting tons of these... Aug 2 01:43:40 001db01a drbd(p_drbd1)[18628]: ERROR: ha02_mysql: Called /usr/sbin/crm_master -Q -l reboot -v 10000 Aug 2 01:43:40 001db01a drbd(p_drbd0)[18627]: ERROR: ha01_mysql: Called /usr/sbin/crm_master -Q -l reboot -v 10000 Aug 2 01:43:40 001db01a drbd(p_drbd0)[18627]: ERROR: ha01_mysql: Exit code 107 Aug 2 01:43:40 001db01a drbd(p_drbd1)[18628]: ERROR: ha02_mysql: Exit code 107 Aug 2 01:43:40 001db01a drbd(p_drbd0)[18627]: ERROR: ha01_mysql: Command output: Aug 2 01:43:40 001db01a drbd(p_drbd1)[18628]: ERROR: ha02_mysql: Command output: Aug 2 01:43:40 001db01a lrmd[2025]: notice: p_drbd0_monitor_60000:18627:stderr [ Error signing on to the CIB service: Transport endpoint is not connected ] Aug 2 01:43:40 001db01a lrmd[2025]: notice: p_drbd1_monitor_60000:18628:stderr [ Error signing on to the CIB service: Transport endpoint is not connected ] > > > > > Any thoughts? > > > > --Eric > -- > Ken Gaillot <kgail...@redhat.com> > _______________________________________________ > Users mailing list: Users@clusterlabs.org > https://lists.clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org Getting started: > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org _______________________________________________ Users mailing list: Users@clusterlabs.org https://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org