Re: [ClusterLabs] Start resource only if another resource is stopped
Ok thank you for the advice. I learnt about the attribute resource, did not knew about it before. I left the idea of NFS failover and am switching an IP to the node which has all required components. Node List: * Online: [ intranet-test1 intranet-test2 nas-sync-test1 nas-sync-test2 ] Full List of Resources: * admin-ip(ocf::heartbeat:IPaddr2):Started nas-sync-test2 * stonith-sbd (stonith:external/sbd): Started nas-sync-test1 * data_2 (ocf::heartbeat:Filesystem): Started intranet-test2 * data_1 (ocf::heartbeat:Filesystem): Started intranet-test1 * nfs_export_1(ocf::heartbeat:exportfs): Started nas-sync-test1 * nfs_server_1(systemd:nfs-server):Started nas-sync-test1 * nfs_export_2(ocf::heartbeat:exportfs): Started nas-sync-test2 * nfs_server_2(systemd:nfs-server):Started nas-sync-test2 * nginx_1 (systemd:nginx): Started intranet-test1 * nginx_2 (systemd:nginx): Started intranet-test2 * mysql_1 (systemd:mysql): Started intranet-test1 * mysql_2 (systemd:mysql): Started intranet-test2 * php_1 (systemd:php5.6-fpm):Started intranet-test1 * php_2 (systemd:php5.6-fpm):Started intranet-test2 * intranet-ip (ocf::heartbeat:IPaddr2):Started intranet-test2 * nginx_1_active (ocf::pacemaker:attribute): Started intranet-test1 * nginx_2_active (ocf::pacemaker:attribute): Started intranet-test2 intranet-ip is allocated at the node which has all of the data_x, php_x, mysql_x and nginx_x resource. data_x requires having nfs_export_x and nfs_server_x running on the sync nodes. All working well thanks. -Original Message- From: Users On Behalf Of Andrei Borzenkov Sent: Thursday, August 18, 2022 21:26 To: users@clusterlabs.org Subject: Re: [ClusterLabs] Start resource only if another resource is stopped On 17.08.2022 16:58, Miro Igov wrote: > As you guessed i am using crm res stop nfs_export_1. > I tried the solution with attribute and it does not work correct. > It does what you asked for originally, but you are shifting the goalposts ... > When i stop nfs_export_1 it stops data_1 data_1_active, then it starts > data_2_failover - so far so good. > > When i start nfs_export_1 it starts data_1, starts data_1_active and > then stops data_2_failover as result of order > data_1_active_after_data_1 and location data_2_failover_if_data_1_inactive. > > But stopping data_2_failover unmounts the mount and end result is > having no NFS export mounted: > Nowhere before did you mention that you have two resources managing the same mount point. ... > Aug 17 15:24:52 intranet-test1 Filesystem(data_1)[16382]: INFO: > Running start for nas-sync-test1:/home/pharmya/NAS on > /data/synology/pharmya_office/NAS_Sync/NAS > Aug 17 15:24:52 intranet-test1 Filesystem(data_1)[16382]: INFO: > Filesystem /data/synology/pharmya_office/NAS_Sync/NAS is already mounted. ... > Aug 17 15:24:52 intranet-test1 Filesystem(data_2_failover)[16456]: INFO: > Trying to unmount /data/synology/pharmya_office/NAS_Sync/NAS > Aug 17 15:24:52 intranet-test1 systemd[1]: > data-synology-pharmya_office-NAS_Sync-NAS.mount: Succeeded. This configuration is wrong - period. Filesystem agent monitor action checks for mounted mountpoint, so pacemaker cannot determine which resource is started. You may get away with it because by default pacemaker does not run recurrent monitor for inactive resource, but any probe will give wrong results. It is almost always wrong to have multiple independent pacemaker resources managing the same underlying physical resource. It looks like you attempt to reimplement high available NFS server on client side. If you insist on this, I see as the only solution separate resource agent that monitors state of export/data resources and sets attribute accordingly. But effectively you will be duplicating pacemaker logic. -- This message has been sent as a part of discussion between PHARMYA and the addressee whose name is specified above. Should you receive this message by mistake, we would be most grateful if you informed us that the message has been sent to you. In this case, we also ask that you delete this message from your mailbox, and do not forward it or any part of it to anyone else. Thank you for your cooperation and understanding. ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Start resource only if another resource is stopped
On Thu, Aug 18, 2022 at 8:26 PM Andrei Borzenkov wrote: > > On 17.08.2022 16:58, Miro Igov wrote: > > As you guessed i am using crm res stop nfs_export_1. > > I tried the solution with attribute and it does not work correct. > > > > It does what you asked for originally, but you are shifting the > goalposts ... > > > When i stop nfs_export_1 it stops data_1 data_1_active, then it starts > > data_2_failover - so far so good. > > > > When i start nfs_export_1 it starts data_1, starts data_1_active and then > > stops data_2_failover as result of order data_1_active_after_data_1 and > > location data_2_failover_if_data_1_inactive. > > > > But stopping data_2_failover unmounts the mount and end result is having no > > NFS export mounted: > > > > Nowhere before did you mention that you have two resources managing the > same mount point. > > ... > > Aug 17 15:24:52 intranet-test1 Filesystem(data_1)[16382]: INFO: Running > > start for nas-sync-test1:/home/pharmya/NAS on > > /data/synology/pharmya_office/NAS_Sync/NAS > > Aug 17 15:24:52 intranet-test1 Filesystem(data_1)[16382]: INFO: Filesystem > > /data/synology/pharmya_office/NAS_Sync/NAS is already mounted. > ... > > Aug 17 15:24:52 intranet-test1 Filesystem(data_2_failover)[16456]: INFO: > > Trying to unmount /data/synology/pharmya_office/NAS_Sync/NAS > > Aug 17 15:24:52 intranet-test1 systemd[1]: > > data-synology-pharmya_office-NAS_Sync-NAS.mount: Succeeded. > > This configuration is wrong - period. Filesystem agent monitor action > checks for mounted mountpoint, so pacemaker cannot determine which > resource is started. You may get away with it because by default > pacemaker does not run recurrent monitor for inactive resource, but any > probe will give wrong results. > > It is almost always wrong to have multiple independent pacemaker > resources managing the same underlying physical resource. > > It looks like you attempt to reimplement high available NFS server on > client side. If you insist on this, I see as the only solution separate > resource agent that monitors state of export/data resources and sets > attribute accordingly. But effectively you will be duplicating pacemaker > logic. As Ulrich already pointed out before in this thread, that sounds a bit as if the concept of promotable resources might be helpful here - as to have at least part of the logic done by pacemaker. But as Andrei is saying - you'll need a custom resource-agent here. Maybe it could be done in a generic way so that the community might adopt it in the end though. I'm at least not aware that such a thing would be out there already but ... Klaus > ___ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ > ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Start resource only if another resource is stopped
On 17.08.2022 16:58, Miro Igov wrote: > As you guessed i am using crm res stop nfs_export_1. > I tried the solution with attribute and it does not work correct. > It does what you asked for originally, but you are shifting the goalposts ... > When i stop nfs_export_1 it stops data_1 data_1_active, then it starts > data_2_failover - so far so good. > > When i start nfs_export_1 it starts data_1, starts data_1_active and then > stops data_2_failover as result of order data_1_active_after_data_1 and > location data_2_failover_if_data_1_inactive. > > But stopping data_2_failover unmounts the mount and end result is having no > NFS export mounted: > Nowhere before did you mention that you have two resources managing the same mount point. ... > Aug 17 15:24:52 intranet-test1 Filesystem(data_1)[16382]: INFO: Running > start for nas-sync-test1:/home/pharmya/NAS on > /data/synology/pharmya_office/NAS_Sync/NAS > Aug 17 15:24:52 intranet-test1 Filesystem(data_1)[16382]: INFO: Filesystem > /data/synology/pharmya_office/NAS_Sync/NAS is already mounted. ... > Aug 17 15:24:52 intranet-test1 Filesystem(data_2_failover)[16456]: INFO: > Trying to unmount /data/synology/pharmya_office/NAS_Sync/NAS > Aug 17 15:24:52 intranet-test1 systemd[1]: > data-synology-pharmya_office-NAS_Sync-NAS.mount: Succeeded. This configuration is wrong - period. Filesystem agent monitor action checks for mounted mountpoint, so pacemaker cannot determine which resource is started. You may get away with it because by default pacemaker does not run recurrent monitor for inactive resource, but any probe will give wrong results. It is almost always wrong to have multiple independent pacemaker resources managing the same underlying physical resource. It looks like you attempt to reimplement high available NFS server on client side. If you insist on this, I see as the only solution separate resource agent that monitors state of export/data resources and sets attribute accordingly. But effectively you will be duplicating pacemaker logic. ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Start resource only if another resource is stopped
As you guessed i am using crm res stop nfs_export_1. I tried the solution with attribute and it does not work correct. When i stop nfs_export_1 it stops data_1 data_1_active, then it starts data_2_failover - so far so good. When i start nfs_export_1 it starts data_1, starts data_1_active and then stops data_2_failover as result of order data_1_active_after_data_1 and location data_2_failover_if_data_1_inactive. But stopping data_2_failover unmounts the mount and end result is having no NFS export mounted: Aug 17 15:24:52 intranet-test1 pacemaker-fenced[1038]: notice: Watchdog will be used via SBD if fencing is required and stonith-watchdog-timeout is nonzero Aug 17 15:24:52 intranet-test1 Filesystem(data_1)[16382]: INFO: Running start for nas-sync-test1:/home/pharmya/NAS on /data/synology/pharmya_office/NAS_Sync/NAS Aug 17 15:24:52 intranet-test1 Filesystem(data_1)[16382]: INFO: Filesystem /data/synology/pharmya_office/NAS_Sync/NAS is already mounted. Aug 17 15:24:52 intranet-test1 pacemaker-controld[1042]: notice: Result of start operation for data_1 on intranet-test1: 0 (ok) Aug 17 15:24:52 intranet-test1 pacemaker-controld[1042]: notice: Result of start operation for data_1_active on intranet-test1: 0 (ok) Aug 17 15:24:52 intranet-test1 pacemaker-attrd[1040]: notice: Setting opa-data_1_active[intranet-test1]: 0 -> 1 Aug 17 15:24:52 intranet-test1 pacemaker-controld[1042]: notice: Result of monitor operation for data_1_active on intranet-test1: 0 (ok) Aug 17 15:24:52 intranet-test1 Filesystem(data_2_failover)[16456]: INFO: Running stop for nas-sync-test2:/home/pharmya/NAS on /data/synology/pharmya_office/NAS_Sync/NAS Aug 17 15:24:52 intranet-test1 pacemaker-attrd[1040]: notice: Setting opa-data_2_active[intranet-test2]: 1 -> 0 Aug 17 15:24:52 intranet-test1 Filesystem(data_2_failover)[16456]: INFO: Trying to unmount /data/synology/pharmya_office/NAS_Sync/NAS Aug 17 15:24:52 intranet-test1 systemd[1]: data-synology-pharmya_office-NAS_Sync-NAS.mount: Succeeded. Aug 17 15:24:52 intranet-test1 systemd[11103]: data-synology-pharmya_office-NAS_Sync-NAS.mount: Succeeded. Aug 17 15:24:52 intranet-test1 Filesystem(data_2_failover)[16456]: INFO: unmounted /data/synology/pharmya_office/NAS_Sync/NAS successfully Aug 17 15:24:52 intranet-test1 pacemaker-controld[1042]: notice: Result of stop operation for data_2_failover on intranet-test1: 0 (ok) Aug 17 15:24:52 intranet-test1 pacemaker-attrd[1040]: notice: Setting opa-data_2_active[intranet-test2]: 0 -> 1 Aug 17 15:25:42 intranet-test1 pacemaker-fenced[1038]: notice: Watchdog will be used via SBD if fencing is required and stonith-watchdog-timeout is nonzero On 11.08.2022 17:34, Miro Igov wrote: > Hello, > > I am trying to create failover resource that would start if another > resource is stopped and stop when the resource is started back. > > It is 4 node cluster (with qdevice) where nodes are virtual machines > and two of them are hosted in a datacenter and the other 2 VMs in > another datacenter. > > Names of the nodes are: > > nas-sync-test1 > > intranet-test1 > > nas-sync-test2 > > intranet-test2 > > The nodes ending with 1 are hosted in same datacenter and ending in 2 > are in the other datacenter. > > > > nas-sync-test* nodes are running NFS servers and exports: > > nfs_server_1, nfs_export_1 (running on nas-sync-test1) > > nfs_server_2, nfs_export_2 (running on nas-sync-test2) > > > > intranet-test1 is running NFS mount data_1 (mounting the > nfs_export_1), > intranet-test2 is running data_2 (mounting nfs_export_2). > > I created data_1_failover which is mounting the nfs_export_1 too and > would like to be running on intranet-test2 ONLY if data_2 is down. So > the idea is it mounts nfs_export_1 on intranet-test2 only when the > local mount data_2 is stopped (note the nfs_server_1 runs on one > datacenter and intranet-test2 in the another DC) > > Also created data_2_failover with the same purpose as data_1_failover. > > > > I would like to ask how to set the failover mounts automatically start > when ordinary mounts stop? > > > > Current configuration of the constraints: > > > > tag all_mounts data_1 data_2 data_1_failover data_2_failover > > tag sync_1 nfs_server_1 nfs_export_1 > > tag sync_2 nfs_server_2 nfs_export_2 > > location deny_data_1 data_1 -inf: intranet-test2 > > location deny_data_2 data_2 -inf: intranet-test1 > > location deny_failover_1 data_1_failover -inf: intranet-test1 > > location deny_failover_2 data_2_failover -inf: intranet-test2 > > location deny_sync_1 sync_1 \ > > rule -inf: #uname ne nas-sync-test1 > > location deny_sync_2 sync_2 \ > > rule -inf: #uname ne nas-sync-test2 > > location mount_on_intranet all_mounts \ > > rule -inf: #uname eq nas-sync-test1 or #uname eq > nas-sync-test2 > > > > colocation nfs_1 inf: nfs_export_1 nfs_server_1 > > colocation nfs_2 inf: nfs_export_2 nfs_server_2 > > > > order nfs_server_export_1 Mandato
Re: [ClusterLabs] Start resource only if another resource is stopped
On 11.08.2022 17:34, Miro Igov wrote: > Hello, > > I am trying to create failover resource that would start if another resource > is stopped and stop when the resource is started back. > > It is 4 node cluster (with qdevice) where nodes are virtual machines and two > of them are hosted in a datacenter and the other 2 VMs in another > datacenter. > > Names of the nodes are: > > nas-sync-test1 > > intranet-test1 > > nas-sync-test2 > > intranet-test2 > > The nodes ending with 1 are hosted in same datacenter and ending in 2 are in > the other datacenter. > > > > nas-sync-test* nodes are running NFS servers and exports: > > nfs_server_1, nfs_export_1 (running on nas-sync-test1) > > nfs_server_2, nfs_export_2 (running on nas-sync-test2) > > > > intranet-test1 is running NFS mount data_1 (mounting the nfs_export_1), > intranet-test2 is running data_2 (mounting nfs_export_2). > > I created data_1_failover which is mounting the nfs_export_1 too and would > like to be running on intranet-test2 ONLY if data_2 is down. So the idea is > it mounts nfs_export_1 on intranet-test2 only when the local mount data_2 is > stopped (note the nfs_server_1 runs on one datacenter and intranet-test2 in > the another DC) > > Also created data_2_failover with the same purpose as data_1_failover. > > > > I would like to ask how to set the failover mounts automatically start when > ordinary mounts stop? > > > > Current configuration of the constraints: > > > > tag all_mounts data_1 data_2 data_1_failover data_2_failover > > tag sync_1 nfs_server_1 nfs_export_1 > > tag sync_2 nfs_server_2 nfs_export_2 > > location deny_data_1 data_1 -inf: intranet-test2 > > location deny_data_2 data_2 -inf: intranet-test1 > > location deny_failover_1 data_1_failover -inf: intranet-test1 > > location deny_failover_2 data_2_failover -inf: intranet-test2 > > location deny_sync_1 sync_1 \ > > rule -inf: #uname ne nas-sync-test1 > > location deny_sync_2 sync_2 \ > > rule -inf: #uname ne nas-sync-test2 > > location mount_on_intranet all_mounts \ > > rule -inf: #uname eq nas-sync-test1 or #uname eq nas-sync-test2 > > > > colocation nfs_1 inf: nfs_export_1 nfs_server_1 > > colocation nfs_2 inf: nfs_export_2 nfs_server_2 > > > > order nfs_server_export_1 Mandatory: nfs_server_1 nfs_export_1 > > order nfs_server_export_2 Mandatory: nfs_server_2 nfs_export_2 > > order mount_1 Mandatory: nfs_export_1 data_1 > > order mount_1_failover Mandatory: nfs_export_1 data_1_failover > > order mount_2 Mandatory: nfs_export_2 data_2 > > order mount_2_failover Mandatory: nfs_export_2 data_2_failover > > > > > > I tried adding following colocation: > >colocation failover_1 -inf: data_2_failover data_1 > This colocation does not say "start data_2_failover when data_1 is stopped". This colocation says "do not allocate data_2_failover to the same node where data_1 is already allocated". There is difference between "resource A can run on node N" and "resource A is active on node N". > and it is stopping data_2_failover when data_1 is started, also it starts > data_2_failover when data_1 is stopped - exactly as needed! > > Full List of Resources: > > * admin-ip(ocf::heartbeat:IPaddr2):Started intranet-test2 > > * stonith-sbd (stonith:external/sbd): Started intranet-test1 > > * nfs_export_1(ocf::heartbeat:exportfs): Started > nas-sync-test1 > > * nfs_server_1(systemd:nfs-server):Started nas-sync-test1 > > * nfs_export_2(ocf::heartbeat:exportfs): Started > nas-sync-test2 > > * nfs_server_2(systemd:nfs-server):Started nas-sync-test2 > > * data_1_failover (ocf::heartbeat:Filesystem): Started > intranet-test2 > > * data_2_failover (ocf::heartbeat:Filesystem): Stopped > > * data_2 (ocf::heartbeat:Filesystem): Started intranet-test2 > > * data_1 (ocf::heartbeat:Filesystem): Started intranet-test1 > > > For the future - it is much better to simply copy and paste actual commands you used with their output. While we may guess that you used "crm resource stop" or equivalent command, it is just a guess. Any conclusion based on this guess will be wrong if we guessed wrong. > > > Full List of Resources: > > * admin-ip(ocf::heartbeat:IPaddr2):Started intranet-test2 > > * stonith-sbd (stonith:external/sbd): Started intranet-test1 > > * nfs_export_1(ocf::heartbeat:exportfs): Started > nas-sync-test1 > > * nfs_server_1(systemd:nfs-server):Started nas-sync-test1 > > * nfs_export_2(ocf::heartbeat:exportfs): Started > nas-sync-test2 > > * nfs_server_2(systemd:nfs-server):Started nas-sync-test2 > > * data_1_failover (ocf::heartbeat:Filesystem): Started > intranet-test2 > > * data_2_failover (ocf::heartbeat:Filesystem): Started > intranet-test1 > > * data_2 (ocf::h