On Tue, May 18, 2021 at 8:20 AM Eric Robinson <eric.robin...@psmnv.com> wrote: > > Okay, here is a test, starting with the initial cluster status... > > > [root@ha09a ~]# pcs status > Cluster name: ha09ab > Cluster Summary: > * Stack: corosync > * Current DC: ha09a (version 2.0.4-6.el8_3.2-2deceaa3ae) - partition with > quorum > * Last updated: Mon May 17 22:14:11 2021 > * Last change: Mon May 17 21:58:18 2021 by hacluster via crmd on ha09b > * 2 nodes configured > * 8 resource instances configured > > Node List: > * Online: [ ha09a ha09b ] > > Full List of Resources: > * Clone Set: p_drbd0-clone [p_drbd0] (promotable): > * Masters: [ ha09a ] > * Slaves: [ ha09b ] > * Clone Set: p_drbd1-clone [p_drbd1] (promotable): > * Masters: [ ha09a ] > * Slaves: [ ha09b ] > * p_vdo0 (lsb:vdo0): Started ha09a > * p_vdo1 (lsb:vdo1): Started ha09a > * p_fs_clust08 (ocf::heartbeat:Filesystem): Started ha09a > * p_fs_clust09 (ocf::heartbeat:Filesystem): Started ha09a > > Failed Resource Actions: > * p_vdo0_monitor_15000 on ha09a 'not running' (7): call=35, > status='complete', exitreason='', last-rc-change='2021-05-17 21:01:28 > -07:00', queued=0ms, exec=157ms > * p_vdo1_monitor_15000 on ha09a 'not running' (7): call=91, > status='complete', exitreason='', last-rc-change='2021-05-17 21:56:57 > -07:00', queued=0ms, exec=164ms > > Daemon Status: > corosync: active/disabled > pacemaker: active/disabled > pcsd: active/enabled > > > Here are the constraints... > > [root@ha09a ~]# pcs constraint --full > Location Constraints: > Ordering Constraints: > promote p_drbd0-clone then start p_vdo0 (kind:Mandatory) > (id:order-p_drbd0-clone-p_vdo0-mandatory) > promote p_drbd1-clone then start p_vdo1 (kind:Mandatory) > (id:order-p_drbd1-clone-p_vdo1-mandatory) > start p_vdo0 then start p_fs_clust08 (kind:Mandatory) > (id:order-p_vdo0-p_fs_clust08-mandatory) > start p_vdo1 then start p_fs_clust09 (kind:Mandatory) > (id:order-p_vdo1-p_fs_clust09-mandatory) > Colocation Constraints: > p_vdo0 with p_drbd0-clone (score:INFINITY) > (id:colocation-p_vdo0-p_drbd0-clone-INFINITY) > p_vdo1 with p_drbd1-clone (score:INFINITY) > (id:colocation-p_vdo1-p_drbd1-clone-INFINITY)
This is wrong. It says vdo can be active on every node where a clone instance is active. You need colocation with master. > p_fs_clust08 with p_vdo0 (score:INFINITY) > (id:colocation-p_fs_clust08-p_vdo0-INFINITY) > p_fs_clust09 with p_vdo1 (score:INFINITY) > (id:colocation-p_fs_clust09-p_vdo1-INFINITY) > Ticket Constraints: > > I will now try to move resource p_fs_clust08... > > [root@ha09a ~]# pcs resource move p_fs_clust08 > Warning: Creating location constraint 'cli-ban-p_fs_clust08-on-ha09a' with a > score of -INFINITY for resource p_fs_clust08 on ha09a. > This will prevent p_fs_clust08 from running on ha09a until the > constraint is removed > This will be the case even if ha09a is the last node in the cluster > [root@ha09a ~]# > [root@ha09a ~]# > > The resource fails to move and is now in a stopped state... > > [root@ha09a ~]# pcs status > Cluster name: ha09ab > Cluster Summary: > * Stack: corosync > * Current DC: ha09a (version 2.0.4-6.el8_3.2-2deceaa3ae) - partition with > quorum > * Last updated: Mon May 17 22:17:16 2021 > * Last change: Mon May 17 22:16:51 2021 by root via crm_resource on ha09a > * 2 nodes configured > * 8 resource instances configured > > Node List: > * Online: [ ha09a ha09b ] > > Full List of Resources: > * Clone Set: p_drbd0-clone [p_drbd0] (promotable): > * Masters: [ ha09b ] > * Slaves: [ ha09a ] > * Clone Set: p_drbd1-clone [p_drbd1] (promotable): > * Masters: [ ha09a ] > * Slaves: [ ha09b ] > * p_vdo0 (lsb:vdo0): Started ha09b > * p_vdo1 (lsb:vdo1): Started ha09a > * p_fs_clust08 (ocf::heartbeat:Filesystem): Stopped > * p_fs_clust09 (ocf::heartbeat:Filesystem): Started ha09a > > Failed Resource Actions: > * p_vdo0_monitor_15000 on ha09a 'not running' (7): call=35, > status='complete', exitreason='', last-rc-change='2021-05-17 21:01:28 > -07:00', queued=0ms, exec=157ms > * p_vdo1_monitor_15000 on ha09a 'not running' (7): call=91, > status='complete', exitreason='', last-rc-change='2021-05-17 21:56:57 > -07:00', queued=0ms, exec=164ms > * p_vdo0_monitor_15000 on ha09b 'not running' (7): call=35, > status='complete', exitreason='', last-rc-change='2021-05-17 22:16:53 > -07:00', queued=0ms, exec=170ms > * p_fs_clust08_start_0 on ha09b 'not installed' (5): call=36, > status='complete', exitreason='Couldn't find device [/dev/mapper/vdo0]. > Expected /dev/??? to exist', last-rc-change='2021-05-17 22:16:53 -07:00', > queued=0ms, exec=330ms > > Daemon Status: > corosync: active/disabled > pacemaker: active/disabled > pcsd: active/enabled > > Here are the logs from ha09a... > > May 17 22:16:51 ha09a pacemaker-controld[2657]: notice: State transition > S_IDLE -> S_POLICY_ENGINE > May 17 22:16:51 ha09a pacemaker-schedulerd[2656]: notice: On loss of quorum: > Ignore > May 17 22:16:51 ha09a pacemaker-schedulerd[2656]: warning: Unexpected result > (not running) was recorded for monitor of p_vdo0 on ha09a at May 17 21:01:28 > 2021 > May 17 22:16:51 ha09a pacemaker-schedulerd[2656]: warning: Unexpected result > (not running) was recorded for monitor of p_vdo1 on ha09a at May 17 21:56:57 > 2021 > May 17 22:16:51 ha09a pacemaker-schedulerd[2656]: notice: * Move > p_vdo0 ( ha09a -> ha09b ) > May 17 22:16:51 ha09a pacemaker-schedulerd[2656]: notice: * Move > p_fs_clust08 ( ha09a -> ha09b ) As you see, pacemaker tries to move resources to node with secondary DRBD instance instead of promoting DRBD first. > May 17 22:16:51 ha09a pacemaker-schedulerd[2656]: notice: Calculated > transition 24, saving inputs in /var/lib/pacemaker/pengine/pe-input-459.bz2 > May 17 22:16:51 ha09a pacemaker-controld[2657]: notice: Initiating stop > operation p_fs_clust08_stop_0 locally on ha09a > May 17 22:16:51 ha09a Filesystem(p_fs_clust08)[50520]: INFO: Running stop for > /dev/mapper/vdo0 on /ha01_mysql > May 17 22:16:51 ha09a Filesystem(p_fs_clust08)[50520]: INFO: Trying to > unmount /ha01_mysql > May 17 22:16:51 ha09a systemd[1611]: ha01_mysql.mount: Succeeded. > May 17 22:16:51 ha09a systemd[2582]: ha01_mysql.mount: Succeeded. > May 17 22:16:51 ha09a systemd[1]: ha01_mysql.mount: Succeeded. > May 17 22:16:51 ha09a kernel: XFS (dm-5): Unmounting Filesystem > May 17 22:16:51 ha09a Filesystem(p_fs_clust08)[50520]: INFO: unmounted > /ha01_mysql successfully > May 17 22:16:51 ha09a pacemaker-controld[2657]: notice: Result of stop > operation for p_fs_clust08 on ha09a: ok > May 17 22:16:51 ha09a pacemaker-controld[2657]: notice: Initiating stop > operation p_vdo0_stop_0 locally on ha09a > May 17 22:16:52 ha09a lvm[4241]: No longer monitoring VDO pool vdo0. > May 17 22:16:52 ha09a UDS/vdodmeventd[50696]: INFO (vdodmeventd/50696) VDO > device vdo0 is now unregistered from dmeventd > May 17 22:16:52 ha09a kernel: kvdo3:dmsetup: suspending device 'vdo0' > May 17 22:16:52 ha09a kernel: kvdo3:packerQ: compression is disabled > May 17 22:16:52 ha09a kernel: kvdo3:packerQ: compression is enabled > May 17 22:16:52 ha09a kernel: uds: dmsetup: beginning save (vcn 85) > May 17 22:16:52 ha09a kernel: uds: dmsetup: finished save (vcn 85) > May 17 22:16:52 ha09a kernel: kvdo3:dmsetup: device 'vdo0' suspended > May 17 22:16:52 ha09a kernel: kvdo3:dmsetup: stopping device 'vdo0' > May 17 22:16:52 ha09a kernel: kvdo3:dmsetup: device 'vdo0' stopped > May 17 22:16:53 ha09a pacemaker-controld[2657]: notice: Result of stop > operation for p_vdo0 on ha09a: ok > May 17 22:16:53 ha09a pacemaker-controld[2657]: notice: Initiating start > operation p_vdo0_start_0 on ha09b > May 17 22:16:53 ha09a pacemaker-controld[2657]: notice: Initiating monitor > operation p_vdo0_monitor_15000 on ha09b > May 17 22:16:53 ha09a pacemaker-controld[2657]: notice: Initiating start > operation p_fs_clust08_start_0 on ha09b > May 17 22:16:53 ha09a pacemaker-controld[2657]: notice: Transition 24 aborted > by operation p_vdo0_monitor_15000 'create' on ha09b: Event failed > May 17 22:16:53 ha09a pacemaker-controld[2657]: notice: Transition 24 action > 69 (p_vdo0_monitor_15000 on ha09b): expected 'ok' but got 'not running' > May 17 22:16:53 ha09a pacemaker-attrd[2655]: notice: Setting > fail-count-p_vdo0#monitor_15000[ha09b]: (unset) -> 1 > May 17 22:16:53 ha09a pacemaker-attrd[2655]: notice: Setting > last-failure-p_vdo0#monitor_15000[ha09b]: (unset) -> 1621315013 > May 17 22:16:53 ha09a pacemaker-controld[2657]: notice: Transition 24 aborted > by status-2-fail-count-p_vdo0.monitor_15000 doing create > fail-count-p_vdo0#monitor_15000=1: Transient attribute change > May 17 22:16:53 ha09a pacemaker-controld[2657]: notice: Transition 24 action > 73 (p_fs_clust08_start_0 on ha09b): expected 'ok' but got 'not installed' > May 17 22:16:53 ha09a pacemaker-controld[2657]: notice: Transition 24 > (Complete=5, Pending=0, Fired=0, Skipped=0, Incomplete=1, > Source=/var/lib/pacemaker/pengine/pe-input-459.bz2): Complete > May 17 22:16:53 ha09a pacemaker-attrd[2655]: notice: Setting > fail-count-p_fs_clust08#start_0[ha09b]: (unset) -> INFINITY > May 17 22:16:53 ha09a pacemaker-attrd[2655]: notice: Setting > last-failure-p_fs_clust08#start_0[ha09b]: (unset) -> 1621315013 > May 17 22:16:53 ha09a pacemaker-schedulerd[2656]: notice: On loss of quorum: > Ignore > May 17 22:16:53 ha09a pacemaker-schedulerd[2656]: warning: Unexpected result > (not running) was recorded for monitor of p_vdo0 on ha09a at May 17 21:01:28 > 2021 > May 17 22:16:53 ha09a pacemaker-schedulerd[2656]: warning: Unexpected result > (not running) was recorded for monitor of p_vdo1 on ha09a at May 17 21:56:57 > 2021 > May 17 22:16:53 ha09a pacemaker-schedulerd[2656]: warning: Unexpected result > (not running) was recorded for monitor of p_vdo0 on ha09b at May 17 22:16:53 > 2021 > May 17 22:16:53 ha09a pacemaker-schedulerd[2656]: warning: Unexpected result > (not installed: Couldn't find device [/dev/mapper/vdo0]. Expected /dev/??? to > exist) was recorded for start of p_fs_clust08 on ha09b at May 17 22:16:53 2021 > May 17 22:16:53 ha09a pacemaker-schedulerd[2656]: notice: Preventing > p_fs_clust08 from restarting on ha09b because of hard failure (not installed: > Couldn't find device [/dev/mapper/vdo0]. Expected /dev/??? to exist) > May 17 22:16:53 ha09a pacemaker-schedulerd[2656]: warning: Unexpected result > (not installed: Couldn't find device [/dev/mapper/vdo0]. Expected /dev/??? to > exist) was recorded for start of p_fs_clust08 on ha09b at May 17 22:16:53 2021 > May 17 22:16:53 ha09a pacemaker-schedulerd[2656]: notice: Preventing > p_fs_clust08 from restarting on ha09b because of hard failure (not installed: > Couldn't find device [/dev/mapper/vdo0]. Expected /dev/??? to exist) > May 17 22:16:53 ha09a pacemaker-schedulerd[2656]: notice: * Demote > p_drbd0:0 ( Master -> Slave ha09a ) > May 17 22:16:53 ha09a pacemaker-schedulerd[2656]: notice: * Promote > p_drbd0:1 ( Slave -> Master ha09b ) > May 17 22:16:53 ha09a pacemaker-schedulerd[2656]: notice: * Recover > p_vdo0 ( ha09b ) > May 17 22:16:53 ha09a pacemaker-schedulerd[2656]: notice: * Stop > p_fs_clust08 ( ha09b ) due to node availability > May 17 22:16:53 ha09a pacemaker-schedulerd[2656]: notice: Calculated > transition 25, saving inputs in /var/lib/pacemaker/pengine/pe-input-460.bz2 > May 17 22:16:53 ha09a pacemaker-schedulerd[2656]: notice: On loss of quorum: > Ignore > May 17 22:16:53 ha09a pacemaker-schedulerd[2656]: warning: Unexpected result > (not running) was recorded for monitor of p_vdo0 on ha09a at May 17 21:01:28 > 2021 > May 17 22:16:53 ha09a pacemaker-schedulerd[2656]: warning: Unexpected result > (not running) was recorded for monitor of p_vdo1 on ha09a at May 17 21:56:57 > 2021 > May 17 22:16:53 ha09a pacemaker-schedulerd[2656]: warning: Unexpected result > (not running) was recorded for monitor of p_vdo0 on ha09b at May 17 22:16:53 > 2021 > May 17 22:16:53 ha09a pacemaker-schedulerd[2656]: warning: Unexpected result > (not installed: Couldn't find device [/dev/mapper/vdo0]. Expected /dev/??? to > exist) was recorded for start of p_fs_clust08 on ha09b at May 17 22:16:53 2021 > May 17 22:16:53 ha09a pacemaker-schedulerd[2656]: notice: Preventing > p_fs_clust08 from restarting on ha09b because of hard failure (not installed: > Couldn't find device [/dev/mapper/vdo0]. Expected /dev/??? to exist) > May 17 22:16:53 ha09a pacemaker-schedulerd[2656]: warning: Unexpected result > (not installed: Couldn't find device [/dev/mapper/vdo0]. Expected /dev/??? to > exist) was recorded for start of p_fs_clust08 on ha09b at May 17 22:16:53 2021 > May 17 22:16:53 ha09a pacemaker-schedulerd[2656]: notice: Preventing > p_fs_clust08 from restarting on ha09b because of hard failure (not installed: > Couldn't find device [/dev/mapper/vdo0]. Expected /dev/??? to exist) > May 17 22:16:53 ha09a pacemaker-schedulerd[2656]: warning: Forcing > p_fs_clust08 away from ha09b after 1000000 failures (max=1000000) > May 17 22:16:53 ha09a pacemaker-schedulerd[2656]: notice: * Demote > p_drbd0:0 ( Master -> Slave ha09a ) > May 17 22:16:53 ha09a pacemaker-schedulerd[2656]: notice: * Promote > p_drbd0:1 ( Slave -> Master ha09b ) > May 17 22:16:53 ha09a pacemaker-schedulerd[2656]: notice: * Recover > p_vdo0 ( ha09b ) > May 17 22:16:53 ha09a pacemaker-schedulerd[2656]: notice: * Stop > p_fs_clust08 ( ha09b ) due to node availability > May 17 22:16:53 ha09a pacemaker-schedulerd[2656]: notice: Calculated > transition 26, saving inputs in /var/lib/pacemaker/pengine/pe-input-461.bz2 > May 17 22:16:53 ha09a pacemaker-controld[2657]: notice: Initiating cancel > operation p_drbd0_monitor_60000 on ha09b > May 17 22:16:53 ha09a pacemaker-controld[2657]: notice: Initiating stop > operation p_fs_clust08_stop_0 on ha09b > May 17 22:16:53 ha09a pacemaker-controld[2657]: notice: Initiating notify > operation p_drbd0_pre_notify_demote_0 locally on ha09a > May 17 22:16:53 ha09a pacemaker-controld[2657]: notice: Initiating notify > operation p_drbd0_pre_notify_demote_0 on ha09b > May 17 22:16:53 ha09a pacemaker-controld[2657]: notice: Result of notify > operation for p_drbd0 on ha09a: ok > May 17 22:16:53 ha09a pacemaker-controld[2657]: notice: Initiating stop > operation p_vdo0_stop_0 on ha09b > May 17 22:16:54 ha09a pacemaker-controld[2657]: notice: Initiating demote > operation p_drbd0_demote_0 locally on ha09a > May 17 22:16:54 ha09a kernel: drbd ha01_mysql: role( Primary -> Secondary ) > May 17 22:16:54 ha09a pacemaker-controld[2657]: notice: Result of demote > operation for p_drbd0 on ha09a: ok > May 17 22:16:54 ha09a pacemaker-controld[2657]: notice: Initiating notify > operation p_drbd0_post_notify_demote_0 locally on ha09a > May 17 22:16:54 ha09a pacemaker-controld[2657]: notice: Initiating notify > operation p_drbd0_post_notify_demote_0 on ha09b > May 17 22:16:54 ha09a pacemaker-controld[2657]: notice: Result of notify > operation for p_drbd0 on ha09a: ok > May 17 22:16:54 ha09a pacemaker-controld[2657]: notice: Initiating notify > operation p_drbd0_pre_notify_promote_0 locally on ha09a > May 17 22:16:54 ha09a pacemaker-controld[2657]: notice: Initiating notify > operation p_drbd0_pre_notify_promote_0 on ha09b > May 17 22:16:54 ha09a pacemaker-controld[2657]: notice: Result of notify > operation for p_drbd0 on ha09a: ok > May 17 22:16:54 ha09a pacemaker-controld[2657]: notice: Initiating promote > operation p_drbd0_promote_0 on ha09b > May 17 22:16:54 ha09a kernel: drbd ha01_mysql ha09b: Preparing remote state > change 610633182 > May 17 22:16:54 ha09a kernel: drbd ha01_mysql ha09b: Committing remote state > change 610633182 (primary_nodes=1) > May 17 22:16:54 ha09a kernel: drbd ha01_mysql ha09b: peer( Secondary -> > Primary ) > May 17 22:16:54 ha09a pacemaker-controld[2657]: notice: Initiating notify > operation p_drbd0_post_notify_promote_0 locally on ha09a > May 17 22:16:54 ha09a pacemaker-controld[2657]: notice: Initiating notify > operation p_drbd0_post_notify_promote_0 on ha09b > May 17 22:16:54 ha09a pacemaker-controld[2657]: notice: Result of notify > operation for p_drbd0 on ha09a: ok > May 17 22:16:54 ha09a pacemaker-controld[2657]: notice: Initiating start > operation p_vdo0_start_0 on ha09b > May 17 22:16:54 ha09a pacemaker-controld[2657]: notice: Initiating monitor > operation p_drbd0_monitor_60000 locally on ha09a > May 17 22:16:54 ha09a pacemaker-controld[2657]: notice: Result of monitor > operation for p_drbd0 on ha09a: ok > May 17 22:16:56 ha09a pacemaker-controld[2657]: notice: Initiating monitor > operation p_vdo0_monitor_15000 on ha09b > May 17 22:16:57 ha09a pacemaker-controld[2657]: notice: Transition 26 > (Complete=28, Pending=0, Fired=0, Skipped=0, Incomplete=0, > Source=/var/lib/pacemaker/pengine/pe-input-461.bz2): Complete > May 17 22:16:57 ha09a pacemaker-controld[2657]: notice: State transition > S_TRANSITION_ENGINE -> S_IDLE > > Here are the logs from ha09b... > > May 17 22:16:53 ha09b UDS/vdodumpconfig[3494]: ERROR (vdodumpconfig/3494) > openFile(): failed opening /dev/drbd0 with file access: 4: Wrong medium type > (124) > May 17 22:16:53 ha09b vdo[3486]: ERROR - vdodumpconfig: Failed to make > FileLayer from '/dev/drbd0' with Wrong medium type > May 17 22:16:53 ha09b pacemaker-controld[2710]: notice: Result of start > operation for p_vdo0 on ha09b: ok > May 17 22:16:53 ha09b Filesystem(p_fs_clust08)[3496]: INFO: Running start for > /dev/mapper/vdo0 on /ha01_mysql > May 17 22:16:53 ha09b UDS/vdodumpconfig[3577]: ERROR (vdodumpconfig/3577) > openFile(): failed opening /dev/drbd0 with file access: 4: Wrong medium type > (124) > May 17 22:16:53 ha09b vdo[3503]: ERROR - vdodumpconfig: Failed to make > FileLayer from '/dev/drbd0' with Wrong medium type > May 17 22:16:53 ha09b pacemaker-controld[2710]: notice: Result of monitor > operation for p_vdo0 on ha09b: not running > May 17 22:16:53 ha09b pacemaker-controld[2710]: notice: > ha09b-p_vdo0_monitor_15000:35 [ error occurred checking vdo0 status on > ha09b\n ] > May 17 22:16:53 ha09b pacemaker-attrd[2708]: notice: Setting > fail-count-p_vdo0#monitor_15000[ha09b]: (unset) -> 1 > May 17 22:16:53 ha09b pacemaker-attrd[2708]: notice: Setting > last-failure-p_vdo0#monitor_15000[ha09b]: (unset) -> 1621315013 > May 17 22:16:53 ha09b Filesystem(p_fs_clust08)[3496]: ERROR: Couldn't find > device [/dev/mapper/vdo0]. Expected /dev/??? to exist > May 17 22:16:53 ha09b pacemaker-execd[2707]: notice: > p_fs_clust08_start_0[3496] error output [ ocf-exit-reason:Couldn't find > device [/dev/mapper/vdo0]. Expected /dev/??? to exist ] > May 17 22:16:53 ha09b pacemaker-controld[2710]: notice: Result of start > operation for p_fs_clust08 on ha09b: not installed > May 17 22:16:53 ha09b pacemaker-controld[2710]: notice: > ha09b-p_fs_clust08_start_0:36 [ ocf-exit-reason:Couldn't find device > [/dev/mapper/vdo0]. Expected /dev/??? to exist\n ] > May 17 22:16:53 ha09b pacemaker-attrd[2708]: notice: Setting > fail-count-p_fs_clust08#start_0[ha09b]: (unset) -> INFINITY > May 17 22:16:53 ha09b pacemaker-attrd[2708]: notice: Setting > last-failure-p_fs_clust08#start_0[ha09b]: (unset) -> 1621315013 > May 17 22:16:53 ha09b Filesystem(p_fs_clust08)[3609]: WARNING: Couldn't find > device [/dev/mapper/vdo0]. Expected /dev/??? to exist > May 17 22:16:53 ha09b pacemaker-controld[2710]: notice: Result of notify > operation for p_drbd0 on ha09b: ok > May 17 22:16:53 ha09b Filesystem(p_fs_clust08)[3609]: INFO: Running stop for > /dev/mapper/vdo0 on /ha01_mysql > May 17 22:16:53 ha09b pacemaker-execd[2707]: notice: > p_fs_clust08_stop_0[3609] error output [ blockdev: cannot open > /dev/mapper/vdo0: No such file or directory ] > May 17 22:16:53 ha09b pacemaker-controld[2710]: notice: Result of stop > operation for p_fs_clust08 on ha09b: ok > May 17 22:16:53 ha09b pacemaker-controld[2710]: notice: > ha09b-p_vdo0_monitor_15000:35 [ error occurred checking vdo0 status on > ha09b\n ] > May 17 22:16:54 ha09b UDS/vdodumpconfig[3705]: ERROR (vdodumpconfig/3705) > openFile(): failed opening /dev/drbd0 with file access: 4: Wrong medium type > (124) > May 17 22:16:54 ha09b vdo[3697]: ERROR - vdodumpconfig: Failed to make > FileLayer from '/dev/drbd0' with Wrong medium type > May 17 22:16:54 ha09b pacemaker-controld[2710]: notice: Result of stop > operation for p_vdo0 on ha09b: ok > May 17 22:16:54 ha09b kernel: drbd ha01_mysql ha09a: peer( Primary -> > Secondary ) > May 17 22:16:54 ha09b pacemaker-controld[2710]: notice: Result of notify > operation for p_drbd0 on ha09b: ok > May 17 22:16:54 ha09b pacemaker-controld[2710]: notice: Result of notify > operation for p_drbd0 on ha09b: ok > May 17 22:16:54 ha09b kernel: drbd ha01_mysql: Preparing cluster-wide state > change 610633182 (0->-1 3/1) > May 17 22:16:54 ha09b kernel: drbd ha01_mysql: State change 610633182: > primary_nodes=1, weak_nodes=FFFFFFFFFFFFFFFC > May 17 22:16:54 ha09b kernel: drbd ha01_mysql: Committing cluster-wide state > change 610633182 (1ms) > May 17 22:16:54 ha09b kernel: drbd ha01_mysql: role( Secondary -> Primary ) > May 17 22:16:54 ha09b pacemaker-controld[2710]: notice: Result of promote > operation for p_drbd0 on ha09b: ok > May 17 22:16:54 ha09b pacemaker-controld[2710]: notice: Result of notify > operation for p_drbd0 on ha09b: ok > May 17 22:16:55 ha09b kernel: uds: modprobe: loaded version 8.0.1.6 > May 17 22:16:55 ha09b kernel: kvdo: modprobe: loaded version 6.2.3.114 > May 17 22:16:55 ha09b kernel: kvdo0:dmsetup: underlying device, REQ_FLUSH: > supported, REQ_FUA: supported > May 17 22:16:55 ha09b kernel: kvdo0:dmsetup: Using write policy async > automatically. > May 17 22:16:55 ha09b kernel: kvdo0:dmsetup: loading device 'vdo0' > May 17 22:16:55 ha09b kernel: kvdo0:dmsetup: zones: 1 logical, 1 physical, 1 > hash; base threads: 5 > May 17 22:16:55 ha09b kernel: kvdo0:dmsetup: starting device 'vdo0' > May 17 22:16:55 ha09b kernel: kvdo0:journalQ: VDO commencing normal operation > May 17 22:16:55 ha09b kernel: kvdo0:dmsetup: Setting UDS index target state > to online > May 17 22:16:55 ha09b kernel: kvdo0:dmsetup: device 'vdo0' started > May 17 22:16:55 ha09b kernel: kvdo0:dmsetup: resuming device 'vdo0' > May 17 22:16:55 ha09b kernel: kvdo0:dmsetup: device 'vdo0' resumed > May 17 22:16:55 ha09b kernel: uds: kvdo0:dedupeQ: loading or rebuilding > index: dev=/dev/drbd0 offset=4096 size=2781704192 > May 17 22:16:55 ha09b kernel: uds: kvdo0:dedupeQ: Using 6 indexing zones for > concurrency. > May 17 22:16:55 ha09b kernel: kvdo0:packerQ: compression is enabled > May 17 22:16:55 ha09b systemd[1]: Started Device-mapper event daemon. > May 17 22:16:55 ha09b dmeventd[3931]: dmeventd ready for processing. > May 17 22:16:55 ha09b UDS/vdodmeventd[3930]: INFO (vdodmeventd/3930) VDO > device vdo0 is now registered with dmeventd for monitoring > May 17 22:16:55 ha09b lvm[3931]: Monitoring VDO pool vdo0. > May 17 22:16:56 ha09b kernel: uds: kvdo0:dedupeQ: loaded index from chapter 0 > through chapter 85 > May 17 22:16:56 ha09b pacemaker-controld[2710]: notice: Result of start > operation for p_vdo0 on ha09b: ok > May 17 22:16:57 ha09b pacemaker-controld[2710]: notice: Result of monitor > operation for p_vdo0 on ha09b: ok > > > > > -----Original Message----- > > From: Users <users-boun...@clusterlabs.org> On Behalf Of Eric Robinson > > Sent: Monday, May 17, 2021 9:49 PM > > To: Cluster Labs - All topics related to open-source clustering welcomed > > <users@clusterlabs.org> > > Subject: Re: [ClusterLabs] DRBD + VDO HowTo? > > > > Notice in that 'pcs status' shows errors for resource p_vdo0 on node ha09b, > > even after doing 'pcs resource cleanup p_vdo0'. > > > > [root@ha09a ~]# pcs status > > Cluster name: ha09ab > > Cluster Summary: > > * Stack: corosync > > * Current DC: ha09a (version 2.0.4-6.el8_3.2-2deceaa3ae) - partition with > > quorum > > * Last updated: Mon May 17 19:45:41 2021 > > * Last change: Mon May 17 19:45:37 2021 by hacluster via crmd on ha09b > > * 2 nodes configured > > * 6 resource instances configured > > > > Node List: > > * Online: [ ha09a ha09b ] > > > > Full List of Resources: > > * Clone Set: p_drbd0-clone [p_drbd0] (promotable): > > * Masters: [ ha09a ] > > * Slaves: [ ha09b ] > > * Clone Set: p_drbd1-clone [p_drbd1] (promotable): > > * Masters: [ ha09b ] > > * Slaves: [ ha09a ] > > * p_vdo0 (lsb:vdo0): Starting ha09a > > * p_vdo1 (lsb:vdo1): Started ha09b > > > > Failed Resource Actions: > > * p_vdo0_monitor_0 on ha09b 'error' (1): call=83, status='complete', > > exitreason='', last-rc-change='2021-05-17 19:45:38 -07:00', queued=0ms, > > exec=175ms > > > > Daemon Status: > > corosync: active/disabled > > pacemaker: active/disabled > > pcsd: active/enabled > > > > > > If I debug the monitor action on ha09b, it reports 'not installed,' which > > makes > > sense because the drbd disk is in standby. > > > > [root@ha09b drbd.d]# pcs resource debug-monitor p_vdo0 Operation > > monitor for p_vdo0 (lsb::vdo0) returned: 'not installed' (5) > stdout: > > error > > occurred checking vdo0 status on ha09b > > > > Should it report something else? > > > > > -----Original Message----- > > > From: Users <users-boun...@clusterlabs.org> On Behalf Of Eric Robinson > > > Sent: Monday, May 17, 2021 1:37 PM > > > To: Cluster Labs - All topics related to open-source clustering > > > welcomed <users@clusterlabs.org> > > > Subject: Re: [ClusterLabs] DRBD + VDO HowTo? > > > > > > Andrei -- > > > > > > To follow up, here is the Pacemaker config. Let's not talk about > > > fencing or quorum right now. I want to focus on the vdo issue at hand. > > > > > > [root@ha09a ~]# pcs config > > > Cluster Name: ha09ab > > > Corosync Nodes: > > > ha09a ha09b > > > Pacemaker Nodes: > > > ha09a ha09b > > > > > > Resources: > > > Clone: p_drbd0-clone > > > Meta Attrs: clone-max=2 clone-node-max=1 notify=true promotable=true > > > promoted-max=1 promoted-node-max=1 > > > Resource: p_drbd0 (class=ocf provider=linbit type=drbd) > > > Attributes: drbd_resource=ha01_mysql > > > Operations: demote interval=0s timeout=90 (p_drbd0-demote-interval- > > 0s) > > > monitor interval=60s (p_drbd0-monitor-interval-60s) > > > notify interval=0s timeout=90 (p_drbd0-notify-interval-0s) > > > promote interval=0s timeout=90 > > > (p_drbd0-promote-interval-0s) > > > reload interval=0s timeout=30 (p_drbd0-reload-interval-0s) > > > start interval=0s timeout=240 (p_drbd0-start-interval-0s) > > > stop interval=0s timeout=100 (p_drbd0-stop-interval-0s) > > > Clone: p_drbd1-clone > > > Meta Attrs: clone-max=2 clone-node-max=1 notify=true promotable=true > > > promoted-max=1 promoted-node-max=1 > > > Resource: p_drbd1 (class=ocf provider=linbit type=drbd) > > > Attributes: drbd_resource=ha02_mysql > > > Operations: demote interval=0s timeout=90 (p_drbd1-demote-interval- > > 0s) > > > monitor interval=60s (p_drbd1-monitor-interval-60s) > > > notify interval=0s timeout=90 (p_drbd1-notify-interval-0s) > > > promote interval=0s timeout=90 > > > (p_drbd1-promote-interval-0s) > > > reload interval=0s timeout=30 (p_drbd1-reload-interval-0s) > > > start interval=0s timeout=240 (p_drbd1-start-interval-0s) > > > stop interval=0s timeout=100 (p_drbd1-stop-interval-0s) > > > Resource: p_vdo0 (class=lsb type=vdo0) > > > Operations: force-reload interval=0s timeout=15 > > > (p_vdo0-force-reload- > > > interval-0s) > > > monitor interval=15 timeout=15 (p_vdo0-monitor-interval-15) > > > restart interval=0s timeout=15 (p_vdo0-restart-interval-0s) > > > start interval=0s timeout=15 (p_vdo0-start-interval-0s) > > > stop interval=0s timeout=15 (p_vdo0-stop-interval-0s) > > > Resource: p_vdo1 (class=lsb type=vdo1) > > > Operations: force-reload interval=0s timeout=15 > > > (p_vdo1-force-reload- > > > interval-0s) > > > monitor interval=15 timeout=15 (p_vdo1-monitor-interval-15) > > > restart interval=0s timeout=15 (p_vdo1-restart-interval-0s) > > > start interval=0s timeout=15 (p_vdo1-start-interval-0s) > > > stop interval=0s timeout=15 (p_vdo1-stop-interval-0s) > > > > > > Stonith Devices: > > > Fencing Levels: > > > > > > Location Constraints: > > > Ordering Constraints: > > > promote p_drbd0-clone then start p_vdo0 (kind:Mandatory) (id:order- > > > p_drbd0-clone-p_vdo0-mandatory) > > > promote p_drbd1-clone then start p_vdo1 (kind:Mandatory) (id:order- > > > p_drbd1-clone-p_vdo1-mandatory) > > > Colocation Constraints: > > > p_vdo0 with p_drbd0-clone (score:INFINITY) (id:colocation-p_vdo0- > > > p_drbd0-clone-INFINITY) > > > p_vdo1 with p_drbd1-clone (score:INFINITY) (id:colocation-p_vdo1- > > > p_drbd1-clone-INFINITY) > > > Ticket Constraints: > > > > > > Alerts: > > > No alerts defined > > > > > > Resources Defaults: > > > Meta Attrs: rsc_defaults-meta_attributes > > > resource-stickiness=100 > > > Operations Defaults: > > > Meta Attrs: op_defaults-meta_attributes > > > timeout=30s > > > > > > Cluster Properties: > > > cluster-infrastructure: corosync > > > cluster-name: ha09ab > > > dc-version: 2.0.4-6.el8_3.2-2deceaa3ae > > > have-watchdog: false > > > last-lrm-refresh: 1621198059 > > > maintenance-mode: false > > > no-quorum-policy: ignore > > > stonith-enabled: false > > > > > > Tags: > > > No tags defined > > > > > > Quorum: > > > Options: > > > > > > Here is the cluster status. Right now, node ha09a is primary for both > > > drbd disks. > > > > > > [root@ha09a ~]# pcs status > > > Cluster name: ha09ab > > > Cluster Summary: > > > * Stack: corosync > > > * Current DC: ha09a (version 2.0.4-6.el8_3.2-2deceaa3ae) - partition > > > with quorum > > > * Last updated: Mon May 17 11:35:34 2021 > > > * Last change: Mon May 17 11:34:24 2021 by hacluster via crmd on ha09a > > > * 2 nodes configured > > > * 6 resource instances configured (2 BLOCKED from further action due > > > to > > > failure) > > > > > > Node List: > > > * Online: [ ha09a ha09b ] > > > > > > Full List of Resources: > > > * Clone Set: p_drbd0-clone [p_drbd0] (promotable): > > > * Masters: [ ha09a ] > > > * Slaves: [ ha09b ] > > > * Clone Set: p_drbd1-clone [p_drbd1] (promotable): > > > * Masters: [ ha09a ] > > > * Slaves: [ ha09b ] > > > * p_vdo0 (lsb:vdo0): FAILED ha09a (blocked) > > > * p_vdo1 (lsb:vdo1): FAILED ha09a (blocked) > > > > > > Failed Resource Actions: > > > * p_vdo1_stop_0 on ha09a 'error' (1): call=21, status='Timed Out', > > > exitreason='', last-rc-change='2021-05-17 11:29:09 -07:00', > > > queued=0ms, exec=15001ms > > > * p_vdo0_stop_0 on ha09a 'error' (1): call=27, status='Timed Out', > > > exitreason='', last-rc-change='2021-05-17 11:34:26 -07:00', > > > queued=0ms, exec=15001ms > > > * p_vdo1_monitor_0 on ha09b 'error' (1): call=21, status='complete', > > > exitreason='', last-rc-change='2021-05-17 11:29:08 -07:00', > > > queued=0ms, exec=217ms > > > * p_vdo0_monitor_0 on ha09b 'error' (1): call=28, status='complete', > > > exitreason='', last-rc-change='2021-05-17 11:34:25 -07:00', > > > queued=0ms, exec=182ms > > > > > > Daemon Status: > > > corosync: active/disabled > > > pacemaker: active/disabled > > > pcsd: active/enabled > > > > > > The vdo devices are available... > > > > > > [root@ha09a ~]# vdo list > > > vdo0 > > > vdo1 > > > > > > > > > > -----Original Message----- > > > > From: Users <users-boun...@clusterlabs.org> On Behalf Of Eric > > > > Robinson > > > > Sent: Monday, May 17, 2021 1:28 PM > > > > To: Cluster Labs - All topics related to open-source clustering > > > > welcomed <users@clusterlabs.org> > > > > Subject: Re: [ClusterLabs] DRBD + VDO HowTo? > > > > > > > > Andrei -- > > > > > > > > Sorry for the novels. Sometimes it is hard to tell whether people > > > > want all the configs, logs, and scripts first, or if they want a > > > > description of the problem and what one is trying to accomplish first. > > > > I'll send whatever you want. I am very eager to get to the bottom of > > > > this. > > > > > > > > I'll start with my custom LSB RA. I can send the Pacemaker config a bit > > later. > > > > > > > > [root@ha09a init.d]# ll|grep vdo > > > > lrwxrwxrwx. 1 root root 9 May 16 10:28 vdo0 -> vdo_multi > > > > lrwxrwxrwx. 1 root root 9 May 16 10:28 vdo1 -> vdo_multi > > > > -rwx------. 1 root root 3623 May 16 13:21 vdo_multi > > > > > > > > [root@ha09a init.d]# cat vdo_multi > > > > #!/bin/bash > > > > > > > > #--custom script for managing vdo volumes > > > > > > > > #--functions > > > > function isActivated() { > > > > R=$(/usr/bin/vdo status -n $VOL 2>&1) > > > > if [ $? -ne 0 ]; then > > > > #--error occurred checking vdo status > > > > echo "$VOL: an error occurred checking activation > > > > status on $MY_HOSTNAME" > > > > return 1 > > > > fi > > > > R=$(/usr/bin/vdo status -n $VOL|grep Activate|awk > > > > '{$1=$1};1'|cut - > > d" > > > " > > > > -f2) > > > > echo "$R" > > > > return 0 > > > > } > > > > > > > > function isOnline() { > > > > R=$(/usr/bin/vdo status -n $VOL 2>&1) > > > > if [ $? -ne 0 ]; then > > > > #--error occurred checking vdo status > > > > echo "$VOL: an error occurred checking activation > > > > status on $MY_HOSTNAME" > > > > return 1 > > > > fi > > > > R=$(/usr/bin/vdo status -n $VOL|grep "Index status"|awk > > > > '{$1=$1};1'|cut -d" " -f3) > > > > echo "$R" > > > > return 0 > > > > } > > > > > > > > #--vars > > > > MY_HOSTNAME=$(hostname -s) > > > > > > > > #--get the volume name > > > > VOL=$(basename $0) > > > > > > > > #--get the action > > > > ACTION=$1 > > > > > > > > #--take the requested action > > > > case $ACTION in > > > > > > > > start) > > > > > > > > #--check current status > > > > R=$(isOnline "$VOL") > > > > if [ $? -ne 0 ]; then > > > > echo "error occurred checking $VOL status on > > > $MY_HOSTNAME" > > > > exit 0 > > > > fi > > > > if [ "$R" == "online" ]; then > > > > echo "running on $MY_HOSTNAME" > > > > exit 0 #--lsb: success > > > > fi > > > > > > > > #--enter activation loop > > > > ACTIVATED=no > > > > TIMER=15 > > > > while [ $TIMER -ge 0 ]; do > > > > R=$(isActivated "$VOL") > > > > if [ "$R" == "enabled" ]; then > > > > ACTIVATED=yes > > > > break > > > > fi > > > > sleep 1 > > > > TIMER=$(( TIMER-1 )) > > > > done > > > > if [ "$ACTIVATED" == "no" ]; then > > > > echo "$VOL: not activated on $MY_HOSTNAME" > > > > exit 5 #--lsb: not running > > > > fi > > > > > > > > #--enter start loop > > > > /usr/bin/vdo start -n $VOL > > > > ONLINE=no > > > > TIMER=15 > > > > while [ $TIMER -ge 0 ]; do > > > > R=$(isOnline "$VOL") > > > > if [ "$R" == "online" ]; then > > > > ONLINE=yes > > > > break > > > > fi > > > > sleep 1 > > > > TIMER=$(( TIMER-1 )) > > > > done > > > > if [ "$ONLINE" == "yes" ]; then > > > > echo "$VOL: started on $MY_HOSTNAME" > > > > exit 0 #--lsb: success > > > > else > > > > echo "$VOL: not started on $MY_HOSTNAME > > > > (unknown problem)" > > > > exit 0 #--lsb: unknown problem > > > > fi > > > > ;; > > > > stop) > > > > > > > > #--check current status > > > > R=$(isOnline "$VOL") > > > > if [ $? -ne 0 ]; then > > > > echo "error occurred checking $VOL status on > > > $MY_HOSTNAME" > > > > exit 0 > > > > fi > > > > > > > > if [ "$R" == "not" ]; then > > > > echo "not started on $MY_HOSTNAME" > > > > exit 0 #--lsb: success > > > > fi > > > > > > > > #--enter stop loop > > > > /usr/bin/vdo stop -n $VOL > > > > ONLINE=yes > > > > TIMER=15 > > > > while [ $TIMER -ge 0 ]; do > > > > R=$(isOnline "$VOL") > > > > if [ "$R" == "not" ]; then > > > > ONLINE=no > > > > break > > > > fi > > > > sleep 1 > > > > TIMER=$(( TIMER-1 )) > > > > done > > > > if [ "$ONLINE" == "no" ]; then > > > > echo "$VOL: stopped on $MY_HOSTNAME" > > > > exit 0 #--lsb:success > > > > else > > > > echo "$VOL: failed to stop on $MY_HOSTNAME > > > > (unknown problem)" > > > > exit 0 > > > > fi > > > > ;; > > > > status) > > > > R=$(isOnline "$VOL") > > > > if [ $? -ne 0 ]; then > > > > echo "error occurred checking $VOL status on > > > $MY_HOSTNAME" > > > > exit 5 > > > > fi > > > > if [ "$R" == "online" ]; then > > > > echo "$VOL started on $MY_HOSTNAME" > > > > exit 0 #--lsb: success > > > > else > > > > echo "$VOL not started on $MY_HOSTNAME" > > > > exit 3 #--lsb: not running > > > > fi > > > > ;; > > > > > > > > esac > > > > > > > > > > > > > > > > > -----Original Message----- > > > > > From: Users <users-boun...@clusterlabs.org> On Behalf Of Andrei > > > > > Borzenkov > > > > > Sent: Monday, May 17, 2021 12:49 PM > > > > > To: users@clusterlabs.org > > > > > Subject: Re: [ClusterLabs] DRBD + VDO HowTo? > > > > > > > > > > On 17.05.2021 18:18, Eric Robinson wrote: > > > > > > To Strahil and Klaus – > > > > > > > > > > > > I created the vdo devices using default parameters, so ‘auto’ > > > > > > mode was > > > > > selected by default. vdostatus shows that the current mode is async. > > > > > The underlying drbd devices are running protocol C, so I assume > > > > > that vdo should be changed to sync mode? > > > > > > > > > > > > The VDO service is disabled and is solely under the control of > > > > > > Pacemaker, > > > > > but I have been unable to get a resource agent to work reliably. I > > > > > have two nodes. Under normal operation, Node A is primary for disk > > > > > drbd0, and device > > > > > vdo0 rides on top of that. Node B is primary for disk drbd1 and > > > > > device > > > > > vdo1 rides on top of that. In the event of a node failure, the vdo > > > > > device and the underlying drbd disk should migrate to the other > > > > > node, and then that node will be primary for both drbd disks and > > > > > both vdo > > > > devices. > > > > > > > > > > > > The default systemd vdo service does not work because it uses > > > > > > the –all flag > > > > > and starts/stops all vdo devices. I noticed that there is also a > > > > > vdo-start-by- dev.service, but there is no documentation on how to > > > > > use it. I wrote my own vdo-by-dev system service, but that did not > > > > > work reliably either. Then I noticed that there is already an OCF > > > > > resource agent named vdo-vol, but that did not work either. I > > > > > finally tried writing my own OCF-compliant RA, and then I tried > > > > > writing an LSB-compliant script, but none of those worked very well. > > > > > > > > > > > > > > > > You continue to write novels instead of simply showing your > > > > > resource agent, your configuration and logs. > > > > > > > > > > > My big problem is that I don’t understand how Pacemaker uses the > > > > > monitor action. Pacemaker would often fail vdo resources because > > > > > the monitor action received an error when it ran on the standby node. > > > > > For example, when Node A is primary for disk drbd1 and device > > > > > vdo1, Pacemaker would fail device vdo1 because when it ran the > > > > > monitor action on Node B, the RA reported an error. But OF COURSE > > > > > it would report an error, because disk drbd1 is secondary on that > > > > > node, and is therefore inaccessible to the vdo driver. I DON’T > > UNDERSTAND. > > > > > > > > > > > > > > > > May be your definition of "error" does not match pacemaker > > > > > definition of "error". It is hard to comment without seeing code. > > > > > > > > > > > -Eric > > > > > > > > > > > > > > > > > > > > > > > > From: Strahil Nikolov <hunter86...@yahoo.com> > > > > > > Sent: Monday, May 17, 2021 5:09 AM > > > > > > To: kwenn...@redhat.com; Klaus Wenninger > > > <kwenn...@redhat.com>; > > > > > > Cluster Labs - All topics related to open-source clustering > > > > > > welcomed <users@clusterlabs.org>; Eric Robinson > > > > > > <eric.robin...@psmnv.com> > > > > > > Subject: Re: [ClusterLabs] DRBD + VDO HowTo? > > > > > > > > > > > > Have you tried to set VDO in async mode ? > > > > > > > > > > > > Best Regards, > > > > > > Strahil Nikolov > > > > > > On Mon, May 17, 2021 at 8:57, Klaus Wenninger > > > > > > <kwenn...@redhat.com<mailto:kwenn...@redhat.com>> wrote: > > > > > > Did you try VDO in sync-mode for the case the flush-fua stuff > > > > > > isn't working through the layers? > > > > > > Did you check that VDO-service is disabled and solely under > > > > > > pacemaker-control and that the dependencies are set correctly? > > > > > > > > > > > > Klaus > > > > > > > > > > > > On 5/17/21 6:17 AM, Eric Robinson wrote: > > > > > > > > > > > > Yes, DRBD is working fine. > > > > > > > > > > > > > > > > > > > > > > > > From: Strahil Nikolov > > > > > > <hunter86...@yahoo.com><mailto:hunter86...@yahoo.com> > > > > > > Sent: Sunday, May 16, 2021 6:06 PM > > > > > > To: Eric Robinson > > > > > > <eric.robin...@psmnv.com><mailto:eric.robin...@psmnv.com>; > > > > Cluster > > > > > > Labs - All topics related to open-source clustering welcomed > > > > > > <users@clusterlabs.org><mailto:users@clusterlabs.org> > > > > > > Subject: RE: [ClusterLabs] DRBD + VDO HowTo? > > > > > > > > > > > > > > > > > > > > > > > > Are you sure that the DRBD is working properly ? > > > > > > > > > > > > > > > > > > > > > > > > Best Regards, > > > > > > > > > > > > Strahil Nikolov > > > > > > > > > > > > On Mon, May 17, 2021 at 0:32, Eric Robinson > > > > > > > > > > > > <eric.robin...@psmnv.com<mailto:eric.robin...@psmnv.com>> > > > wrote: > > > > > > > > > > > > Okay, it turns out I was wrong. I thought I had it working, but > > > > > > I keep running > > > > > into problems. Sometimes when I demote a DRBD resource on Node A > > > and > > > > > promote it on Node B, and I try to mount the filesystem, the > > > > > system complains that it cannot read the superblock. But when I > > > > > move the DRBD primary back to Node A, the file system is mountable > > again. > > > > > Also, I have problems with filesystems not mounting because the > > > > > vdo devices are not present. All kinds of issues. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > From: Users > > > > > > <users-boun...@clusterlabs.org<mailto:users- > > > > boun...@clusterlabs.org> > > > > > > > > > > > > > On Behalf Of Eric Robinson > > > > > > Sent: Friday, May 14, 2021 3:55 PM > > > > > > To: Strahil Nikolov > > > > > > <hunter86...@yahoo.com<mailto:hunter86...@yahoo.com>>; > > > Cluster > > > > > Labs - > > > > > > All topics related to open-source clustering welcomed > > > > > > <users@clusterlabs.org<mailto:users@clusterlabs.org>> > > > > > > Subject: Re: [ClusterLabs] DRBD + VDO HowTo? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Okay, I have it working now. The default systemd service > > > > > > definitions did > > > > > not work, so I created my own. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > From: Strahil Nikolov > > > > > > <hunter86...@yahoo.com<mailto:hunter86...@yahoo.com>> > > > > > > Sent: Friday, May 14, 2021 3:41 AM > > > > > > To: Eric Robinson > > > > > > <eric.robin...@psmnv.com<mailto:eric.robin...@psmnv.com>>; > > > > Cluster > > > > > > Labs - All topics related to open-source clustering welcomed > > > > > > <users@clusterlabs.org<mailto:users@clusterlabs.org>> > > > > > > Subject: RE: [ClusterLabs] DRBD + VDO HowTo? > > > > > > > > > > > > > > > > > > > > > > > > There is no VDO RA according to my knowledge, but you can use > > > > > > systemd > > > > > service as a resource. > > > > > > > > > > > > > > > > > > > > > > > > Yet, the VDO service that comes with thr OS is a generic one and > > > > > > controlls > > > > > all VDOs - so you need to create your own vdo service. > > > > > > > > > > > > > > > > > > > > > > > > Best Regards, > > > > > > > > > > > > Strahil Nikolov > > > > > > > > > > > > On Fri, May 14, 2021 at 6:55, Eric Robinson > > > > > > > > > > > > <eric.robin...@psmnv.com<mailto:eric.robin...@psmnv.com>> > > > wrote: > > > > > > > > > > > > I created the VDO volumes fine on the drbd devices, formatted > > > > > > them as xfs > > > > > filesystems, created cluster filesystem resources, and the cluster > > > > > us using them. But the cluster won’t fail over. Is there a VDO > > > > > cluster RA out there somewhere already? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > From: Strahil Nikolov > > > > > > <hunter86...@yahoo.com<mailto:hunter86...@yahoo.com>> > > > > > > Sent: Thursday, May 13, 2021 10:07 PM > > > > > > To: Cluster Labs - All topics related to open-source clustering > > > > > > welcomed <users@clusterlabs.org<mailto:users@clusterlabs.org>>; > > > > > > Eric Robinson > > > > > <eric.robin...@psmnv.com<mailto:eric.robin...@psmnv.com>> > > > > > > Subject: Re: [ClusterLabs] DRBD + VDO HowTo? > > > > > > > > > > > > > > > > > > > > > > > > For DRBD there is enough info, so let's focus on VDO. > > > > > > > > > > > > There is a systemd service that starts all VDOs on the system. > > > > > > You can > > > > > create the VDO once drbs is open for writes and then you can > > > > > create your own systemd '.service' file which can be used as a cluster > > resource. > > > > > > > > > > > > > > > > > > Best Regards, > > > > > > > > > > > > Strahil Nikolov > > > > > > > > > > > > > > > > > > > > > > > > On Fri, May 14, 2021 at 2:33, Eric Robinson > > > > > > > > > > > > <eric.robin...@psmnv.com<mailto:eric.robin...@psmnv.com>> > > > wrote: > > > > > > > > > > > > Can anyone point to a document on how to use VDO de-duplication > > > > > > with > > > > > DRBD? Linbit has a blog page about it, but it was last updated 6 > > > > > years ago and the embedded links are dead. > > > > > > > > > > > > > > > > > > > > > > > > https://linbit.com/blog/albireo-virtual-data-optimizer-vdo-on-dr > > > > > > bd > > > > > > / > > > > > > > > > > > > > > > > > > > > > > > > -Eric > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Disclaimer : This email and any files transmitted with it are > > > > > > confidential and > > > > > intended solely for intended recipients. If you are not the named > > > > > addressee you should not disseminate, distribute, copy or alter > > > > > this email. Any views or opinions presented in this email are > > > > > solely those of the author and might not represent those of > > > > > Physician Select Management. Warning: Although Physician Select > > > > > Management > > > has > > > > > taken reasonable precautions to ensure no viruses are present in > > > > > this email, the company cannot accept responsibility for any loss > > > > > or damage arising > > > > from the use of this email or attachments. > > > > > > > > > > > > _______________________________________________ > > > > > > Manage your subscription: > > > > > > https://lists.clusterlabs.org/mailman/listinfo/users > > > > > > > > > > > > ClusterLabs home: https://www.clusterlabs.org/ > > > > > > > > > > > > Disclaimer : This email and any files transmitted with it are > > > > > > confidential and > > > > > intended solely for intended recipients. If you are not the named > > > > > addressee you should not disseminate, distribute, copy or alter > > > > > this email. Any views or opinions presented in this email are > > > > > solely those of the author and might not represent those of > > > > > Physician Select Management. Warning: Although Physician Select > > > > > Management > > > has > > > > > taken reasonable precautions to ensure no viruses are present in > > > > > this email, the company cannot accept responsibility for any loss > > > > > or damage arising > > > > from the use of this email or attachments. > > > > > > > > > > > > Disclaimer : This email and any files transmitted with it are > > > > > > confidential and > > > > > intended solely for intended recipients. If you are not the named > > > > > addressee you should not disseminate, distribute, copy or alter > > > > > this email. Any views or opinions presented in this email are > > > > > solely those of the author and might not represent those of > > > > > Physician Select Management. Warning: Although Physician Select > > > > > Management > > > has > > > > > taken reasonable precautions to ensure no viruses are present in > > > > > this email, the company cannot accept responsibility for any loss > > > > > or damage arising > > > > from the use of this email or attachments. > > > > > > > > > > > > Disclaimer : This email and any files transmitted with it are > > > > > > confidential and > > > > > intended solely for intended recipients. If you are not the named > > > > > addressee you should not disseminate, distribute, copy or alter > > > > > this email. Any views or opinions presented in this email are > > > > > solely those of the author and might not represent those of > > > > > Physician Select Management. Warning: Although Physician Select > > > > > Management > > > has > > > > > taken reasonable precautions to ensure no viruses are present in > > > > > this email, the company cannot accept responsibility for any loss > > > > > or damage arising > > > > from the use of this email or attachments. > > > > > > Disclaimer : This email and any files transmitted with it are > > > > > > confidential and > > > > > intended solely for intended recipients. If you are not the named > > > > > addressee you should not disseminate, distribute, copy or alter > > > > > this email. Any views or opinions presented in this email are > > > > > solely those of the author and might not represent those of > > > > > Physician Select Management. Warning: Although Physician Select > > > > > Management > > > has > > > > > taken reasonable precautions to ensure no viruses are present in > > > > > this email, the company cannot accept responsibility for any loss > > > > > or damage arising > > > > from the use of this email or attachments. > > > > > > > > > > > > _______________________________________________ > > > > > > > > > > > > Manage your subscription: > > > > > > > > > > > > https://lists.clusterlabs.org/mailman/listinfo/users > > > > > > > > > > > > > > > > > > > > > > > > ClusterLabs home: https://www.clusterlabs.org/ > > > > > > > > > > > > Disclaimer : This email and any files transmitted with it are > > > > > > confidential and > > > > > intended solely for intended recipients. If you are not the named > > > > > addressee you should not disseminate, distribute, copy or alter > > > > > this email. Any views or opinions presented in this email are > > > > > solely those of the author and might not represent those of > > > > > Physician Select Management. Warning: Although Physician Select > > > > > Management > > > has > > > > > taken reasonable precautions to ensure no viruses are present in > > > > > this email, the company cannot accept responsibility for any loss > > > > > or damage arising > > > > from the use of this email or attachments. > > > > > > > > > > > > > > > > > > _______________________________________________ > > > > > > Manage your subscription: > > > > > > https://lists.clusterlabs.org/mailman/listinfo/users > > > > > > > > > > > > ClusterLabs home: https://www.clusterlabs.org/ > > > > > > > > > > > > > > > > _______________________________________________ > > > > > Manage your subscription: > > > > > https://lists.clusterlabs.org/mailman/listinfo/users > > > > > > > > > > ClusterLabs home: https://www.clusterlabs.org/ > > > > Disclaimer : This email and any files transmitted with it are > > > > confidential and intended solely for intended recipients. If you are > > > > not the named addressee you should not disseminate, distribute, copy > > > > or alter this email. Any views or opinions presented in this email > > > > are solely those of the author and might not represent those of > > > > Physician Select Management. Warning: Although Physician Select > > > > Management has taken reasonable precautions to ensure no viruses are > > > > present in this email, the company cannot accept responsibility for > > > > any loss or damage > > > arising from the use of this email or attachments. > > > > _______________________________________________ > > > > Manage your subscription: > > > > https://lists.clusterlabs.org/mailman/listinfo/users > > > > > > > > ClusterLabs home: https://www.clusterlabs.org/ > > > Disclaimer : This email and any files transmitted with it are > > > confidential and intended solely for intended recipients. If you are > > > not the named addressee you should not disseminate, distribute, copy > > > or alter this email. Any views or opinions presented in this email are > > > solely those of the author and might not represent those of Physician > > > Select Management. Warning: Although Physician Select Management has > > > taken reasonable precautions to ensure no viruses are present in this > > > email, the company cannot accept responsibility for any loss or damage > > arising from the use of this email or attachments. > > > _______________________________________________ > > > Manage your subscription: > > > https://lists.clusterlabs.org/mailman/listinfo/users > > > > > > ClusterLabs home: https://www.clusterlabs.org/ > > Disclaimer : This email and any files transmitted with it are confidential > > and > > intended solely for intended recipients. If you are not the named addressee > > you should not disseminate, distribute, copy or alter this email. Any views > > or > > opinions presented in this email are solely those of the author and might > > not > > represent those of Physician Select Management. Warning: Although > > Physician Select Management has taken reasonable precautions to ensure > > no viruses are present in this email, the company cannot accept > > responsibility > > for any loss or damage arising from the use of this email or attachments. > > _______________________________________________ > > Manage your subscription: > > https://lists.clusterlabs.org/mailman/listinfo/users > > > > ClusterLabs home: https://www.clusterlabs.org/ > Disclaimer : This email and any files transmitted with it are confidential > and intended solely for intended recipients. If you are not the named > addressee you should not disseminate, distribute, copy or alter this email. > Any views or opinions presented in this email are solely those of the author > and might not represent those of Physician Select Management. Warning: > Although Physician Select Management has taken reasonable precautions to > ensure no viruses are present in this email, the company cannot accept > responsibility for any loss or damage arising from the use of this email or > attachments. > _______________________________________________ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ _______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/