Re: [ClusterLabs] Major problem with iSCSITarget resource on top of DRBD M/S resource.
On 27/09/15 11:02 AM, Alex Crow wrote: > > > On 27/09/15 15:54, Digimer wrote: >> On 27/09/15 10:40 AM, Alex Crow wrote: >>> Hi List, >>> >>> I'm trying to set up a failover iSCSI storage system for oVirt using a >>> self-hosted engine. I've set up DRBD in Master-Slave for two iSCSI >>> targets, one for the self-hosted engine and one for the VMs. I has this >>> all working perfectly, then after trying to move the engine's LUN to the >>> opposite host, all hell broke loose. The VMS LUN is still fine, starts >> I'm guessing no fencing? > > Hi Digimer, > > No, but I've tried turning off one machine and still no success as a > single node :-( You *must* have working fencing, anyway. So now strikes me as a fantastic time to add it. Turning off the node alone doesn't help. >>> and migrates as it should. However the engine LUN always seems to try to >>> launch the target on the host that is *NOT* the master of the DRBD >>> resource. My constraints look fine, and should be self-explanatory about >>> which is which: >>> >>> [root@granby ~]# pcs constraint --full >>> Location Constraints: >>> Ordering Constraints: >>>promote drbd-vms-iscsi then start iscsi-vms-ip (kind:Mandatory) >>> (id:vm_iscsi_ip_after_drbd) >>>start iscsi-vms-target then start iscsi-vms-lun (kind:Mandatory) >>> (id:vms_lun_after_target) >>>promote drbd-vms-iscsi then start iscsi-vms-target (kind:Mandatory) >>> (id:vms_target_after_drbd) >>>promote drbd-engine-iscsi then start iscsi-engine-ip (kind:Mandatory) >>> (id:ip_after_drbd) >>>start iscsi-engine-target then start iscsi-engine-lun >>> (kind:Mandatory) >>> (id:lun_after_target) >>>promote drbd-engine-iscsi then start iscsi-engine-target >>> (kind:Mandatory) (id:target_after_drbd) >>> Colocation Constraints: >>>iscsi-vms-ip with drbd-vms-iscsi (score:INFINITY) (rsc-role:Started) >>> (with-rsc-role:Master) (id:vms_ip-with-drbd) >>>iscsi-vms-lun with drbd-vms-iscsi (score:INFINITY) (rsc-role:Started) >>> (with-rsc-role:Master) (id:vms_lun-with-drbd) >>>iscsi-vms-target with drbd-vms-iscsi (score:INFINITY) >>> (rsc-role:Started) (with-rsc-role:Master) (id:vms_target-with-drbd) >>>iscsi-engine-ip with drbd-engine-iscsi (score:INFINITY) >>> (rsc-role:Started) (with-rsc-role:Master) (id:ip-with-drbd) >>>iscsi-engine-lun with drbd-engine-iscsi (score:INFINITY) >>> (rsc-role:Started) (with-rsc-role:Master) (id:lun-with-drbd) >>>iscsi-engine-target with drbd-engine-iscsi (score:INFINITY) >>> (rsc-role:Started) (with-rsc-role:Master) (id:target-with-drbd) >>> >>> But see this from pcs status, the iSCSI target has FAILED on glenrock, >>> but the DRBD master is on granby!: >>> >>> [root@granby ~]# pcs status >>> Cluster name: storage >>> Last updated: Sun Sep 27 15:30:08 2015 >>> Last change: Sun Sep 27 15:20:58 2015 >>> Stack: cman >>> Current DC: glenrock - partition with quorum >>> Version: 1.1.11-97629de >>> 2 Nodes configured >>> 10 Resources configured >>> >>> >>> Online: [ glenrock granby ] >>> >>> Full list of resources: >>> >>> Master/Slave Set: drbd-vms-iscsi [drbd-vms] >>> Masters: [ glenrock ] >>> Slaves: [ granby ] >>> iscsi-vms-target(ocf::heartbeat:iSCSITarget): Started glenrock >>> iscsi-vms-lun(ocf::heartbeat:iSCSILogicalUnit): Started glenrock >>> iscsi-vms-ip(ocf::heartbeat:IPaddr2):Started glenrock >>> Master/Slave Set: drbd-engine-iscsi [drbd-engine] >>> Masters: [ granby ] >>> Slaves: [ glenrock ] >>> iscsi-engine-target(ocf::heartbeat:iSCSITarget): FAILED glenrock >>> (unmanaged) >>> iscsi-engine-ip(ocf::heartbeat:IPaddr2):Stopped >>> iscsi-engine-lun(ocf::heartbeat:iSCSILogicalUnit): Stopped >>> >>> Failed actions: >>> iscsi-engine-target_stop_0 on glenrock 'unknown error' (1): >>> call=177, status=Timed Out, last-rc-change='Sun Sep 27 15:20:59 2015', >>> queued=0ms, exec=10003ms >>> iscsi-engine-target_stop_0 on glenrock 'unknown error' (1): >>> call=177, status=Timed Out, last-rc-change='Sun Sep 27 15:20:59 2015', >>> queued=0ms, exec=10003ms >>> >>> I have tried various combinations of pcs resource clear and cleanup, but >>> that all result in the same outcome - apart from on some occasions when >>> one or other of the two hosts suddenly reboots! >>> >>> Here is a log right after a "pcs resource cleanup" - first on the master >>> for the DRBD m/s resource: >>> [root@granby ~]# pcs resource cleanup; tail -f /var/log/messages >>> All resources/stonith devices successfully cleaned up >>> Sep 27 15:33:42 granby crmd[3358]: notice: process_lrm_event: >>> granby-drbd-engine_monitor_0:117 [ \n ] >>> Sep 27 15:33:42 granby attrd[3356]: notice: attrd_trigger_update: >>> Sending flush op to all hosts for: probe_complete (true) >>> Sep 27 15:33:42 granby attrd[3356]: notice: attrd_perform_update: Sent >>> update 54: probe_complete=true >>> Sep 27 15:33:42 granby crmd[3358]: notice: process_lrm_event: >>> Operation drbd-engine_monitor_1:
Re: [ClusterLabs] Major problem with iSCSITarget resource on top of DRBD M/S resource.
On 27/09/15 10:40 AM, Alex Crow wrote: > Hi List, > > I'm trying to set up a failover iSCSI storage system for oVirt using a > self-hosted engine. I've set up DRBD in Master-Slave for two iSCSI > targets, one for the self-hosted engine and one for the VMs. I has this > all working perfectly, then after trying to move the engine's LUN to the > opposite host, all hell broke loose. The VMS LUN is still fine, starts I'm guessing no fencing? > and migrates as it should. However the engine LUN always seems to try to > launch the target on the host that is *NOT* the master of the DRBD > resource. My constraints look fine, and should be self-explanatory about > which is which: > > [root@granby ~]# pcs constraint --full > Location Constraints: > Ordering Constraints: > promote drbd-vms-iscsi then start iscsi-vms-ip (kind:Mandatory) > (id:vm_iscsi_ip_after_drbd) > start iscsi-vms-target then start iscsi-vms-lun (kind:Mandatory) > (id:vms_lun_after_target) > promote drbd-vms-iscsi then start iscsi-vms-target (kind:Mandatory) > (id:vms_target_after_drbd) > promote drbd-engine-iscsi then start iscsi-engine-ip (kind:Mandatory) > (id:ip_after_drbd) > start iscsi-engine-target then start iscsi-engine-lun (kind:Mandatory) > (id:lun_after_target) > promote drbd-engine-iscsi then start iscsi-engine-target > (kind:Mandatory) (id:target_after_drbd) > Colocation Constraints: > iscsi-vms-ip with drbd-vms-iscsi (score:INFINITY) (rsc-role:Started) > (with-rsc-role:Master) (id:vms_ip-with-drbd) > iscsi-vms-lun with drbd-vms-iscsi (score:INFINITY) (rsc-role:Started) > (with-rsc-role:Master) (id:vms_lun-with-drbd) > iscsi-vms-target with drbd-vms-iscsi (score:INFINITY) > (rsc-role:Started) (with-rsc-role:Master) (id:vms_target-with-drbd) > iscsi-engine-ip with drbd-engine-iscsi (score:INFINITY) > (rsc-role:Started) (with-rsc-role:Master) (id:ip-with-drbd) > iscsi-engine-lun with drbd-engine-iscsi (score:INFINITY) > (rsc-role:Started) (with-rsc-role:Master) (id:lun-with-drbd) > iscsi-engine-target with drbd-engine-iscsi (score:INFINITY) > (rsc-role:Started) (with-rsc-role:Master) (id:target-with-drbd) > > But see this from pcs status, the iSCSI target has FAILED on glenrock, > but the DRBD master is on granby!: > > [root@granby ~]# pcs status > Cluster name: storage > Last updated: Sun Sep 27 15:30:08 2015 > Last change: Sun Sep 27 15:20:58 2015 > Stack: cman > Current DC: glenrock - partition with quorum > Version: 1.1.11-97629de > 2 Nodes configured > 10 Resources configured > > > Online: [ glenrock granby ] > > Full list of resources: > > Master/Slave Set: drbd-vms-iscsi [drbd-vms] > Masters: [ glenrock ] > Slaves: [ granby ] > iscsi-vms-target(ocf::heartbeat:iSCSITarget): Started glenrock > iscsi-vms-lun(ocf::heartbeat:iSCSILogicalUnit): Started glenrock > iscsi-vms-ip(ocf::heartbeat:IPaddr2):Started glenrock > Master/Slave Set: drbd-engine-iscsi [drbd-engine] > Masters: [ granby ] > Slaves: [ glenrock ] > iscsi-engine-target(ocf::heartbeat:iSCSITarget): FAILED glenrock > (unmanaged) > iscsi-engine-ip(ocf::heartbeat:IPaddr2):Stopped > iscsi-engine-lun(ocf::heartbeat:iSCSILogicalUnit): Stopped > > Failed actions: > iscsi-engine-target_stop_0 on glenrock 'unknown error' (1): > call=177, status=Timed Out, last-rc-change='Sun Sep 27 15:20:59 2015', > queued=0ms, exec=10003ms > iscsi-engine-target_stop_0 on glenrock 'unknown error' (1): > call=177, status=Timed Out, last-rc-change='Sun Sep 27 15:20:59 2015', > queued=0ms, exec=10003ms > > I have tried various combinations of pcs resource clear and cleanup, but > that all result in the same outcome - apart from on some occasions when > one or other of the two hosts suddenly reboots! > > Here is a log right after a "pcs resource cleanup" - first on the master > for the DRBD m/s resource: > [root@granby ~]# pcs resource cleanup; tail -f /var/log/messages > All resources/stonith devices successfully cleaned up > Sep 27 15:33:42 granby crmd[3358]: notice: process_lrm_event: > granby-drbd-engine_monitor_0:117 [ \n ] > Sep 27 15:33:42 granby attrd[3356]: notice: attrd_trigger_update: > Sending flush op to all hosts for: probe_complete (true) > Sep 27 15:33:42 granby attrd[3356]: notice: attrd_perform_update: Sent > update 54: probe_complete=true > Sep 27 15:33:42 granby crmd[3358]: notice: process_lrm_event: > Operation drbd-engine_monitor_1: master (node=granby, call=131, > rc=8, cib-update=83, confirmed=false) > Sep 27 15:33:42 granby crmd[3358]: notice: process_lrm_event: > granby-drbd-engine_monitor_1:131 [ \n ] > Sep 27 15:33:42 granby crmd[3358]: notice: process_lrm_event: > Operation drbd-vms_monitor_2: ok (node=granby, call=130, rc=0, > cib-update=84, confirmed=false) > Sep 27 15:34:46 granby crmd[3358]: notice: do_lrm_invoke: Forcing the > status of all resources to be redetected > Sep 27 15:34:46 granby attrd[3356]: notice: attrd_trigger_update: >