Re: [ClusterLabs] Stupid DRBD/LVM Global Filter Question

2019-10-30 Thread Eric Robinson
Roger --

Thank you, sir. That does help.

-Original Message-
From: Roger Zhou 
Sent: Wednesday, October 30, 2019 2:56 AM
To: Cluster Labs - All topics related to open-source clustering welcomed 
; Eric Robinson 
Subject: Re: [ClusterLabs] Stupid DRBD/LVM Global Filter Question


On 10/30/19 6:17 AM, Eric Robinson wrote:
> If I have an LV as a backing device for a DRBD disk, can someone
> explain why I need an LVM filter? It seems to me that we would want
> the LV to be always active under both the primary and secondary DRBD
> devices, and there should be no need or desire to have the LV
> activated or deactivated by Pacemaker. What am I missing?

Your understanding is correct. No need to use LVM resource agent from Pacemaker 
in your case.

--Roger

>
> --Eric
>
> Disclaimer : This email and any files transmitted with it are
> confidential and intended solely for intended recipients. If you are
> not the named addressee you should not disseminate, distribute, copy
> or alter this email. Any views or opinions presented in this email are
> solely those of the author and might not represent those of Physician
> Select Management. Warning: Although Physician Select Management has
> taken reasonable precautions to ensure no viruses are present in this
> email, the company cannot accept responsibility for any loss or damage
> arising from the use of this email or attachments.
>
> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/
>
Disclaimer : This email and any files transmitted with it are confidential and 
intended solely for intended recipients. If you are not the named addressee you 
should not disseminate, distribute, copy or alter this email. Any views or 
opinions presented in this email are solely those of the author and might not 
represent those of Physician Select Management. Warning: Although Physician 
Select Management has taken reasonable precautions to ensure no viruses are 
present in this email, the company cannot accept responsibility for any loss or 
damage arising from the use of this email or attachments.
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] fencing on iscsi device not working

2019-10-30 Thread Andrei Borzenkov
30.10.2019 15:46, RAM PRASAD TWISTED ILLUSIONS пишет:
> Hi everyone,
> 
> I am trying to set up a storage cluster with two nodes, both running debian
> buster. The two nodes called, duke and miles, have a LUN residing on a SAN
> box as their shared storage device between them. As you can see in the
> output of pcs status, all the demons are active and I can get the nodes
> online without any issues. However, I cannot get the fencing resources to
> start.
> 
> These two nodes were running debian jessie before and had access to the
> same LUN in a storage cluster configuration. Now, I am trying to recreate a
> similar setup with both nodes now running the latest debian. I am not sure
> if this is relevant, but this LUN already has shared VG with data on it.  I
> am wondering if this could be the cause of the trouble? Should I be
> creating my stonith device on a different/fresh LUN?
> 
> ### pcs status
> Cluster name: jazz
> Stack: corosync
> Current DC: duke (version 2.0.1-9e909a5bdd) - partition with quorum
> Last updated: Wed Oct 30 11:58:19 2019
> Last change: Wed Oct 30 11:28:28 2019 by root via cibadmin on duke
> 
> 2 nodes configured
> 2 resources configured
> 
> Online: [ duke miles ]
> 
> Full list of resources:
> 
>  fence_duke(stonith:fence_scsi):Stopped
>  fence_miles(stonith:fence_scsi):Stopped
> 
> Failed Fencing Actions:
> * unfencing of duke failed: delegate=, client=pacemaker-controld.1703,
> origin=duke,
> last-failed='Wed Oct 30 11:43:29 2019'
> * unfencing of miles failed: delegate=, client=pacemaker-controld.1703,
> origin=duke,
> last-failed='Wed Oct 30 11:43:29 2019'
> 
> Daemon Status:
>   corosync: active/enabled
>   pacemaker: active/enabled
>   pcsd: active/enabled
> ###
> 
> I used the following commands to add the two fencing devices and set their
> location constraints .
> 
> ###
> sudo pcs cluster cib test_cib_cfg
> pcs -f test_cib_cfg stonith create fence_duke fence_scsi
> pcmk_host_list=duke pcmk_reboot_action="off"

According to documentation, pcmk_host_list is used only if
pcmk_host_check=static-list which is not default, by default pacemaker
queries agent for nodes it can fence and fence_scsi does not return
anything.

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] fencing on iscsi device not working

2019-10-30 Thread Ken Gaillot
On Wed, 2019-10-30 at 13:46 +0100, RAM PRASAD TWISTED ILLUSIONS wrote:
> Hi everyone, 
> 
> I am trying to set up a storage cluster with two nodes, both running
> debian buster. The two nodes called, duke and miles, have a LUN
> residing on a SAN box as their shared storage device between them. As
> you can see in the output of pcs status, all the demons are active
> and I can get the nodes online without any issues. However, I cannot
> get the fencing resources to start. 
> 
> These two nodes were running debian jessie before and had access to
> the same LUN in a storage cluster configuration. Now, I am trying to
> recreate a similar setup with both nodes now running the latest
> debian. I am not sure if this is relevant, but this LUN already has
> shared VG with data on it.  I am wondering if this could be the cause
> of the trouble? Should I be creating my stonith device on a
> different/fresh LUN?
> 
> ### pcs status 
> Cluster name: jazz 
> Stack: corosync 
> Current DC: duke (version 2.0.1-9e909a5bdd) - partition with quorum 
> Last updated: Wed Oct 30 11:58:19 2019 
> Last change: Wed Oct 30 11:28:28 2019 by root via cibadmin on duke 
> 
> 2 nodes configured 
> 2 resources configured 
> 
> Online: [ duke miles ] 
> 
> Full list of resources: 
> 
>  fence_duke(stonith:fence_scsi):Stopped 
>  fence_miles(stonith:fence_scsi):Stopped 
> 
> Failed Fencing Actions: 
> * unfencing of duke failed: delegate=, client=pacemaker-
> controld.1703, origin=duke, 
> last-failed='Wed Oct 30 11:43:29 2019' 
> * unfencing of miles failed: delegate=, client=pacemaker-
> controld.1703, origin=duke, 
> last-failed='Wed Oct 30 11:43:29 2019' 
> 
> Daemon Status: 
>   corosync: active/enabled 
>   pacemaker: active/enabled 
>   pcsd: active/enabled 
> ### 
> 
> I used the following commands to add the two fencing devices and set
> their location constraints . 
> 
> ### 
> sudo pcs cluster cib test_cib_cfg 
> pcs -f test_cib_cfg stonith create fence_duke fence_scsi
> pcmk_host_list=duke pcmk_reboot_action="off" devices="/dev/disk/by-
> id/wwn-0x600c0ff0001e8e3c89601b580100" meta provides="unfencing" 
> pcs -f test_cib_cfg stonith create fence_miles fence_scsi
> pcmk_host_list=miles pcmk_reboot_action="off" devices="/dev/disk/by-
> id/wwn-0x600c0ff0001e8e3c89601b580100" delay=15 meta
> provides="unfencing" 
> pcs -f test_cib_cfg constraint location fence_duke avoids
> duke=INFINITY 
> pcs -f test_cib_cfg constraint location fence_miles avoids
> miles=INFINITY 

Use a score less than INFINITY. You want the devices to be able to run
on the target node if for some reason the other node is unable to (e.g.
if it's in standby).

I'm not sure whether that will fix the issue here (see below).

> pcs cluster cib-push test_cib_cfg 
> ### 
> 
> Here is the output in /var/log/pacemaker/pacemaker.log after adding
> the fencing resources 
> 
> Oct 30 12:06:02 duke pacemaker-schedulerd[1702]
> (determine_online_status_fencing)   info: Node miles is active 
> Oct 30 12:06:02 duke pacemaker-schedulerd[1702]
> (determine_online_status)   info: Node miles is online 
> Oct 30 12:06:02 duke pacemaker-schedulerd[1702]
> (determine_online_status_fencing)   info: Node duke is active 
> Oct 30 12:06:02 duke pacemaker-schedulerd[1702]
> (determine_online_status)   info: Node duke is online 
> Oct 30 12:06:02 duke pacemaker-schedulerd[1702]
> (unpack_node_loop)  info: Node 2 is already processed 
> Oct 30 12:06:02 duke pacemaker-schedulerd[1702]
> (unpack_node_loop)  info: Node 1 is already processed 
> Oct 30 12:06:02 duke pacemaker-schedulerd[1702]
> (unpack_node_loop)  info: Node 2 is already processed 
> Oct 30 12:06:02 duke pacemaker-schedulerd[1702]
> (unpack_node_loop)  info: Node 1 is already processed 
> Oct 30 12:06:02 duke pacemaker-schedulerd[1702] (common_print) info:
> fence_duke(stonith:fence_scsi):   Stopped 
> Oct 30 12:06:02 duke pacemaker-schedulerd[1702] (common_print) info:
> fence_miles   (stonith:fence_scsi):   Stopped 
> Oct 30 12:06:02 duke pacemaker-schedulerd[1702] (RecurringOp) info: 
> Start recurring monitor (60s) for fence_duke on miles 
> Oct 30 12:06:02 duke pacemaker-schedulerd[1702] (RecurringOp) info: 
> Start recurring monitor (60s) for fence_miles on duke 
> Oct 30 12:06:02 duke pacemaker-schedulerd[1702]
> (LogNodeActions)notice:  * Fence (on) miles 'required by
> fence_duke monitor' 
> Oct 30 12:06:02 duke pacemaker-schedulerd[1702]
> (LogNodeActions)notice:  * Fence (on) duke 'required by
> fence_duke monitor' 
> Oct 30 12:06:02 duke pacemaker-schedulerd[1702] (LogAction) notice: 
> * Start  fence_duke ( miles ) 
> Oct 30 12:06:02 duke pacemaker-schedulerd[1702] (LogAction) notice: 
> * Start  fence_miles(  duke ) 
> Oct 30 12:06:02 duke pacemaker-schedulerd[1702]
> (process_pe_message)notice: Calculated transition 63, saving
> inputs in /var/lib/pacemaker/pengine/pe-input-23.bz2 
> Oc

[ClusterLabs] fencing on iscsi device not working

2019-10-30 Thread Ramprasad

Hi everyone,

I am trying to set up a storage cluster with two nodes, both running 
debian buster. The two nodes called, duke and miles, have a LUN residing 
on a SAN box as their shared storage device between them. As you can see 
in the output of pcs status, all the demons are active and I can get the 
nodes online without any issues. However, I cannot get the fencing 
resources to start.


These two nodes were running debian jessie before and had access to the 
same LUN in a storage cluster configuration. Now, I am trying to 
recreate a similar setup with both nodes now running the latest debian.



### pcs status
Cluster name: jazz
Stack: corosync
Current DC: duke (version 2.0.1-9e909a5bdd) - partition with quorum
Last updated: Wed Oct 30 11:58:19 2019
Last change: Wed Oct 30 11:28:28 2019 by root via cibadmin on duke

2 nodes configured
2 resources configured

Online: [ duke miles ]

Full list of resources:

 fence_duke    (stonith:fence_scsi):    Stopped
 fence_miles    (stonith:fence_scsi):    Stopped

Failed Fencing Actions:
* unfencing of duke failed: delegate=, client=pacemaker-controld.1703, 
origin=duke,

    last-failed='Wed Oct 30 11:43:29 2019'
* unfencing of miles failed: delegate=, client=pacemaker-controld.1703, 
origin=duke,

    last-failed='Wed Oct 30 11:43:29 2019'

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled
###

I used the following commands to add the two fencing devices and set 
their location constraints .


###
sudo pcs cluster cib test_cib_cfg
pcs -f test_cib_cfg stonith create fence_duke fence_scsi 
pcmk_host_list=duke pcmk_reboot_action="off" 
devices="/dev/disk/by-id/wwn-0x600c0ff0001e8e3c89601b580100" meta 
provides="unfencing"
pcs -f test_cib_cfg stonith create fence_miles fence_scsi 
pcmk_host_list=miles pcmk_reboot_action="off" 
devices="/dev/disk/by-id/wwn-0x600c0ff0001e8e3c89601b580100" 
delay=15 meta provides="unfencing"

pcs -f test_cib_cfg constraint location fence_duke avoids duke=INFINITY
pcs -f test_cib_cfg constraint location fence_miles avoids miles=INFINITY
pcs cluster cib-push test_cib_cfg
###

Here is the output in /var/log/pacemaker/pacemaker.log after adding the 
fencing resources


Oct 30 12:06:02 duke pacemaker-schedulerd[1702] 
(determine_online_status_fencing)   info: Node miles is active
Oct 30 12:06:02 duke pacemaker-schedulerd[1702] 
(determine_online_status)   info: Node miles is online
Oct 30 12:06:02 duke pacemaker-schedulerd[1702] 
(determine_online_status_fencing)   info: Node duke is active
Oct 30 12:06:02 duke pacemaker-schedulerd[1702] 
(determine_online_status)   info: Node duke is online
Oct 30 12:06:02 duke pacemaker-schedulerd[1702] (unpack_node_loop)  
info: Node 2 is already processed
Oct 30 12:06:02 duke pacemaker-schedulerd[1702] (unpack_node_loop)  
info: Node 1 is already processed
Oct 30 12:06:02 duke pacemaker-schedulerd[1702] (unpack_node_loop)  
info: Node 2 is already processed
Oct 30 12:06:02 duke pacemaker-schedulerd[1702] (unpack_node_loop)  
info: Node 1 is already processed
Oct 30 12:06:02 duke pacemaker-schedulerd[1702] (common_print) info: 
fence_duke    (stonith:fence_scsi):   Stopped
Oct 30 12:06:02 duke pacemaker-schedulerd[1702] (common_print) info: 
fence_miles   (stonith:fence_scsi):   Stopped
Oct 30 12:06:02 duke pacemaker-schedulerd[1702] (RecurringOp) info:  
Start recurring monitor (60s) for fence_duke on miles
Oct 30 12:06:02 duke pacemaker-schedulerd[1702] (RecurringOp) info:  
Start recurring monitor (60s) for fence_miles on duke
Oct 30 12:06:02 duke pacemaker-schedulerd[1702] (LogNodeActions)    
notice:  * Fence (on) miles 'required by fence_duke monitor'
Oct 30 12:06:02 duke pacemaker-schedulerd[1702] (LogNodeActions)    
notice:  * Fence (on) duke 'required by fence_duke monitor'
Oct 30 12:06:02 duke pacemaker-schedulerd[1702] (LogAction) notice:  * 
Start  fence_duke ( miles )
Oct 30 12:06:02 duke pacemaker-schedulerd[1702] (LogAction) notice:  * 
Start  fence_miles    (  duke )
Oct 30 12:06:02 duke pacemaker-schedulerd[1702] (process_pe_message)    
notice: Calculated transition 63, saving inputs in 
/var/lib/pacemaker/pengine/pe-input-23.bz2
Oct 30 12:06:02 duke pacemaker-controld  [1703] (do_state_transition)   
info: State transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE | 
input=I_PE_SUCCESS cause=C_IPC_MESSAGE origin=handle_response
Oct 30 12:06:02 duke pacemaker-controld  [1703] (do_te_invoke) info: 
Processing graph 63 (ref=pe_calc-dc-1572433562-101) derived from 
/var/lib/pacemaker/pengine/pe-input-23.bz2
Oct 30 12:06:02 duke pacemaker-controld  [1703] (te_fence_node) 
notice: Requesting fencing (on) of node miles | action=5 timeout=6
Oct 30 12:06:02 duke pacemaker-controld  [1703] (te_fence_node) 
notice: Requesting fencing (on) of node duke | action=2 timeout=6
Oct 30 12:06:02 duke pacemaker-fenced    [1699] (handle_request)    
no

[ClusterLabs] Antw: Stupid DRBD/LVM Global Filter Question

2019-10-30 Thread Ulrich Windl
>>> Eric Robinson  schrieb am 29.10.2019 um 23:17 in
Nachricht


> If I have an LV as a backing device for a DRBD disk, can someone explain why

> I need an LVM filter? It seems to me that we would want the LV to be always

> active under both the primary and secondary DRBD devices, and there should
be 
> no need or desire to have the LV activated or deactivated by Pacemaker. What

> am I missing?

At least in the past it also had a performance reason: Most LVM tools did not
cache device information, and instead did O_DIRECT device access. In a SAN
environment with more than 100 disk paths, plus device-mapper devices,
partitions, etc. The delay and I/O impact was significant. So if you are using
LVM only on a few devices, help LVM to locate those. Also for multipath you
want LVM to use the multipath device instead of a single path device for
example.

Regards,
Ulrich

> 
> ‑‑Eric
> 
> 
> 
> 
> Disclaimer : This email and any files transmitted with it are confidential 
> and intended solely for intended recipients. If you are not the named 
> addressee you should not disseminate, distribute, copy or alter this email.

> Any views or opinions presented in this email are solely those of the author

> and might not represent those of Physician Select Management. Warning: 
> Although Physician Select Management has taken reasonable precautions to 
> ensure no viruses are present in this email, the company cannot accept 
> responsibility for any loss or damage arising from the use of this email or

> attachments.



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

[ClusterLabs] fencing on iscsi device not working

2019-10-30 Thread RAM PRASAD TWISTED ILLUSIONS
Hi everyone,

I am trying to set up a storage cluster with two nodes, both running debian
buster. The two nodes called, duke and miles, have a LUN residing on a SAN
box as their shared storage device between them. As you can see in the
output of pcs status, all the demons are active and I can get the nodes
online without any issues. However, I cannot get the fencing resources to
start.

These two nodes were running debian jessie before and had access to the
same LUN in a storage cluster configuration. Now, I am trying to recreate a
similar setup with both nodes now running the latest debian. I am not sure
if this is relevant, but this LUN already has shared VG with data on it.  I
am wondering if this could be the cause of the trouble? Should I be
creating my stonith device on a different/fresh LUN?

### pcs status
Cluster name: jazz
Stack: corosync
Current DC: duke (version 2.0.1-9e909a5bdd) - partition with quorum
Last updated: Wed Oct 30 11:58:19 2019
Last change: Wed Oct 30 11:28:28 2019 by root via cibadmin on duke

2 nodes configured
2 resources configured

Online: [ duke miles ]

Full list of resources:

 fence_duke(stonith:fence_scsi):Stopped
 fence_miles(stonith:fence_scsi):Stopped

Failed Fencing Actions:
* unfencing of duke failed: delegate=, client=pacemaker-controld.1703,
origin=duke,
last-failed='Wed Oct 30 11:43:29 2019'
* unfencing of miles failed: delegate=, client=pacemaker-controld.1703,
origin=duke,
last-failed='Wed Oct 30 11:43:29 2019'

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled
###

I used the following commands to add the two fencing devices and set their
location constraints .

###
sudo pcs cluster cib test_cib_cfg
pcs -f test_cib_cfg stonith create fence_duke fence_scsi
pcmk_host_list=duke pcmk_reboot_action="off"
devices="/dev/disk/by-id/wwn-0x600c0ff0001e8e3c89601b580100" meta
provides="unfencing"
pcs -f test_cib_cfg stonith create fence_miles fence_scsi
pcmk_host_list=miles pcmk_reboot_action="off"
devices="/dev/disk/by-id/wwn-0x600c0ff0001e8e3c89601b580100" delay=15
meta provides="unfencing"
pcs -f test_cib_cfg constraint location fence_duke avoids duke=INFINITY
pcs -f test_cib_cfg constraint location fence_miles avoids miles=INFINITY
pcs cluster cib-push test_cib_cfg
###

Here is the output in /var/log/pacemaker/pacemaker.log after adding the
fencing resources

Oct 30 12:06:02 duke pacemaker-schedulerd[1702]
(determine_online_status_fencing)   info: Node miles is active
Oct 30 12:06:02 duke pacemaker-schedulerd[1702]
(determine_online_status)   info: Node miles is online
Oct 30 12:06:02 duke pacemaker-schedulerd[1702]
(determine_online_status_fencing)   info: Node duke is active
Oct 30 12:06:02 duke pacemaker-schedulerd[1702]
(determine_online_status)   info: Node duke is online
Oct 30 12:06:02 duke pacemaker-schedulerd[1702] (unpack_node_loop)
info: Node 2 is already processed
Oct 30 12:06:02 duke pacemaker-schedulerd[1702] (unpack_node_loop)
info: Node 1 is already processed
Oct 30 12:06:02 duke pacemaker-schedulerd[1702] (unpack_node_loop)
info: Node 2 is already processed
Oct 30 12:06:02 duke pacemaker-schedulerd[1702] (unpack_node_loop)
info: Node 1 is already processed
Oct 30 12:06:02 duke pacemaker-schedulerd[1702] (common_print) info:
fence_duke(stonith:fence_scsi):   Stopped
Oct 30 12:06:02 duke pacemaker-schedulerd[1702] (common_print) info:
fence_miles   (stonith:fence_scsi):   Stopped
Oct 30 12:06:02 duke pacemaker-schedulerd[1702] (RecurringOp) info:  Start
recurring monitor (60s) for fence_duke on miles
Oct 30 12:06:02 duke pacemaker-schedulerd[1702] (RecurringOp) info:  Start
recurring monitor (60s) for fence_miles on duke
Oct 30 12:06:02 duke pacemaker-schedulerd[1702] (LogNodeActions)
notice:  * Fence (on) miles 'required by fence_duke monitor'
Oct 30 12:06:02 duke pacemaker-schedulerd[1702] (LogNodeActions)
notice:  * Fence (on) duke 'required by fence_duke monitor'
Oct 30 12:06:02 duke pacemaker-schedulerd[1702] (LogAction) notice:  *
Start  fence_duke ( miles )
Oct 30 12:06:02 duke pacemaker-schedulerd[1702] (LogAction) notice:  *
Start  fence_miles(  duke )
Oct 30 12:06:02 duke pacemaker-schedulerd[1702] (process_pe_message)
notice: Calculated transition 63, saving inputs in
/var/lib/pacemaker/pengine/pe-input-23.bz2
Oct 30 12:06:02 duke pacemaker-controld  [1703] (do_state_transition)
info: State transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE |
input=I_PE_SUCCESS cause=C_IPC_MESSAGE origin=handle_response
Oct 30 12:06:02 duke pacemaker-controld  [1703] (do_te_invoke) info:
Processing graph 63 (ref=pe_calc-dc-1572433562-101) derived from
/var/lib/pacemaker/pengine/pe-input-23.bz2
Oct 30 12:06:02 duke pacemaker-controld  [1703] (te_fence_node)
notice: Requesting fencing (on) of node miles | action=5 timeout=6
Oct 30 12:06:02 duke pacemaker-controld  [1703] (te_fence_node)
notice: Requesting fencing (on) of node

Re: [ClusterLabs] Stupid DRBD/LVM Global Filter Question

2019-10-30 Thread Roger Zhou


On 10/30/19 6:17 AM, Eric Robinson wrote:
> If I have an LV as a backing device for a DRBD disk, can someone explain 
> why I need an LVM filter? It seems to me that we would want the LV to be 
> always active under both the primary and secondary DRBD devices, and 
> there should be no need or desire to have the LV activated or 
> deactivated by Pacemaker. What am I missing?

Your understanding is correct. No need to use LVM resource agent from 
Pacemaker in your case.

--Roger

> 
> --Eric
> 
> Disclaimer : This email and any files transmitted with it are 
> confidential and intended solely for intended recipients. If you are not 
> the named addressee you should not disseminate, distribute, copy or 
> alter this email. Any views or opinions presented in this email are 
> solely those of the author and might not represent those of Physician 
> Select Management. Warning: Although Physician Select Management has 
> taken reasonable precautions to ensure no viruses are present in this 
> email, the company cannot accept responsibility for any loss or damage 
> arising from the use of this email or attachments.
> 
> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
> 
> ClusterLabs home: https://www.clusterlabs.org/
> 
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/