Re: [ClusterLabs] stonith-ng - performing action 'monitor' timed out with signal 15

2019-09-11 Thread Marco Marino
Hi, some updates about this?
Thank you

Il Mer 4 Set 2019, 10:46 Marco Marino  ha scritto:

> First of all, thank you for your support.
> Andrey: sure, I can reach machines through IPMI.
> Here is a short "log":
>
> #From ld1 trying to contact ld1
> [root@ld1 ~]# ipmitool -I lanplus -H 192.168.254.250 -U root -P XX
> sdr elist all
> SEL  | 72h | ns  |  7.1 | No Reading
> Intrusion| 73h | ok  |  7.1 |
> iDRAC8   | 00h | ok  |  7.1 | Dynamic MC @ 20h
> ...
>
> #From ld1 trying to contact ld2
> ipmitool -I lanplus -H 192.168.254.251 -U root -P XX sdr elist all
> SEL  | 72h | ns  |  7.1 | No Reading
> Intrusion| 73h | ok  |  7.1 |
> iDRAC7   | 00h | ok  |  7.1 | Dynamic MC @ 20h
> ...
>
>
> #From ld2 trying to contact ld1:
> root@ld2 ~]# ipmitool -I lanplus -H 192.168.254.250 -U root -P X sdr
> elist all
> SEL  | 72h | ns  |  7.1 | No Reading
> Intrusion| 73h | ok  |  7.1 |
> iDRAC8   | 00h | ok  |  7.1 | Dynamic MC @ 20h
> System Board | 00h | ns  |  7.1 | Logical FRU @00h
> .
>
> #From ld2 trying to contact ld2
> [root@ld2 ~]# ipmitool -I lanplus -H 192.168.254.251 -U root -P  sdr
> elist all
> SEL  | 72h | ns  |  7.1 | No Reading
> Intrusion| 73h | ok  |  7.1 |
> iDRAC7   | 00h | ok  |  7.1 | Dynamic MC @ 20h
> System Board | 00h | ns  |  7.1 | Logical FRU @00h
> 
>
> Jan: Actually the cluster uses /etc/hosts in order to resolve names:
> 172.16.77.10ld1.mydomain.it  ld1
> 172.16.77.11ld2.mydomain.it  ld2
>
> Furthermore I'm using ip addresses for ipmi interfaces in the
> configuration:
> [root@ld1 ~]# pcs stonith show fence-node1
>  Resource: fence-node1 (class=stonith type=fence_ipmilan)
>   Attributes: ipaddr=192.168.254.250 lanplus=1 login=root passwd=X
> pcmk_host_check=static-list pcmk_host_list=ld1.mydomain.it
>   Operations: monitor interval=60s (fence-node1-monitor-interval-60s)
>
>
> Any idea?
> How can I reset the state of the cluster without downtime? "pcs resource
> cleanup" is enough?
> Thank you,
> Marco
>
>
> Il giorno mer 4 set 2019 alle ore 10:29 Jan Pokorný 
> ha scritto:
>
>> On 03/09/19 20:15 +0300, Andrei Borzenkov wrote:
>> > 03.09.2019 11:09, Marco Marino пишет:
>> >> Hi, I have a problem with fencing on a two node cluster. It seems that
>> >> randomly the cluster cannot complete monitor operation for fence
>> devices.
>> >> In log I see:
>> >> crmd[8206]:   error: Result of monitor operation for fence-node2 on
>> >> ld2.mydomain.it: Timed Out
>> >
>> > Can you actually access IP addresses of your IPMI ports?
>>
>> [
>> Tangentially, interesting aspect beyond that and applicable for any
>> non-IP cross-host referential needs, which I haven't seen mentioned
>> anywhere so far, is the risk of DNS resolution (when /etc/hosts will
>> come short) getting to troubles (stale records, port blocked, DNS
>> server overload [DNSSEC, etc.], IPv4/IPv6 parallel records that the SW
>> cannot handle gracefully, etc.).  In any case, just a single DNS
>> server would apparently be an undesired SPOF, and would be unfortunate
>> when unable to fence a node because of that.
>>
>> I think the most robust approach is to use IP addresses whenever
>> possible, and unambiguous records in /etc/hosts when practical.
>> ]
>>
>> >> As attachment there is
>> >> - /var/log/messages for node1 (only the important part)
>> >> - /var/log/messages for node2 (only the important part) <-- Problem
>> starts
>> >> here
>> >> - pcs status
>> >> - pcs stonith show (for both fence devices)
>> >>
>> >> I think it could be a timeout problem, so how can I see timeout value
>> for
>> >> monitor operation in stonith devices?
>> >> Please, someone can help me with this problem?
>> >> Furthermore, how can I fix the state of fence devices without downtime?
>>
>> --
>> Jan (Poki)
>> ___
>> Manage your subscription:
>> https://lists.clusterlabs.org/mailman/listinfo/users
>>
>> ClusterLabs home: https://www.clusterlabs.org/
>
>
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] stonith-ng - performing action 'monitor' timed out with signal 15

2019-09-04 Thread Marco Marino
First of all, thank you for your support.
Andrey: sure, I can reach machines through IPMI.
Here is a short "log":

#From ld1 trying to contact ld1
[root@ld1 ~]# ipmitool -I lanplus -H 192.168.254.250 -U root -P XX sdr
elist all
SEL  | 72h | ns  |  7.1 | No Reading
Intrusion| 73h | ok  |  7.1 |
iDRAC8   | 00h | ok  |  7.1 | Dynamic MC @ 20h
...

#From ld1 trying to contact ld2
ipmitool -I lanplus -H 192.168.254.251 -U root -P XX sdr elist all
SEL  | 72h | ns  |  7.1 | No Reading
Intrusion| 73h | ok  |  7.1 |
iDRAC7   | 00h | ok  |  7.1 | Dynamic MC @ 20h
...


#From ld2 trying to contact ld1:
root@ld2 ~]# ipmitool -I lanplus -H 192.168.254.250 -U root -P X sdr
elist all
SEL  | 72h | ns  |  7.1 | No Reading
Intrusion| 73h | ok  |  7.1 |
iDRAC8   | 00h | ok  |  7.1 | Dynamic MC @ 20h
System Board | 00h | ns  |  7.1 | Logical FRU @00h
.

#From ld2 trying to contact ld2
[root@ld2 ~]# ipmitool -I lanplus -H 192.168.254.251 -U root -P  sdr
elist all
SEL  | 72h | ns  |  7.1 | No Reading
Intrusion| 73h | ok  |  7.1 |
iDRAC7   | 00h | ok  |  7.1 | Dynamic MC @ 20h
System Board | 00h | ns  |  7.1 | Logical FRU @00h


Jan: Actually the cluster uses /etc/hosts in order to resolve names:
172.16.77.10ld1.mydomain.it  ld1
172.16.77.11ld2.mydomain.it  ld2

Furthermore I'm using ip addresses for ipmi interfaces in the configuration:
[root@ld1 ~]# pcs stonith show fence-node1
 Resource: fence-node1 (class=stonith type=fence_ipmilan)
  Attributes: ipaddr=192.168.254.250 lanplus=1 login=root passwd=X
pcmk_host_check=static-list pcmk_host_list=ld1.mydomain.it
  Operations: monitor interval=60s (fence-node1-monitor-interval-60s)


Any idea?
How can I reset the state of the cluster without downtime? "pcs resource
cleanup" is enough?
Thank you,
Marco


Il giorno mer 4 set 2019 alle ore 10:29 Jan Pokorný 
ha scritto:

> On 03/09/19 20:15 +0300, Andrei Borzenkov wrote:
> > 03.09.2019 11:09, Marco Marino пишет:
> >> Hi, I have a problem with fencing on a two node cluster. It seems that
> >> randomly the cluster cannot complete monitor operation for fence
> devices.
> >> In log I see:
> >> crmd[8206]:   error: Result of monitor operation for fence-node2 on
> >> ld2.mydomain.it: Timed Out
> >
> > Can you actually access IP addresses of your IPMI ports?
>
> [
> Tangentially, interesting aspect beyond that and applicable for any
> non-IP cross-host referential needs, which I haven't seen mentioned
> anywhere so far, is the risk of DNS resolution (when /etc/hosts will
> come short) getting to troubles (stale records, port blocked, DNS
> server overload [DNSSEC, etc.], IPv4/IPv6 parallel records that the SW
> cannot handle gracefully, etc.).  In any case, just a single DNS
> server would apparently be an undesired SPOF, and would be unfortunate
> when unable to fence a node because of that.
>
> I think the most robust approach is to use IP addresses whenever
> possible, and unambiguous records in /etc/hosts when practical.
> ]
>
> >> As attachment there is
> >> - /var/log/messages for node1 (only the important part)
> >> - /var/log/messages for node2 (only the important part) <-- Problem
> starts
> >> here
> >> - pcs status
> >> - pcs stonith show (for both fence devices)
> >>
> >> I think it could be a timeout problem, so how can I see timeout value
> for
> >> monitor operation in stonith devices?
> >> Please, someone can help me with this problem?
> >> Furthermore, how can I fix the state of fence devices without downtime?
>
> --
> Jan (Poki)
> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

[ClusterLabs] stonith-ng - performing action 'monitor' timed out with signal 15

2019-09-03 Thread Marco Marino
Hi, I have a problem with fencing on a two node cluster. It seems that
randomly the cluster cannot complete monitor operation for fence devices.
In log I see:
crmd[8206]:   error: Result of monitor operation for fence-node2 on
ld2.mydomain.it: Timed Out
As attachment there is
- /var/log/messages for node1 (only the important part)
- /var/log/messages for node2 (only the important part) <-- Problem starts
here
- pcs status
- pcs stonith show (for both fence devices)

I think it could be a timeout problem, so how can I see timeout value for
monitor operation in stonith devices?
Please, someone can help me with this problem?
Furthermore, how can I fix the state of fence devices without downtime?

Thank you
PCS STATUS

root@ld1 ~]# pcs status
Cluster name: ldcluster
Stack: corosync
Current DC: ld1.mydomain.it (version 1.1.19-8.el7_6.4-c3c624ea3d) - partition 
with quorum
Last updated: Tue Sep  3 09:37:27 2019
Last change: Thu Jul  4 21:36:07 2019 by root via cibadmin on ld1.mydomain.it

2 nodes configured
10 resources configured

Online: [ ld1.mydomain.it ld2.mydomain.it ]

Full list of resources:

 fence-node1(stonith:fence_ipmilan):Stopped
 fence-node2(stonith:fence_ipmilan):Stopped
 Master/Slave Set: DrbdResClone [DrbdRes]
 Masters: [ ld1.mydomain.it ]
 Slaves: [ ld2.mydomain.it ]
 HALVM  (ocf::heartbeat:LVM):   Started ld1.mydomain.it
 PgsqlFs(ocf::heartbeat:Filesystem):Started ld1.mydomain.it
 PostgresqlD(systemd:postgresql-9.6.service):   Started ld1.mydomain.it
 LegaldocapiD   (systemd:legaldocapi.service):  Started ld1.mydomain.it
 PublicVIP  (ocf::heartbeat:IPaddr2):   Started ld1.mydomain.it
 DefaultRoute   (ocf::heartbeat:Route): Started ld1.mydomain.it

Failed Actions:
* fence-node1_start_0 on ld1.mydomain.it 'unknown error' (1): call=221, 
status=Timed Out, exitreason='',
last-rc-change='Wed Aug 21 12:49:00 2019', queued=0ms, exec=20006ms
* fence-node2_start_0 on ld1.mydomain.it 'unknown error' (1): call=222, 
status=Timed Out, exitreason='',
last-rc-change='Wed Aug 21 12:49:00 2019', queued=1ms, exec=20013ms
* fence-node1_start_0 on ld2.mydomain.it 'unknown error' (1): call=182, 
status=Timed Out, exitreason='',
last-rc-change='Wed Aug 21 14:26:09 2019', queued=0ms, exec=20006ms
* fence-node2_start_0 on ld2.mydomain.it 'unknown error' (1): call=176, 
status=Timed Out, exitreason='',
last-rc-change='Wed Aug 21 12:48:40 2019', queued=1ms, exec=20008ms


Daemon Status:
  corosync: active/disabled
  pacemaker: active/disabled
  pcsd: active/enabled
[root@ld1 ~]#


STONITH SHOW###
[root@ld1 ~]# pcs stonith show fence-node1
 Resource: fence-node1 (class=stonith type=fence_ipmilan)
  Attributes: ipaddr=192.168.254.250 lanplus=1 login=root passwd=XXX 
pcmk_host_check=static-list pcmk_host_list=ld1.mydomain.it
  Operations: monitor interval=60s (fence-node1-monitor-interval-60s)
[root@ld1 ~]# pcs stonith show fence-node2
 Resource: fence-node2 (class=stonith type=fence_ipmilan)
  Attributes: ipaddr=192.168.254.251 lanplus=1 login=root passwd= 
pcmk_host_check=static-list pcmk_host_list=ld2.mydomain.it delay=12
  Operations: monitor interval=60s (fence-node2-monitor-interval-60s)
[root@ld1 ~]#


###NODE 2 
/var/log/messages##
Aug 21 12:48:40 ld2 stonith-ng[8202]:  notice: Child process 46006 performing 
action 'monitor' timed out with signal 15
Aug 21 12:48:40 ld2 stonith-ng[8202]:  notice: Operation 'monitor' [46006] for 
device 'fence-node2' returned: -62 (Timer expired)
Aug 21 12:48:40 ld2 crmd[8206]:   error: Result of monitor operation for 
fence-node2 on ld2.mydomain.it: Timed Out
Aug 21 12:48:40 ld2 stonith-ng[8202]:  notice: On loss of CCM Quorum: Ignore
Aug 21 12:48:40 ld2 crmd[8206]:  notice: Result of stop operation for 
fence-node2 on ld2.mydomain.it: 0 (ok)
Aug 21 12:48:40 ld2 stonith-ng[8202]:  notice: On loss of CCM Quorum: Ignore
Aug 21 12:48:40 ld2 stonith-ng[8202]:  notice: On loss of CCM Quorum: Ignore
Aug 21 12:48:40 ld2 stonith-ng[8202]:  notice: On loss of CCM Quorum: Ignore
Aug 21 12:48:59 ld2 stonith-ng[8202]:  notice: On loss of CCM Quorum: Ignore
Aug 21 12:49:00 ld2 stonith-ng[8202]:  notice: Child process 46053 performing 
action 'monitor' timed out with signal 15
Aug 21 12:49:00 ld2 stonith-ng[8202]:  notice: Operation 'monitor' [46053] for 
device 'fence-node2' returned: -62 (Timer expired)
Aug 21 12:49:00 ld2 crmd[8206]:   error: Result of start operation for 
fence-node2 on ld2.mydomain.it: Timed Out
Aug 21 12:49:00 ld2 stonith-ng[8202]:  notice: On loss of CCM Quorum: Ignore
Aug 21 12:49:00 ld2 stonith-ng[8202]:  notice: On loss of CCM Quorum: Ignore
Aug 21 12:49:00 ld2 stonith-ng[8202]:  notice: On loss of CCM Quorum: Ignore
Aug 21 12:49:00 ld2 stonith-ng[8202]:  notice: On loss of CCM Quorum: Ignore
Aug 21 12:49:00 ld2 crmd[8206]:  

Re: [ClusterLabs] HALVM monitor action fail on slave node. Possible bug?

2018-04-16 Thread Marco Marino
Hi Emmanuel, thank you for you support. I did a lot of checks during the WE
and there are some updates:
- Main problem is that ocf:heartbeat:LVM is old. The current version on
centos 7 is 3.9.5 (package resource-agents). More precisely, in 3.9.5 the
monitor function has one important assumption: the underlying storage is
shared between all nodes in the cluster. So the monitor function checks the
presence of the volume group on all nodes. From version 3.9.6 this is not
the normal behavior and the monitor function (LVM_status) returns
$OCF_NOT_RUNNING from slaves nodes without errors. You can check this in
the file /usr/lib/ocf/resource.d/heartbeat/LVM in lines 340-351 that
disappears in version 3.9.6.

Obviously this is not error, but an important change in the cluster
architecture because I need to use drbd in dual primary mode when version
3.9.5 is used. My personal idea is that drbd in dual primary mode with lvm
is not a good idea due to the fact that I don't need an active/active
cluster.

Anyway, thank you for your time again
Marco

2018-04-13 15:54 GMT+02:00 emmanuel segura <emi2f...@gmail.com>:

> the first thing that you need to configure is the stonith, because you
> have this constraint "constraint order promote DrbdResClone then start
> HALVM"
>
> To recover and promote drbd to master when you crash a node, configurare
> the drbd fencing handler.
>
> pacemaker execute monitor in both nodes, so this is normal, to test why
> monitor fail, use ocf-tester
>
> 2018-04-13 15:29 GMT+02:00 Marco Marino <marino@gmail.com>:
>
>> Hello, I'm trying to configure a simple 2 node cluster with drbd and
>> HALVM (ocf:heartbeat:LVM) but I have a problem that I'm not able to solve,
>> to I decided to write this long post. I need to really understand what I'm
>> doing and where I'm doing wrong.
>> More precisely, I'm configuring a pacemaker cluster with 2 nodes and only
>> one drbd resource. Here all operations:
>>
>> - System configuration
>> hostnamectl set-hostname pcmk[12]
>> yum update -y
>> yum install vim wget git -y
>> vim /etc/sysconfig/selinux  -> permissive mode
>> systemctl disable firewalld
>> reboot
>>
>> - Network configuration
>> [pcmk1]
>> nmcli connection modify corosync ipv4.method manual ipv4.addresses
>> 192.168.198.201/24 ipv6.method ignore connection.autoconnect yes
>> nmcli connection modify replication ipv4.method manual ipv4.addresses
>> 192.168.199.201/24 ipv6.method ignore connection.autoconnect yes
>> [pcmk2]
>> nmcli connection modify corosync ipv4.method manual ipv4.addresses
>> 192.168.198.202/24 ipv6.method ignore connection.autoconnect yes
>> nmcli connection modify replication ipv4.method manual ipv4.addresses
>> 192.168.199.202/24 ipv6.method ignore connection.autoconnect yes
>>
>> ssh-keyget -t rsa
>> ssh-copy-id root@pcmk[12]
>> scp /etc/hosts root@pcmk2:/etc/hosts
>>
>> - Drbd Repo configuration and drbd installation
>> rpm --import https://www.elrepo.org/RPM-GPG-KEY-elrepo.org
>> rpm -Uvh http://www.elrepo.org/elrepo-release-7.0-3.el7.elrepo.noarch
>> .rpm
>> yum update -y
>> yum install drbd84-utils kmod-drbd84 -y
>>
>> - Drbd Configuration:
>> Creating a new partition on top of /dev/vdb -> /dev/vdb1 of type
>> "Linux" (83)
>> [/etc/drbd.d/global_common.conf]
>> usage-count no;
>> [/etc/drbd.d/myres.res]
>> resource myres {
>> on pcmk1 {
>> device /dev/drbd0;
>> disk /dev/vdb1;
>> address 192.168.199.201:7789;
>> meta-disk internal;
>> }
>> on pcmk2 {
>> device /dev/drbd0;
>> disk /dev/vdb1;
>> address 192.168.199.202:7789;
>> meta-disk internal;
>> }
>> }
>>
>> scp /etc/drbd.d/myres.res root@pcmk2:/etc/drbd.d/myres.res
>> systemctl start drbd <-- only for test. The service is disabled at
>> boot!
>> drbdadm create-md myres
>> drbdadm up myres
>> drbdadm primary --force myres
>>
>> - LVM Configuration
>> [root@pcmk1 ~]# lsblk
>> NAMEMAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
>> sr0  11:01 1024M  0 rom
>> vda 252:00   20G  0 disk
>> ├─vda1  252:101G  0 part /boot
>> └─vda2  252:20   19G  0 part
>>   ├─cl-root 253:00   17G  0 lvm  /
>>   └─cl-swap 253:102G  0 lvm  [SWAP]
>>

[ClusterLabs] HALVM monitor action fail on slave node. Possible bug?

2018-04-13 Thread Marco Marino
Hello, I'm trying to configure a simple 2 node cluster with drbd and HALVM
(ocf:heartbeat:LVM) but I have a problem that I'm not able to solve, to I
decided to write this long post. I need to really understand what I'm doing
and where I'm doing wrong.
More precisely, I'm configuring a pacemaker cluster with 2 nodes and only
one drbd resource. Here all operations:

- System configuration
hostnamectl set-hostname pcmk[12]
yum update -y
yum install vim wget git -y
vim /etc/sysconfig/selinux  -> permissive mode
systemctl disable firewalld
reboot

- Network configuration
[pcmk1]
nmcli connection modify corosync ipv4.method manual ipv4.addresses
192.168.198.201/24 ipv6.method ignore connection.autoconnect yes
nmcli connection modify replication ipv4.method manual ipv4.addresses
192.168.199.201/24 ipv6.method ignore connection.autoconnect yes
[pcmk2]
nmcli connection modify corosync ipv4.method manual ipv4.addresses
192.168.198.202/24 ipv6.method ignore connection.autoconnect yes
nmcli connection modify replication ipv4.method manual ipv4.addresses
192.168.199.202/24 ipv6.method ignore connection.autoconnect yes

ssh-keyget -t rsa
ssh-copy-id root@pcmk[12]
scp /etc/hosts root@pcmk2:/etc/hosts

- Drbd Repo configuration and drbd installation
rpm --import https://www.elrepo.org/RPM-GPG-KEY-elrepo.org
rpm -Uvh
http://www.elrepo.org/elrepo-release-7.0-3.el7.elrepo.noarch.rpm
yum update -y
yum install drbd84-utils kmod-drbd84 -y

- Drbd Configuration:
Creating a new partition on top of /dev/vdb -> /dev/vdb1 of type
"Linux" (83)
[/etc/drbd.d/global_common.conf]
usage-count no;
[/etc/drbd.d/myres.res]
resource myres {
on pcmk1 {
device /dev/drbd0;
disk /dev/vdb1;
address 192.168.199.201:7789;
meta-disk internal;
}
on pcmk2 {
device /dev/drbd0;
disk /dev/vdb1;
address 192.168.199.202:7789;
meta-disk internal;
}
}

scp /etc/drbd.d/myres.res root@pcmk2:/etc/drbd.d/myres.res
systemctl start drbd <-- only for test. The service is disabled at boot!
drbdadm create-md myres
drbdadm up myres
drbdadm primary --force myres

- LVM Configuration
[root@pcmk1 ~]# lsblk
NAMEMAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sr0  11:01 1024M  0 rom
vda 252:00   20G  0 disk
├─vda1  252:101G  0 part /boot
└─vda2  252:20   19G  0 part
  ├─cl-root 253:00   17G  0 lvm  /
  └─cl-swap 253:102G  0 lvm  [SWAP]
vdb 252:16   08G  0 disk
└─vdb1  252:17   08G  0 part  <--- /dev/vdb1 is the partition
I'd like to use as backing device for drbd
  └─drbd0   147:008G  0 disk

[/etc/lvm/lvm.conf]
write_cache_state = 0
use_lvmetad = 0
filter = [ "a|drbd.*|", "a|vda.*|", "r|.*|" ]

Disabling lvmetad service
systemctl disable lvm2-lvmetad.service
systemctl disable lvm2-lvmetad.socket
reboot

- Creating volume group and logical volume
systemctl start drbd (both nodes)
drbdadm primary myres
pvcreate /dev/drbd0
vgcreate havolumegroup /dev/drbd0
lvcreate -n c-vol1 -L1G havolumegroup
[root@pcmk1 ~]# lvs
LV VGAttr   LSize   Pool Origin Data%  Meta%
Move Log Cpy%Sync Convert
root   cl-wi-ao
<17.00g
swap   cl-wi-ao
2.00g
c-vol1 havolumegroup -wi-a-   1.00g


- Cluster Configuration
yum install pcs fence-agents-all -y
systemctl enable pcsd
systemctl start pcsd
echo redhat | passwd --stdin hacluster
pcs cluster auth pcmk1 pcmk2
pcs cluster setup --name ha_cluster pcmk1 pcmk2
pcs cluster start --all
pcs cluster enable --all
pcs property set stonith-enabled=false<--- Just for test!!!
pcs property set no-quorum-policy=ignore

- Drbd resource configuration
pcs cluster cib drbd_cfg
pcs -f drbd_cfg resource create DrbdRes ocf:linbit:drbd
drbd_resource=myres op monitor interval=60s
pcs -f drbd_cfg resource master DrbdResClone DrbdRes master-max=1
master-node-max=1 clone-max=2 clone-node-max=1 notify=true
[root@pcmk1 ~]# pcs -f drbd_cfg resource show
 Master/Slave Set: DrbdResClone [DrbdRes]
 Stopped: [ pcmk1 pcmk2 ]
[root@pcmk1 ~]#

Testing the failover with a forced shutoff of pcmk1. When pcmk1 returns
up, drbd is slave but logical volume is not active on pcmk2. So I need HALVM
[root@pcmk2 ~]# lvs
  LV VGAttr   LSize   Pool Origin Data%  Meta%
Move Log Cpy%Sync Convert
  root   cl-wi-ao
<17.00g
  swap   cl-wi-ao
2.00g
  c-vol1 havolumegroup -wi---
1.00g
[root@pcmk2 ~]#



- Lvm resource and constraints
pcs cluster cib lvm_cfg
pcs -f lvm_cfg resource create HALVM 

Re: [ClusterLabs] HALVM problem with 2 nodes cluster

2017-01-18 Thread Marco Marino
Ferenc, regarding the flag use_lvmetad in
/usr/lib/ocf/resource.d/heartbeat/LVM I read:

"#lvmetad is a daemon that caches lvm metadata to improve the
# performance of LVM commands. This daemon should never be used when
# volume groups exist that are being managed by the cluster. The
lvmetad
# daemon introduces a response lag, where certain LVM commands look
like
# they have completed (like vg activation) when in fact the command
# is still in progress by the lvmetad.  This can cause reliability
issues
# when managing volume groups in the cluster.  For Example, if you
have a
# volume group that is a dependency for another application, it is
possible
# the cluster will think the volume group is activated and attempt
to start
# the application before volume group is really accesible...
lvmetad is bad."

in the function LVM_validate_all()
Anyway, it's only a warning but there is a good reason. I'm not an expert,
I'm studying for a certification and I have a lot of doubts.
Thank you for your help
Marco




2017-01-18 11:03 GMT+01:00 Ferenc Wágner <wf...@niif.hu>:

> Marco Marino <marino@gmail.com> writes:
>
> > I agree with you for
> > use_lvmetad = 0 (setting it = 1 in a clustered environment is an error)
>
> Where does this information come from?  AFAIK, if locking_type=3 (LVM
> uses internal clustered locking, that is, clvmd), lvmetad is not used
> anyway, even if it's running.  So it's best to disable it to avoid
> warning messages all around.  This is the case with active/active
> clustering in LVM itself, in which Pacemaker isn't involved.
>
> On the other hand, if you use Pacemaker to do active/passive clustering
> by appropriately activating/deactivating your VG, this isn't clustering
> from the LVM point of view, you don't set the clustered flag on your VG,
> don't run clvmd and use locking_type=1.  Lvmetad should be perfectly
> fine with this in principle (unless it caches metadata of inactive VGs,
> which would be stupid, but I never tested this).
>
> > but I think I have to set
> > locking_type = 3 only if I use clvm
>
> Right.
> --
> Feri
>
___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] HALVM problem with 2 nodes cluster

2017-01-18 Thread Marco Marino
Hi Bliu, thank you.
I agree with you for
use_lvmetad = 0 (setting it = 1 in a clustered environment is an error)
but I think I have to set
locking_type = 3 only if I use clvm
In my case, I'm trying to use LVM so I think that locking_type = 1 is ok.
What do you think about?

Furthermore, I have an application (managed as a resource in the cluster)
that continously create and remove logical volumes in the cluster. Is this
a problem? The application uses a custom lvm.conf configuration file where
I have volume_list = [ "@pacemaker" ]

Thank you




2017-01-18 10:12 GMT+01:00 bliu <b...@suse.com>:

> Hi, Marco
>
> On 01/18/2017 04:45 PM, Marco Marino wrote:
>
> Hi, I'm trying to realize a cluster with 2 nodes that manages a volume
> group.
> Basically I have a san connected to both nodes that exposes 1 lun. So both
> nodes have a disk /dev/sdb. From one node I did:
> fdisk /dev/sdb  <- Create a partition with type = 8e (LVM)
> pvcreate /dev/sdb1
> vgcreate myvg
>
> then
>
> pcs resource create halvm LVM volgrpname=myvg exclusive=true
>
> Last command fails with an error: "LVM: myvg did not activate correctly"
>
> Reading /usr/lib/ocf/resource.d/heartbeat/LVM, this happens because it
> seems that I need at least one logical volume inside the volume group
> before create the resource. Is this correct?
>
> Yes, you need to create pv, vg before you use cluster to manager it.
>
> Furthermore, how can I set volume_list in lvm.conf? Actually in lvm.conf I
> have:
>
> Normally, clvm is used in cluster with shared storage as:
> locking_type = 3
> use_lvmetad = 0
>
> locking_type = 1
> use_lvmetad = 1
> volume_list = [ "vg-with-root-lv" ]
>
>
> Thank you
>
>
>
>
> ___
> Users mailing list: 
> Users@clusterlabs.orghttp://lists.clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
>
>
> ___
> Users mailing list: Users@clusterlabs.org
> http://lists.clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
>
___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] HALVM problem with 2 nodes cluster

2017-01-18 Thread Marco Marino
Hi, I'm trying to realize a cluster with 2 nodes that manages a volume
group.
Basically I have a san connected to both nodes that exposes 1 lun. So both
nodes have a disk /dev/sdb. From one node I did:
fdisk /dev/sdb  <- Create a partition with type = 8e (LVM)
pvcreate /dev/sdb1
vgcreate myvg

then

pcs resource create halvm LVM volgrpname=myvg exclusive=true

Last command fails with an error: "LVM: myvg did not activate correctly"

Reading /usr/lib/ocf/resource.d/heartbeat/LVM, this happens because it
seems that I need at least one logical volume inside the volume group
before create the resource. Is this correct?
Furthermore, how can I set volume_list in lvm.conf? Actually in lvm.conf I
have:
locking_type = 1
use_lvmetad = 1
volume_list = [ "vg-with-root-lv" ]


Thank you
___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] SAN with drbd and pacemaker

2015-09-21 Thread Marco Marino
"With 20 disk of 4TB you have a total capacity of 80TB. If you run all of
them as RAID6 then you have a total of 72TB."

And that's the point! I'm trying to understand if I can create more RAID6
arrays and how my controller handles disk failures in that case. First I
think we need to clarify terminology related to Megaraid Storage Manager
and for this reason I attach here a screenshot -> (phyical drives ->
http://pasteboard.co/NC3O60x.pngand logical drives ->
http://pasteboard.co/NC8DLcM.png )
So, reading this ->
http://www.cisco.com/c/dam/en/us/td/docs/unified_computing/ucs/3rd-party/lsi/mrsas/userguide/LSI_MR_SAS_SW_UG.pdf
(page 41) I think I have a RAID (6 in my case) array as a drive group and a
volume as a Virtual Drive. If this is right, I should discover how much
RAID array my controller supports. Actually I have 20 disks, but I can add
more. However, reducing rebuild array time is my goal, so I think that
create a virtual drive for each drive group is the right way. Please give
me some advises
Thanks



2015-09-18 13:02 GMT+02:00 Kai Dupke <kdu...@suse.com>:

> On 09/18/2015 09:28 AM, Marco Marino wrote:
> > Can you explain me this? 16 volumes?
>
>
> With 20 disk of 4TB you have a total capacity of 80TB. If you run all of
> them as RAID6 then you have a total of 72TB.
>
> If you ask your controller to create a 8TB volume, this volume is spread
> across all the 20 disk. As 2 stripes are used for parity, you have
> 20-2=18 data stripes per volume. This makes each stripe 444G big,
> leaving 3500G free for other volumes.
>
> If you fill up the remaining 3500G with volumes the same way, you get 8
> additional volumes (OK, the last volume is <8TB then).
>
> In total you have 9 volumes then, each disk has data/parity on all of
> these volumes.
>
> 9x8=72, voila!
>
> If a disk error appear and the controller marks the disk dead then all 9
> volumes are affected.
>
> With 20 6TB/8TB drives, you just get more 8TB volumes using this way.
>
> What would of course reduce the risk is to always use <20 disk in one
> raid6 volume, so not each disk serves all volumes.
>
> Another issue is about performance, not every RAID controller performs
> best with 20 drives. Adaptec recommends an odd number of drives, with 7
> or 9 drives performs best AFAIK.
>
> So you could make volume 1 on disks 1-9, volume 2 on disk 2-10, volume 3
> on disk 3-11 etc. etc.
>
> Or consider using some combination of RAID6 and RAID1, but this gives
> you way less available disk size (and no, I have no calculation handy on
> the chance for failure for RAID6 vs. RAID15 vs. RAID16)
>
> greetings kai
>
>
> >
> > Thank you
> >
> >
> >
> > 2015-09-17 15:54 GMT+02:00 Kai Dupke <kdu...@suse.com>:
> >
> >> On 09/17/2015 09:44 AM, Marco Marino wrote:
> >>> Hi, I have 2 servers supermicro lsi 2108 with many disks (80TB) and I'm
> >>> trying to build a SAN with drbd and pacemaker. I'm studying, but I have
> >> no
> >>> experience on large array of disks with drbd and pacemaker, so I have
> >> some
> >>> questions:
> >>>
> >>> I'm using MegaRAID Storage Manager to create virtual drives. Each
> virtual
> >>> drive is a device on linux (eg /dev/sdb, /dev/sdc.), so my first
> >>> question is: it's a good idea to create virtual drive of 8 TB (max)?
> I'm
> >>> thinking to rebuild array time in case of disk failure (about 1 day
> for 8
> >>
> >> It depends on your disks and RAID level. If one disk fails the content
> >> of this disk has to be recreated by either copying (all RAID levels with
> >> some RAID 1 included) or calculating (all with no RAID1 included), in
> >> the later case all disks get really stressed.
> >>
> >> If you run 20x4TB disks as RAID6, then an 8TB volume is only ~500G per
> >> disk. However, if one disk fails, then all the other 15 volumes this
> >> disk handles are broken, too. (BTW, most raid controller can handle
> >> multiple stripes per disk, but usually only a handful) In such case the
> >> complete 4TB of the broken disk has to be recovered, affecting all 16
> >> volumes.
> >>
> >> On the other side, if you use 4x5x4TB as 4x 12TB RAID6, a broken disk
> >> only affects one of 4 volumes - but at the cost of more disks needed.
> >>
> >> You can do the similar calculation based on RAID16/15.
> >>
> >> The only reason I see to create small slices is to make them fit on
> >> smaller replacement disks, which might be more easily available/payable
> >> 

Re: [ClusterLabs] SAN with drbd and pacemaker

2015-09-18 Thread Marco Marino
ok, first if all, thank you for your answer. This is acomplicated task and
I cannot found many guides (if you have are welcome).
I'm using RAID6 and I have 20 disks of 4TB each.
In RAID6 space efficiency is 1-2/n, so a solution for small Virtual Drive
could be 4 or 5 disks. If I use 4 disks I will have (4*4) * (1-2/4) = 8 TB
of effective space. Instead, if I use 5 disks, I will have (5*4) * (1-2/5)
= 12TB of effective space.
Space efficiency is not a primary goal for me, I'm trying to reduce time of
rebuilding when a disk fails (and performance improvement!).

"If you run 20x4TB disks as RAID6, then an 8TB volume is only ~500G per
disk. However, if one disk fails, then all the other 15 volumes this
disk handles are broken, too. (BTW, most raid controller can handle
multiple stripes per disk, but usually only a handful) In such case the
complete 4TB of the broken disk has to be recovered, affecting all 16
volumes."

Can you explain me this? 16 volumes?

Thank you



2015-09-17 15:54 GMT+02:00 Kai Dupke <kdu...@suse.com>:

> On 09/17/2015 09:44 AM, Marco Marino wrote:
> > Hi, I have 2 servers supermicro lsi 2108 with many disks (80TB) and I'm
> > trying to build a SAN with drbd and pacemaker. I'm studying, but I have
> no
> > experience on large array of disks with drbd and pacemaker, so I have
> some
> > questions:
> >
> > I'm using MegaRAID Storage Manager to create virtual drives. Each virtual
> > drive is a device on linux (eg /dev/sdb, /dev/sdc.), so my first
> > question is: it's a good idea to create virtual drive of 8 TB (max)? I'm
> > thinking to rebuild array time in case of disk failure (about 1 day for 8
>
> It depends on your disks and RAID level. If one disk fails the content
> of this disk has to be recreated by either copying (all RAID levels with
> some RAID 1 included) or calculating (all with no RAID1 included), in
> the later case all disks get really stressed.
>
> If you run 20x4TB disks as RAID6, then an 8TB volume is only ~500G per
> disk. However, if one disk fails, then all the other 15 volumes this
> disk handles are broken, too. (BTW, most raid controller can handle
> multiple stripes per disk, but usually only a handful) In such case the
> complete 4TB of the broken disk has to be recovered, affecting all 16
> volumes.
>
> On the other side, if you use 4x5x4TB as 4x 12TB RAID6, a broken disk
> only affects one of 4 volumes - but at the cost of more disks needed.
>
> You can do the similar calculation based on RAID16/15.
>
> The only reason I see to create small slices is to make them fit on
> smaller replacement disks, which might be more easily available/payable
> at time of error (but now we are entering a more low cost area where
> usually SAN and DRBD do not take place).
>
> greetings
> Kai Dupke
> Senior Product Manager
> Server Product Line
> --
> Sell not virtue to purchase wealth, nor liberty to purchase power.
> Phone:  +49-(0)5102-9310828 Mail: kdu...@suse.com
> Mobile: +49-(0)173-5876766  WWW:  www.suse.com
>
> SUSE Linux GmbH - Maxfeldstr. 5 - 90409 Nuernberg (Germany)
> GF:Felix Imendörffer,Jane Smithard,Graham Norton,HRB 21284 (AG Nürnberg)
>
> ___
> Users mailing list: Users@clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org