Re: [ClusterLabs] unable to start fence_scsi on a new add node

2020-04-20 Thread Stefan Sabolowitsch
Oyvind,
>> Sound like you need to increase the number of journals for your GFS2 
>> filesystem.

Thanks that was the trick.
Thank you for the professional help here.

Stefan



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] unable to start fence_scsi on a new add node

2020-04-20 Thread Oyvind Albrigtsen

Sound like you need to increase the number of journals for your GFS2
filesystem.

https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/global_file_system_2/s1-manage-addjournalfs


Oyvind

On 19/04/20 11:03 +, Stefan Sabolowitsch wrote:

Andrei,
i found this.
if i try to mount the volume by hand, i get this error message

[root@logger log]# mount /dev/mapper/vg_cluster-lv_cluster /data-san
mount: mount /dev/mapper/vg_cluster-lv_cluster on /data-san failed: to many 
users
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] unable to start fence_scsi on a new add node

2020-04-19 Thread Stefan Sabolowitsch
Andrei,
i found this.
if i try to mount the volume by hand, i get this error message

[root@logger log]# mount /dev/mapper/vg_cluster-lv_cluster /data-san
mount: mount /dev/mapper/vg_cluster-lv_cluster on /data-san failed: to many 
users
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] unable to start fence_scsi on a new add node

2020-04-19 Thread Stefan Sabolowitsch
Privjet / Hello Andrei (happy easter to russia)
thanks for the tip, which made me even more brought something, but the volume 
is not mounted now on the new node.

[root@elastic ~]# pcs status
Cluster name: cluster_elastic
Stack: corosync
Current DC: elastic-02 (version 1.1.20-5.el7_7.2-3c4c782f70) - partition with 
quorum
Last updated: Sun Apr 19 12:21:01 2020
Last change: Sun Apr 19 12:15:29 2020 by root via cibadmin on elastic-03

3 nodes configured
10 resources configured

Online: [ elastic-01 elastic-02 elastic-03 ]

Full list of resources:

 scsi   (stonith:fence_scsi):   Started elastic-01
 Clone Set: dlm-clone [dlm]
 Started: [ elastic-01 elastic-02 elastic-03 ]
 Clone Set: clvmd-clone [clvmd]
 Started: [ elastic-01 elastic-02 elastic-03 ]
 Clone Set: fs_gfs2-clone [fs_gfs2]
 Started: [ elastic-01 elastic-02 ]
 Stopped: [ elastic-03 ]

Failed Resource Actions:
* fs_gfs2_start_0 on elastic-03 'unknown error' (1): call=53, status=complete, 
exitreason='Couldn't mount device [/dev/vg_cluster/lv_cluster] as /data-san',
last-rc-change='Sun Apr 19 12:02:44 2020', queued=0ms, exec=1015ms

Failed Fencing Actions:
* unfencing of elastic-03 failed: delegate=, client=crmd.5149, 
origin=elastic-02,
last-failed='Sun Apr 19 11:32:59 2020'

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

my config:

Cluster Name: cluster_elastic
Corosync Nodes:
 elastic-01 elastic-02 elastic-03
Pacemaker Nodes:
 elastic-01 elastic-02 elastic-03

Resources:
 Clone: dlm-clone
  Meta Attrs: interleave=true ordered=true
  Resource: dlm (class=ocf provider=pacemaker type=controld)
   Operations: monitor interval=30s on-fail=fence (dlm-monitor-interval-30s)
   start interval=0s timeout=90 (dlm-start-interval-0s)
   stop interval=0s timeout=100 (dlm-stop-interval-0s)
 Clone: clvmd-clone
  Meta Attrs: interleave=true ordered=true
  Resource: clvmd (class=ocf provider=heartbeat type=clvm)
   Operations: monitor interval=30s on-fail=fence (clvmd-monitor-interval-30s)
   start interval=0s timeout=90s (clvmd-start-interval-0s)
   stop interval=0s timeout=90s (clvmd-stop-interval-0s)
 Clone: fs_gfs2-clone
  Meta Attrs: interleave=true
  Resource: fs_gfs2 (class=ocf provider=heartbeat type=Filesystem)
   Attributes: device=/dev/vg_cluster/lv_cluster directory=/data-san 
fstype=gfs2 options=noatime,nodiratime
   Operations: monitor interval=10s on-fail=fence (fs_gfs2-monitor-interval-10s)
   notify interval=0s timeout=60s (fs_gfs2-notify-interval-0s)
   start interval=0s timeout=60s (fs_gfs2-start-interval-0s)
   stop interval=0s timeout=60s (fs_gfs2-stop-interval-0s)

Stonith Devices:
 Resource: scsi (class=stonith type=fence_scsi)
  Attributes: pcmk_host_list="elastic-01 elastic-02 elastic-03" 
pcmk_monitor_action=metadata pcmk_reboot_action=offdevices=/dev/mapper/mpatha 
verbose
=true
  Meta Attrs: provides=unfencing
  Operations: monitor interval=60s (scsi-monitor-interval-60s)
Fencing Levels:

Location Constraints:
Ordering Constraints:
  start dlm-clone then start clvmd-clone (kind:Mandatory) 
(id:order-dlm-clone-clvmd-clone-mandatory)
  start clvmd-clone then start fs_gfs2-clone (kind:Mandatory) 
(id:order-clvmd-clone-fs_gfs2-clone-mandatory)
Colocation Constraints:
  clvmd-clone with dlm-clone (score:INFINITY) 
(id:colocation-clvmd-clone-dlm-clone-INFINITY)
  fs_gfs2-clone with clvmd-clone (score:INFINITY) 
(id:colocation-fs_gfs2-clone-clvmd-clone-INFINITY)
Ticket Constraints:

Alerts:
 No alerts defined

Resources Defaults:
 No defaults set
Operations Defaults:
 No defaults set

Cluster Properties:
 cluster-infrastructure: corosync
 cluster-name: cluster_elastic
 dc-version: 1.1.20-5.el7_7.2-3c4c782f70
 have-watchdog: false
 maintenance-mode: false
 no-quorum-policy: ignore

Quorum:
  Options:



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] unable to start fence_scsi on a new add node

2020-04-18 Thread Andrei Borzenkov
16.04.2020 18:58, Stefan Sabolowitsch пишет:
> Hi there,
> i have expanded a cluster with 2 nodes with an additional one "elastic-03". 
> However, fence_scsi does not start on the new node.
> 
> pcs-status:
> [root@logger cluster]# pcs status
> Cluster name: cluster_elastic
> Stack: corosync
> Current DC: elastic-02 (version 1.1.20-5.el7_7.2-3c4c782f70) - partition with 
> quorum
> Last updated: Thu Apr 16 17:38:16 2020
> Last change: Thu Apr 16 17:23:43 2020 by root via cibadmin on elastic-03
> 
> 3 nodes configured
> 10 resources configured
> 
> Online: [ elastic-01 elastic-02 elastic-03 ]
> 
> Full list of resources:
> 
>  scsi (stonith:fence_scsi):   Stopped
>  Clone Set: dlm-clone [dlm]
>  Started: [ elastic-01 elastic-02 ]
>  Stopped: [ elastic-03 ]
>  Clone Set: clvmd-clone [clvmd]
>  Started: [ elastic-01 elastic-02 ]
>  Stopped: [ elastic-03 ]
>  Clone Set: fs_gfs2-clone [fs_gfs2]
>  Started: [ elastic-01 elastic-02 ]
>  Stopped: [ elastic-03 ]
> 
> Failed Fencing Actions:
> * unfencing of elastic-03 failed: delegate=, client=crmd.5149, 
> origin=elastic-02,
> last-failed='Thu Apr 16 17:23:43 2020'
> 
> Daemon Status:
>   corosync: active/enabled
>   pacemaker: active/enabled
>   pcsd: active/enabled
> 
> 
> corosync.log 
> Apr 16 17:27:10 [4572] logger stonith-ng:   notice: 
> can_fence_host_with_device:   scsi can fence (off) elastic-01 : 
> static-list
> Apr 16 17:27:12 [4572] logger stonith-ng:   notice: 
> can_fence_host_with_device:   scsi can fence (off) elastic-02 : 
> static-list
> Apr 16 17:27:13 [4572] logger stonith-ng:   notice: 
> can_fence_host_with_device:   scsi can not fence (off) elasti c-03: 
> static-list
> Apr 16 17:38:43 [4572] logger stonith-ng:   notice: 
> can_fence_host_with_device:   scsi can not fence (on) elastic -03: 
> static-list

You probably need to update your stonith resource to include new node.

> Apr 16 17:38:43 [4572] logger stonith-ng:   notice: remote_op_done:   
> Operation on of elastic-03 by  for crmd .5149@elastic-02.4b624305: No 
> such device
> Apr 16 17:38:43 [4576] logger.feltengroup.local   crmd:error: 
> tengine_stonith_notify:   Unfencing of elastic-03 by  failed: No such 
> device (-19)
> 
> [root@logger cluster]# stonith_admin -L
>  scsi
> 1 devices found
> 
> [root@logger cluster]# stonith_admin -l elastic-03
> No devices found
> 
> Thanks for any help here.
> Stefan
> 
> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
> 
> ClusterLabs home: https://www.clusterlabs.org/
> 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


[ClusterLabs] unable to start fence_scsi on a new add node

2020-04-16 Thread Stefan Sabolowitsch
Hi there,
i have expanded a cluster with 2 nodes with an additional one "elastic-03". 
However, fence_scsi does not start on the new node.

pcs-status:
[root@logger cluster]# pcs status
Cluster name: cluster_elastic
Stack: corosync
Current DC: elastic-02 (version 1.1.20-5.el7_7.2-3c4c782f70) - partition with 
quorum
Last updated: Thu Apr 16 17:38:16 2020
Last change: Thu Apr 16 17:23:43 2020 by root via cibadmin on elastic-03

3 nodes configured
10 resources configured

Online: [ elastic-01 elastic-02 elastic-03 ]

Full list of resources:

 scsi   (stonith:fence_scsi):   Stopped
 Clone Set: dlm-clone [dlm]
 Started: [ elastic-01 elastic-02 ]
 Stopped: [ elastic-03 ]
 Clone Set: clvmd-clone [clvmd]
 Started: [ elastic-01 elastic-02 ]
 Stopped: [ elastic-03 ]
 Clone Set: fs_gfs2-clone [fs_gfs2]
 Started: [ elastic-01 elastic-02 ]
 Stopped: [ elastic-03 ]

Failed Fencing Actions:
* unfencing of elastic-03 failed: delegate=, client=crmd.5149, 
origin=elastic-02,
last-failed='Thu Apr 16 17:23:43 2020'

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled


corosync.log 
Apr 16 17:27:10 [4572] logger stonith-ng:   notice: can_fence_host_with_device: 
  scsi can fence (off) elastic-01 : static-list
Apr 16 17:27:12 [4572] logger stonith-ng:   notice: can_fence_host_with_device: 
  scsi can fence (off) elastic-02 : static-list
Apr 16 17:27:13 [4572] logger stonith-ng:   notice: can_fence_host_with_device: 
  scsi can not fence (off) elasti c-03: static-list
Apr 16 17:38:43 [4572] logger stonith-ng:   notice: can_fence_host_with_device: 
  scsi can not fence (on) elastic -03: static-list
Apr 16 17:38:43 [4572] logger stonith-ng:   notice: remote_op_done:   Operation 
on of elastic-03 by  for crmd .5149@elastic-02.4b624305: No such device
Apr 16 17:38:43 [4576] logger.feltengroup.local   crmd:error: 
tengine_stonith_notify:   Unfencing of elastic-03 by  failed: No such 
device (-19)

[root@logger cluster]# stonith_admin -L
 scsi
1 devices found

[root@logger cluster]# stonith_admin -l elastic-03
No devices found

Thanks for any help here.
Stefan

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/