Re: [ClusterLabs] Antw: Re: Antw: Re: Antw: [EXT] Re: Q: What is lvmlockd locking?

2021-01-22 Thread Roger Zhou



On 1/22/21 5:45 PM, Ulrich Windl wrote:

Roger Zhou  schrieb am 22.01.2021 um 10:18 in Nachricht

:


Could be the naming of lvmlockd and virtlockd mislead you, I guess.


I agree that there is one "virtlockd" name in the resources that refers to 
lvmlockd. That is confusing, I agree.
But: Isn't virtlockd trying to lock the VM images used? Those are located on a 
different OCFS2 filesystem here.


Right. virtlockd works together with libvirt for Virtual Machines locking.


And I thought virtlockd is using lvmlockd to lock those images. Maybe I'm just 
confused.
Even after reading the manual page of virtlockd I could not find out how it 
actually does perform locking.

lsof suggests it used files like this:
/var/lib/libvirt/lockd/files/f9d587c61002c7480f8b86116eb4f7dfa210e52af7e944762f58c2c2f89a6865


This file lock indicates the VM backing file is a qemu image. In case the VM 
backing storage is SCSI or LVM, the directory structure will change


/var/lib/libvirt/lockd/scsi
/var/lib/libvirt/lockd/lvm

Some years ago, there was a draft patch set sent to libvirt community to add 
the alternative to let virtlockd use the DLM lock, hence no need the 
filesystem(nfs, ocfs2, or gfs2(?) ) for "/var/lib/libvirt/lockd". Well, the 
libvirt community was less motivated to move it on.




That filesystem is OCFS:
h18:~ # df /var/lib/libvirt/lockd/files
Filesystem 1K-blocks  Used Available Use% Mounted on
/dev/md10 261120 99120162000  38% /var/lib/libvirt/lockd


Could part of the problem be that systemd controls virtlockd, but the 
filesystem it needs is controlled by the cluster?

Do I have to mess with those systemd resources in the cluster?:
systemd:virtlockd   systemd:virtlockd-admin.socket  
systemd:virtlockd.socket



It would be more complete and solid cluster configuration if doing so. Though, 
I think it could work to let libvirtd and virtlockd running out side of the 
cluster stack as long as the whole system is not too complex to manage. Anyway, 
testing could tell.


BR,
Roger




Anyway, two more tweaks needed in your CIB:

colocation col_vm__virtlockd inf: ( prm_xen_test-jeos1 prm_xen_test-jeos2
prm_xen_test-jeos3 prm_xen_test-jeos4 ) cln_lockspace_ocfs2

order ord_virtlockd__vm Mandatory: cln_lockspace_ocfs2 ( prm_xen_test-jeos1
prm_xen_test-jeos2 prm_xen_test-jeos3 prm_xen_test-jeos4 )


I'm still trying to understand all that. Thanks for helping so far.

Regards,
Ulrich




BR,
Roger







___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


[ClusterLabs] Antw: Re: Antw: Re: Antw: [EXT] Re: Q: What is lvmlockd locking?

2021-01-22 Thread Ulrich Windl
>>> Gang He  schrieb am 22.01.2021 um 09:44 in Nachricht
:

> 
> On 2021/1/22 16:17, Ulrich Windl wrote:
> Gang He  schrieb am 22.01.2021 um 09:13 in Nachricht
>> <1fd1c07d-d12c-fea9-4b17-90a977fe7...@suse.com>:
>>> Hi Ulrich,
>>>
>>> I reviewed the crm configuration file, there are some comments as below,
>>> 1) lvmlockd resource is used for shared VG, if you do not plan to add
>>> any shared VG in your cluster, I suggest to drop this resource and clone.
>>> 2) second, lvmlockd service depends on DLM service, it will create
>>> "lvm_xxx" related lock spaces when any shared VG is created/activated.
>>> but some other resource also depends on DLM to create lock spaces for
>>> avoiding race condition, e.g. clustered MD, ocfs2, etc. Then, the file
>>> system resource should start later than lvm2(lvmlockd) related resources.
>>> That means this order should be wrong.
>>> order ord_lockspace_fs__lvmlockd Mandatory: cln_lockspace_ocfs2
cln_lvmlock
>> 
>> But cln_lockspace_ocfs2 provides the shared filesystem that lvmlockd uses.
I
>> thought for locking in a cluster it needs a cluster-wide filesystem.
> 
> ocfs2 file system resource only depends on DLM resource if you use a 
> shared raw disk(e.g /dev/vdb3), e.g.
> primitive dlm ocf:pacemaker:controld \
>  op start interval=0 timeout=90 \
>  op stop interval=0 timeout=100 \
>  op monitor interval=20 timeout=600
> primitive ocfs2-2 Filesystem \
>  params device="/dev/vdb3" directory="/mnt/shared" fstype=ocfs2 \
>  op monitor interval=20 timeout=40
> group base-group dlm ocfs2-2
> clone base-clone base-group
> 
> If you use ocfs2 file system on top of shared VG(e.g./dev/vg1/lv1), you 
> need to add lvmlock/LVM-activate resource before ocfs2 file system, e.g.
> primitive dlm ocf:pacemaker:controld \
> op monitor interval=60 timeout=60
> primitive lvmlockd lvmlockd \
> op start timeout=90 interval=0 \
> op stop timeout=100 interval=0 \
> op monitor interval=30 timeout=90
> primitive ocfs2-2 Filesystem \
> params device="/dev/vg1/lv1" directory="/mnt/shared" fstype=ocfs2 \
> op monitor interval=20 timeout=40
> primitive vg1 LVM-activate \
> params vgname=vg1 vg_access_mode=lvmlockd activation_mode=shared \
> op start timeout=90s interval=0 \
> op stop timeout=90s interval=0 \
> op monitor interval=30s timeout=90s
> group base-group dlm lvmlockd vg1 ocfs2-2
> clone base-clone base-group

Hi!

I don't see the problem:
As said before OCFS2 used for lockspace does not use LVM itself, but it uses a
clustered-MD (prm_lockspace_ocfs2 Filesystem, cln_lockspace_ocfs2).
That is co-located with DLM and the RAID (cln_lockspace_raid_md10). (And also
for cln_lvmlockd)
Ordering is somewhat redundant as clustered RAID needs DLM, and OCFS needs DLM
and the RAID.

lvmlockd (prm_lvmlockd, cln_lvmlockd) is co-located with DLM (hmm...does that
mean it used DLM and maybe does NOT need a shared filesystem?) and
cln_lockspace_ocfs2.
Accordingly ordering is that vlmlockd starts after DLM (cln_DLM) and after
OCFS (cln_lockspace_ocfs2)

To summarize the related resources:
Node List:
  * Online: [ h16 h18 h19 ]

Full List of Resources:
  * Clone Set: cln_DLM [prm_DLM]:
* Started: [ h16 h18 h19 ]
  * Clone Set: cln_lvmlockd [prm_lvmlockd]:
* Started: [ h16 h18 h19 ]
  * Clone Set: cln_lockspace_raid_md10 [prm_lockspace_raid_md10]:
* Started: [ h16 h18 h19 ]
  * Clone Set: cln_lockspace_ocfs2 [prm_lockspace_ocfs2]:
* Started: [ h16 h18 h19 ]

Regards,
Ulrich

> 
> Thanks
> Gang
> 
> 
>> 
>>>
>>>
>>> Thanks
>>> Gang
>>>
>>> On 2021/1/21 20:08, Ulrich Windl wrote:
>>> Gang He  schrieb am 21.01.2021 um 11:30 in Nachricht
 <59b543ee-0824-6b91-d0af-48f66922b...@suse.com>:
> Hi Ulrich,
>
> The problem is reproduced stably?  could you help to share your
> pacemaker crm configure and OS/lvm2/resource‑agents related version
> information?

 OK, the problem occurred on every node, so I guess it's reproducible.
 OS is SLES15 SP2 with all current updates (lvm2-2.03.05-8.18.1.x86_64,
 pacemaker-2.0.4+20200616.2deceaa3a-3.3.1.x86_64,
 resource-agents-4.4.0+git57.70549516-3.12.1.x86_64).

 The configuration (somewhat trimmed) is attached.

 The only VG the cluster node sees is:
 ph16:~ # vgs
 VG  #PV #LV #SN Attr   VSize   VFree
 sys   1   3   0 wz--n- 222.50g0

 Regards,
 Ulrich

> I feel the problem was probably caused by lvmlock resource agent
script,
> which did not handle this corner case correctly.
>
> Thanks
> Gang
>
>
> On 2021/1/21 17:53, Ulrich Windl wrote:
>> Hi!
>>
>> I have a problem: For tests I had configured lvmlockd. Now that the
>> tests
> have ended, no LVM is used for cluster resources any more, but lvmlockd
>> is
> still configured.
>> Unfortunately I ran into this problem:
>> On OCFS2 mount was unmounted successfully, another holding the
lockspace
 for
>