Hi Krutika, I am saying that I am facing this issue with 4k drives. I never encountered this issue with 512 drives.
Alex On Jun 5, 2017 14:26, "Krutika Dhananjay" <kdhan...@redhat.com> wrote: > This seems like a case of O_DIRECT reads and writes gone wrong, judging by > the 'Invalid argument' errors. > > The two operations that have failed on gluster bricks are: > > [2017-06-05 09:40:39.428979] E [MSGID: 113072] [posix.c:3453:posix_writev] > 0-engine-posix: write failed: offset 0, [Invalid argument] > [2017-06-05 09:41:00.865760] E [MSGID: 113040] [posix.c:3178:posix_readv] > 0-engine-posix: read failed on gfid=8c94f658-ac3c-4e3a-b368-8c038513a914, > fd=0x7f408584c06c, offset=127488 size=512, buf=0x7f4083c0b000 [Invalid > argument] > > But then, both the write and the read have 512byte-aligned offset, size > and buf address (which is correct). > > Are you saying you don't see this issue with 4K block-size? > > -Krutika > > On Mon, Jun 5, 2017 at 3:21 PM, Abi Askushi <rightkickt...@gmail.com> > wrote: > >> Hi Sahina, >> >> Attached are the logs. Let me know if sth else is needed. >> >> I have 5 disks (with 4K physical sector) in RAID5. The RAID has 64K >> stripe size at the moment. >> I have prepared the storage as below: >> >> pvcreate --dataalignment 256K /dev/sda4 >> vgcreate --physicalextentsize 256K gluster /dev/sda4 >> >> lvcreate -n engine --size 120G gluster >> mkfs.xfs -f -i size=512 /dev/gluster/engine >> >> Thanx, >> Alex >> >> On Mon, Jun 5, 2017 at 12:14 PM, Sahina Bose <sab...@redhat.com> wrote: >> >>> Can we have the gluster mount logs and brick logs to check if it's the >>> same issue? >>> >>> On Sun, Jun 4, 2017 at 11:21 PM, Abi Askushi <rightkickt...@gmail.com> >>> wrote: >>> >>>> I clean installed everything and ran into the same. >>>> I then ran gdeploy and encountered the same issue when deploying >>>> engine. >>>> Seems that gluster (?) doesn't like 4K sector drives. I am not sure if >>>> it has to do with alignment. The weird thing is that gluster volumes are >>>> all ok, replicating normally and no split brain is reported. >>>> >>>> The solution to the mentioned bug (1386443 >>>> <https://bugzilla.redhat.com/show_bug.cgi?id=1386443>) was to format >>>> with 512 sector size, which for my case is not an option: >>>> >>>> mkfs.xfs -f -i size=512 -s size=512 /dev/gluster/engine >>>> illegal sector size 512; hw sector is 4096 >>>> >>>> Is there any workaround to address this? >>>> >>>> Thanx, >>>> Alex >>>> >>>> >>>> On Sun, Jun 4, 2017 at 5:48 PM, Abi Askushi <rightkickt...@gmail.com> >>>> wrote: >>>> >>>>> Hi Maor, >>>>> >>>>> My disk are of 4K block size and from this bug seems that gluster >>>>> replica needs 512B block size. >>>>> Is there a way to make gluster function with 4K drives? >>>>> >>>>> Thank you! >>>>> >>>>> On Sun, Jun 4, 2017 at 2:34 PM, Maor Lipchuk <mlipc...@redhat.com> >>>>> wrote: >>>>> >>>>>> Hi Alex, >>>>>> >>>>>> I saw a bug that might be related to the issue you encountered at >>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1386443 >>>>>> >>>>>> Sahina, maybe you have any advise? Do you think that BZ1386443is >>>>>> related? >>>>>> >>>>>> Regards, >>>>>> Maor >>>>>> >>>>>> On Sat, Jun 3, 2017 at 8:45 PM, Abi Askushi <rightkickt...@gmail.com> >>>>>> wrote: >>>>>> > Hi All, >>>>>> > >>>>>> > I have installed successfully several times oVirt (version 4.1) >>>>>> with 3 nodes >>>>>> > on top glusterfs. >>>>>> > >>>>>> > This time, when trying to configure the same setup, I am facing the >>>>>> > following issue which doesn't seem to go away. During installation >>>>>> i get the >>>>>> > error: >>>>>> > >>>>>> > Failed to execute stage 'Misc configuration': Cannot acquire host >>>>>> id: >>>>>> > (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', SanlockException(22, >>>>>> 'Sanlock >>>>>> > lockspace add failure', 'Invalid argument')) >>>>>> > >>>>>> > The only different in this setup is that instead of standard >>>>>> partitioning i >>>>>> > have GPT partitioning and the disks have 4K block size instead of >>>>>> 512. >>>>>> > >>>>>> > The /var/log/sanlock.log has the following lines: >>>>>> > >>>>>> > 2017-06-03 19:21:15+0200 23450 [943]: s9 lockspace >>>>>> > ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047:250:/rhev/data-center/m >>>>>> nt/_var_lib_ovirt-hosted-engin-setup_tmptjkIDI/ba6bd862-c2b8 >>>>>> -46e7-b2c8-91e4a5bb2047/dom_md/ids:0 >>>>>> > 2017-06-03 19:21:36+0200 23471 [944]: s9:r5 resource >>>>>> > ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047:SDM:/rhev/data-center/m >>>>>> nt/_var_lib_ovirt-hosted-engine-setup_tmptjkIDI/ba6bd862-c2b >>>>>> 8-46e7-b2c8-91e4a5bb2047/dom_md/leases:1048576 >>>>>> > for 2,9,23040 >>>>>> > 2017-06-03 19:21:36+0200 23471 [943]: s10 lockspace >>>>>> > a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922:250:/rhev/data-center/m >>>>>> nt/glusterSD/10.100.100.1:_engine/a5a6b0e7-fc3f-4838-8e26-c8 >>>>>> b4d5e5e922/dom_md/ids:0 >>>>>> > 2017-06-03 19:21:36+0200 23471 [23522]: a5a6b0e7 aio collect RD >>>>>> > 0x7f59b00008c0:0x7f59b00008d0:0x7f59b0101000 result -22:0 match res >>>>>> > 2017-06-03 19:21:36+0200 23471 [23522]: read_sectors delta_leader >>>>>> offset >>>>>> > 127488 rv -22 >>>>>> > /rhev/data-center/mnt/glusterSD/10.100.100.1:_engine/a5a6b0e >>>>>> 7-fc3f-4838-8e26-c8b4d5e5e922/dom_md/ids >>>>>> > 2017-06-03 19:21:37+0200 23472 [930]: s9 host 250 1 23450 >>>>>> > 88c2244c-a782-40ed-9560-6cfa4d46f853.v0.neptune >>>>>> > 2017-06-03 19:21:37+0200 23472 [943]: s10 add_lockspace fail result >>>>>> -22 >>>>>> > >>>>>> > And /var/log/vdsm/vdsm.log says: >>>>>> > >>>>>> > 2017-06-03 19:19:38,176+0200 WARN (jsonrpc/3) >>>>>> > [storage.StorageServer.MountConnection] Using user specified >>>>>> > backup-volfile-servers option (storageServer:253) >>>>>> > 2017-06-03 19:21:12,379+0200 WARN (periodic/1) [throttled] MOM not >>>>>> > available. (throttledlog:105) >>>>>> > 2017-06-03 19:21:12,380+0200 WARN (periodic/1) [throttled] MOM not >>>>>> > available, KSM stats will be missing. (throttledlog:105) >>>>>> > 2017-06-03 19:21:14,714+0200 WARN (jsonrpc/1) >>>>>> > [storage.StorageServer.MountConnection] Using user specified >>>>>> > backup-volfile-servers option (storageServer:253) >>>>>> > 2017-06-03 19:21:15,515+0200 ERROR (jsonrpc/4) >>>>>> [storage.initSANLock] Cannot >>>>>> > initialize SANLock for domain a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922 >>>>>> > (clusterlock:238) >>>>>> > Traceback (most recent call last): >>>>>> > File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py", >>>>>> line >>>>>> > 234, in initSANLock >>>>>> > sanlock.init_lockspace(sdUUID, idsPath) >>>>>> > SanlockException: (107, 'Sanlock lockspace init failure', 'Transport >>>>>> > endpoint is not connected') >>>>>> > 2017-06-03 19:21:15,515+0200 WARN (jsonrpc/4) >>>>>> > [storage.StorageDomainManifest] lease did not initialize >>>>>> successfully >>>>>> > (sd:557) >>>>>> > Traceback (most recent call last): >>>>>> > File "/usr/share/vdsm/storage/sd.py", line 552, in initDomainLock >>>>>> > self._domainLock.initLock(self.getDomainLease()) >>>>>> > File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py", >>>>>> line >>>>>> > 271, in initLock >>>>>> > initSANLock(self._sdUUID, self._idsPath, lease) >>>>>> > File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py", >>>>>> line >>>>>> > 239, in initSANLock >>>>>> > raise se.ClusterLockInitError() >>>>>> > ClusterLockInitError: Could not initialize cluster lock: () >>>>>> > 2017-06-03 19:21:37,867+0200 ERROR (jsonrpc/2) >>>>>> [storage.StoragePool] Create >>>>>> > pool hosted_datacenter canceled (sp:655) >>>>>> > Traceback (most recent call last): >>>>>> > File "/usr/share/vdsm/storage/sp.py", line 652, in create >>>>>> > self.attachSD(sdUUID) >>>>>> > File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", >>>>>> line >>>>>> > 79, in wrapper >>>>>> > return method(self, *args, **kwargs) >>>>>> > File "/usr/share/vdsm/storage/sp.py", line 971, in attachSD >>>>>> > dom.acquireHostId(self.id) >>>>>> > File "/usr/share/vdsm/storage/sd.py", line 790, in acquireHostId >>>>>> > self._manifest.acquireHostId(hostId, async) >>>>>> > File "/usr/share/vdsm/storage/sd.py", line 449, in acquireHostId >>>>>> > self._domainLock.acquireHostId(hostId, async) >>>>>> > File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py", >>>>>> line >>>>>> > 297, in acquireHostId >>>>>> > raise se.AcquireHostIdFailure(self._sdUUID, e) >>>>>> > AcquireHostIdFailure: Cannot acquire host id: >>>>>> > (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', SanlockException(22, >>>>>> 'Sanlock >>>>>> > lockspace add failure', 'Invalid argument')) >>>>>> > 2017-06-03 19:21:37,870+0200 ERROR (jsonrpc/2) >>>>>> [storage.StoragePool] Domain >>>>>> > ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047 detach from MSD >>>>>> > ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047 Ver 1 failed. (sp:528) >>>>>> > Traceback (most recent call last): >>>>>> > File "/usr/share/vdsm/storage/sp.py", line 525, in >>>>>> __cleanupDomains >>>>>> > self.detachSD(sdUUID) >>>>>> > File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", >>>>>> line >>>>>> > 79, in wrapper >>>>>> > return method(self, *args, **kwargs) >>>>>> > File "/usr/share/vdsm/storage/sp.py", line 1046, in detachSD >>>>>> > raise se.CannotDetachMasterStorageDomain(sdUUID) >>>>>> > CannotDetachMasterStorageDomain: Illegal action: >>>>>> > (u'ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047',) >>>>>> > 2017-06-03 19:21:37,872+0200 ERROR (jsonrpc/2) >>>>>> [storage.StoragePool] Domain >>>>>> > a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922 detach from MSD >>>>>> > ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047 Ver 1 failed. (sp:528) >>>>>> > Traceback (most recent call last): >>>>>> > File "/usr/share/vdsm/storage/sp.py", line 525, in >>>>>> __cleanupDomains >>>>>> > self.detachSD(sdUUID) >>>>>> > File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", >>>>>> line >>>>>> > 79, in wrapper >>>>>> > return method(self, *args, **kwargs) >>>>>> > File "/usr/share/vdsm/storage/sp.py", line 1043, in detachSD >>>>>> > self.validateAttachedDomain(dom) >>>>>> > File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", >>>>>> line >>>>>> > 79, in wrapper >>>>>> > return method(self, *args, **kwargs) >>>>>> > File "/usr/share/vdsm/storage/sp.py", line 542, in >>>>>> validateAttachedDomain >>>>>> > self.validatePoolSD(dom.sdUUID) >>>>>> > File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", >>>>>> line >>>>>> > 79, in wrapper >>>>>> > return method(self, *args, **kwargs) >>>>>> > File "/usr/share/vdsm/storage/sp.py", line 535, in validatePoolSD >>>>>> > raise se.StorageDomainNotMemberOfPool(self.spUUID, sdUUID) >>>>>> > StorageDomainNotMemberOfPool: Domain is not member in pool: >>>>>> > u'pool=a1e7e9dd-0cf4-41ae-ba13-36297ed66309, >>>>>> > domain=a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922' >>>>>> > 2017-06-03 19:21:40,063+0200 ERROR (jsonrpc/2) >>>>>> [storage.TaskManager.Task] >>>>>> > (Task='a2476a33-26f8-4ebd-876d-02fe5d13ef78') Unexpected error >>>>>> (task:870) >>>>>> > Traceback (most recent call last): >>>>>> > File "/usr/share/vdsm/storage/task.py", line 877, in _run >>>>>> > return fn(*args, **kargs) >>>>>> > File "/usr/lib/python2.7/site-packages/vdsm/logUtils.py", line >>>>>> 52, in >>>>>> > wrapper >>>>>> > res = f(*args, **kwargs) >>>>>> > File "/usr/share/vdsm/storage/hsm.py", line 959, in >>>>>> createStoragePool >>>>>> > leaseParams) >>>>>> > File "/usr/share/vdsm/storage/sp.py", line 652, in create >>>>>> > self.attachSD(sdUUID) >>>>>> > File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", >>>>>> line >>>>>> > 79, in wrapper >>>>>> > return method(self, *args, **kwargs) >>>>>> > File "/usr/share/vdsm/storage/sp.py", line 971, in attachSD >>>>>> > dom.acquireHostId(self.id) >>>>>> > File "/usr/share/vdsm/storage/sd.py", line 790, in acquireHostId >>>>>> > self._manifest.acquireHostId(hostId, async) >>>>>> > File "/usr/share/vdsm/storage/sd.py", line 449, in acquireHostId >>>>>> > self._domainLock.acquireHostId(hostId, async) >>>>>> > File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py", >>>>>> line >>>>>> > 297, in acquireHostId >>>>>> > raise se.AcquireHostIdFailure(self._sdUUID, e) >>>>>> > AcquireHostIdFailure: Cannot acquire host id: >>>>>> > (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', SanlockException(22, >>>>>> 'Sanlock >>>>>> > lockspace add failure', 'Invalid argument')) >>>>>> > 2017-06-03 19:21:40,067+0200 ERROR (jsonrpc/2) [storage.Dispatcher] >>>>>> > {'status': {'message': "Cannot acquire host id: >>>>>> > (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', SanlockException(22, >>>>>> 'Sanlock >>>>>> > lockspace add failure', 'Invalid argument'))", 'code': 661}} >>>>>> (dispatcher:77) >>>>>> > >>>>>> > The gluster volume prepared for engine storage is online and no >>>>>> split brain >>>>>> > is reported. I don't understand what needs to be done to overcome >>>>>> this. Any >>>>>> > idea will be appreciated. >>>>>> > >>>>>> > Thank you, >>>>>> > Alex >>>>>> > >>>>>> > _______________________________________________ >>>>>> > Users mailing list >>>>>> > Users@ovirt.org >>>>>> > http://lists.ovirt.org/mailman/listinfo/users >>>>>> > >>>>>> >>>>> >>>>> >>>> >>>> _______________________________________________ >>>> Users mailing list >>>> Users@ovirt.org >>>> http://lists.ovirt.org/mailman/listinfo/users >>>> >>>> >>> >> >> _______________________________________________ >> Users mailing list >> Users@ovirt.org >> http://lists.ovirt.org/mailman/listinfo/users >> >> >
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users