This seems like a case of O_DIRECT reads and writes gone wrong, judging by the 'Invalid argument' errors.
The two operations that have failed on gluster bricks are: [2017-06-05 09:40:39.428979] E [MSGID: 113072] [posix.c:3453:posix_writev] 0-engine-posix: write failed: offset 0, [Invalid argument] [2017-06-05 09:41:00.865760] E [MSGID: 113040] [posix.c:3178:posix_readv] 0-engine-posix: read failed on gfid=8c94f658-ac3c-4e3a-b368-8c038513a914, fd=0x7f408584c06c, offset=127488 size=512, buf=0x7f4083c0b000 [Invalid argument] But then, both the write and the read have 512byte-aligned offset, size and buf address (which is correct). Are you saying you don't see this issue with 4K block-size? -Krutika On Mon, Jun 5, 2017 at 3:21 PM, Abi Askushi <rightkickt...@gmail.com> wrote: > Hi Sahina, > > Attached are the logs. Let me know if sth else is needed. > > I have 5 disks (with 4K physical sector) in RAID5. The RAID has 64K stripe > size at the moment. > I have prepared the storage as below: > > pvcreate --dataalignment 256K /dev/sda4 > vgcreate --physicalextentsize 256K gluster /dev/sda4 > > lvcreate -n engine --size 120G gluster > mkfs.xfs -f -i size=512 /dev/gluster/engine > > Thanx, > Alex > > On Mon, Jun 5, 2017 at 12:14 PM, Sahina Bose <sab...@redhat.com> wrote: > >> Can we have the gluster mount logs and brick logs to check if it's the >> same issue? >> >> On Sun, Jun 4, 2017 at 11:21 PM, Abi Askushi <rightkickt...@gmail.com> >> wrote: >> >>> I clean installed everything and ran into the same. >>> I then ran gdeploy and encountered the same issue when deploying engine. >>> Seems that gluster (?) doesn't like 4K sector drives. I am not sure if >>> it has to do with alignment. The weird thing is that gluster volumes are >>> all ok, replicating normally and no split brain is reported. >>> >>> The solution to the mentioned bug (1386443 >>> <https://bugzilla.redhat.com/show_bug.cgi?id=1386443>) was to format >>> with 512 sector size, which for my case is not an option: >>> >>> mkfs.xfs -f -i size=512 -s size=512 /dev/gluster/engine >>> illegal sector size 512; hw sector is 4096 >>> >>> Is there any workaround to address this? >>> >>> Thanx, >>> Alex >>> >>> >>> On Sun, Jun 4, 2017 at 5:48 PM, Abi Askushi <rightkickt...@gmail.com> >>> wrote: >>> >>>> Hi Maor, >>>> >>>> My disk are of 4K block size and from this bug seems that gluster >>>> replica needs 512B block size. >>>> Is there a way to make gluster function with 4K drives? >>>> >>>> Thank you! >>>> >>>> On Sun, Jun 4, 2017 at 2:34 PM, Maor Lipchuk <mlipc...@redhat.com> >>>> wrote: >>>> >>>>> Hi Alex, >>>>> >>>>> I saw a bug that might be related to the issue you encountered at >>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1386443 >>>>> >>>>> Sahina, maybe you have any advise? Do you think that BZ1386443is >>>>> related? >>>>> >>>>> Regards, >>>>> Maor >>>>> >>>>> On Sat, Jun 3, 2017 at 8:45 PM, Abi Askushi <rightkickt...@gmail.com> >>>>> wrote: >>>>> > Hi All, >>>>> > >>>>> > I have installed successfully several times oVirt (version 4.1) with >>>>> 3 nodes >>>>> > on top glusterfs. >>>>> > >>>>> > This time, when trying to configure the same setup, I am facing the >>>>> > following issue which doesn't seem to go away. During installation i >>>>> get the >>>>> > error: >>>>> > >>>>> > Failed to execute stage 'Misc configuration': Cannot acquire host id: >>>>> > (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', SanlockException(22, >>>>> 'Sanlock >>>>> > lockspace add failure', 'Invalid argument')) >>>>> > >>>>> > The only different in this setup is that instead of standard >>>>> partitioning i >>>>> > have GPT partitioning and the disks have 4K block size instead of >>>>> 512. >>>>> > >>>>> > The /var/log/sanlock.log has the following lines: >>>>> > >>>>> > 2017-06-03 19:21:15+0200 23450 [943]: s9 lockspace >>>>> > ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047:250:/rhev/data-center/m >>>>> nt/_var_lib_ovirt-hosted-engin-setup_tmptjkIDI/ba6bd862-c2b8 >>>>> -46e7-b2c8-91e4a5bb2047/dom_md/ids:0 >>>>> > 2017-06-03 19:21:36+0200 23471 [944]: s9:r5 resource >>>>> > ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047:SDM:/rhev/data-center/m >>>>> nt/_var_lib_ovirt-hosted-engine-setup_tmptjkIDI/ba6bd862-c2b >>>>> 8-46e7-b2c8-91e4a5bb2047/dom_md/leases:1048576 >>>>> > for 2,9,23040 >>>>> > 2017-06-03 19:21:36+0200 23471 [943]: s10 lockspace >>>>> > a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922:250:/rhev/data-center/m >>>>> nt/glusterSD/10.100.100.1:_engine/a5a6b0e7-fc3f-4838-8e26-c8 >>>>> b4d5e5e922/dom_md/ids:0 >>>>> > 2017-06-03 19:21:36+0200 23471 [23522]: a5a6b0e7 aio collect RD >>>>> > 0x7f59b00008c0:0x7f59b00008d0:0x7f59b0101000 result -22:0 match res >>>>> > 2017-06-03 19:21:36+0200 23471 [23522]: read_sectors delta_leader >>>>> offset >>>>> > 127488 rv -22 >>>>> > /rhev/data-center/mnt/glusterSD/10.100.100.1:_engine/a5a6b0e >>>>> 7-fc3f-4838-8e26-c8b4d5e5e922/dom_md/ids >>>>> > 2017-06-03 19:21:37+0200 23472 [930]: s9 host 250 1 23450 >>>>> > 88c2244c-a782-40ed-9560-6cfa4d46f853.v0.neptune >>>>> > 2017-06-03 19:21:37+0200 23472 [943]: s10 add_lockspace fail result >>>>> -22 >>>>> > >>>>> > And /var/log/vdsm/vdsm.log says: >>>>> > >>>>> > 2017-06-03 19:19:38,176+0200 WARN (jsonrpc/3) >>>>> > [storage.StorageServer.MountConnection] Using user specified >>>>> > backup-volfile-servers option (storageServer:253) >>>>> > 2017-06-03 19:21:12,379+0200 WARN (periodic/1) [throttled] MOM not >>>>> > available. (throttledlog:105) >>>>> > 2017-06-03 19:21:12,380+0200 WARN (periodic/1) [throttled] MOM not >>>>> > available, KSM stats will be missing. (throttledlog:105) >>>>> > 2017-06-03 19:21:14,714+0200 WARN (jsonrpc/1) >>>>> > [storage.StorageServer.MountConnection] Using user specified >>>>> > backup-volfile-servers option (storageServer:253) >>>>> > 2017-06-03 19:21:15,515+0200 ERROR (jsonrpc/4) [storage.initSANLock] >>>>> Cannot >>>>> > initialize SANLock for domain a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922 >>>>> > (clusterlock:238) >>>>> > Traceback (most recent call last): >>>>> > File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py", >>>>> line >>>>> > 234, in initSANLock >>>>> > sanlock.init_lockspace(sdUUID, idsPath) >>>>> > SanlockException: (107, 'Sanlock lockspace init failure', 'Transport >>>>> > endpoint is not connected') >>>>> > 2017-06-03 19:21:15,515+0200 WARN (jsonrpc/4) >>>>> > [storage.StorageDomainManifest] lease did not initialize >>>>> successfully >>>>> > (sd:557) >>>>> > Traceback (most recent call last): >>>>> > File "/usr/share/vdsm/storage/sd.py", line 552, in initDomainLock >>>>> > self._domainLock.initLock(self.getDomainLease()) >>>>> > File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py", >>>>> line >>>>> > 271, in initLock >>>>> > initSANLock(self._sdUUID, self._idsPath, lease) >>>>> > File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py", >>>>> line >>>>> > 239, in initSANLock >>>>> > raise se.ClusterLockInitError() >>>>> > ClusterLockInitError: Could not initialize cluster lock: () >>>>> > 2017-06-03 19:21:37,867+0200 ERROR (jsonrpc/2) [storage.StoragePool] >>>>> Create >>>>> > pool hosted_datacenter canceled (sp:655) >>>>> > Traceback (most recent call last): >>>>> > File "/usr/share/vdsm/storage/sp.py", line 652, in create >>>>> > self.attachSD(sdUUID) >>>>> > File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", >>>>> line >>>>> > 79, in wrapper >>>>> > return method(self, *args, **kwargs) >>>>> > File "/usr/share/vdsm/storage/sp.py", line 971, in attachSD >>>>> > dom.acquireHostId(self.id) >>>>> > File "/usr/share/vdsm/storage/sd.py", line 790, in acquireHostId >>>>> > self._manifest.acquireHostId(hostId, async) >>>>> > File "/usr/share/vdsm/storage/sd.py", line 449, in acquireHostId >>>>> > self._domainLock.acquireHostId(hostId, async) >>>>> > File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py", >>>>> line >>>>> > 297, in acquireHostId >>>>> > raise se.AcquireHostIdFailure(self._sdUUID, e) >>>>> > AcquireHostIdFailure: Cannot acquire host id: >>>>> > (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', SanlockException(22, >>>>> 'Sanlock >>>>> > lockspace add failure', 'Invalid argument')) >>>>> > 2017-06-03 19:21:37,870+0200 ERROR (jsonrpc/2) [storage.StoragePool] >>>>> Domain >>>>> > ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047 detach from MSD >>>>> > ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047 Ver 1 failed. (sp:528) >>>>> > Traceback (most recent call last): >>>>> > File "/usr/share/vdsm/storage/sp.py", line 525, in >>>>> __cleanupDomains >>>>> > self.detachSD(sdUUID) >>>>> > File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", >>>>> line >>>>> > 79, in wrapper >>>>> > return method(self, *args, **kwargs) >>>>> > File "/usr/share/vdsm/storage/sp.py", line 1046, in detachSD >>>>> > raise se.CannotDetachMasterStorageDomain(sdUUID) >>>>> > CannotDetachMasterStorageDomain: Illegal action: >>>>> > (u'ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047',) >>>>> > 2017-06-03 19:21:37,872+0200 ERROR (jsonrpc/2) [storage.StoragePool] >>>>> Domain >>>>> > a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922 detach from MSD >>>>> > ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047 Ver 1 failed. (sp:528) >>>>> > Traceback (most recent call last): >>>>> > File "/usr/share/vdsm/storage/sp.py", line 525, in >>>>> __cleanupDomains >>>>> > self.detachSD(sdUUID) >>>>> > File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", >>>>> line >>>>> > 79, in wrapper >>>>> > return method(self, *args, **kwargs) >>>>> > File "/usr/share/vdsm/storage/sp.py", line 1043, in detachSD >>>>> > self.validateAttachedDomain(dom) >>>>> > File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", >>>>> line >>>>> > 79, in wrapper >>>>> > return method(self, *args, **kwargs) >>>>> > File "/usr/share/vdsm/storage/sp.py", line 542, in >>>>> validateAttachedDomain >>>>> > self.validatePoolSD(dom.sdUUID) >>>>> > File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", >>>>> line >>>>> > 79, in wrapper >>>>> > return method(self, *args, **kwargs) >>>>> > File "/usr/share/vdsm/storage/sp.py", line 535, in validatePoolSD >>>>> > raise se.StorageDomainNotMemberOfPool(self.spUUID, sdUUID) >>>>> > StorageDomainNotMemberOfPool: Domain is not member in pool: >>>>> > u'pool=a1e7e9dd-0cf4-41ae-ba13-36297ed66309, >>>>> > domain=a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922' >>>>> > 2017-06-03 19:21:40,063+0200 ERROR (jsonrpc/2) >>>>> [storage.TaskManager.Task] >>>>> > (Task='a2476a33-26f8-4ebd-876d-02fe5d13ef78') Unexpected error >>>>> (task:870) >>>>> > Traceback (most recent call last): >>>>> > File "/usr/share/vdsm/storage/task.py", line 877, in _run >>>>> > return fn(*args, **kargs) >>>>> > File "/usr/lib/python2.7/site-packages/vdsm/logUtils.py", line >>>>> 52, in >>>>> > wrapper >>>>> > res = f(*args, **kwargs) >>>>> > File "/usr/share/vdsm/storage/hsm.py", line 959, in >>>>> createStoragePool >>>>> > leaseParams) >>>>> > File "/usr/share/vdsm/storage/sp.py", line 652, in create >>>>> > self.attachSD(sdUUID) >>>>> > File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", >>>>> line >>>>> > 79, in wrapper >>>>> > return method(self, *args, **kwargs) >>>>> > File "/usr/share/vdsm/storage/sp.py", line 971, in attachSD >>>>> > dom.acquireHostId(self.id) >>>>> > File "/usr/share/vdsm/storage/sd.py", line 790, in acquireHostId >>>>> > self._manifest.acquireHostId(hostId, async) >>>>> > File "/usr/share/vdsm/storage/sd.py", line 449, in acquireHostId >>>>> > self._domainLock.acquireHostId(hostId, async) >>>>> > File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py", >>>>> line >>>>> > 297, in acquireHostId >>>>> > raise se.AcquireHostIdFailure(self._sdUUID, e) >>>>> > AcquireHostIdFailure: Cannot acquire host id: >>>>> > (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', SanlockException(22, >>>>> 'Sanlock >>>>> > lockspace add failure', 'Invalid argument')) >>>>> > 2017-06-03 19:21:40,067+0200 ERROR (jsonrpc/2) [storage.Dispatcher] >>>>> > {'status': {'message': "Cannot acquire host id: >>>>> > (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', SanlockException(22, >>>>> 'Sanlock >>>>> > lockspace add failure', 'Invalid argument'))", 'code': 661}} >>>>> (dispatcher:77) >>>>> > >>>>> > The gluster volume prepared for engine storage is online and no >>>>> split brain >>>>> > is reported. I don't understand what needs to be done to overcome >>>>> this. Any >>>>> > idea will be appreciated. >>>>> > >>>>> > Thank you, >>>>> > Alex >>>>> > >>>>> > _______________________________________________ >>>>> > Users mailing list >>>>> > Users@ovirt.org >>>>> > http://lists.ovirt.org/mailman/listinfo/users >>>>> > >>>>> >>>> >>>> >>> >>> _______________________________________________ >>> Users mailing list >>> Users@ovirt.org >>> http://lists.ovirt.org/mailman/listinfo/users >>> >>> >> > > _______________________________________________ > Users mailing list > Users@ovirt.org > http://lists.ovirt.org/mailman/listinfo/users > >
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users