I stand corrected. Just realised the strace command I gave was wrong.
Here's what you would actually need to execute: strace -y -ff -o <path-where-you-want-your-output-saved> <dd command here> -Krutika On Tue, Jun 6, 2017 at 3:20 PM, Krutika Dhananjay <kdhan...@redhat.com> wrote: > OK. > > So for the 'Transport endpoint is not connected' issue, could you share > the mount and brick logs? > > Hmmm.. 'Invalid argument' error even on the root partition. What if you > change bs to 4096 and run? > > The logs I showed in my earlier mail shows that gluster is merely > returning the error it got from the disk file system where the > brick is hosted. But you're right about the fact that the offset 127488 is > not 4K-aligned. > > If the dd on /root worked for you with bs=4096, could you try the same > directly on gluster mount point on a dummy file and capture the strace > output of dd? > You can perhaps reuse your existing gluster volume by mounting it at > another location and doing the dd. > Here's what you need to execute: > > strace -ff -T -p <pid-of-mount-process> -o > <path-to-the-file-where-you-want-the-output-saved>` > > FWIW, here's something I found in man(2) open: > > > > > *Under Linux 2.4, transfer sizes, and the alignment of the user > buffer and the file offset must all be multiples of the logical block size > of the filesystem. Since Linux 2.6.0, alignment to the logical block size > of the underlying storage (typically 512 bytes) suffices. The > logical block size can be determined using the ioctl(2) BLKSSZGET operation > or from the shell using the command: blockdev --getss* > > > -Krutika > > > On Tue, Jun 6, 2017 at 1:18 AM, Abi Askushi <rightkickt...@gmail.com> > wrote: > >> Also when testing with dd i get the following: >> >> *Testing on the gluster mount: * >> dd if=/dev/zero >> of=/rhev/data-center/mnt/glusterSD/10.100.100.1:_engine/test2.img >> oflag=direct bs=512 count=1 >> dd: error writing β/rhev/data-center/mnt/glusterSD/10.100.100.1: >> _engine/test2.imgβ: *Transport endpoint is not connected* >> 1+0 records in >> 0+0 records out >> 0 bytes (0 B) copied, 0.00336755 s, 0.0 kB/s >> >> *Testing on the /root directory (XFS): * >> dd if=/dev/zero of=/test2.img oflag=direct bs=512 count=1 >> dd: error writing β/test2.imgβ:* Invalid argument* >> 1+0 records in >> 0+0 records out >> 0 bytes (0 B) copied, 0.000321239 s, 0.0 kB/s >> >> Seems that the gluster is trying to do the same and fails. >> >> >> >> On Mon, Jun 5, 2017 at 10:10 PM, Abi Askushi <rightkickt...@gmail.com> >> wrote: >> >>> The question that rises is what is needed to make gluster aware of the >>> 4K physical sectors presented to it (the logical sector is also 4K). The >>> offset (127488) at the log does not seem aligned at 4K. >>> >>> Alex >>> >>> On Mon, Jun 5, 2017 at 2:47 PM, Abi Askushi <rightkickt...@gmail.com> >>> wrote: >>> >>>> Hi Krutika, >>>> >>>> I am saying that I am facing this issue with 4k drives. I never >>>> encountered this issue with 512 drives. >>>> >>>> Alex >>>> >>>> On Jun 5, 2017 14:26, "Krutika Dhananjay" <kdhan...@redhat.com> wrote: >>>> >>>>> This seems like a case of O_DIRECT reads and writes gone wrong, >>>>> judging by the 'Invalid argument' errors. >>>>> >>>>> The two operations that have failed on gluster bricks are: >>>>> >>>>> [2017-06-05 09:40:39.428979] E [MSGID: 113072] >>>>> [posix.c:3453:posix_writev] 0-engine-posix: write failed: offset 0, >>>>> [Invalid argument] >>>>> [2017-06-05 09:41:00.865760] E [MSGID: 113040] >>>>> [posix.c:3178:posix_readv] 0-engine-posix: read failed on >>>>> gfid=8c94f658-ac3c-4e3a-b368-8c038513a914, fd=0x7f408584c06c, >>>>> offset=127488 size=512, buf=0x7f4083c0b000 [Invalid argument] >>>>> >>>>> But then, both the write and the read have 512byte-aligned offset, >>>>> size and buf address (which is correct). >>>>> >>>>> Are you saying you don't see this issue with 4K block-size? >>>>> >>>>> -Krutika >>>>> >>>>> On Mon, Jun 5, 2017 at 3:21 PM, Abi Askushi <rightkickt...@gmail.com> >>>>> wrote: >>>>> >>>>>> Hi Sahina, >>>>>> >>>>>> Attached are the logs. Let me know if sth else is needed. >>>>>> >>>>>> I have 5 disks (with 4K physical sector) in RAID5. The RAID has 64K >>>>>> stripe size at the moment. >>>>>> I have prepared the storage as below: >>>>>> >>>>>> pvcreate --dataalignment 256K /dev/sda4 >>>>>> vgcreate --physicalextentsize 256K gluster /dev/sda4 >>>>>> >>>>>> lvcreate -n engine --size 120G gluster >>>>>> mkfs.xfs -f -i size=512 /dev/gluster/engine >>>>>> >>>>>> Thanx, >>>>>> Alex >>>>>> >>>>>> On Mon, Jun 5, 2017 at 12:14 PM, Sahina Bose <sab...@redhat.com> >>>>>> wrote: >>>>>> >>>>>>> Can we have the gluster mount logs and brick logs to check if it's >>>>>>> the same issue? >>>>>>> >>>>>>> On Sun, Jun 4, 2017 at 11:21 PM, Abi Askushi < >>>>>>> rightkickt...@gmail.com> wrote: >>>>>>> >>>>>>>> I clean installed everything and ran into the same. >>>>>>>> I then ran gdeploy and encountered the same issue when deploying >>>>>>>> engine. >>>>>>>> Seems that gluster (?) doesn't like 4K sector drives. I am not sure >>>>>>>> if it has to do with alignment. The weird thing is that gluster >>>>>>>> volumes are >>>>>>>> all ok, replicating normally and no split brain is reported. >>>>>>>> >>>>>>>> The solution to the mentioned bug (1386443 >>>>>>>> <https://bugzilla.redhat.com/show_bug.cgi?id=1386443>) was to >>>>>>>> format with 512 sector size, which for my case is not an option: >>>>>>>> >>>>>>>> mkfs.xfs -f -i size=512 -s size=512 /dev/gluster/engine >>>>>>>> illegal sector size 512; hw sector is 4096 >>>>>>>> >>>>>>>> Is there any workaround to address this? >>>>>>>> >>>>>>>> Thanx, >>>>>>>> Alex >>>>>>>> >>>>>>>> >>>>>>>> On Sun, Jun 4, 2017 at 5:48 PM, Abi Askushi < >>>>>>>> rightkickt...@gmail.com> wrote: >>>>>>>> >>>>>>>>> Hi Maor, >>>>>>>>> >>>>>>>>> My disk are of 4K block size and from this bug seems that gluster >>>>>>>>> replica needs 512B block size. >>>>>>>>> Is there a way to make gluster function with 4K drives? >>>>>>>>> >>>>>>>>> Thank you! >>>>>>>>> >>>>>>>>> On Sun, Jun 4, 2017 at 2:34 PM, Maor Lipchuk <mlipc...@redhat.com> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> Hi Alex, >>>>>>>>>> >>>>>>>>>> I saw a bug that might be related to the issue you encountered at >>>>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1386443 >>>>>>>>>> >>>>>>>>>> Sahina, maybe you have any advise? Do you think that BZ1386443is >>>>>>>>>> related? >>>>>>>>>> >>>>>>>>>> Regards, >>>>>>>>>> Maor >>>>>>>>>> >>>>>>>>>> On Sat, Jun 3, 2017 at 8:45 PM, Abi Askushi < >>>>>>>>>> rightkickt...@gmail.com> wrote: >>>>>>>>>> > Hi All, >>>>>>>>>> > >>>>>>>>>> > I have installed successfully several times oVirt (version 4.1) >>>>>>>>>> with 3 nodes >>>>>>>>>> > on top glusterfs. >>>>>>>>>> > >>>>>>>>>> > This time, when trying to configure the same setup, I am facing >>>>>>>>>> the >>>>>>>>>> > following issue which doesn't seem to go away. During >>>>>>>>>> installation i get the >>>>>>>>>> > error: >>>>>>>>>> > >>>>>>>>>> > Failed to execute stage 'Misc configuration': Cannot acquire >>>>>>>>>> host id: >>>>>>>>>> > (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', SanlockException(22, >>>>>>>>>> 'Sanlock >>>>>>>>>> > lockspace add failure', 'Invalid argument')) >>>>>>>>>> > >>>>>>>>>> > The only different in this setup is that instead of standard >>>>>>>>>> partitioning i >>>>>>>>>> > have GPT partitioning and the disks have 4K block size instead >>>>>>>>>> of 512. >>>>>>>>>> > >>>>>>>>>> > The /var/log/sanlock.log has the following lines: >>>>>>>>>> > >>>>>>>>>> > 2017-06-03 19:21:15+0200 23450 [943]: s9 lockspace >>>>>>>>>> > ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047:250:/rhev/data-center/m >>>>>>>>>> nt/_var_lib_ovirt-hosted-engin-setup_tmptjkIDI/ba6bd862-c2b8 >>>>>>>>>> -46e7-b2c8-91e4a5bb2047/dom_md/ids:0 >>>>>>>>>> > 2017-06-03 19:21:36+0200 23471 [944]: s9:r5 resource >>>>>>>>>> > ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047:SDM:/rhev/data-center/m >>>>>>>>>> nt/_var_lib_ovirt-hosted-engine-setup_tmptjkIDI/ba6bd862-c2b >>>>>>>>>> 8-46e7-b2c8-91e4a5bb2047/dom_md/leases:1048576 >>>>>>>>>> > for 2,9,23040 >>>>>>>>>> > 2017-06-03 19:21:36+0200 23471 [943]: s10 lockspace >>>>>>>>>> > a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922:250:/rhev/data-center/m >>>>>>>>>> nt/glusterSD/10.100.100.1:_engine/a5a6b0e7-fc3f-4838-8e26-c8 >>>>>>>>>> b4d5e5e922/dom_md/ids:0 >>>>>>>>>> > 2017-06-03 19:21:36+0200 23471 [23522]: a5a6b0e7 aio collect RD >>>>>>>>>> > 0x7f59b00008c0:0x7f59b00008d0:0x7f59b0101000 result -22:0 >>>>>>>>>> match res >>>>>>>>>> > 2017-06-03 19:21:36+0200 23471 [23522]: read_sectors >>>>>>>>>> delta_leader offset >>>>>>>>>> > 127488 rv -22 >>>>>>>>>> > /rhev/data-center/mnt/glusterSD/10.100.100.1:_engine/a5a6b0e >>>>>>>>>> 7-fc3f-4838-8e26-c8b4d5e5e922/dom_md/ids >>>>>>>>>> > 2017-06-03 19:21:37+0200 23472 [930]: s9 host 250 1 23450 >>>>>>>>>> > 88c2244c-a782-40ed-9560-6cfa4d46f853.v0.neptune >>>>>>>>>> > 2017-06-03 19:21:37+0200 23472 [943]: s10 add_lockspace fail >>>>>>>>>> result -22 >>>>>>>>>> > >>>>>>>>>> > And /var/log/vdsm/vdsm.log says: >>>>>>>>>> > >>>>>>>>>> > 2017-06-03 19:19:38,176+0200 WARN (jsonrpc/3) >>>>>>>>>> > [storage.StorageServer.MountConnection] Using user specified >>>>>>>>>> > backup-volfile-servers option (storageServer:253) >>>>>>>>>> > 2017-06-03 19:21:12,379+0200 WARN (periodic/1) [throttled] MOM >>>>>>>>>> not >>>>>>>>>> > available. (throttledlog:105) >>>>>>>>>> > 2017-06-03 19:21:12,380+0200 WARN (periodic/1) [throttled] MOM >>>>>>>>>> not >>>>>>>>>> > available, KSM stats will be missing. (throttledlog:105) >>>>>>>>>> > 2017-06-03 19:21:14,714+0200 WARN (jsonrpc/1) >>>>>>>>>> > [storage.StorageServer.MountConnection] Using user specified >>>>>>>>>> > backup-volfile-servers option (storageServer:253) >>>>>>>>>> > 2017-06-03 19:21:15,515+0200 ERROR (jsonrpc/4) >>>>>>>>>> [storage.initSANLock] Cannot >>>>>>>>>> > initialize SANLock for domain a5a6b0e7-fc3f-4838-8e26-c8b4d5 >>>>>>>>>> e5e922 >>>>>>>>>> > (clusterlock:238) >>>>>>>>>> > Traceback (most recent call last): >>>>>>>>>> > File "/usr/lib/python2.7/site-packa >>>>>>>>>> ges/vdsm/storage/clusterlock.py", line >>>>>>>>>> > 234, in initSANLock >>>>>>>>>> > sanlock.init_lockspace(sdUUID, idsPath) >>>>>>>>>> > SanlockException: (107, 'Sanlock lockspace init failure', >>>>>>>>>> 'Transport >>>>>>>>>> > endpoint is not connected') >>>>>>>>>> > 2017-06-03 19:21:15,515+0200 WARN (jsonrpc/4) >>>>>>>>>> > [storage.StorageDomainManifest] lease did not initialize >>>>>>>>>> successfully >>>>>>>>>> > (sd:557) >>>>>>>>>> > Traceback (most recent call last): >>>>>>>>>> > File "/usr/share/vdsm/storage/sd.py", line 552, in >>>>>>>>>> initDomainLock >>>>>>>>>> > self._domainLock.initLock(self.getDomainLease()) >>>>>>>>>> > File "/usr/lib/python2.7/site-packa >>>>>>>>>> ges/vdsm/storage/clusterlock.py", line >>>>>>>>>> > 271, in initLock >>>>>>>>>> > initSANLock(self._sdUUID, self._idsPath, lease) >>>>>>>>>> > File "/usr/lib/python2.7/site-packa >>>>>>>>>> ges/vdsm/storage/clusterlock.py", line >>>>>>>>>> > 239, in initSANLock >>>>>>>>>> > raise se.ClusterLockInitError() >>>>>>>>>> > ClusterLockInitError: Could not initialize cluster lock: () >>>>>>>>>> > 2017-06-03 19:21:37,867+0200 ERROR (jsonrpc/2) >>>>>>>>>> [storage.StoragePool] Create >>>>>>>>>> > pool hosted_datacenter canceled (sp:655) >>>>>>>>>> > Traceback (most recent call last): >>>>>>>>>> > File "/usr/share/vdsm/storage/sp.py", line 652, in create >>>>>>>>>> > self.attachSD(sdUUID) >>>>>>>>>> > File "/usr/lib/python2.7/site-packa >>>>>>>>>> ges/vdsm/storage/securable.py", line >>>>>>>>>> > 79, in wrapper >>>>>>>>>> > return method(self, *args, **kwargs) >>>>>>>>>> > File "/usr/share/vdsm/storage/sp.py", line 971, in attachSD >>>>>>>>>> > dom.acquireHostId(self.id) >>>>>>>>>> > File "/usr/share/vdsm/storage/sd.py", line 790, in >>>>>>>>>> acquireHostId >>>>>>>>>> > self._manifest.acquireHostId(hostId, async) >>>>>>>>>> > File "/usr/share/vdsm/storage/sd.py", line 449, in >>>>>>>>>> acquireHostId >>>>>>>>>> > self._domainLock.acquireHostId(hostId, async) >>>>>>>>>> > File "/usr/lib/python2.7/site-packa >>>>>>>>>> ges/vdsm/storage/clusterlock.py", line >>>>>>>>>> > 297, in acquireHostId >>>>>>>>>> > raise se.AcquireHostIdFailure(self._sdUUID, e) >>>>>>>>>> > AcquireHostIdFailure: Cannot acquire host id: >>>>>>>>>> > (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', SanlockException(22, >>>>>>>>>> 'Sanlock >>>>>>>>>> > lockspace add failure', 'Invalid argument')) >>>>>>>>>> > 2017-06-03 19:21:37,870+0200 ERROR (jsonrpc/2) >>>>>>>>>> [storage.StoragePool] Domain >>>>>>>>>> > ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047 detach from MSD >>>>>>>>>> > ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047 Ver 1 failed. (sp:528) >>>>>>>>>> > Traceback (most recent call last): >>>>>>>>>> > File "/usr/share/vdsm/storage/sp.py", line 525, in >>>>>>>>>> __cleanupDomains >>>>>>>>>> > self.detachSD(sdUUID) >>>>>>>>>> > File "/usr/lib/python2.7/site-packa >>>>>>>>>> ges/vdsm/storage/securable.py", line >>>>>>>>>> > 79, in wrapper >>>>>>>>>> > return method(self, *args, **kwargs) >>>>>>>>>> > File "/usr/share/vdsm/storage/sp.py", line 1046, in detachSD >>>>>>>>>> > raise se.CannotDetachMasterStorageDomain(sdUUID) >>>>>>>>>> > CannotDetachMasterStorageDomain: Illegal action: >>>>>>>>>> > (u'ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047',) >>>>>>>>>> > 2017-06-03 19:21:37,872+0200 ERROR (jsonrpc/2) >>>>>>>>>> [storage.StoragePool] Domain >>>>>>>>>> > a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922 detach from MSD >>>>>>>>>> > ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047 Ver 1 failed. (sp:528) >>>>>>>>>> > Traceback (most recent call last): >>>>>>>>>> > File "/usr/share/vdsm/storage/sp.py", line 525, in >>>>>>>>>> __cleanupDomains >>>>>>>>>> > self.detachSD(sdUUID) >>>>>>>>>> > File "/usr/lib/python2.7/site-packa >>>>>>>>>> ges/vdsm/storage/securable.py", line >>>>>>>>>> > 79, in wrapper >>>>>>>>>> > return method(self, *args, **kwargs) >>>>>>>>>> > File "/usr/share/vdsm/storage/sp.py", line 1043, in detachSD >>>>>>>>>> > self.validateAttachedDomain(dom) >>>>>>>>>> > File "/usr/lib/python2.7/site-packa >>>>>>>>>> ges/vdsm/storage/securable.py", line >>>>>>>>>> > 79, in wrapper >>>>>>>>>> > return method(self, *args, **kwargs) >>>>>>>>>> > File "/usr/share/vdsm/storage/sp.py", line 542, in >>>>>>>>>> validateAttachedDomain >>>>>>>>>> > self.validatePoolSD(dom.sdUUID) >>>>>>>>>> > File "/usr/lib/python2.7/site-packa >>>>>>>>>> ges/vdsm/storage/securable.py", line >>>>>>>>>> > 79, in wrapper >>>>>>>>>> > return method(self, *args, **kwargs) >>>>>>>>>> > File "/usr/share/vdsm/storage/sp.py", line 535, in >>>>>>>>>> validatePoolSD >>>>>>>>>> > raise se.StorageDomainNotMemberOfPool(self.spUUID, sdUUID) >>>>>>>>>> > StorageDomainNotMemberOfPool: Domain is not member in pool: >>>>>>>>>> > u'pool=a1e7e9dd-0cf4-41ae-ba13-36297ed66309, >>>>>>>>>> > domain=a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922' >>>>>>>>>> > 2017-06-03 19:21:40,063+0200 ERROR (jsonrpc/2) >>>>>>>>>> [storage.TaskManager.Task] >>>>>>>>>> > (Task='a2476a33-26f8-4ebd-876d-02fe5d13ef78') Unexpected error >>>>>>>>>> (task:870) >>>>>>>>>> > Traceback (most recent call last): >>>>>>>>>> > File "/usr/share/vdsm/storage/task.py", line 877, in _run >>>>>>>>>> > return fn(*args, **kargs) >>>>>>>>>> > File "/usr/lib/python2.7/site-packages/vdsm/logUtils.py", >>>>>>>>>> line 52, in >>>>>>>>>> > wrapper >>>>>>>>>> > res = f(*args, **kwargs) >>>>>>>>>> > File "/usr/share/vdsm/storage/hsm.py", line 959, in >>>>>>>>>> createStoragePool >>>>>>>>>> > leaseParams) >>>>>>>>>> > File "/usr/share/vdsm/storage/sp.py", line 652, in create >>>>>>>>>> > self.attachSD(sdUUID) >>>>>>>>>> > File "/usr/lib/python2.7/site-packa >>>>>>>>>> ges/vdsm/storage/securable.py", line >>>>>>>>>> > 79, in wrapper >>>>>>>>>> > return method(self, *args, **kwargs) >>>>>>>>>> > File "/usr/share/vdsm/storage/sp.py", line 971, in attachSD >>>>>>>>>> > dom.acquireHostId(self.id) >>>>>>>>>> > File "/usr/share/vdsm/storage/sd.py", line 790, in >>>>>>>>>> acquireHostId >>>>>>>>>> > self._manifest.acquireHostId(hostId, async) >>>>>>>>>> > File "/usr/share/vdsm/storage/sd.py", line 449, in >>>>>>>>>> acquireHostId >>>>>>>>>> > self._domainLock.acquireHostId(hostId, async) >>>>>>>>>> > File "/usr/lib/python2.7/site-packa >>>>>>>>>> ges/vdsm/storage/clusterlock.py", line >>>>>>>>>> > 297, in acquireHostId >>>>>>>>>> > raise se.AcquireHostIdFailure(self._sdUUID, e) >>>>>>>>>> > AcquireHostIdFailure: Cannot acquire host id: >>>>>>>>>> > (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', SanlockException(22, >>>>>>>>>> 'Sanlock >>>>>>>>>> > lockspace add failure', 'Invalid argument')) >>>>>>>>>> > 2017-06-03 19:21:40,067+0200 ERROR (jsonrpc/2) >>>>>>>>>> [storage.Dispatcher] >>>>>>>>>> > {'status': {'message': "Cannot acquire host id: >>>>>>>>>> > (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', SanlockException(22, >>>>>>>>>> 'Sanlock >>>>>>>>>> > lockspace add failure', 'Invalid argument'))", 'code': 661}} >>>>>>>>>> (dispatcher:77) >>>>>>>>>> > >>>>>>>>>> > The gluster volume prepared for engine storage is online and no >>>>>>>>>> split brain >>>>>>>>>> > is reported. I don't understand what needs to be done to >>>>>>>>>> overcome this. Any >>>>>>>>>> > idea will be appreciated. >>>>>>>>>> > >>>>>>>>>> > Thank you, >>>>>>>>>> > Alex >>>>>>>>>> > >>>>>>>>>> > _______________________________________________ >>>>>>>>>> > Users mailing list >>>>>>>>>> > Users@ovirt.org >>>>>>>>>> > http://lists.ovirt.org/mailman/listinfo/users >>>>>>>>>> > >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> Users mailing list >>>>>>>> Users@ovirt.org >>>>>>>> http://lists.ovirt.org/mailman/listinfo/users >>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> Users mailing list >>>>>> Users@ovirt.org >>>>>> http://lists.ovirt.org/mailman/listinfo/users >>>>>> >>>>>> >>>>> >>> >> >
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users