Re: [Gluster-users] problem with Peer Rejected

2022-02-04 Thread Jiří Sléžka

well, I tried to downgrade to 8.6 on node gluster07. It didn't help.

Fortunately I remember old post from Strahil in ovirt list which 
suggests to switch


gluster volume set   cluster.lookup-optimize off

when expanding cluster. As nodes were rejected due cksum mismatch on 
only one volume I tried to switch lookup-optimize on that volume, then 
restart glusterd on both nodes and it helped.


Problem seems to be solved...

Cheers,

Jiri


On 2/4/22 15:45, Jiří Sléžka wrote:

Hello,

I have a glusterfs cluster in version 8.6, 6 nodes, 1 arbiter node, 
distributed-replicated setup with arbiter (Number of Bricks: 3 x (2 + 1) 
= 9).


Yesterday I added two new nodes. Because I plan to upgrade to gluster 9 
I have installed them with Rocky Linux 8 and glusterfs 9 (from CentOS 
stream repo). Then I added these two nodes and got this setup


Volume Name: samba
Type: Distributed-Replicate
Volume ID: a96ea622-7abb-4213-a39b-8a23a3035a5d
Status: Started
Snapshot Count: 0
Number of Bricks: 4 x (2 + 1) = 12
Transport-type: tcp
Bricks:
Brick1: 10.10.102.91:/gluster/samba
Brick2: 10.10.100.92:/gluster/samba
Brick3: 10.10.100.90:/gluster/samba/brick1 (arbiter)
Brick4: 10.10.100.93:/gluster/samba
Brick5: 10.10.100.94:/gluster/samba
Brick6: 10.10.100.90:/gluster/samba/brick2 (arbiter)
Brick7: 10.10.100.95:/gluster/samba
Brick8: 10.10.100.96:/gluster/samba
Brick9: 10.10.100.90:/gluster/samba/brick3 (arbiter)
Brick10: 10.10.100.97:/gluster/samba
Brick11: 10.10.100.98:/gluster/samba
Brick12: 10.10.100.90:/gluster/samba/brick4 (arbiter)
Options Reconfigured:
auth.allow: xxx
cluster.self-heal-daemon: on
cluster.entry-self-heal: on
cluster.metadata-self-heal: on
cluster.data-self-heal: on
performance.client-io-threads: off
nfs.disable: on
transport.address-family: inet
performance.readdir-ahead: on
features.shard: on
features.shard-block-size: 512MB
cluster.quorum-type: auto
cluster.server-quorum-type: server
cluster.lookup-optimize: off


op-version is still 8

It worked well, I reballanced one of volumes but today I mentioned that 
two new nodes are in Peer Rejected state (from gluster02 view)


gluster peer status
Number of Peers: 8

Hostname: 10.10.100.91
Uuid: 6d9e6170-2386-4b40-8fb5-7aeaef3d3122
State: Peer in Cluster (Connected)

Hostname: 10.224.102.93
Uuid: 4f74741e-7fee-41d0-a8db-916458f7280e
State: Peer in Cluster (Connected)

Hostname: 10.10.100.94
Uuid: cda31067-5bd9-44ea-816d-7c9dd947d78a
State: Peer in Cluster (Connected)

Hostname: 10.10.100.95
Uuid: 3c904f48-1ff3-4669-891b-27d4296ccf0e
State: Peer in Cluster (Connected)

Hostname: 10.10.100.96
Uuid: 0105494d-d5b4-40fb-ad31-c531efd818bb
State: Peer in Cluster (Connected)

Hostname: 10.10.100.90
Uuid: 291b7afd-3090-4733-a97f-20f8585adad2
State: Peer in Cluster (Connected)

Hostname: 10.10.100.97
Uuid: 82ac9abf-1678-43c9-a92f-94d0d472b2fe
State: Peer Rejected (Disconnected)

Hostname: 10.10.100.98
Uuid: 0f9e4891-250a-45b5-bdd3-e6a61aa49a29
State: Peer Rejected (Connected)

from new node (gluster08) are Peer Rejected all nodes

there are log line in /var/log/glusterfs/glusterd.log like this

[2022-02-04 14:36:49.805753 +] E [MSGID: 106010] 
[glusterd-utils.c:3851:glusterd_compare_friend_volume] 0-management: 
Version of Cksums samba differ. local cksum = 3146523269, remote cksum = 
2206743689 on peer 10.10.100.97


there is a documentation for this particular problem...

https://docs.gluster.org/en/latest/Troubleshooting/troubleshooting-glusterd/#common-issues-and-how-to-resolve-them 



..but

gluster volume get all cluster.max-op-version

is still 8

and I cannot set it lower or equal

gluster volume set all cluster.op-version 8
volume set: failed: Required op-version (8) should not be equal or 
lower than current cluster op-version (8).


Unfortunately cluster seems broken on client's side. Any hints how can I 
recover?


Thanks in advance,

Jiri






Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users




smime.p7s
Description: S/MIME Cryptographic Signature




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


[Gluster-users] problem with Peer Rejected

2022-02-04 Thread Jiří Sléžka

Hello,

I have a glusterfs cluster in version 8.6, 6 nodes, 1 arbiter node, 
distributed-replicated setup with arbiter (Number of Bricks: 3 x (2 + 1) 
= 9).


Yesterday I added two new nodes. Because I plan to upgrade to gluster 9 
I have installed them with Rocky Linux 8 and glusterfs 9 (from CentOS 
stream repo). Then I added these two nodes and got this setup


Volume Name: samba
Type: Distributed-Replicate
Volume ID: a96ea622-7abb-4213-a39b-8a23a3035a5d
Status: Started
Snapshot Count: 0
Number of Bricks: 4 x (2 + 1) = 12
Transport-type: tcp
Bricks:
Brick1: 10.10.102.91:/gluster/samba
Brick2: 10.10.100.92:/gluster/samba
Brick3: 10.10.100.90:/gluster/samba/brick1 (arbiter)
Brick4: 10.10.100.93:/gluster/samba
Brick5: 10.10.100.94:/gluster/samba
Brick6: 10.10.100.90:/gluster/samba/brick2 (arbiter)
Brick7: 10.10.100.95:/gluster/samba
Brick8: 10.10.100.96:/gluster/samba
Brick9: 10.10.100.90:/gluster/samba/brick3 (arbiter)
Brick10: 10.10.100.97:/gluster/samba
Brick11: 10.10.100.98:/gluster/samba
Brick12: 10.10.100.90:/gluster/samba/brick4 (arbiter)
Options Reconfigured:
auth.allow: xxx
cluster.self-heal-daemon: on
cluster.entry-self-heal: on
cluster.metadata-self-heal: on
cluster.data-self-heal: on
performance.client-io-threads: off
nfs.disable: on
transport.address-family: inet
performance.readdir-ahead: on
features.shard: on
features.shard-block-size: 512MB
cluster.quorum-type: auto
cluster.server-quorum-type: server
cluster.lookup-optimize: off


op-version is still 8

It worked well, I reballanced one of volumes but today I mentioned that 
two new nodes are in Peer Rejected state (from gluster02 view)


gluster peer status
Number of Peers: 8

Hostname: 10.10.100.91
Uuid: 6d9e6170-2386-4b40-8fb5-7aeaef3d3122
State: Peer in Cluster (Connected)

Hostname: 10.224.102.93
Uuid: 4f74741e-7fee-41d0-a8db-916458f7280e
State: Peer in Cluster (Connected)

Hostname: 10.10.100.94
Uuid: cda31067-5bd9-44ea-816d-7c9dd947d78a
State: Peer in Cluster (Connected)

Hostname: 10.10.100.95
Uuid: 3c904f48-1ff3-4669-891b-27d4296ccf0e
State: Peer in Cluster (Connected)

Hostname: 10.10.100.96
Uuid: 0105494d-d5b4-40fb-ad31-c531efd818bb
State: Peer in Cluster (Connected)

Hostname: 10.10.100.90
Uuid: 291b7afd-3090-4733-a97f-20f8585adad2
State: Peer in Cluster (Connected)

Hostname: 10.10.100.97
Uuid: 82ac9abf-1678-43c9-a92f-94d0d472b2fe
State: Peer Rejected (Disconnected)

Hostname: 10.10.100.98
Uuid: 0f9e4891-250a-45b5-bdd3-e6a61aa49a29
State: Peer Rejected (Connected)

from new node (gluster08) are Peer Rejected all nodes

there are log line in /var/log/glusterfs/glusterd.log like this

[2022-02-04 14:36:49.805753 +] E [MSGID: 106010] 
[glusterd-utils.c:3851:glusterd_compare_friend_volume] 0-management: 
Version of Cksums samba differ. local cksum = 3146523269, remote cksum = 
2206743689 on peer 10.10.100.97


there is a documentation for this particular problem...

https://docs.gluster.org/en/latest/Troubleshooting/troubleshooting-glusterd/#common-issues-and-how-to-resolve-them

..but

gluster volume get all cluster.max-op-version

is still 8

and I cannot set it lower or equal

gluster volume set all cluster.op-version 8
volume set: failed: Required op-version (8) should not be equal or 
lower than current cluster op-version (8).


Unfortunately cluster seems broken on client's side. Any hints how can I 
recover?


Thanks in advance,

Jiri



smime.p7s
Description: S/MIME Cryptographic Signature




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] glusterfs health-check failed, (brick) going down

2021-07-08 Thread Jiří Sléžka

Hi Olaf,

thanks for reply.

On 7/8/21 3:29 PM, Olaf Buitelaar wrote:

Hi Jiri,

your probleem looks pretty similar to mine, see; 
https://lists.gluster.org/pipermail/gluster-users/2021-February/039134.html 
<https://lists.gluster.org/pipermail/gluster-users/2021-February/039134.html>

Any chance you also see the xfs errors in de brick logs?


yes, I can see this log lines related to "health-check failed" items

[root@ovirt-hci02 ~]# grep "aio_read" /var/log/glusterfs/bricks/*
/var/log/glusterfs/bricks/gluster_bricks-vms2-vms2.log:[2021-07-07 
07:13:37.408010] W [MSGID: 113075] 
[posix-helpers.c:2135:posix_fs_health_check] 0-vms-posix: 
aio_read_cmp_buf() on /gluster_bricks/vms2/vms2/.glusterfs/health_check 
returned ret is -1 error is Structure needs cleaning
/var/log/glusterfs/bricks/gluster_bricks-vms2-vms2.log:[2021-07-07 
16:11:14.518844] W [MSGID: 113075] 
[posix-helpers.c:2135:posix_fs_health_check] 0-vms-posix: 
aio_read_cmp_buf() on /gluster_bricks/vms2/vms2/.glusterfs/health_check 
returned ret is -1 error is Structure needs cleaning


[root@ovirt-hci01 ~]# grep "aio_read" /var/log/glusterfs/bricks/*
/var/log/glusterfs/bricks/gluster_bricks-engine-engine.log:[2021-07-05 
13:15:51.982938] W [MSGID: 113075] 
[posix-helpers.c:2135:posix_fs_health_check] 0-engine-posix: 
aio_read_cmp_buf() on 
/gluster_bricks/engine/engine/.glusterfs/health_check returned ret is -1 
error is Structure needs cleaning
/var/log/glusterfs/bricks/gluster_bricks-vms2-vms2.log:[2021-07-05 
01:53:35.768534] W [MSGID: 113075] 
[posix-helpers.c:2135:posix_fs_health_check] 0-vms-posix: 
aio_read_cmp_buf() on /gluster_bricks/vms2/vms2/.glusterfs/health_check 
returned ret is -1 error is Structure needs cleaning


it looks very similar to your issue but in my case I don't use LVM cache 
and brick disks are JBOD (but connected through Broadcom / LSI MegaRAID 
SAS-3 3008 [Fury] (rev 02)).


For me the situation improved once i disabled brick multiplexing, but i 
don't see that in your volume configuration.


probably important is your note...


When i kill the brick process and start with "gluser v start x force" the
issue seems much more unlikely to occur, but when started from a fresh
reboot, or when killing the process and let it being started by glusterd
(e.g. service glusterd start) the error seems to arise after a couple of
minutes.


...because in the ovirt list Jayme replied this

https://lists.ovirt.org/archives/list/us...@ovirt.org/message/BZRONK53OGWSOPUSGQ76GIXUM7J6HHMJ/

and it looks to me like something you also observes.

Cheers, Jiri



Cheers Olaf

Op do 8 jul. 2021 om 12:28 schreef Jiří Sléžka <mailto:jiri.sle...@slu.cz>>:


Hello gluster community,

I am new to this list but using glusterfs for log time as our SDS
solution for storing 80+TiB of data. I'm also using glusterfs for small
3 node HCI cluster with oVirt 4.4.6 and CentOS 8 (not stream yet).
Glusterfs version here is 8.5-2.el8.x86_64.

For time to time (I belive) random brick on random host goes down
because health-check. It looks like

[root@ovirt-hci02 ~]# grep "posix_health_check"
/var/log/glusterfs/bricks/*
/var/log/glusterfs/bricks/gluster_bricks-vms2-vms2.log:[2021-07-07
07:13:37.408184] M [MSGID: 113075]
[posix-helpers.c:2214:posix_health_check_thread_proc] 0-vms-posix:
health-check failed, going down
/var/log/glusterfs/bricks/gluster_bricks-vms2-vms2.log:[2021-07-07
07:13:37.408407] M [MSGID: 113075]
[posix-helpers.c:2232:posix_health_check_thread_proc] 0-vms-posix:
still
alive! -> SIGTERM
/var/log/glusterfs/bricks/gluster_bricks-vms2-vms2.log:[2021-07-07
16:11:14.518971] M [MSGID: 113075]
[posix-helpers.c:2214:posix_health_check_thread_proc] 0-vms-posix:
health-check failed, going down
/var/log/glusterfs/bricks/gluster_bricks-vms2-vms2.log:[2021-07-07
16:11:14.519200] M [MSGID: 113075]
[posix-helpers.c:2232:posix_health_check_thread_proc] 0-vms-posix:
still
alive! -> SIGTERM

on other host

[root@ovirt-hci01 ~]# grep "posix_health_check"
/var/log/glusterfs/bricks/*
/var/log/glusterfs/bricks/gluster_bricks-engine-engine.log:[2021-07-05
13:15:51.983327] M [MSGID: 113075]
[posix-helpers.c:2214:posix_health_check_thread_proc] 0-engine-posix:
health-check failed, going down
/var/log/glusterfs/bricks/gluster_bricks-engine-engine.log:[2021-07-05
13:15:51.983728] M [MSGID: 113075]
[posix-helpers.c:2232:posix_health_check_thread_proc] 0-engine-posix:
still alive! -> SIGTERM
/var/log/glusterfs/bricks/gluster_bricks-vms2-vms2.log:[2021-07-05
01:53:35.769129] M [MSGID: 113075]
[posix-helpers.c:2214:posix_health_check_thread_proc] 0-vms-posix:
health-check failed, going down
/var/log/glusterfs/bricks/gluster_bricks-vms2-vms2.log:[2021-07-05
01:53:35.769819] M [MSGID: 113075]
[posix-helpers.c:2232

[Gluster-users] glusterfs health-check failed, (brick) going down

2021-07-08 Thread Jiří Sléžka

Hello gluster community,

I am new to this list but using glusterfs for log time as our SDS 
solution for storing 80+TiB of data. I'm also using glusterfs for small 
3 node HCI cluster with oVirt 4.4.6 and CentOS 8 (not stream yet). 
Glusterfs version here is 8.5-2.el8.x86_64.


For time to time (I belive) random brick on random host goes down 
because health-check. It looks like


[root@ovirt-hci02 ~]# grep "posix_health_check" /var/log/glusterfs/bricks/*
/var/log/glusterfs/bricks/gluster_bricks-vms2-vms2.log:[2021-07-07 
07:13:37.408184] M [MSGID: 113075] 
[posix-helpers.c:2214:posix_health_check_thread_proc] 0-vms-posix: 
health-check failed, going down
/var/log/glusterfs/bricks/gluster_bricks-vms2-vms2.log:[2021-07-07 
07:13:37.408407] M [MSGID: 113075] 
[posix-helpers.c:2232:posix_health_check_thread_proc] 0-vms-posix: still 
alive! -> SIGTERM
/var/log/glusterfs/bricks/gluster_bricks-vms2-vms2.log:[2021-07-07 
16:11:14.518971] M [MSGID: 113075] 
[posix-helpers.c:2214:posix_health_check_thread_proc] 0-vms-posix: 
health-check failed, going down
/var/log/glusterfs/bricks/gluster_bricks-vms2-vms2.log:[2021-07-07 
16:11:14.519200] M [MSGID: 113075] 
[posix-helpers.c:2232:posix_health_check_thread_proc] 0-vms-posix: still 
alive! -> SIGTERM


on other host

[root@ovirt-hci01 ~]# grep "posix_health_check" /var/log/glusterfs/bricks/*
/var/log/glusterfs/bricks/gluster_bricks-engine-engine.log:[2021-07-05 
13:15:51.983327] M [MSGID: 113075] 
[posix-helpers.c:2214:posix_health_check_thread_proc] 0-engine-posix: 
health-check failed, going down
/var/log/glusterfs/bricks/gluster_bricks-engine-engine.log:[2021-07-05 
13:15:51.983728] M [MSGID: 113075] 
[posix-helpers.c:2232:posix_health_check_thread_proc] 0-engine-posix: 
still alive! -> SIGTERM
/var/log/glusterfs/bricks/gluster_bricks-vms2-vms2.log:[2021-07-05 
01:53:35.769129] M [MSGID: 113075] 
[posix-helpers.c:2214:posix_health_check_thread_proc] 0-vms-posix: 
health-check failed, going down
/var/log/glusterfs/bricks/gluster_bricks-vms2-vms2.log:[2021-07-05 
01:53:35.769819] M [MSGID: 113075] 
[posix-helpers.c:2232:posix_health_check_thread_proc] 0-vms-posix: still 
alive! -> SIGTERM


I cannot link these errors to any storage/fs issue (in dmesg or 
/var/log/messages), brick devices looks healthy (smartd).


I can force start brick with

gluster volume start vms|engine force

and after some healing all works fine for few days

Did anybody observe this behavior?

vms volume has this structure (two bricks per host, each is separate 
JBOD ssd disk), engine volume has one brick on each host...


gluster volume info vms

Volume Name: vms
Type: Distributed-Replicate
Volume ID: 52032ec6-99d4-4210-8fb8-ffbd7a1e0bf7
Status: Started
Snapshot Count: 0
Number of Bricks: 2 x 3 = 6
Transport-type: tcp
Bricks:
Brick1: 10.0.4.11:/gluster_bricks/vms/vms
Brick2: 10.0.4.13:/gluster_bricks/vms/vms
Brick3: 10.0.4.12:/gluster_bricks/vms/vms
Brick4: 10.0.4.11:/gluster_bricks/vms2/vms2
Brick5: 10.0.4.13:/gluster_bricks/vms2/vms2
Brick6: 10.0.4.12:/gluster_bricks/vms2/vms2
Options Reconfigured:
cluster.granular-entry-heal: enable
performance.stat-prefetch: off
cluster.eager-lock: enable
performance.io-cache: off
performance.read-ahead: off
performance.quick-read: off
user.cifs: off
network.ping-timeout: 30
network.remote-dio: off
performance.strict-o-direct: on
performance.low-prio-threads: 32
features.shard: on
storage.owner-gid: 36
storage.owner-uid: 36
transport.address-family: inet
storage.fips-mode-rchecksum: on
nfs.disable: on
performance.client-io-threads: off


Cheers,

Jiri



smime.p7s
Description: S/MIME Cryptographic Signature




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] [ovirt-users] slow performance with export storage on glusterfs

2017-11-29 Thread Jiří Sléžka
Hello,

> 
> If you use Gluster as FUSE mount it's always slower than you expect it
> to be.
> If you want to get better performance out of your oVirt/Gluster storage,
> try the following: 
> 
> - create a Linux VM in your oVirt environment, assign 4/8/12 virtual
> disks (Virtual disks are located on your Gluster storage volume).
> - Boot/configure the VM, then use LVM to create VG/LV with 4 stripes
> (lvcreate -i 4) and use all 4/8/12 virtual disks as PVS.
> - then install NFS server and export LV you created in previous step,
> use the NFS export as export domain in oVirt/RHEV.
> 
> You should get wire speed when you use multiple stripes on Gluster
> storage, FUSE mount on oVirt host will fan out requests to all 4 servers.
> Gluster is very good at distributed/parallel workloads, but when you use
> direct Gluster FUSE mount for Export domain you only have one data
> stream, which is fragmented even more my multiple writes/reads that
> Gluster needs to do to save your data on all member servers.

Thanks for explanation, it is an interesting solution.

Cheers,

Jiri

> 
> 
> 
> On Mon, Nov 27, 2017 at 8:41 PM, Donny Davis <do...@fortnebula.com
> <mailto:do...@fortnebula.com>> wrote:
> 
> What about mounting over nfs instead of the fuse client. Or maybe
> libgfapi. Is that available for export domains
> 
> On Fri, Nov 24, 2017 at 3:48 AM Jiří Sléžka <jiri.sle...@slu.cz
> <mailto:jiri.sle...@slu.cz>> wrote:
> 
>     On 11/24/2017 06:41 AM, Sahina Bose wrote:
> >
> >
> > On Thu, Nov 23, 2017 at 4:56 PM, Jiří Sléžka
> <jiri.sle...@slu.cz <mailto:jiri.sle...@slu.cz>
> > <mailto:jiri.sle...@slu.cz <mailto:jiri.sle...@slu.cz>>> wrote:
> >
> >     Hi,
> >
> >     On 11/22/2017 07:30 PM, Nir Soffer wrote:
> >     > On Mon, Nov 20, 2017 at 5:22 PM Jiří Sléžka
> <jiri.sle...@slu.cz <mailto:jiri.sle...@slu.cz>
> <mailto:jiri.sle...@slu.cz <mailto:jiri.sle...@slu.cz>>
> >     > <mailto:jiri.sle...@slu.cz <mailto:jiri.sle...@slu.cz>
> <mailto:jiri.sle...@slu.cz <mailto:jiri.sle...@slu.cz>>>> wrote:
> >     >
> >     >     Hi,
> >     >
> >     >     I am trying realize why is exporting of vm to export
> storage on
> >     >     glusterfs such slow.
> >     >
> >     >     I am using oVirt and RHV, both instalations on
> version 4.1.7.
> >     >
> >     >     Hosts have dedicated nics for rhevm network - 1gbps,
> data
> >     storage itself
> >     >     is on FC.
> >     >
> >     >     GlusterFS cluster lives separate on 4 dedicated
> hosts. It has
> >     slow disks
> >     >     but I can achieve about 200-400mbit throughput in other
> >     applications (we
> >     >     are using it for "cold" data, backups mostly).
> >     >
> >     >     I am using this glusterfs cluster as backend for export
> >     storage. When I
> >     >     am exporting vm I can see only about 60-80mbit
> throughput.
> >     >
> >     >     What could be the bottleneck here?
> >     >
> >     >     Could it be qemu-img utility?
> >     >
> >     >     vdsm      97739  0.3  0.0 354212 29148 ?        S<l 
> 15:43   0:06
> >     >     /usr/bin/qemu-img convert -p -t none -T none -f raw
> >     >   
> >   
>   
> /rhev/data-center/2ff6d0ee-a10b-473d-b77c-be9149945f5f/ff3cd56a-1005-4426-8137-8f422c0b47c1/images/ba42cbcc-c068-4df8-af3d-00f2077b1e27/c57acd5f-d6cf-48cc-ad0c-4a7d979c0c1e
> >     >     -O raw
> >     >   
> >   
>   
> /rhev/data-center/mnt/glusterSD/10.20.30.41:_rhv__export/81094499-a392-4ea2-b081-7c6288fbb636/images/ba42cbcc-c068-4df8-af3d-00f2077b1e27/c57acd5f-d6cf-48cc-ad0c-4a7d979c0c1e
> >     >
> >     >     Any idea how to make it work faster or what
> throughput should I
> >     >     expected?
> >     >
> >     >
> >     > gluster storage operations are using fuse mount - so
> every write:
> >     > - travel to the kernel
> >     > 

Re: [Gluster-users] [ovirt-users] slow performance with export storage on glusterfs

2017-11-26 Thread Jiří Sléžka
On 11/24/2017 06:41 AM, Sahina Bose wrote:
> 
> 
> On Thu, Nov 23, 2017 at 4:56 PM, Jiří Sléžka <jiri.sle...@slu.cz
> <mailto:jiri.sle...@slu.cz>> wrote:
> 
> Hi,
> 
> On 11/22/2017 07:30 PM, Nir Soffer wrote:
> > On Mon, Nov 20, 2017 at 5:22 PM Jiří Sléžka <jiri.sle...@slu.cz 
> <mailto:jiri.sle...@slu.cz>
> > <mailto:jiri.sle...@slu.cz <mailto:jiri.sle...@slu.cz>>> wrote:
> >
> >     Hi,
> >
> >     I am trying realize why is exporting of vm to export storage on
> >     glusterfs such slow.
> >
> >     I am using oVirt and RHV, both instalations on version 4.1.7.
> >
> >     Hosts have dedicated nics for rhevm network - 1gbps, data
> storage itself
> >     is on FC.
> >
> >     GlusterFS cluster lives separate on 4 dedicated hosts. It has
> slow disks
> >     but I can achieve about 200-400mbit throughput in other
> applications (we
> >     are using it for "cold" data, backups mostly).
> >
> >     I am using this glusterfs cluster as backend for export
> storage. When I
> >     am exporting vm I can see only about 60-80mbit throughput.
> >
> >     What could be the bottleneck here?
> >
> >     Could it be qemu-img utility?
> >
> >     vdsm      97739  0.3  0.0 354212 29148 ?        S<l  15:43   0:06
> >     /usr/bin/qemu-img convert -p -t none -T none -f raw
> >   
>  
> /rhev/data-center/2ff6d0ee-a10b-473d-b77c-be9149945f5f/ff3cd56a-1005-4426-8137-8f422c0b47c1/images/ba42cbcc-c068-4df8-af3d-00f2077b1e27/c57acd5f-d6cf-48cc-ad0c-4a7d979c0c1e
> >     -O raw
> >   
>  
> /rhev/data-center/mnt/glusterSD/10.20.30.41:_rhv__export/81094499-a392-4ea2-b081-7c6288fbb636/images/ba42cbcc-c068-4df8-af3d-00f2077b1e27/c57acd5f-d6cf-48cc-ad0c-4a7d979c0c1e
> >
> >     Any idea how to make it work faster or what throughput should I
> >     expected?
> >
> >
> > gluster storage operations are using fuse mount - so every write:
> > - travel to the kernel
> > - travel back to the gluster fuse helper process
> > - travel to all 3 replicas - replication is done on client side
> > - return to kernel when all writes succeeded
> > - return to caller
> >
> > So gluster will never set any speed record.
> >
> > Additionally, you are copying from raw lv on FC - qemu-img cannot do
> > anything
> > smart and avoid copying unused clusters. Instead if copies
> gigabytes of
> > zeros
> > from FC.
> 
> ok, it does make sense
> 
> > However 7.5-10 MiB/s sounds too slow.
> >
> > I would try to test with dd - how much time it takes to copy
> > the same image from FC to your gluster storage?
> >
> > dd
> > 
> if=/rhev/data-center/2ff6d0ee-a10b-473d-b77c-be9149945f5f/ff3cd56a-1005-4426-8137-8f422c0b47c1/images/ba42cbcc-c068-4df8-af3d-00f2077b1e27/c57acd5f-d6cf-48cc-ad0c-4a7d979c0c1e
> > 
> of=/rhev/data-center/mnt/glusterSD/10.20.30.41:_rhv__export/81094499-a392-4ea2-b081-7c6288fbb636/__test__
> > bs=8M oflag=direct status=progress
> 
> unfrotunately dd performs the same
> 
> 1778384896 bytes (1.8 GB) copied, 198.565265 s, 9.0 MB/s
> 
> 
> > If dd can do this faster, please ask on qemu-discuss mailing list:
> > https://lists.nongnu.org/mailman/listinfo/qemu-discuss
> <https://lists.nongnu.org/mailman/listinfo/qemu-discuss>
> >
> > If both give similar results, I think asking in gluster mailing list
> > about this can help. Maybe your gluster setup can be optimized.
> 
> ok, this is definitly on the gluster side. Thanks for your guidance.
> 
> I will investigate the gluster side and also will try Export on NFS
> share.
> 
> 
> [Adding gluster users ml]
> 
> Please provide "gluster volume info" output for the rhv_export gluster
> volume and also volume profile details (refer to earlier mail from Shani
> on how to run this) while performing the dd operation above.

you can find all this output on https://pastebin.com/sBK01VS8

as mentioned in other posts. Gluster cluster uses really slow (green)
disks but without direct io it can achieve throughput around 400mbit/s.

This storage is used mostly for backup purposes. It is not used as a vm
storage.

In my case it would be nice not to use direct io in export case but I
understand why it might not be wise.

Cheers,

Jiri

> 
&