Re: [Gluster-users] GFS performance under heavy traffic

2020-01-07 Thread Strahil
As your issue is Network, consider changing the MTU if the infrastructure is 
allowing it.
The tuned profiles are very important, as they control ratios for dumping data 
in memory to disk (this case gluster over network). You want to avoid keeping a 
lot of data in client's memory(in this  case  the gluster server), just to 
unleash it over network.

These 2 can be implemented online and I do not expect any issues.

Filesystem of bricks is important because the faster they soak data, the faster 
gluster can take more.


Of  course,  you need to reproduce it in  test.

Also  consider checking if there  is  any kind  of backup  running on the 
bricks. I have seen too many 'miracles' :D

Best Regards,
Strahil NikolovOn Jan 8, 2020 01:03, David Cunningham 
 wrote:
>
> Hi Strahil,
>
> Thanks for that. The queue/scheduler file for the relevant disk reports "noop 
> [deadline] cfq", so deadline is being used. It is using ext4, and I've 
> verified that the MTU is 1500.
>
> We could change the filesystem from ext4 to xfs, but in this case we're not 
> looking to tinker around the edges and get a small performance improvement - 
> we need a very large improvement on the 114MBps of network traffic to make it 
> usable.
>
> I think what we really need to do first is to reproduce the problem in 
> testing, and then come back to possible solutions.
>
>
> On Tue, 7 Jan 2020 at 22:15, Strahil Nikolov  wrote:
>>
>> To find the scheduler , find all pvs of the LV is providing your storage
>>
>> [root@ovirt1 ~]# df -Th /gluster_bricks/data_fast
>> Filesystem   Type  Size  Used Avail Use% 
>> Mounted on
>> /dev/mapper/gluster_vg_nvme-gluster_lv_data_fast xfs   100G   39G   62G  39% 
>> /gluster_bricks/data_fast
>>
>>
>> [root@ovirt1 ~]# pvs | grep gluster_vg_nvme
>>   /dev/mapper/vdo_nvme gluster_vg_nvme lvm2 a--  <1024.00g    0
>>
>> [root@ovirt1 ~]# cat /etc/vdoconf.yml
>> 
>> # THIS FILE IS MACHINE GENERATED. DO NOT EDIT THIS FILE BY HAND.
>> 
>> config: !Configuration
>>   vdos:
>>    vdo_nvme: !VDOService
>>   device: /dev/disk/by-id/nvme-ADATA_SX8200PNP_2J1120011596
>>
>>
>> [root@ovirt1 ~]# ll /dev/disk/by-id/nvme-ADATA_SX8200PNP_2J1120011596
>> lrwxrwxrwx. 1 root root 13 Dec 17 20:21 
>> /dev/disk/by-id/nvme-ADATA_SX8200PNP_2J1120011596 -> ../../nvme0n1
>> [root@ovirt1 ~]# cat /sys/block/nvme0n1/queue/scheduler
>> [none] mq-deadline kyber
>>
>> Note: If device is under multipath , you need to check all paths (you can 
>> get them from 'multipath -ll' command).
>> The only scheduler you should avoid is "cfq" which was default for RHEL 6 & 
>> SLES 11.
>>
>> XFS has better performance that ext-based systems.
>>
>> Another tuning is to use Red hat's tuned profiles for gluster. You can 
>> extract them from (or newer if you find) 
>> ftp://ftp.redhat.com/redhat/linux/enterprise/7Server/en/RHS/SRPMS/redhat-storage-server-3.4.2.0-1.el7rhgs.src.rpm
>>
>>
>> About MTU - it's reducing the ammount of packages that the kernel has to 
>> process - but requires infrastructure to support that too. You can test by 
>> setting MTU on both sides to 9000 and then run 'tracepath remote-ip'. Also 
>> run a ping with large size without do not fragment flag ->  'ping -M do -s 
>> 8900 <

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] GFS performance under heavy traffic

2020-01-07 Thread David Cunningham
Hi Strahil,

Thanks for that. The queue/scheduler file for the relevant disk reports
"noop [deadline] cfq", so deadline is being used. It is using ext4, and
I've verified that the MTU is 1500.

We could change the filesystem from ext4 to xfs, but in this case we're not
looking to tinker around the edges and get a small performance improvement
- we need a very large improvement on the 114MBps of network traffic to
make it usable.

I think what we really need to do first is to reproduce the problem in
testing, and then come back to possible solutions.


On Tue, 7 Jan 2020 at 22:15, Strahil Nikolov  wrote:

> To find the scheduler , find all pvs of the LV is providing your storage
>
> [root@ovirt1 ~]# df -Th /gluster_bricks/data_fast
> Filesystem   Type  Size  Used Avail
> Use% Mounted on
> /dev/mapper/gluster_vg_nvme-gluster_lv_data_fast xfs   100G   39G   62G
> 39% /gluster_bricks/data_fast
>
>
> [root@ovirt1 ~]# pvs | grep gluster_vg_nvme
>   /dev/mapper/vdo_nvme gluster_vg_nvme lvm2 a--  <1024.00g0
>
> [root@ovirt1 ~]# cat /etc/vdoconf.yml
> 
> # THIS FILE IS MACHINE GENERATED. DO NOT EDIT THIS FILE BY HAND.
> 
> config: !Configuration
>   vdos:
>vdo_nvme: !VDOService
>   device: /dev/disk/by-id/nvme-ADATA_SX8200PNP_2J1120011596
>
>
> [root@ovirt1 ~]# ll /dev/disk/by-id/nvme-ADATA_SX8200PNP_2J1120011596
> lrwxrwxrwx. 1 root root 13 Dec 17 20:21
> /dev/disk/by-id/nvme-ADATA_SX8200PNP_2J1120011596 -> ../../nvme0n1
> [root@ovirt1 ~]# cat /sys/block/nvme0n1/queue/scheduler
> [none] mq-deadline kyber
>
> Note: If device is under multipath , you need to check all paths (you can
> get them from 'multipath -ll' command).
> The only scheduler you should avoid is "cfq" which was default for RHEL 6
> & SLES 11.
>
> XFS has better performance that ext-based systems.
>
> Another tuning is to use Red hat's tuned profiles for gluster. You can
> extract them from (or newer if you find)
> ftp://ftp.redhat.com/redhat/linux/enterprise/7Server/en/RHS/SRPMS/redhat-storage-server-3.4.2.0-1.el7rhgs.src.rpm
>
>
> About MTU - it's reducing the ammount of packages that the kernel has to
> process - but requires infrastructure to support that too. You can test by
> setting MTU on both sides to 9000 and then run 'tracepath remote-ip'. Also
> run a ping with large size without do not fragment flag ->  'ping -M do
> -s 8900 ' If ping comes back - you are good to go.
>
>
> Best Regards,
> Strahil Nikolov
>
> В вторник, 7 януари 2020 г., 3:00:23 ч. Гринуич-5, David Cunningham <
> dcunning...@voisonics.com> написа:
>
>
> Hi Strahil,
>
> I believe we are using the standard MTU of 1500 (would need to check with
> the network people to be sure). Does it make a difference?
>
> I'm afraid I don't know about the scheduler - where do I find that?
>
> Thank you for the suggestions about turning off performance.read-ahead and
> performance.readdir-ahead.
>
>
> On Tue, 7 Jan 2020 at 18:08, Strahil  wrote:
>
> Hi David,
>
> It's difficult to find anything structured (but it's the same for Linux
> and other  tech). I use Red Hat's doxumentation, guideds online (crosscheck
> the options with official documentation) and experience shared on the
> mailing list.
>
> I don't see anything (iin /var/lib/gluster/groups) that will match your
> profile, but I think that you should try with performance.read-ahead  and
> performance.readdir-ahead 'off' . I have found out a bug (didn't read  the
> whole stuff) ,  that might be interesting for you :
>
> https://bugzilla.redhat.com/show_bug.cgi?id=1601166
>
> Also, Arbiter is very important in order to avoid split brain situations
> (but based on my experience , issues still can occur) and best the brick
> for the Arbiter to be an SSD as it needs to process the metadata as fast as
> possible. With v7, there  is an option the client to have an Arbiter even
> in the cloud (remote arbiter) that is used only when 1 data brick is down.
>
> Please report the issue with the cache  - that should not be like that.
>
> Are you using Jumbo frames  (MTU 9000)?
> What is yoir brick's  I/O scheduler  ?
>
> Best Regards,
> Strahil Nikolov
> On Jan 7, 2020 01:34, David Cunningham  wrote:
>
> Hi Strahil,
>
> We may have had a heal since the GFS arbiter node wasn't accessible from
> the GFS clients, only from the other GFS servers. Unfortunately we haven't
> been able to produce the problem seen in production while testing so are
> unsure whether making the GFS arbiter node directly available to clients
> has fixed the issue.
>
> The load on GFS is mainly:
> 1. There are a small number of files around 5MB in size which are read
> often and change infrequently.
> 2. There are a large number of directories which are opened for reading to
> read the list of contents frequently.
> 3. There are a large number of new files around 5MB in size written
> 

Re: [Gluster-users] GFS performance under heavy traffic

2020-01-07 Thread David Cunningham
Hi Strahil,

I believe we are using the standard MTU of 1500 (would need to check with
the network people to be sure). Does it make a difference?

I'm afraid I don't know about the scheduler - where do I find that?

Thank you for the suggestions about turning off performance.read-ahead and
performance.readdir-ahead.


On Tue, 7 Jan 2020 at 18:08, Strahil  wrote:

> Hi David,
>
> It's difficult to find anything structured (but it's the same for Linux
> and other  tech). I use Red Hat's doxumentation, guideds online (crosscheck
> the options with official documentation) and experience shared on the
> mailing list.
>
> I don't see anything (iin /var/lib/gluster/groups) that will match your
> profile, but I think that you should try with performance.read-ahead  and
> performance.readdir-ahead 'off' . I have found out a bug (didn't read  the
> whole stuff) ,  that might be interesting for you :
>
> https://bugzilla.redhat.com/show_bug.cgi?id=1601166
>
> Also, Arbiter is very important in order to avoid split brain situations
> (but based on my experience , issues still can occur) and best the brick
> for the Arbiter to be an SSD as it needs to process the metadata as fast as
> possible. With v7, there  is an option the client to have an Arbiter even
> in the cloud (remote arbiter) that is used only when 1 data brick is down.
>
> Please report the issue with the cache  - that should not be like that.
>
> Are you using Jumbo frames  (MTU 9000)?
> What is yoir brick's  I/O scheduler  ?
>
> Best Regards,
> Strahil Nikolov
> On Jan 7, 2020 01:34, David Cunningham  wrote:
>
> Hi Strahil,
>
> We may have had a heal since the GFS arbiter node wasn't accessible from
> the GFS clients, only from the other GFS servers. Unfortunately we haven't
> been able to produce the problem seen in production while testing so are
> unsure whether making the GFS arbiter node directly available to clients
> has fixed the issue.
>
> The load on GFS is mainly:
> 1. There are a small number of files around 5MB in size which are read
> often and change infrequently.
> 2. There are a large number of directories which are opened for reading to
> read the list of contents frequently.
> 3. There are a large number of new files around 5MB in size written
> frequently and read infrequently.
>
> We haven't touched the tuning options as we don't really feel qualified to
> tell what needs changed from the default. Do you know of any suitable
> guides to get started?
>
> For some reason performance.cache-size is reported as both 32MB and 128MB.
> Is it worth reporting even for version 5.6?
>
> Here is the "gluster volume info" taken on the first node. Note that the
> third node (the arbiter) is currently taken out of the cluster:
> Volume Name: gvol0
> Type: Replicate
> Volume ID: fb5af69e-1c3e-4164-8b23-c1d7bec9b1b6
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 1 x 2 = 2
> Transport-type: tcp
> Bricks:
> Brick1: gfs1:/nodirectwritedata/gluster/gvol0
> Brick2: gfs2:/nodirectwritedata/gluster/gvol0
> Options Reconfigured:
> diagnostics.client-log-level: INFO
> performance.client-io-threads: off
> nfs.disable: on
> transport.address-family: inet
>
> Thanks for your help and advice.
>
>
> On Sat, 28 Dec 2019 at 17:46, Strahil  wrote:
>
> Hi David,
>
> It seems that I have misread your quorum options, so just ignore that from
> my previous e-mail.
>
> Best Regards,
> Strahil Nikolov
> On Dec 27, 2019 15:38, Strahil  wrote:
>
> Hi David,
>
> Gluster supports live rolling upgrade, so there is no need to redeploy at
> all - but the migration notes should be checked as some features must be
> disabled first.
> Also, the gluster client should remount in order to bump the gluster
> op-version.
>
> What kind of workload do you have ?
> I'm asking as there  are predefined (and recommended) settings located at
> /var/lib/gluster/groups .
> You can check the options for each group and cross-check the options
> meaning in the docs before  activating a setting.
>
> I still have a vague feeling  that ,during that high-peak of network
> bandwidth, there was  a  heal  going on. Have you checked that ?
>
> Also, sharding is very useful , when you work with large files and the
> heal is reduced to the size of the shard.
>
> N.B.: Once sharding is enabled, DO NOT DISABLE it - as you will loose
> your data.
>
> Using GLUSTER v7.1 (soon on CentOS  & Debian) allows using latest
> features  and optimizations while support from gluster Dev community is
> quite active.
>
> P.S: I'm wondering how 'performance.cache-size' can both be 32 MB and 128
> MB. Please double-check this (maybe I'm reading it wrong on my smartphone)
> and if needed raise a bug on bugzilla.redhat.com
>
> P.S2: Please  provide  'gluster volume info' as 'cluster.quorum-type' ->
> 'none' is not normal for replicated volumes (arbiters are using in replica
> volumes)
>
> According to the dooutput (otps://
> docs.gluster.org/en/latest/Administrator%20Guide/arbiter-volumes-and-quorum/)
> :
>
> 

Re: [Gluster-users] GFS performance under heavy traffic

2020-01-06 Thread David Cunningham
Hi Strahil,

We may have had a heal since the GFS arbiter node wasn't accessible from
the GFS clients, only from the other GFS servers. Unfortunately we haven't
been able to produce the problem seen in production while testing so are
unsure whether making the GFS arbiter node directly available to clients
has fixed the issue.

The load on GFS is mainly:
1. There are a small number of files around 5MB in size which are read
often and change infrequently.
2. There are a large number of directories which are opened for reading to
read the list of contents frequently.
3. There are a large number of new files around 5MB in size written
frequently and read infrequently.

We haven't touched the tuning options as we don't really feel qualified to
tell what needs changed from the default. Do you know of any suitable
guides to get started?

For some reason performance.cache-size is reported as both 32MB and 128MB.
Is it worth reporting even for version 5.6?

Here is the "gluster volume info" taken on the first node. Note that the
third node (the arbiter) is currently taken out of the cluster:
Volume Name: gvol0
Type: Replicate
Volume ID: fb5af69e-1c3e-4164-8b23-c1d7bec9b1b6
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: gfs1:/nodirectwritedata/gluster/gvol0
Brick2: gfs2:/nodirectwritedata/gluster/gvol0
Options Reconfigured:
diagnostics.client-log-level: INFO
performance.client-io-threads: off
nfs.disable: on
transport.address-family: inet

Thanks for your help and advice.


On Sat, 28 Dec 2019 at 17:46, Strahil  wrote:

> Hi David,
>
> It seems that I have misread your quorum options, so just ignore that from
> my previous e-mail.
>
> Best Regards,
> Strahil Nikolov
> On Dec 27, 2019 15:38, Strahil  wrote:
>
> Hi David,
>
> Gluster supports live rolling upgrade, so there is no need to redeploy at
> all - but the migration notes should be checked as some features must be
> disabled first.
> Also, the gluster client should remount in order to bump the gluster
> op-version.
>
> What kind of workload do you have ?
> I'm asking as there  are predefined (and recommended) settings located at
> /var/lib/gluster/groups .
> You can check the options for each group and cross-check the options
> meaning in the docs before  activating a setting.
>
> I still have a vague feeling  that ,during that high-peak of network
> bandwidth, there was  a  heal  going on. Have you checked that ?
>
> Also, sharding is very useful , when you work with large files and the
> heal is reduced to the size of the shard.
>
> N.B.: Once sharding is enabled, DO NOT DISABLE it - as you will loose
> your data.
>
> Using GLUSTER v7.1 (soon on CentOS  & Debian) allows using latest
> features  and optimizations while support from gluster Dev community is
> quite active.
>
> P.S: I'm wondering how 'performance.cache-size' can both be 32 MB and 128
> MB. Please double-check this (maybe I'm reading it wrong on my smartphone)
> and if needed raise a bug on bugzilla.redhat.com
>
> P.S2: Please  provide  'gluster volume info' as 'cluster.quorum-type' ->
> 'none' is not normal for replicated volumes (arbiters are using in replica
> volumes)
>
> According to the dooutput (otps://
> docs.gluster.org/en/latest/Administrator%20Guide/arbiter-volumes-and-quorum/)
> :
>
> *Note:** Enabling the arbiter feature **automatically** configures* 
> *client-quorum
> to 'auto'. This setting is **not** to be changed.*
>
> Here is my output (Hyperconverged Virtualization Cluster -> oVirt):
> # gluster volume info engine |  grep quorum
> cluster.quorum-type: auto
> cluster.server-quorum-type: server
>
> Changing quorum is more 'riskier' than other options, so you need to take
> necessary measures.  I think , we all  know what will happen , if the
> cluster is out of quorum and you change the quorum settings to more
> stringent ones :D
>
> P.S3: If you decide to reset  your gluster volume to the defaults, you can
> create a new volume (same type as current one), the  get the options for
> that volume and put them in a file and then bulk deploy via 'gluster volume
> setgroup custom-group' ,  where  the file is located
> on every gluster  server in the '/var/lib/gluster/groups' directory.
> Last ,  get rid of the sample volume.
>
> Best Regards,
> Strahil Nikolov
> On Dec 27, 2019 03:22, David Cunningham  wrote:
>
> Hi Strahil,
>
> Our volume options are as below. Thanks for the suggestion to upgrade to
> version 6 or 7. We could do that be simply removing the current
> installation and installing the new one (since it's not live right now). We
> might have to convince the customer that it's likely to succeed though, as
> at the moment I think they believe that GFS is not going to work for them.
>
> Option  Value
>
> --  -
>
> cluster.lookup-unhashed on
>
> cluster.lookup-optimize on
>
> cluster.min-free-disk   

Re: [Gluster-users] GFS performance under heavy traffic

2019-12-27 Thread Strahil
Hi David,

Gluster supports live rolling upgrade, so there is no need to redeploy at all - 
but the migration notes should be checked as some features must be disabled 
first.
Also, the gluster client should remount in order to bump the gluster op-version.

What kind of workload do you have ?
I'm asking as there  are predefined (and recommended) settings located at 
/var/lib/gluster/groups .
You can check the options for each group and cross-check the options meaning in 
the docs before  activating a setting.

I still have a vague feeling  that ,during that high-peak of network bandwidth, 
there was  a  heal  going on. Have you checked that ?

Also, sharding is very useful , when you work with large files and the heal is 
reduced to the size of the shard.

N.B.: Once sharding is enabled, DO NOT DISABLE it - as you will loose  your 
data.

Using GLUSTER v7.1 (soon on CentOS  & Debian) allows using latest features  and 
optimizations while support from gluster Dev community is quite active.

P.S: I'm wondering how 'performance.cache-size' can both be 32 MB and 128 MB. 
Please double-check this (maybe I'm reading it wrong on my smartphone) and if 
needed raise a bug on bugzilla.redhat.com 

P.S2: Please  provide  'gluster volume info' as 'cluster.quorum-type' ->  
'none' is not normal for replicated volumes (arbiters are using in replica 
volumes)

According to the dooutput 
(otps://docs.gluster.org/en/latest/Administrator%20Guide/arbiter-volumes-and-quorum/)
 :

Note: Enabling the arbiter feature automatically configures client-quorum to 
'auto'. This setting is not to be changed.


Here is my output (Hyperconverged Virtualization Cluster -> oVirt):
# gluster volume info engine |  grep quorum
cluster.quorum-type: auto
cluster.server-quorum-type: server

Changing quorum is more 'riskier' than other options, so you need to take 
necessary measures.  I think , we all  know what will happen , if the cluster 
is out of quorum and you change the quorum settings to more stringent ones :D


P.S3: If you decide to reset  your gluster volume to the defaults, you can 
create a new volume (same type as current one), the  get the options for that 
volume and put them in a file and then bulk deploy via 'gluster volume set 
   group custom-group' ,  where  the file is located on every 
gluster  server in the '/var/lib/gluster/groups' directory.
Last ,  get rid of the sample volume.


Best Regards,
Strahil NikolovOn Dec 27, 2019 03:22, David Cunningham 
 wrote:
>
> Hi Strahil,
>
> Our volume options are as below. Thanks for the suggestion to upgrade to 
> version 6 or 7. We could do that be simply removing the current installation 
> and installing the new one (since it's not live right now). We might have to 
> convince the customer that it's likely to succeed though, as at the moment I 
> think they believe that GFS is not going to work for them.
>
> Option                                  Value                                 
>   
> --                                  -                                 
>   
> cluster.lookup-unhashed                 on                                    
>   
> cluster.lookup-optimize                 on                                    
>   
> cluster.min-free-disk                   10%                                   
>   
> cluster.min-free-inodes                 5%                                    
>   
> cluster.rebalance-stats                 off                                   
>   
> cluster.subvols-per-directory           (null)                                
>   
> cluster.readdir-optimize                off                                   
>   
> cluster.rsync-hash-regex                (null)                                
>   
> cluster.extra-hash-regex                (null)                                
>   
> cluster.dht-xattr-name                  trusted.glusterfs.dht                 
>   
> cluster.randomize-hash-range-by-gfid    off                                   
>   
> cluster.rebal-throttle                  normal                                
>   
> cluster.lock-migration                  off                                   
>   
> cluster.force-migration                 off                                   
>   
> cluster.local-volume-name               (null)                                
>   
> cluster.weighted-rebalance              on                                    
>   
> cluster.switch-pattern                  (null)                                
>   
> cluster.entry-change-log                on                                    
>   
> cluster.read-subvolume                  (null)                                
>   
> cluster.read-subvolume-index            -1                                    
>   
> cluster.read-hash-mode                  1                                     
>   
> cluster.background-self-heal-count      8                                     
>   
> cluster.metadata-self-heal              on      

Re: [Gluster-users] GFS performance under heavy traffic

2019-12-26 Thread David Cunningham
Oh and I see that the op-version is slightly less than the max-op-version:

[root@gfs1 ~]# gluster volume get all cluster.max-op-version
Option  Value

--  -

cluster.max-op-version  50400


[root@gfs1 ~]# gluster volume get all cluster.op-version
Option  Value

--  -

cluster.op-version  5



On Fri, 27 Dec 2019 at 14:22, David Cunningham 
wrote:

> Hi Strahil,
>
> Our volume options are as below. Thanks for the suggestion to upgrade to
> version 6 or 7. We could do that be simply removing the current
> installation and installing the new one (since it's not live right now). We
> might have to convince the customer that it's likely to succeed though, as
> at the moment I think they believe that GFS is not going to work for them.
>
> Option  Value
>
> --  -
>
> cluster.lookup-unhashed on
>
> cluster.lookup-optimize on
>
> cluster.min-free-disk   10%
>
> cluster.min-free-inodes 5%
>
> cluster.rebalance-stats off
>
> cluster.subvols-per-directory   (null)
>
> cluster.readdir-optimizeoff
>
> cluster.rsync-hash-regex(null)
>
> cluster.extra-hash-regex(null)
>
> cluster.dht-xattr-name  trusted.glusterfs.dht
>
> cluster.randomize-hash-range-by-gfidoff
>
> cluster.rebal-throttle  normal
>
> cluster.lock-migration  off
>
> cluster.force-migration off
>
> cluster.local-volume-name   (null)
>
> cluster.weighted-rebalance  on
>
> cluster.switch-pattern  (null)
>
> cluster.entry-change-logon
>
> cluster.read-subvolume  (null)
>
> cluster.read-subvolume-index-1
>
> cluster.read-hash-mode  1
>
> cluster.background-self-heal-count  8
>
> cluster.metadata-self-heal  on
>
> cluster.data-self-heal  on
>
> cluster.entry-self-heal on
>
> cluster.self-heal-daemonon
>
> cluster.heal-timeout600
>
> cluster.self-heal-window-size   1
>
> cluster.data-change-log on
>
> cluster.metadata-change-log on
>
> cluster.data-self-heal-algorithm(null)
>
> cluster.eager-lock  on
>
> disperse.eager-lock on
>
> disperse.other-eager-lock   on
>
> disperse.eager-lock-timeout 1
>
> disperse.other-eager-lock-timeout   1
>
> cluster.quorum-type none
>
> cluster.quorum-count(null)
>
> cluster.choose-localtrue
>
> cluster.self-heal-readdir-size  1KB
>
> cluster.post-op-delay-secs  1
>
> cluster.ensure-durability   on
>
> cluster.consistent-metadata no
>
> cluster.heal-wait-queue-length  128
>
> cluster.favorite-child-policy   none
>
> cluster.full-lock   yes
>
> cluster.stripe-block-size   128KB
>
> cluster.stripe-coalesce true
>
> diagnostics.latency-measurement off
>
> diagnostics.dump-fd-stats   off
>
> diagnostics.count-fop-hits  off
>
> diagnostics.brick-log-level INFO
>
> diagnostics.client-log-levelINFO
>
> diagnostics.brick-sys-log-level CRITICAL
>
> diagnostics.client-sys-log-levelCRITICAL
>
> diagnostics.brick-logger(null)
>
> diagnostics.client-logger   (null)
>
> diagnostics.brick-log-format(null)
>
> diagnostics.client-log-format   (null)
>
> diagnostics.brick-log-buf-size  5
>
> diagnostics.client-log-buf-size 5
>
> diagnostics.brick-log-flush-timeout 120
>
> diagnostics.client-log-flush-timeout120
>
> diagnostics.stats-dump-interval 0
>
> diagnostics.fop-sample-interval 0
>
> diagnostics.stats-dump-format   json
>
> diagnostics.fop-sample-buf-size 65535
>
> diagnostics.stats-dnscache-ttl-sec  86400
>
> performance.cache-max-file-size 0
>
> performance.cache-min-file-size 0
>
> performance.cache-refresh-timeout   1
>
> performance.cache-priority
>
> performance.cache-size  32MB
>
> performance.io-thread-count 16
>
> performance.high-prio-threads   16
>
> performance.normal-prio-threads 16
>
> performance.low-prio-threads16
>
> performance.least-prio-threads  1
>
> performance.enable-least-priority   on
>
> performance.iot-watchdog-secs   (null)
>
> performance.iot-cleanup-disconnected-reqsoff
>
> performance.iot-pass-throughfalse
>
> 

Re: [Gluster-users] GFS performance under heavy traffic

2019-12-26 Thread David Cunningham
Hi Strahil,

Our volume options are as below. Thanks for the suggestion to upgrade to
version 6 or 7. We could do that be simply removing the current
installation and installing the new one (since it's not live right now). We
might have to convince the customer that it's likely to succeed though, as
at the moment I think they believe that GFS is not going to work for them.

Option  Value

--  -

cluster.lookup-unhashed on

cluster.lookup-optimize on

cluster.min-free-disk   10%

cluster.min-free-inodes 5%

cluster.rebalance-stats off

cluster.subvols-per-directory   (null)

cluster.readdir-optimizeoff

cluster.rsync-hash-regex(null)

cluster.extra-hash-regex(null)

cluster.dht-xattr-name  trusted.glusterfs.dht

cluster.randomize-hash-range-by-gfidoff

cluster.rebal-throttle  normal

cluster.lock-migration  off

cluster.force-migration off

cluster.local-volume-name   (null)

cluster.weighted-rebalance  on

cluster.switch-pattern  (null)

cluster.entry-change-logon

cluster.read-subvolume  (null)

cluster.read-subvolume-index-1

cluster.read-hash-mode  1

cluster.background-self-heal-count  8

cluster.metadata-self-heal  on

cluster.data-self-heal  on

cluster.entry-self-heal on

cluster.self-heal-daemonon

cluster.heal-timeout600

cluster.self-heal-window-size   1

cluster.data-change-log on

cluster.metadata-change-log on

cluster.data-self-heal-algorithm(null)

cluster.eager-lock  on

disperse.eager-lock on

disperse.other-eager-lock   on

disperse.eager-lock-timeout 1

disperse.other-eager-lock-timeout   1

cluster.quorum-type none

cluster.quorum-count(null)

cluster.choose-localtrue

cluster.self-heal-readdir-size  1KB

cluster.post-op-delay-secs  1

cluster.ensure-durability   on

cluster.consistent-metadata no

cluster.heal-wait-queue-length  128

cluster.favorite-child-policy   none

cluster.full-lock   yes

cluster.stripe-block-size   128KB

cluster.stripe-coalesce true

diagnostics.latency-measurement off

diagnostics.dump-fd-stats   off

diagnostics.count-fop-hits  off

diagnostics.brick-log-level INFO

diagnostics.client-log-levelINFO

diagnostics.brick-sys-log-level CRITICAL

diagnostics.client-sys-log-levelCRITICAL

diagnostics.brick-logger(null)

diagnostics.client-logger   (null)

diagnostics.brick-log-format(null)

diagnostics.client-log-format   (null)

diagnostics.brick-log-buf-size  5

diagnostics.client-log-buf-size 5

diagnostics.brick-log-flush-timeout 120

diagnostics.client-log-flush-timeout120

diagnostics.stats-dump-interval 0

diagnostics.fop-sample-interval 0

diagnostics.stats-dump-format   json

diagnostics.fop-sample-buf-size 65535

diagnostics.stats-dnscache-ttl-sec  86400

performance.cache-max-file-size 0

performance.cache-min-file-size 0

performance.cache-refresh-timeout   1

performance.cache-priority

performance.cache-size  32MB

performance.io-thread-count 16

performance.high-prio-threads   16

performance.normal-prio-threads 16

performance.low-prio-threads16

performance.least-prio-threads  1

performance.enable-least-priority   on

performance.iot-watchdog-secs   (null)

performance.iot-cleanup-disconnected-reqsoff

performance.iot-pass-throughfalse

performance.io-cache-pass-through   false

performance.cache-size  128MB

performance.qr-cache-timeout1

performance.cache-invalidation  false

performance.ctime-invalidation  false

performance.flush-behindon

performance.nfs.flush-behindon

performance.write-behind-window-size1MB

performance.resync-failed-syncs-after-fsyncoff

performance.nfs.write-behind-window-size1MB

performance.strict-o-direct off

performance.nfs.strict-o-direct off

performance.strict-write-ordering   off

performance.nfs.strict-write-ordering   off

performance.write-behind-trickling-writeson

performance.aggregate-size  128KB

performance.nfs.write-behind-trickling-writeson

performance.lazy-open   

Re: [Gluster-users] GFS performance under heavy traffic

2019-12-23 Thread David Cunningham
Hello,

In testing we found that actually the GFS client having access to all 3
nodes made no difference to performance. Perhaps that's because the 3rd
node that wasn't accessible from the client before was the arbiter node?

Presumably we shouldn't have an arbiter node listed under
backupvolfile-server when mounting the filesystem? Since it doesn't store
all the data surely it can't be used to serve the data.

We did have direct-io-mode=disable already as well, so that wasn't a factor
in the performance problems.

Thanks again for any advice.



On Mon, 23 Dec 2019 at 13:09, David Cunningham 
wrote:

> Hi Strahil,
>
> Thanks for that. We do have one backup server specified, but will add the
> second backup as well.
>
>
> On Sat, 21 Dec 2019 at 11:26, Strahil  wrote:
>
>> Hi David,
>>
>> Also consider using the  mount option to specify backup server via
>> 'backupvolfile-server=server2:server3' (you can define more but I don't
>> thing replica volumes  greater that 3 are usefull (maybe  in some special
>> cases).
>>
>> In such way, when the primary is lost, your client can reach a backup one
>> without disruption.
>>
>> P.S.: Client may 'hang' - if the primary server got rebooted ungracefully
>> - as the communication must timeout before FUSE addresses the next server.
>> There is a special script for  killing gluster processes in
>> '/usr/share/gluster/scripts' which can be used  for  setting up a systemd
>> service to do that for you on shutdown.
>>
>> Best Regards,
>> Strahil Nikolov
>> On Dec 20, 2019 23:49, David Cunningham 
>> wrote:
>>
>> Hi Stahil,
>>
>> Ah, that is an important point. One of the nodes is not accessible from
>> the client, and we assumed that it only needed to reach the GFS node that
>> was mounted so didn't think anything of it.
>>
>> We will try making all nodes accessible, as well as
>> "direct-io-mode=disable".
>>
>> Thank you.
>>
>>
>> On Sat, 21 Dec 2019 at 10:29, Strahil Nikolov 
>> wrote:
>>
>> Actually I haven't clarified myself.
>> FUSE mounts on the client side is connecting directly to all bricks
>> consisted of the volume.
>> If for some reason (bad routing, firewall blocked) there could be cases
>> where the client can reach 2 out of 3 bricks and this can constantly cause
>> healing to happen (as one of the bricks is never updated) which will
>> degrade the performance and cause excessive network usage.
>> As your attachment is from one of the gluster nodes, this could be the
>> case.
>>
>> Best Regards,
>> Strahil Nikolov
>>
>> В петък, 20 декември 2019 г., 01:49:56 ч. Гринуич+2, David Cunningham <
>> dcunning...@voisonics.com> написа:
>>
>>
>> Hi Strahil,
>>
>> The chart attached to my original email is taken from the GFS server.
>>
>> I'm not sure what you mean by accessing all bricks simultaneously. We've
>> mounted it from the client like this:
>> gfs1:/gvol0 /mnt/glusterfs/ glusterfs
>> defaults,direct-io-mode=disable,_netdev,backupvolfile-server=gfs2,fetch-attempts=10
>> 0 0
>>
>> Should we do something different to access all bricks simultaneously?
>>
>> Thanks for your help!
>>
>>
>> On Fri, 20 Dec 2019 at 11:47, Strahil Nikolov 
>> wrote:
>>
>> I'm not sure if you did measure the traffic from client side (tcpdump on
>> a client machine) or from Server side.
>>
>> In both cases , please verify that the client accesses all bricks
>> simultaneously, as this can cause unnecessary heals.
>>
>> Have you thought about upgrading to v6? There are some enhancements in v6
>> which could be beneficial.
>>
>> Yet, it is indeed strange that so much traffic is generated with FUSE.
>>
>> Another aproach is to test with NFSGanesha which suports pNFS and can
>> natively speak with Gluster, which cant bring you closer to the previous
>> setup and also provide some extra performance.
>>
>>
>> Best Regards,
>> Strahil Nikolov
>>
>>
>>
>>
>
> --
> David Cunningham, Voisonics Limited
> http://voisonics.com/
> USA: +1 213 221 1092
> New Zealand: +64 (0)28 2558 3782
>


-- 
David Cunningham, Voisonics Limited
http://voisonics.com/
USA: +1 213 221 1092
New Zealand: +64 (0)28 2558 3782


Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] GFS performance under heavy traffic

2019-12-22 Thread David Cunningham
Hi Strahil,

Thanks for that. We do have one backup server specified, but will add the
second backup as well.


On Sat, 21 Dec 2019 at 11:26, Strahil  wrote:

> Hi David,
>
> Also consider using the  mount option to specify backup server via
> 'backupvolfile-server=server2:server3' (you can define more but I don't
> thing replica volumes  greater that 3 are usefull (maybe  in some special
> cases).
>
> In such way, when the primary is lost, your client can reach a backup one
> without disruption.
>
> P.S.: Client may 'hang' - if the primary server got rebooted ungracefully
> - as the communication must timeout before FUSE addresses the next server.
> There is a special script for  killing gluster processes in
> '/usr/share/gluster/scripts' which can be used  for  setting up a systemd
> service to do that for you on shutdown.
>
> Best Regards,
> Strahil Nikolov
> On Dec 20, 2019 23:49, David Cunningham  wrote:
>
> Hi Stahil,
>
> Ah, that is an important point. One of the nodes is not accessible from
> the client, and we assumed that it only needed to reach the GFS node that
> was mounted so didn't think anything of it.
>
> We will try making all nodes accessible, as well as
> "direct-io-mode=disable".
>
> Thank you.
>
>
> On Sat, 21 Dec 2019 at 10:29, Strahil Nikolov 
> wrote:
>
> Actually I haven't clarified myself.
> FUSE mounts on the client side is connecting directly to all bricks
> consisted of the volume.
> If for some reason (bad routing, firewall blocked) there could be cases
> where the client can reach 2 out of 3 bricks and this can constantly cause
> healing to happen (as one of the bricks is never updated) which will
> degrade the performance and cause excessive network usage.
> As your attachment is from one of the gluster nodes, this could be the
> case.
>
> Best Regards,
> Strahil Nikolov
>
> В петък, 20 декември 2019 г., 01:49:56 ч. Гринуич+2, David Cunningham <
> dcunning...@voisonics.com> написа:
>
>
> Hi Strahil,
>
> The chart attached to my original email is taken from the GFS server.
>
> I'm not sure what you mean by accessing all bricks simultaneously. We've
> mounted it from the client like this:
> gfs1:/gvol0 /mnt/glusterfs/ glusterfs
> defaults,direct-io-mode=disable,_netdev,backupvolfile-server=gfs2,fetch-attempts=10
> 0 0
>
> Should we do something different to access all bricks simultaneously?
>
> Thanks for your help!
>
>
> On Fri, 20 Dec 2019 at 11:47, Strahil Nikolov 
> wrote:
>
> I'm not sure if you did measure the traffic from client side (tcpdump on a
> client machine) or from Server side.
>
> In both cases , please verify that the client accesses all bricks
> simultaneously, as this can cause unnecessary heals.
>
> Have you thought about upgrading to v6? There are some enhancements in v6
> which could be beneficial.
>
> Yet, it is indeed strange that so much traffic is generated with FUSE.
>
> Another aproach is to test with NFSGanesha which suports pNFS and can
> natively speak with Gluster, which cant bring you closer to the previous
> setup and also provide some extra performance.
>
>
> Best Regards,
> Strahil Nikolov
>
>
>
>

-- 
David Cunningham, Voisonics Limited
http://voisonics.com/
USA: +1 213 221 1092
New Zealand: +64 (0)28 2558 3782


Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] GFS performance under heavy traffic

2019-12-20 Thread David Cunningham
Hi Stahil,

Ah, that is an important point. One of the nodes is not accessible from the
client, and we assumed that it only needed to reach the GFS node that was
mounted so didn't think anything of it.

We will try making all nodes accessible, as well as
"direct-io-mode=disable".

Thank you.


On Sat, 21 Dec 2019 at 10:29, Strahil Nikolov  wrote:

> Actually I haven't clarified myself.
> FUSE mounts on the client side is connecting directly to all bricks
> consisted of the volume.
> If for some reason (bad routing, firewall blocked) there could be cases
> where the client can reach 2 out of 3 bricks and this can constantly cause
> healing to happen (as one of the bricks is never updated) which will
> degrade the performance and cause excessive network usage.
> As your attachment is from one of the gluster nodes, this could be the
> case.
>
> Best Regards,
> Strahil Nikolov
>
> В петък, 20 декември 2019 г., 01:49:56 ч. Гринуич+2, David Cunningham <
> dcunning...@voisonics.com> написа:
>
>
> Hi Strahil,
>
> The chart attached to my original email is taken from the GFS server.
>
> I'm not sure what you mean by accessing all bricks simultaneously. We've
> mounted it from the client like this:
> gfs1:/gvol0 /mnt/glusterfs/ glusterfs
> defaults,direct-io-mode=disable,_netdev,backupvolfile-server=gfs2,fetch-attempts=10
> 0 0
>
> Should we do something different to access all bricks simultaneously?
>
> Thanks for your help!
>
>
> On Fri, 20 Dec 2019 at 11:47, Strahil Nikolov 
> wrote:
>
> I'm not sure if you did measure the traffic from client side (tcpdump on a
> client machine) or from Server side.
>
> In both cases , please verify that the client accesses all bricks
> simultaneously, as this can cause unnecessary heals.
>
> Have you thought about upgrading to v6? There are some enhancements in v6
> which could be beneficial.
>
> Yet, it is indeed strange that so much traffic is generated with FUSE.
>
> Another aproach is to test with NFSGanesha which suports pNFS and can
> natively speak with Gluster, which cant bring you closer to the previous
> setup and also provide some extra performance.
>
>
> Best Regards,
> Strahil Nikolov
>
>
>
> В четвъртък, 19 декември 2019 г., 02:28:55 ч. Гринуич+2, David Cunningham <
> dcunning...@voisonics.com> написа:
>
>
> Hi Raghavendra and Strahil,
>
> We are using GFS version 5.6-1.el7 from the CentOS repository.
> Unfortunately we can't modify the application and it expects to read and
> write from a normal filesystem.
>
> There's around 25GB of data being written during a business day, so over
> 10 hours that's around 0.7 MBps, which has me mystified as to how it can
> generate 114MBps of network traffic. Granted we have read traffic as well,
> but still. The chart shows much more inbound traffic to the GFS server than
> outbound, suggesting the problem is with data writes.
>
> Is it possible with GFS to not check with the other nodes when reading?
> Our data is mostly static and we don't require 100% guarantee that the data
> is up-to-date when reading.
>
> Thanks for any assistance.
>
>
> On Wed, 18 Dec 2019 at 16:39, Raghavendra Gowdappa 
> wrote:
>
> What version of Glusterfs are you using? Though, not sure what's the root
> cause of your problem, just wanted to point out a bug with read-ahead which
> would cause read-amplification over network [1][2], which should be fixed
> in recent versions.
>
> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1214489
> [2] https://bugzilla.redhat.com/show_bug.cgi?id=1393419
>
> On Wed, Dec 18, 2019 at 2:50 AM David Cunningham <
> dcunning...@voisonics.com> wrote:
>
> Hello,
>
> We switched a production system to using GFS instead of NFS at the
> weekend, however it didn't go well on Monday when full load hit. The
> application started crashing regularly and we had to revert to NFS. It
> seems that the problem was high network traffic used by GFS.
>
> We've two GFS nodes plus one arbiter node, each about 1.3ms latency from
> each other. Attached is a chart of network traffic on one of the GFS nodes.
> We see that it saturated the 1Gbps link before we reverted to NFS at 15:10.
>
> The question is, why does GFS use so much network traffic and is there
> anything we can do about it? NFS traffic doesn't exceed 4MBps, so 120MBps
> for GFS seems awfully high.
>
> It would also be good to have faster read performance from GFS, but that's
> another issue.
>
> Thanks in advance for any assistance.
>
> --
> David Cunningham, Voisonics Limited
> http://voisonics.com/
> USA: +1 213 221 1092
> New Zealand: +64 (0)28 2558 3782
> 
>
> Community Meeting Calendar:
>
> APAC Schedule -
> Every 2nd and 4th Tuesday at 11:30 AM IST
> Bridge: https://bluejeans.com/441850968
>
> NA/EMEA Schedule -
> Every 1st and 3rd Tuesday at 01:00 PM EDT
> Bridge: https://bluejeans.com/441850968
>
> Gluster-users mailing list
> Gluster-users@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
>
>
>
> --
> David Cunningham, Voisonics Limited
> 

Re: [Gluster-users] GFS performance under heavy traffic

2019-12-19 Thread David Cunningham
Hi Strahil,

The chart attached to my original email is taken from the GFS server.

I'm not sure what you mean by accessing all bricks simultaneously. We've
mounted it from the client like this:
gfs1:/gvol0 /mnt/glusterfs/ glusterfs
defaults,direct-io-mode=disable,_netdev,backupvolfile-server=gfs2,fetch-attempts=10
0 0

Should we do something different to access all bricks simultaneously?

Thanks for your help!


On Fri, 20 Dec 2019 at 11:47, Strahil Nikolov  wrote:

> I'm not sure if you did measure the traffic from client side (tcpdump on a
> client machine) or from Server side.
>
> In both cases , please verify that the client accesses all bricks
> simultaneously, as this can cause unnecessary heals.
>
> Have you thought about upgrading to v6? There are some enhancements in v6
> which could be beneficial.
>
> Yet, it is indeed strange that so much traffic is generated with FUSE.
>
> Another aproach is to test with NFSGanesha which suports pNFS and can
> natively speak with Gluster, which cant bring you closer to the previous
> setup and also provide some extra performance.
>
>
> Best Regards,
> Strahil Nikolov
>
>
>
> В четвъртък, 19 декември 2019 г., 02:28:55 ч. Гринуич+2, David Cunningham <
> dcunning...@voisonics.com> написа:
>
>
> Hi Raghavendra and Strahil,
>
> We are using GFS version 5.6-1.el7 from the CentOS repository.
> Unfortunately we can't modify the application and it expects to read and
> write from a normal filesystem.
>
> There's around 25GB of data being written during a business day, so over
> 10 hours that's around 0.7 MBps, which has me mystified as to how it can
> generate 114MBps of network traffic. Granted we have read traffic as well,
> but still. The chart shows much more inbound traffic to the GFS server than
> outbound, suggesting the problem is with data writes.
>
> Is it possible with GFS to not check with the other nodes when reading?
> Our data is mostly static and we don't require 100% guarantee that the data
> is up-to-date when reading.
>
> Thanks for any assistance.
>
>
> On Wed, 18 Dec 2019 at 16:39, Raghavendra Gowdappa 
> wrote:
>
> What version of Glusterfs are you using? Though, not sure what's the root
> cause of your problem, just wanted to point out a bug with read-ahead which
> would cause read-amplification over network [1][2], which should be fixed
> in recent versions.
>
> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1214489
> [2] https://bugzilla.redhat.com/show_bug.cgi?id=1393419
>
> On Wed, Dec 18, 2019 at 2:50 AM David Cunningham <
> dcunning...@voisonics.com> wrote:
>
> Hello,
>
> We switched a production system to using GFS instead of NFS at the
> weekend, however it didn't go well on Monday when full load hit. The
> application started crashing regularly and we had to revert to NFS. It
> seems that the problem was high network traffic used by GFS.
>
> We've two GFS nodes plus one arbiter node, each about 1.3ms latency from
> each other. Attached is a chart of network traffic on one of the GFS nodes.
> We see that it saturated the 1Gbps link before we reverted to NFS at 15:10.
>
> The question is, why does GFS use so much network traffic and is there
> anything we can do about it? NFS traffic doesn't exceed 4MBps, so 120MBps
> for GFS seems awfully high.
>
> It would also be good to have faster read performance from GFS, but that's
> another issue.
>
> Thanks in advance for any assistance.
>
> --
> David Cunningham, Voisonics Limited
> http://voisonics.com/
> USA: +1 213 221 1092
> New Zealand: +64 (0)28 2558 3782
> 
>
> Community Meeting Calendar:
>
> APAC Schedule -
> Every 2nd and 4th Tuesday at 11:30 AM IST
> Bridge: https://bluejeans.com/441850968
>
> NA/EMEA Schedule -
> Every 1st and 3rd Tuesday at 01:00 PM EDT
> Bridge: https://bluejeans.com/441850968
>
> Gluster-users mailing list
> Gluster-users@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
>
>
>
> --
> David Cunningham, Voisonics Limited
> http://voisonics.com/
> USA: +1 213 221 1092
> New Zealand: +64 (0)28 2558 3782
> 
>
> Community Meeting Calendar:
>
> APAC Schedule -
> Every 2nd and 4th Tuesday at 11:30 AM IST
> Bridge: https://bluejeans.com/441850968
>
> NA/EMEA Schedule -
> Every 1st and 3rd Tuesday at 01:00 PM EDT
> Bridge: https://bluejeans.com/441850968
>
> Gluster-users mailing list
> Gluster-users@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
>


-- 
David Cunningham, Voisonics Limited
http://voisonics.com/
USA: +1 213 221 1092
New Zealand: +64 (0)28 2558 3782


Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] GFS performance under heavy traffic

2019-12-19 Thread Strahil Nikolov
 I'm not sure if you did measure the traffic from client side (tcpdump on a 
client machine) or from Server side.
In both cases , please verify that the client accesses all bricks 
simultaneously, as this can cause unnecessary heals.
Have you thought about upgrading to v6? There are some enhancements in v6 which 
could be beneficial.
Yet, it is indeed strange that so much traffic is generated with FUSE.
Another aproach is to test with NFSGanesha which suports pNFS and can natively 
speak with Gluster, which cant bring you closer to the previous setup and also 
provide some extra performance.

Best Regards,Strahil Nikolov


В четвъртък, 19 декември 2019 г., 02:28:55 ч. Гринуич+2, David Cunningham 
 написа:  
 
 Hi Raghavendra and Strahil,
We are using GFS version 5.6-1.el7 from the CentOS repository. Unfortunately we 
can't modify the application and it expects to read and write from a normal 
filesystem.
There's around 25GB of data being written during a business day, so over 10 
hours that's around 0.7 MBps, which has me mystified as to how it can generate 
114MBps of network traffic. Granted we have read traffic as well, but still. 
The chart shows much more inbound traffic to the GFS server than outbound, 
suggesting the problem is with data writes.

Is it possible with GFS to not check with the other nodes when reading? Our 
data is mostly static and we don't require 100% guarantee that the data is 
up-to-date when reading.
Thanks for any assistance.

On Wed, 18 Dec 2019 at 16:39, Raghavendra Gowdappa  wrote:

What version of Glusterfs are you using? Though, not sure what's the root cause 
of your problem, just wanted to point out a bug with read-ahead which would 
cause read-amplification over network [1][2], which should be fixed in recent 
versions.

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1214489[2] 
https://bugzilla.redhat.com/show_bug.cgi?id=1393419
On Wed, Dec 18, 2019 at 2:50 AM David Cunningham  
wrote:

Hello,
We switched a production system to using GFS instead of NFS at the weekend, 
however it didn't go well on Monday when full load hit. The application started 
crashing regularly and we had to revert to NFS. It seems that the problem was 
high network traffic used by GFS.

We've two GFS nodes plus one arbiter node, each about 1.3ms latency from each 
other. Attached is a chart of network traffic on one of the GFS nodes. We see 
that it saturated the 1Gbps link before we reverted to NFS at 15:10.
The question is, why does GFS use so much network traffic and is there anything 
we can do about it? NFS traffic doesn't exceed 4MBps, so 120MBps for GFS seems 
awfully high.
It would also be good to have faster read performance from GFS, but that's 
another issue.

Thanks in advance for any assistance.

-- 
David Cunningham, Voisonics Limited
http://voisonics.com/
USA: +1 213 221 1092
New Zealand: +64 (0)28 2558 3782

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users




-- 
David Cunningham, Voisonics Limited
http://voisonics.com/
USA: +1 213 221 1092
New Zealand: +64 (0)28 2558 3782

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users
  

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] GFS performance under heavy traffic

2019-12-19 Thread David Cunningham
Hi Jorick,

Thank you for that, we will try direct-io-mode=disable.

Would anyone have any other advice on the cause of 114MBps of network
traffic with GFS?


On Fri, 20 Dec 2019 at 03:00, Jorick Astrego  wrote:

> Hi David,
>
> Did you try setting "direct-io-mode=disable" on the client mounts? As it
> is mostly static content it would help to use the kernel caching and
> read-ahead mechanisms.
>
> I think the default is enabled.
>
> Regards,
>
> Jorick Astrego
> On 12/19/19 1:28 AM, David Cunningham wrote:
>
> Hi Raghavendra and Strahil,
>
> We are using GFS version 5.6-1.el7 from the CentOS repository.
> Unfortunately we can't modify the application and it expects to read and
> write from a normal filesystem.
>
> There's around 25GB of data being written during a business day, so over
> 10 hours that's around 0.7 MBps, which has me mystified as to how it can
> generate 114MBps of network traffic. Granted we have read traffic as well,
> but still. The chart shows much more inbound traffic to the GFS server than
> outbound, suggesting the problem is with data writes.
>
> Is it possible with GFS to not check with the other nodes when reading?
> Our data is mostly static and we don't require 100% guarantee that the data
> is up-to-date when reading.
>
> Thanks for any assistance.
>
>
> On Wed, 18 Dec 2019 at 16:39, Raghavendra Gowdappa 
> wrote:
>
>> What version of Glusterfs are you using? Though, not sure what's the root
>> cause of your problem, just wanted to point out a bug with read-ahead which
>> would cause read-amplification over network [1][2], which should be fixed
>> in recent versions.
>>
>> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1214489
>> [2] https://bugzilla.redhat.com/show_bug.cgi?id=1393419
>>
>> On Wed, Dec 18, 2019 at 2:50 AM David Cunningham <
>> dcunning...@voisonics.com> wrote:
>>
>>> Hello,
>>>
>>> We switched a production system to using GFS instead of NFS at the
>>> weekend, however it didn't go well on Monday when full load hit. The
>>> application started crashing regularly and we had to revert to NFS. It
>>> seems that the problem was high network traffic used by GFS.
>>>
>>> We've two GFS nodes plus one arbiter node, each about 1.3ms latency from
>>> each other. Attached is a chart of network traffic on one of the GFS nodes.
>>> We see that it saturated the 1Gbps link before we reverted to NFS at 15:10.
>>>
>>> The question is, why does GFS use so much network traffic and is there
>>> anything we can do about it? NFS traffic doesn't exceed 4MBps, so 120MBps
>>> for GFS seems awfully high.
>>>
>>> It would also be good to have faster read performance from GFS, but
>>> that's another issue.
>>>
>>> Thanks in advance for any assistance.
>>>
>>> --
>>> David Cunningham, Voisonics Limited
>>> http://voisonics.com/
>>> USA: +1 213 221 1092
>>> New Zealand: +64 (0)28 2558 3782
>>> 
>>>
>>> Community Meeting Calendar:
>>>
>>> APAC Schedule -
>>> Every 2nd and 4th Tuesday at 11:30 AM IST
>>> Bridge: https://bluejeans.com/441850968
>>>
>>> NA/EMEA Schedule -
>>> Every 1st and 3rd Tuesday at 01:00 PM EDT
>>> Bridge: https://bluejeans.com/441850968
>>>
>>> Gluster-users mailing list
>>> Gluster-users@gluster.org
>>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>>
>>
>
> --
> David Cunningham, Voisonics Limited
> http://voisonics.com/
> USA: +1 213 221 1092
> New Zealand: +64 (0)28 2558 3782
>
> 
>
> Community Meeting Calendar:
>
> APAC Schedule -
> Every 2nd and 4th Tuesday at 11:30 AM IST
> Bridge: https://bluejeans.com/441850968
>
> NA/EMEA Schedule -
> Every 1st and 3rd Tuesday at 01:00 PM EDT
> Bridge: https://bluejeans.com/441850968
>
> Gluster-users mailing 
> listGluster-users@gluster.orghttps://lists.gluster.org/mailman/listinfo/gluster-users
>
>
>
>
>
> Met vriendelijke groet, With kind regards,
>
> Jorick Astrego
>
> *Netbulae Virtualization Experts *
> --
> Tel: 053 20 30 270 i...@netbulae.eu Staalsteden 4-3A KvK 08198180
> Fax: 053 20 30 271 www.netbulae.eu 7547 TA Enschede BTW NL821234584B01
> --
>
> 
>
> Community Meeting Calendar:
>
> APAC Schedule -
> Every 2nd and 4th Tuesday at 11:30 AM IST
> Bridge: https://bluejeans.com/441850968
>
> NA/EMEA Schedule -
> Every 1st and 3rd Tuesday at 01:00 PM EDT
> Bridge: https://bluejeans.com/441850968
>
> Gluster-users mailing list
> Gluster-users@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
>


-- 
David Cunningham, Voisonics Limited
http://voisonics.com/
USA: +1 213 221 1092
New Zealand: +64 (0)28 2558 3782


Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] GFS performance under heavy traffic

2019-12-19 Thread Jorick Astrego
Hi David,

Did you try setting "direct-io-mode=disable" on the client mounts? As it
is mostly static content it would help to use the kernel caching and
read-ahead mechanisms.

I think the default is enabled.

Regards,

Jorick Astrego

On 12/19/19 1:28 AM, David Cunningham wrote:
> Hi Raghavendra and Strahil,
>
> We are using GFS version 5.6-1.el7 from the CentOS repository.
> Unfortunately we can't modify the application and it expects to read
> and write from a normal filesystem.
>
> There's around 25GB of data being written during a business day, so
> over 10 hours that's around 0.7 MBps, which has me mystified as to how
> it can generate 114MBps of network traffic. Granted we have read
> traffic as well, but still. The chart shows much more inbound traffic
> to the GFS server than outbound, suggesting the problem is with data
> writes.
>
> Is it possible with GFS to not check with the other nodes when
> reading? Our data is mostly static and we don't require 100% guarantee
> that the data is up-to-date when reading.
>
> Thanks for any assistance.
>
>
> On Wed, 18 Dec 2019 at 16:39, Raghavendra Gowdappa
> mailto:rgowd...@redhat.com>> wrote:
>
> What version of Glusterfs are you using? Though, not sure what's
> the root cause of your problem, just wanted to point out a bug
> with read-ahead which would cause read-amplification over network
> [1][2], which should be fixed in recent versions.
>
> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1214489
> [2] https://bugzilla.redhat.com/show_bug.cgi?id=1393419
>
> On Wed, Dec 18, 2019 at 2:50 AM David Cunningham
> mailto:dcunning...@voisonics.com>> wrote:
>
> Hello,
>
> We switched a production system to using GFS instead of NFS at
> the weekend, however it didn't go well on Monday when full
> load hit. The application started crashing regularly and we
> had to revert to NFS. It seems that the problem was high
> network traffic used by GFS.
>
> We've two GFS nodes plus one arbiter node, each about 1.3ms
> latency from each other. Attached is a chart of network
> traffic on one of the GFS nodes. We see that it saturated the
> 1Gbps link before we reverted to NFS at 15:10.
>
> The question is, why does GFS use so much network traffic and
> is there anything we can do about it? NFS traffic doesn't
> exceed 4MBps, so 120MBps for GFS seems awfully high.
>
> It would also be good to have faster read performance from
> GFS, but that's another issue.
>
> Thanks in advance for any assistance.
>
> -- 
> David Cunningham, Voisonics Limited
> http://voisonics.com/
> USA: +1 213 221 1092
> New Zealand: +64 (0)28 2558 3782
> 
>
> Community Meeting Calendar:
>
> APAC Schedule -
> Every 2nd and 4th Tuesday at 11:30 AM IST
> Bridge: https://bluejeans.com/441850968
>
> NA/EMEA Schedule -
> Every 1st and 3rd Tuesday at 01:00 PM EDT
> Bridge: https://bluejeans.com/441850968
>
> Gluster-users mailing list
> Gluster-users@gluster.org 
> https://lists.gluster.org/mailman/listinfo/gluster-users
>
>
>
> -- 
> David Cunningham, Voisonics Limited
> http://voisonics.com/
> USA: +1 213 221 1092
> New Zealand: +64 (0)28 2558 3782
>
> 
>
> Community Meeting Calendar:
>
> APAC Schedule -
> Every 2nd and 4th Tuesday at 11:30 AM IST
> Bridge: https://bluejeans.com/441850968
>
> NA/EMEA Schedule -
> Every 1st and 3rd Tuesday at 01:00 PM EDT
> Bridge: https://bluejeans.com/441850968
>
> Gluster-users mailing list
> Gluster-users@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users




Met vriendelijke groet, With kind regards,

Jorick Astrego

Netbulae Virtualization Experts 



Tel: 053 20 30 270  i...@netbulae.euStaalsteden 4-3A
KvK 08198180
Fax: 053 20 30 271  www.netbulae.eu 7547 TA Enschede
BTW NL821234584B01





Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] GFS performance under heavy traffic

2019-12-18 Thread David Cunningham
Hi Raghavendra and Strahil,

We are using GFS version 5.6-1.el7 from the CentOS repository.
Unfortunately we can't modify the application and it expects to read and
write from a normal filesystem.

There's around 25GB of data being written during a business day, so over 10
hours that's around 0.7 MBps, which has me mystified as to how it can
generate 114MBps of network traffic. Granted we have read traffic as well,
but still. The chart shows much more inbound traffic to the GFS server than
outbound, suggesting the problem is with data writes.

Is it possible with GFS to not check with the other nodes when reading? Our
data is mostly static and we don't require 100% guarantee that the data is
up-to-date when reading.

Thanks for any assistance.


On Wed, 18 Dec 2019 at 16:39, Raghavendra Gowdappa 
wrote:

> What version of Glusterfs are you using? Though, not sure what's the root
> cause of your problem, just wanted to point out a bug with read-ahead which
> would cause read-amplification over network [1][2], which should be fixed
> in recent versions.
>
> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1214489
> [2] https://bugzilla.redhat.com/show_bug.cgi?id=1393419
>
> On Wed, Dec 18, 2019 at 2:50 AM David Cunningham <
> dcunning...@voisonics.com> wrote:
>
>> Hello,
>>
>> We switched a production system to using GFS instead of NFS at the
>> weekend, however it didn't go well on Monday when full load hit. The
>> application started crashing regularly and we had to revert to NFS. It
>> seems that the problem was high network traffic used by GFS.
>>
>> We've two GFS nodes plus one arbiter node, each about 1.3ms latency from
>> each other. Attached is a chart of network traffic on one of the GFS nodes.
>> We see that it saturated the 1Gbps link before we reverted to NFS at 15:10.
>>
>> The question is, why does GFS use so much network traffic and is there
>> anything we can do about it? NFS traffic doesn't exceed 4MBps, so 120MBps
>> for GFS seems awfully high.
>>
>> It would also be good to have faster read performance from GFS, but
>> that's another issue.
>>
>> Thanks in advance for any assistance.
>>
>> --
>> David Cunningham, Voisonics Limited
>> http://voisonics.com/
>> USA: +1 213 221 1092
>> New Zealand: +64 (0)28 2558 3782
>> 
>>
>> Community Meeting Calendar:
>>
>> APAC Schedule -
>> Every 2nd and 4th Tuesday at 11:30 AM IST
>> Bridge: https://bluejeans.com/441850968
>>
>> NA/EMEA Schedule -
>> Every 1st and 3rd Tuesday at 01:00 PM EDT
>> Bridge: https://bluejeans.com/441850968
>>
>> Gluster-users mailing list
>> Gluster-users@gluster.org
>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>
>

-- 
David Cunningham, Voisonics Limited
http://voisonics.com/
USA: +1 213 221 1092
New Zealand: +64 (0)28 2558 3782


Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] GFS performance under heavy traffic

2019-12-17 Thread Raghavendra Gowdappa
What version of Glusterfs are you using? Though, not sure what's the root
cause of your problem, just wanted to point out a bug with read-ahead which
would cause read-amplification over network [1][2], which should be fixed
in recent versions.

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1214489
[2] https://bugzilla.redhat.com/show_bug.cgi?id=1393419

On Wed, Dec 18, 2019 at 2:50 AM David Cunningham 
wrote:

> Hello,
>
> We switched a production system to using GFS instead of NFS at the
> weekend, however it didn't go well on Monday when full load hit. The
> application started crashing regularly and we had to revert to NFS. It
> seems that the problem was high network traffic used by GFS.
>
> We've two GFS nodes plus one arbiter node, each about 1.3ms latency from
> each other. Attached is a chart of network traffic on one of the GFS nodes.
> We see that it saturated the 1Gbps link before we reverted to NFS at 15:10.
>
> The question is, why does GFS use so much network traffic and is there
> anything we can do about it? NFS traffic doesn't exceed 4MBps, so 120MBps
> for GFS seems awfully high.
>
> It would also be good to have faster read performance from GFS, but that's
> another issue.
>
> Thanks in advance for any assistance.
>
> --
> David Cunningham, Voisonics Limited
> http://voisonics.com/
> USA: +1 213 221 1092
> New Zealand: +64 (0)28 2558 3782
> 
>
> Community Meeting Calendar:
>
> APAC Schedule -
> Every 2nd and 4th Tuesday at 11:30 AM IST
> Bridge: https://bluejeans.com/441850968
>
> NA/EMEA Schedule -
> Every 1st and 3rd Tuesday at 01:00 PM EDT
> Bridge: https://bluejeans.com/441850968
>
> Gluster-users mailing list
> Gluster-users@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
>


Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users