Re: [Gluster-users] GFS performance under heavy traffic

David Cunningham Tue, 07 Jan 2020 00:00:49 -0800

Hi Strahil,

I believe we are using the standard MTU of 1500 (would need to check with
the network people to be sure). Does it make a difference?


I'm afraid I don't know about the scheduler - where do I find that?

Thank you for the suggestions about turning off performance.read-ahead and
performance.readdir-ahead.


On Tue, 7 Jan 2020 at 18:08, Strahil <hunter86...@yahoo.com> wrote:

> Hi David,
>
> It's difficult to find anything structured (but it's the same for Linux
> and other  tech). I use Red Hat's doxumentation, guideds online (crosscheck
> the options with official documentation) and experience shared on the
> mailing list.
>
> I don't see anything (iin /var/lib/gluster/groups) that will match your
> profile, but I think that you should try with performance.read-ahead  and
> performance.readdir-ahead 'off' . I have found out a bug (didn't read  the
> whole stuff) ,  that might be interesting for you :
>
> https://bugzilla.redhat.com/show_bug.cgi?id=1601166
>
> Also, Arbiter is very important in order to avoid split brain situations
> (but based on my experience , issues still can occur) and best the brick
> for the Arbiter to be an SSD as it needs to process the metadata as fast as
> possible. With v7, there  is an option the client to have an Arbiter even
> in the cloud (remote arbiter) that is used only when 1 data brick is down.
>
> Please report the issue with the cache  - that should not be like that.
>
> Are you using Jumbo frames  (MTU 9000)?
> What is yoir brick's  I/O scheduler  ?
>
> Best Regards,
> Strahil Nikolov
> On Jan 7, 2020 01:34, David Cunningham <dcunning...@voisonics.com> wrote:
>
> Hi Strahil,
>
> We may have had a heal since the GFS arbiter node wasn't accessible from
> the GFS clients, only from the other GFS servers. Unfortunately we haven't
> been able to produce the problem seen in production while testing so are
> unsure whether making the GFS arbiter node directly available to clients
> has fixed the issue.
>
> The load on GFS is mainly:
> 1. There are a small number of files around 5MB in size which are read
> often and change infrequently.
> 2. There are a large number of directories which are opened for reading to
> read the list of contents frequently.
> 3. There are a large number of new files around 5MB in size written
> frequently and read infrequently.
>
> We haven't touched the tuning options as we don't really feel qualified to
> tell what needs changed from the default. Do you know of any suitable
> guides to get started?
>
> For some reason performance.cache-size is reported as both 32MB and 128MB.
> Is it worth reporting even for version 5.6?
>
> Here is the "gluster volume info" taken on the first node. Note that the
> third node (the arbiter) is currently taken out of the cluster:
> Volume Name: gvol0
> Type: Replicate
> Volume ID: fb5af69e-1c3e-4164-8b23-c1d7bec9b1b6
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 1 x 2 = 2
> Transport-type: tcp
> Bricks:
> Brick1: gfs1:/nodirectwritedata/gluster/gvol0
> Brick2: gfs2:/nodirectwritedata/gluster/gvol0
> Options Reconfigured:
> diagnostics.client-log-level: INFO
> performance.client-io-threads: off
> nfs.disable: on
> transport.address-family: inet
>
> Thanks for your help and advice.
>
>
> On Sat, 28 Dec 2019 at 17:46, Strahil <hunter86...@yahoo.com> wrote:
>
> Hi David,
>
> It seems that I have misread your quorum options, so just ignore that from
> my previous e-mail.
>
> Best Regards,
> Strahil Nikolov
> On Dec 27, 2019 15:38, Strahil <hunter86...@yahoo.com> wrote:
>
> Hi David,
>
> Gluster supports live rolling upgrade, so there is no need to redeploy at
> all - but the migration notes should be checked as some features must be
> disabled first.
> Also, the gluster client should remount in order to bump the gluster
> op-version.
>
> What kind of workload do you have ?
> I'm asking as there  are predefined (and recommended) settings located at
> /var/lib/gluster/groups .
> You can check the options for each group and cross-check the options
> meaning in the docs before  activating a setting.
>
> I still have a vague feeling  that ,during that high-peak of network
> bandwidth, there was  a  heal  going on. Have you checked that ?
>
> Also, sharding is very useful , when you work with large files and the
> heal is reduced to the size of the shard.
>
> N.B.: Once sharding is enabled, DO NOT DISABLE it - as you will loose
> your data.
>
> Using GLUSTER v7.1 (soon on CentOS  & Debian) allows using latest
> features  and optimizations while support from gluster Dev community is
> quite active.
>
> P.S: I'm wondering how 'performance.cache-size' can both be 32 MB and 128
> MB. Please double-check this (maybe I'm reading it wrong on my smartphone)
> and if needed raise a bug on bugzilla.redhat.com
>
> P.S2: Please  provide  'gluster volume info' as 'cluster.quorum-type' ->
> 'none' is not normal for replicated volumes (arbiters are using in replica
> volumes)
>
> According to the dooutput (otps://
> docs.gluster.org/en/latest/Administrator%20Guide/arbiter-volumes-and-quorum/)
> :
>
> *Note:** Enabling the arbiter feature **automatically** configures* 
> *client-quorum
> to 'auto'. This setting is **not** to be changed.*
>
> Here is my output (Hyperconverged Virtualization Cluster -> oVirt):
> # gluster volume info engine |  grep quorum
> cluster.quorum-type: auto
> cluster.server-quorum-type: server
>
> Changing quorum is more 'riskier' than other options, so you need to take
> necessary measures.  I think , we all  know what will happen , if the
> cluster is out of quorum and you change the quorum settings to more
> stringent ones :D
>
> P.S3: If you decide to reset  your gluster volume to the defaults, you can
> create a new volume (same type as current one), the  get the options for
> that volume and put them in a file and then bulk deploy via 'gluster volume
> set <Original Volume>   group custom-group' ,  where  the file is located
> on every gluster  server in the '/var/lib/gluster/groups' directory.
> Last ,  get rid of the sample volume.
>
> Best Regards,
> Strahil Nikolov
> On Dec 27, 2019 03:22, David Cunningham <dcunning...@voisonics.com> wrote:
>
> Hi Strahil,
>
> Our volume options are as below. Thanks for the suggestion to upgrade to
> version 6 or 7. We could do that be simply removing the current
> installation and installing the new one (since it's not live right now). We
> might have to convince the customer that it's likely to succeed though, as
> at the moment I think they believe that GFS is not going to work for them.
>
> Option                                  Value
>
> ------                                  -----
>
> cluster.lookup-unhashed                 on
>
> cluster.lookup-optimize                 on
>
> cluster.min-free-disk                   10%
>
> cluster.min-free-inodes                 5%
>
> cluster.rebalance-stats                 off
>
> cluster.subvols-per-directory           (null)
>
> cluster.readdir-optimize                off
>
> cluster.rsync-hash-regex                (null)
>
> cluster.extra-hash-regex                (null)
>
> cluster.dht-xattr-name                  trusted.glusterfs.dht
>
> cluster.randomize-hash-range-by-gfid    off
>
> cluster.rebal-throttle                  normal
>
> cluster.lock-migration                  off
>
> cluster.force-migration                 off
>
> cluster.local-volume-name               (null)
>
> cluster.weighted-rebalance              on
>
> cluster.switch-pattern                  (null)
>
> cluster.entry-change-log                on
>
> cluster.read-subvolume                  (null)
>
> cluster.read-subvolume-index            -1
>
> cluster.read-hash-mode                  1
>
> cluster.background-self-heal-count      8
>
> cluster.metadata-self-heal              on
>
> cluster.data-self-heal                  on
>
> cluster.entry-self-heal                 on
>
> cluster.self-heal-daemon                on
>
> cluster.heal-timeout                    600
>
> cluster.self-heal-window-size           1
>
> cluster.data-change-log                 on
>
> cluster.metadata-change-log             on
>
> cluster.data-self-heal-algorithm        (null)
>
> cluster.eager-lock                      on
>
> disperse.eager-lock                     on
>
> disperse.other-eager-lock               on
>
> disperse.eager-lock-timeout             1
>
> disperse.other-eager-lock-timeout       1
>
> cluster.quorum-type                     none
>
> cluster.quorum-count                    (null)
>
> cluster.choose-local                    true
>
> cluster.self-heal-readdir-size          1KB
>
> cluster.post-op-delay-secs              1
>
> cluster.ensure-durability               on
>
> cluster.consistent-metadata             no
>
> cluster.heal-wait-queue-length          128
>
> cluster.favorite-child-policy           none
>
> cluster.full-lock                       yes
>
> cluster.stripe-block-size               128KB
>
> cluster.stripe-coalesce                 true
>
> diagnostics.latency-measurement         off
>
> diagnostics.dump-fd-stats               off
>
> diagnostics.count-fop-hits              off
>
> diagnostics.brick-log-level             INFO
>
> diagnostics.client-log-level            INFO
>
> diagnostics.brick-sys-log-level         CRITICAL
>
> diagnostics.client-sys-log-level        CRITICAL
>
> diagnostics.brick-logger                (null)
>
> diagnostics.client-logger               (null)
>
> diagnostics.brick-log-format            (null)
>
> diagnostics.client-log-format           (null)
>
> diagnostics.brick-log-buf-size          5
>
> diagnostics.client-log-buf-size         5
>
> diagnostics.brick-log-flush-timeout     120
>
> diagnostics.client-log-flush-timeout    120
>
> diagnostics.stats-dump-interval         0
>
> diagnostics.fop-sample-interval         0
>
> diagnostics.stats-dump-format           json
>
> diagnostics.fop-sample-buf-size         65535
>
> diagnostics.stats-dnscache-ttl-sec      86400
>
> performance.cache-max-file-size         0
>
> performance.cache-min-file-size         0
>
> performance.cache-refresh-timeout       1
>
> performance.cache-priority
>
> performance.cache-size                  32MB
>
> performance.io-thread-count             16
>
> performance.high-prio-threads           16
>
> performance.normal-prio-threads         16
>
> performance.low-prio-threads            16
>
> performance.least-prio-threads          1
>
> performance.enable-least-priority       on
>
> performance.iot-watchdog-secs           (null)
>
> performance.iot-cleanup-disconnected-reqsoff
>
> performance.iot-pass-through            false
>
> performance.io-cache-pass-through       false
>
> performance.cache-size                  128MB
>
> performance.qr-cache-timeout            1
>
> performance.cache-invalidation          false
>
> performance.ctime-invalidation          false
>
> performance.flush-behind                on
>
> performance.nfs.flush-behind            on
>
> performance.write-behind-window-size    1MB
>
> performance.resync-failed-syncs-after-fsyncoff
>
> performance.nfs.write-behind-window-size1MB
>
> performance.strict-o-direct             off
>
> performance.nfs.strict-o-direct         off
>
> performance.strict-write-ordering       off
>
> performance.nfs.strict-write-ordering   off
>
> performance.write-behind-trickling-writeson
>
> performance.aggregate-size              128KB
>
> performance.nfs.write-behind-trickling-writeson
>
> performance.lazy-open                   yes
>
> performance.read-after-open             yes
>
> performance.open-behind-pass-through    false
>
> performance.read-ahead-page-count       4
>
> performance.read-ahead-pass-through     false
>
> performance.readdir-ahead-pass-through  false
>
> performance.md-cache-pass-through       false
>
> performance.md-cache-timeout            1
>
> performance.cache-swift-metadata        true
>
> performance.cache-samba-metadata        false
>
> performance.cache-capability-xattrs     true
>
> performance.cache-ima-xattrs            true
>
> performance.md-cache-statfs             off
>
> performance.xattr-cache-list
>
> performance.nl-cache-pass-through       false
>
> features.encryption                     off
>
> encryption.master-key                   (null)
>
> encryption.data-key-size                256
>
> encryption.block-size                   4096
>
> network.frame-timeout                   1800
>
> network.ping-timeout                    42
>
> network.tcp-window-size                 (null)
>
> network.remote-dio                      disable
>
> client.event-threads                    2
>
> client.tcp-user-timeout                 0
>
> client.keepalive-time                   20
>
> client.keepalive-interval               2
>
> client.keepalive-count                  9
>
> network.tcp-window-size                 (null)
>
> network.inode-lru-limit                 16384
>
> auth.allow                              *
>
> auth.reject                             (null)
>
> transport.keepalive                     1
>
> server.allow-insecure                   on
>
> server.root-squash                      off
>
> server.anonuid                          65534
>
> server.anongid                          65534
>
> server.statedump-path                   /var/run/gluster
>
> server.outstanding-rpc-limit            64
>
> server.ssl                              (null)
>
> auth.ssl-allow                          *
>
> server.manage-gids                      off
>
> server.dynamic-auth                     on
>
> client.send-gids                        on
>
> server.gid-timeout                      300
>
> server.own-thread                       (null)
>
> server.event-threads                    1
>
> server.tcp-user-timeout                 0
>
> server.keepalive-time                   20
>
> server.keepalive-interval               2
>
> server.keepalive-count                  9
>
> transport.listen-backlog                1024
>
> ssl.own-cert                            (null)
>
> ssl.private-key                         (null)
>
> ssl.ca-list                             (null)
>
> ssl.crl-path                            (null)
>
> ssl.certificate-depth                   (null)
>
> ssl.cipher-list                         (null)
>
> ssl.dh-param                            (null)
>
> ssl.ec-curve                            (null)
>
> transport.address-family                inet
>
> performance.write-behind                on
>
> performance.read-ahead                  on
>
> performance.readdir-ahead               on
>
> performance.io-cache                    on
>
> performance.quick-read                  on
>
> performance.open-behind                 on
>
> performance.nl-cache                    off
>
> performance.stat-prefetch               on
>
> performance.client-io-threads           off
>
> performance.nfs.write-behind            on
>
> performance.nfs.read-ahead              off
>
> performance.nfs.io-cache                off
>
> performance.nfs.quick-read              off
>
> performance.nfs.stat-prefetch           off
>
> performance.nfs.io-threads              off
>
> performance.force-readdirp              true
>
> performance.cache-invalidation          false
>
> features.uss                            off
>
> features.snapshot-directory             .snaps
>
> features.show-snapshot-directory        off
>
> features.tag-namespaces                 off
>
> network.compression                     off
>
> network.compression.window-size         -15
>
> network.compression.mem-level           8
>
> network.compression.min-size            0
>
> network.compression.compression-level   -1
>
> network.compression.debug               false
>
> features.default-soft-limit             80%
>
> features.soft-timeout                   60
>
> features.hard-timeout                   5
>
> features.alert-time                     86400
>
> features.quota-deem-statfs              off
>
> geo-replication.indexing                off
>
> geo-replication.indexing                off
>
> geo-replication.ignore-pid-check        off
>
> geo-replication.ignore-pid-check        off
>
> features.quota                          off
>
> features.inode-quota                    off
>
> features.bitrot                         disable
>
> debug.trace                             off
>
> debug.log-history                       no
>
> debug.log-file                          no
>
> debug.exclude-ops                       (null)
>
> debug.include-ops                       (null)
>
> debug.error-gen                         off
>
> debug.error-failure                     (null)
>
> debug.error-number                      (null)
>
> debug.random-failure                    off
>
> debug.error-fops                        (null)
>
> nfs.disable                             on
>
> features.read-only                      off
>
> features.worm                           off
>
> features.worm-file-level                off
>
> features.worm-files-deletable           on
>
> features.default-retention-period       120
>
> features.retention-mode                 relax
>
> features.auto-commit-period             180
>
> storage.linux-aio                       off
>
> storage.batch-fsync-mode                reverse-fsync
>
> storage.batch-fsync-delay-usec          0
>
> storage.owner-uid                       -1
>
> storage.owner-gid                       -1
>
> storage.node-uuid-pathinfo              off
>
> storage.health-check-interval           30
>
> storage.build-pgfid                     off
>
> storage.gfid2path                       on
>
> storage.gfid2path-separator             :
>
> storage.reserve                         1
>
> storage.health-check-timeout            10
>
> storage.fips-mode-rchecksum             off
>
> storage.force-create-mode               0000
>
> storage.force-directory-mode            0000
>
> storage.create-mask                     0777
>
> storage.create-directory-mask           0777
>
> storage.max-hardlinks                   100
>
> storage.ctime                           off
>
> storage.bd-aio                          off
>
> config.gfproxyd                         off
>
> cluster.server-quorum-type              off
>
> cluster.server-quorum-ratio             0
>
> changelog.changelog                     off
>
> changelog.changelog-dir                 {{ brick.path
> }}/.glusterfs/changelogs
> changelog.encoding                      ascii
>
> changelog.rollover-time                 15
>
> changelog.fsync-interval                5
>
> changelog.changelog-barrier-timeout     120
>
> changelog.capture-del-path              off
>
> features.barrier                        disable
>
> features.barrier-timeout                120
>
> features.trash                          off
>
> features.trash-dir                      .trashcan
>
> features.trash-eliminate-path           (null)
>
> features.trash-max-filesize             5MB
>
> features.trash-internal-op              off
>
> cluster.enable-shared-storage           disable
>
> cluster.write-freq-threshold            0
>
> cluster.read-freq-threshold             0
>
> cluster.tier-pause                      off
>
> cluster.tier-promote-frequency          120
>
> cluster.tier-demote-frequency           3600
>
> cluster.watermark-hi                    90
>
> cluster.watermark-low                   75
>
> cluster.tier-mode                       cache
>
> cluster.tier-max-promote-file-size      0
>
> cluster.tier-max-mb                     4000
>
> cluster.tier-max-files                  10000
>
> cluster.tier-query-limit                100
>
> cluster.tier-compact                    on
>
> cluster.tier-hot-compact-frequency      604800
>
> cluster.tier-cold-compact-frequency     604800
>
> features.ctr-enabled                    off
>
> features.record-counters                off
>
> features.ctr-record-metadata-heat       off
>
> features.ctr_link_consistency           off
>
> features.ctr_lookupheal_link_timeout    300
>
> features.ctr_lookupheal_inode_timeout   300
>
> features.ctr-sql-db-cachesize           12500
>
> features.ctr-sql-db-wal-autocheckpoint  25000
>
> features.selinux                        on
>
> locks.trace                             off
>
> locks.mandatory-locking                 off
>
> cluster.disperse-self-heal-daemon       enable
>
> cluster.quorum-reads                    no
>
> client.bind-insecure                    (null)
>
> features.shard                          off
>
> features.shard-block-size               64MB
>
> features.shard-lru-limit                16384
>
> features.shard-deletion-rate            100
>
> features.scrub-throttle                 lazy
>
> features.scrub-freq                     biweekly
>
> features.scrub                          false
>
> features.expiry-time                    120
>
> features.cache-invalidation             off
>
> features.cache-invalidation-timeout     60
>
> features.leases                         off
>
> features.lease-lock-recall-timeout      60
>
> disperse.background-heals               8
>
> disperse.heal-wait-qlength              128
>
> cluster.heal-timeout                    600
>
> dht.force-readdirp                      on
>
> disperse.read-policy                    gfid-hash
>
> cluster.shd-max-threads                 1
>
> cluster.shd-wait-qlength                1024
>
> cluster.locking-scheme                  full
>
> cluster.granular-entry-heal             no
>
> features.locks-revocation-secs          0
>
> features.locks-revocation-clear-all     false
>
> features.locks-revocation-max-blocked   0
>
> features.locks-monkey-unlocking         false
>
> features.locks-notify-contention        no
>
> features.locks-notify-contention-delay  5
>
> disperse.shd-max-threads                1
>
> disperse.shd-wait-qlength               1024
>
> disperse.cpu-extensions                 auto
>
> disperse.self-heal-window-size          1
>
> cluster.use-compound-fops               off
>
> performance.parallel-readdir            off
>
> performance.rda-request-size            131072
>
> performance.rda-low-wmark               4096
>
> performance.rda-high-wmark              128KB
>
> performance.rda-cache-limit             10MB
>
> performance.nl-cache-positive-entry     false
>
> performance.nl-cache-limit              10MB
>
> performance.nl-cache-timeout            60
>
> cluster.brick-multiplex                 off
>
> cluster.max-bricks-per-process          0
>
> disperse.optimistic-change-log          on
>
> disperse.stripe-cache                   4
>
> cluster.halo-enabled                    False
>
> cluster.halo-shd-max-latency            99999
>
> cluster.halo-nfsd-max-latency           5
>
> cluster.halo-max-latency                5
>
> cluster.halo-max-replicas               99999
>
> cluster.halo-min-replicas               2
>
> cluster.daemon-log-level                INFO
>
> debug.delay-gen                         off
>
> delay-gen.delay-percentage              10%
>
> delay-gen.delay-duration                100000
>
> delay-gen.enable
>
> disperse.parallel-writes                on
>
> features.sdfs                           on
>
> features.cloudsync                      off
>
> features.utime                          off
>
> ctime.noatime                           on
>
> feature.cloudsync-storetype             (null)
>
>
> Thanks again.
>
>
> On Wed, 25 Dec 2019 at 05:51, Strahil <hunter86...@yahoo.com> wrote:
>
> Hi David,
>
> On Dec 24, 2019 02:47, David Cunningham <dcunning...@voisonics.com> wrote:
> >
> > Hello,
> >
> > In testing we found that actually the GFS client having access to all 3
> nodes made no difference to performance. Perhaps that's because the 3rd
> node that wasn't accessible from the client before was the arbiter node?
> It makes sense, as no data is being generated towards the arbiter.
> > Presumably we shouldn't have an arbiter node listed under
> backupvolfile-server when mounting the filesystem? Since it doesn't store
> all the data surely it can't be used to serve the data.
>
> I have my arbiter defined as last backup and no issues so far. At least
> the admin can easily identify the bricks from the mount options.
>
> > We did have direct-io-mode=disable already as well, so that wasn't a
> factor in the performance problems.
>
> Have you checked if the client vedsion ia not too old.
> Also you can check the cluster's  operation cersion:
> # gluster volume get all cluster.max-op-version
> # gluster volume get all cluster.op-version
>
> Cluster's op version should be at max-op-version.
>
> In my mind come 2  options:
> A) Upgrade to latest GLUSTER v6 or even v7 ( I know it won't be easy) and
> then set the op version to highest possible.
> # gluster volume get all cluster.max-op-version
> # gluster volume get all cluster.op-version
>
> B)  Deploy a NFS Ganesha server and connect the client over NFS v4.2 (and
> control the parallel connections from Ganesha).
>
> Can you provide your  Gluster volume's  options?
> 'gluster volume get <VOLNAME>  all'
>
> > Thanks again for any advice.
> >
> >
> >
> > On Mon, 23 Dec 2019 at 13:09, David Cunningham <
> dcunning...@voisonics.com> wrote:
> >>
> >> Hi Strahil,
> >>
> >> Thanks for that. We do have one backup server specified, but will add
> the second backup as well.
> >>
> >>
> >> On Sat, 21 Dec 2019 at 11:26, Strahil <hunter86...@yahoo.com> wrote:
> >>>
> >>> Hi David,
> >>>
> >>> Also consider using the  mount option to specify backup server via
> 'backupvolfile-server=server2:server3' (you can define more but I don't
> thing replica volumes  greater that 3 are usefull (maybe  in some special
> cases).
> >>>
> >>> In such way, when the primary is lost, your client can reach a backup
> one without disruption.
> >>>
> >>> P.S.: Client may 'hang' - if the primary server got rebooted
> ungracefully - as the communication must timeout before FUSE addresses the
> next server. There is a special script for  killing gluster processes in
> '/usr/share/gluster/scripts' which can be used  for  setting up a systemd
> service to do that for you on shutdown.
> >>>
> >>> Best Regards,
> >>> Strahil Nikolov
> >>>
> >>> On Dec 20, 2019 23:49, David Cunningham <dcunning...@voisonics.com>
> wrote:
> >>>>
> >>>> Hi Stahil,
> >>>>
> >>>> Ah, that is an important point. One of the nodes is not accessible
> from the client, and we assumed that it only needed to reach the GFS node
> that was mounted so didn't think anything of it.
> >>>>
> >>>> We will try making all nodes accessible, as well as
> "direct-io-mode=disable".
> >>>>
> >>>> Thank you.
> >>>>
> >>>>
> >>>> On Sat, 21 Dec 2019 at 10:29, Strahil Nikolov <hunter86...@yahoo.com>
> wrote:
> >>>>>
> >>>>> Actually I haven't clarified myself.
> >>>>> FUSE mounts on the client side is connecting directly to all bricks
> consisted of the volume.
> >>>>> If for some reason (bad routing, firewall blocked) there could be
> cases where the client can reach 2 out of 3 bricks and this can constantly
> cause healing to happen (as one of the bricks is never updated) which will
> degrade the performance and cause excessive network usage.
> >>>>> As your attachment is from one of the gluster nodes, this could be
> the case.
> >>>>>
> >>>>> Best Regards,
> >>>>> Strahil Nikolov
> >>>>>
> >>>>> В петък, 20 декември 2019 г., 01:49:56 ч. Гринуич+2, David
> Cunningham <dcunning...@voisonics.com> написа:
> >>>>>
> >>>>>
> >>>>> Hi Strahil,
> >>>>>
> >>>>> The chart attached to my original email is taken from the GFS server.
> >>>>>
> >>>>> I'm not sure what you mean by accessing all bricks simultaneously.
> We've mounted it from the client like this:
> >>>>> gfs1:/gvol0 /mnt/glusterfs/ glusterfs
> defaults,direct-io-mode=disable,_netdev,backupvolfile-server=gfs2,fetch-attempts=10
> 0 0
> >>>>>
> >>>>> Should we do something different to access all bricks simultaneously?
> >>>>>
> >>>>> Thanks for your help!
> >>>>>
> >>>>>
> >>>>> On Fri, 20 Dec 2019 at 11:47, Strahil Nikolov <hunter86...@yahoo.com>
> wrote:
> >>>>>>
> >>>>>> I'm not sure if you did measure the traffic from client side
> (tcpdump on a client machine) or from Server side.
> >>>>>>
> >>>>>> In both cases , please verify that the client accesses all bricks
> simultaneously, as this can cause unnecessary heals.
> >>>>>>
> >>>>>> Have you thought about upgrading to v6? There are some enhancements
> in v6 which could be beneficial.
> >>>>>>
> >>>>>> Yet, it is indeed strange that so much traffic is generated with
> FUSE.
> >>>>>>
> >>>>>> Another aproach is to test with NFSGanesha which suports pNFS and
> can natively speak with Gluster, which cant bring you closer to the
> previous setup and also provide some extra performance.
> >>>>>>
> >>>>>>
> >>>>>> Best Regards,
> >>>>>> Strahil Nikolov
> >>>>>>
> >>>>>>
> >>>>>>
> >>
> >>
> >> --
> >> David Cunningham, Voisonics Limited
> >> http://voisonics.com/
> >> USA: +1 213 221 1092
> >> New Zealand: +64 (0)28 2558 3782
> >
> >
> >
> > --
> > David Cunningham, Voisonics Limited
> > http://voisonics.com/
> > USA: +1 213 221 1092
> > New Zealand: +64 (0)28 2558 3782
>
> Best Regards,
> Strahil Nikolov
>
>
>
> --
> David Cunningham, Voisonics Limited
> http://voisonics.com/
> USA: +1 213 221 1092
> New Zealand: +64 (0)28 2558 3782
>
>
>
> --
> David Cunningham, Voisonics Limited
> http://voisonics.com/
> USA: +1 213 221 1092
> New Zealand: +64 (0)28 2558 3782
>
>

-- 
David Cunningham, Voisonics Limited
http://voisonics.com/
USA: +1 213 221 1092
New Zealand: +64 (0)28 2558 3782

________

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] GFS performance under heavy traffic

Reply via email to