Hello,

>4 cores is quite low, especially when healing.
The 4 cores (and, by default, 8GB RAM), is a standard offering in our 
situations.  It would be up to the specific usage our end-users to see of that 
is enough (most deployed glusters in our environment have an average of 5% 
total usage, so that does seem to be quite enough).  Even this particular 
gluster hardly even goes above 10/15% .. except when rebalancing after adding 
bricks (then shoots to 80% during the several hours of rebalancing).

>Why not FUSE ? Ganesha is suitable for UNIX and BSD systems that do not 
>support FUSE.
When we designed our offering we did had a hard-time choosing the default..  
Fuse vs NFS...  Since we have (very) large environment,  loads of 
network-segments, layer-7 firewalls across subnets, and a variety of possible 
clients (windows, aix, solaris, linux) we opted for NFS (via Ganesha).  Each OS 
can handle NFS and by sticking to NFSv4.0 with TCP it makes opening firewalls 
to only TCP/2049 a lot simpler (else all of the different ports needed for 
brick connections + glusterfsd itself need to be opened).

>Consider increasing the 'token' and 'consensus' to a more meaningful values -> 
>start with 10s token for example.
That is actually something we did not yet look at, thanks for the suggestion.. 
we'd would need to test this but would sound like a good recommendation 
(currently they're at RedHat's defaults).

>For performance improvements , I would add some SSDs in the game (tier 1+ 
>storage) and use the SSD-based LUNs as lvm caching.
As much as we'd like to, unfortunately not possible in our environment.  We use 
a 'private cloud' (which is not even a cloud, just a beefy vmware environment), 
and each tenant/consumer gets the same type of resources.
Problems of a large (and often sluggish) financial company....
It currently hosts almost 20.000 VM instances in total (80% RHEL based) and 
among that appr 55 gluster-clusters.

Customizing the corosync values to somewhat larger times does sound that it can 
help in this case (less busy glusters seem to be able to cope well), thanks for 
this suggestion!

Regards,
Nico


----- Oorspronkelijk bericht -----
Van: "Strahil Nikolov" <hunter86...@yahoo.com>
Aan: "gluster-users" <gluster-users@gluster.org>, "Nico van Royen" 
<n...@van-royen.nl>
Verzonden: Maandag 19 oktober 2020 05:56:20
Onderwerp: Re: [Gluster-users] Setup recommendations

>Size is not that big, 600GB space with around half of that actually used.  
>GlusterFS servers themselves each have 4 cores and 12GB memory.  It might also 
>be important to note that these are VMware hosted nodes that make use of  SAN 
>storage for the datastores.

4 cores is quite low, especially when healing.

>Connected to that NFS (ganesha) exported share are just over 100 clients, all 
>RHEL6 and RHEL7, some spanning 10 network hops away.  All of those clients are 
>(currently) using the same virtual-IP, so all end up on the same server.

Why not FUSE ? Ganesha is suitable for UNIX and BSD systems that do not support 
FUSE.

>Note that I mentioned 'should', since at times it had anywhere between 250.000 
>and 1 million files in it (which of course is not advised).  Using some kind 
>of hashing (subfolders spread per day/hour etc) was also already advised.
If you have multiple subdomains (from replicate -> to distributed-replicated) , 
you can also spread the load - yet 'find' won't be faster :)


Problems that are often seen:
>- Any kind of operation on VMware such as a vMotion, creating a VM snapshot 
>etc. on the node that has these 100+ clients connected causes such a temporary 
>pause that pacemaker decides to switch the resources (causing a failover of 
>the virtual IP address, thus clients connected suffer delay).  
RH corosync defaults are not suitable for VMs. I prefer SUSE's defaults.
Consider increasing the 'token' and 'consensus' to a more meaningful values -> 
start with 10s token for example.

>One would expect this to last just shy under a minute, then clients would 
>happily continue.  However connected clients are stuck with a non-working 
>mountpoint (commands as df, ls, find etc simply hang.. they go into an 
>uninterruptible sleep).
In regular HA NFS, there is a "notify" resource that notifies the clients about 
the failover. The stale happens because your IP is brought before the NFS 
export is ready. As you haven't provided HA details, I can't help much there.

>Mount are 'hard' mounts to insure guaranteed writes.
That's good. Also is needed for the HA to properly work.

>- Once the number of files are over the 100.000 mark (again into a single, 
>unhashed, folder) any operation on that share becomes very sluggish (even a 
>df, on a client, would take 20/30 seconds,  a find command would take minutes 
>to complete).
I think it's expected...

>If anyone can spot any ideas for improvement ?
I would try to first switch to 'replica 3 arbiter 1' as current setup is 
wasting storage, next switch the clients to FUSE.
For performance improvements , I would add some SSDs in the game (tier 1+ 
storage) and use the SSD-based LUNs as lvm caching.

Best Regards,
Strahil Nikolov
________



Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968

Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Reply via email to