Re: [Gluster-users] Sudden, dramatic performance drops with Glusterfs

2019-11-09 Thread Strahil
There are options that can help a little bit with the ls/find.

Still, many devs will need to know your settings, so the volume's info is very 
important.

Try the 'noatime,nodiratime' (if ZFS supports them).
Also, as this is a new cluster you can try to setup XFS and verify if the issue 
is the same.
RedHat provide an XFS options' calculator but it requires aby kind of 
subscription (even dev subscription is enough).


P.S.: As this is a new cluster, I would recommend you to switch to gluster v6.6 
as v7 is too new (for my taste).

If the issue on XFS cannot be reproduced - the issue is either in the ZFS or in 
the kernel tunables (sysctl).

I'm not sure what is the most suitable I/O scheduler for ZFS, so you should 
check that too.



Edit: What kind of workload do you expect (size and number files, read:write 
ratio, etc).

Best Regards,
Strahil Nikolov
On Nov 8, 2019 10:32, Michael Rightmire  wrote:
>
> Hi  Strahil, 
>
> Thanks for the reply. See below. 
>
> Also, as an aside, I tested by installing a single Cenots 7 machine with the 
> ZBOD, installed gluster and ZFSonLinux as recommended at..
>  
> https://staged-gluster-docs.readthedocs.io/en/release3.7.0beta1/Administrator%20Guide/Gluster%20On%20ZFS/
>
> And created a gluster volume consisting of one brick made up of a local ZFS 
> raidz2, copied about 4 TB of data to it, and am having the same issue. 
>
> The biggest part of the issue is with things like "ls" and "find". IF I read 
> a single file, or write a single file it works great. But if I run rsync 
> (which does alot of listing, writing, renaming, etc) it is slow as garbage. 
> I.e. a find command that will finish in 30 seconds when run directly on the 
> underlying ZFS directory, takes about an hour. 
>
>  
> Strahil wrote on 08-Nov-19 05:39:
>
> Hi Michael,
>
> What is your 'gluster volume info  ' showing.
>
> I've been playing with the install (since it's a fresh machine) so I can't 
> give you verbatim output. However, it was showing two bricks, one on each 
> server, started, and apparently healthy. 
>>
>> How much is your zpool full ? Usually when it gets too full, the ZFS 
>> performance drops seriosly.
>
> The zpool is only at about 30% usage. It's a new server setup.
> We have about 10TB of data on a 30TB volume (made up of two 30TB ZFS raidz2 
> bricks, each residing on different servers, via a 10GB dedicated Ethernet 
> connection.) 
>>
>> Try to rsync a file directly to one of the bricks, then to the other brick 
>> (don't forget to remove the files after that, as gluster will not know about 
>> them).
>
> If I rsync manually, or scp a file directly to the zpool bricks (outside of 
> gluster) I get 30-100MBytes/s (depending on what I'm copying.)
> If I rsync THROUGH gluster (via the glusterfs mounts) I get 1 - 5MB/s
>>
>> What are your mounting options ? Usually 'noatime,nodiratime' are a good 
>> start.
>
> I'll try these. Currently using ...
> (mounting TO serverA) serverA:/homes /glusterfs/homes    glusterfs 
> defaults,_netdev 0 0
>>
>> Are you using ZFS provideed by Ubuntu packagees or directly from ZOL project 
>> ?
>
> ZFS provided by Ubuntu 18 repo...
>   libzfs2linux/bionic-updates,now 0.7.5-1ubuntu16.6 amd64 
> [installed,automatic]
>   zfs-dkms/bionic-updates,bionic-updates,now 0.7.5-1ubuntu16.6 all [installed]
>   zfs-zed/bionic-updates,now 0.7.5-1ubuntu16.6 amd64 [installed,automatic]
>   zfsutils-linux/bionic-updates,now 0.7.5-1ubuntu16.6 amd64 [installed]
>
> Gluster provided by. "add-apt-repository ppa:gluster/glusterfs-5" ...
>   glusterfs 5.10
>   Repository revision: git://git.gluster.org/glusterfs.git
>
>   
>>
>> Best Regards,
>> Strahil Nikolov
>>
>> On Nov 6, 2019 12:50, Michael Rightmire  wrote:
>>>
>>> Hello list!
>>>
>>> I'm new to Glusterfs in general. We have chosen to use it as our 
>>> distributed file system on a new set of HA file servers. 
>>>
>>> The setup is: 
>>> 2 SUPERMICRO SuperStorage Server 6049PE1CR36L with 24-4TB spinning disks 
>>> and NVMe for cache and slog.
>>> HBA not RAID card 
>>> Ubuntu 18.04 server (on both systems)
>>> ZFS filestorage
>>> Glusterfs 5.10
>>>
>>> Step one was to install Ubuntu, ZFS, and gluster. This all went without 
>>> issue. 
>>> We have 3 ZFS raidz2 identical on both servers
>>> We have three glusterfs mirrored volumes - 1 attached to each raidz <

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/118564314

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/118564314

Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] hook script question related to ctdb, shared storage, and bind mounts

2019-11-09 Thread Erik Jacobson
> Here is what was the setup :

I thought I'd share an update in case it helps others. Your ideas
inspired me to try a different approach.

We support 4 main distros (and a 2 variants of some). We try not to
provide our own versions of distro-supported packages like CTDB where
possible. So a concern for me in modifying services is that they could
be replaced in package updates. There are ways to mitigate that but
that thought combined with yourr ideas led me to try this:

- Be sure ctdb service is disabled
- Added a systemd serivce of my own, oneshot, that runs a helper script
- The helper script first ensures the gluster volumes show up
  (I use localhost in my case and besides, in our environment, we don't
  want CTDB to have a public IP anyway until NFS can be served so this
  helps there too)
- Even with the gluster volume showing good, during init startup, first
  attempts to mount gluster volumes fail. So the helper script keeps
  looping until they work. It seems they work on the 2nd try (after a 3s
  sleep at failure).
- Once the mounts are confirmed working and mounted, then my helper
  starts the ctdb service.
- Awkward CTDB problems (where the lock check sometimes fails to detect
  a lock problem) are avoided since we won't start CTDB until we're 100%
  sure the gluster lock is mounted and pointing at gluster.

The above is working in prototype form so I'm going to start adding
my bind mounts to the equation.

I think I have a solution that will work now and I thank you so much for
the ideas.

I'm taking things from prototype form now on to something we can provide
people.


With regards to pacemaker. There are a few pacemaker solutions that I've
touched, and one I even helped implement. Now, it could be that I'm not
an expert at writing rules, but pacemaker seems to have often given us
more trouble than the problem it solves. I believe this is due to the
complexity of the software and the power of it. I am not knocking
pacemaker. However, a person really has to be a pacemaker expert
to not make a mistake that could cause a down time. So I have attempted
to avoid pacemaker in the new solution. I know there are down sides --
fencing is there for a reason -- but as far as I can tell the decision
has been right for us. CTDB is less complicated even if does not provide
100% true full HA abilities. That said, in the solution, I've been
careful to future-proof a move to pacemaker. For example, on the gluster
servers/NFS servers, I bring up IP aliases (interfaces) on the network the
BMCs reside so we're seamlessly able to switch to pacemaker with
IPMI/BMC/redfish fencing later if needed without causing too much pain in
the field with deployed servers.

I do realize there are tools to help configure pacemaker for you. Some
that I've tried have given me mixed results, perhaps due to the
complexity of networking setup in the solutions we have.

As we start to deploy this to more locations, I'll gain a feel for if
a move to pacemaker is right or not. I just share this in the interest
of learning. I'm always willing to learn and improve if I've overlooked
something.

Erik


Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/118564314

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/118564314

Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users