Re: [ceph-users] (yet another) multi active mds advise needed

2018-05-19 Thread Webert de Souza Lima
Hi Daniel,

Thanks for clarifying.
I'll have a look at dirfrag option.

Regards,
Webert Lima

Em sáb, 19 de mai de 2018 01:18, Daniel Baumann 
escreveu:

> On 05/19/2018 01:13 AM, Webert de Souza Lima wrote:
> > New question: will it make any difference in the balancing if instead of
> > having the MAIL directory in the root of cephfs and the domains's
> > subtrees inside it, I discard the parent dir and put all the subtress
> right in cephfs root?
>
> the balancing between the MDS is influenced by which directories are
> accessed, the currently accessed directory-trees are diveded between the
> MDS's (also check the dirfrag option in the docs). assuming you have the
> same access pattern, the "fragmentation" between the MDS's happens at
> these "target-directories", so it doesn't matter if these directories
> are further up or down in the same filesystem tree.
>
> in the multi-MDS scenario where the MDS serving rank 0 fails, the
> effects in the moment of the failure for any cephfs client accessing a
> directory/file are the same (as described in an earlier mail),
> regardless on which level the directory/file is within the filesystem.
>
> Regards,
> Daniel
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] (yet another) multi active mds advise needed

2018-05-18 Thread Daniel Baumann
On 05/19/2018 01:13 AM, Webert de Souza Lima wrote:
> New question: will it make any difference in the balancing if instead of
> having the MAIL directory in the root of cephfs and the domains's
> subtrees inside it, I discard the parent dir and put all the subtress right 
> in cephfs root?

the balancing between the MDS is influenced by which directories are
accessed, the currently accessed directory-trees are diveded between the
MDS's (also check the dirfrag option in the docs). assuming you have the
same access pattern, the "fragmentation" between the MDS's happens at
these "target-directories", so it doesn't matter if these directories
are further up or down in the same filesystem tree.

in the multi-MDS scenario where the MDS serving rank 0 fails, the
effects in the moment of the failure for any cephfs client accessing a
directory/file are the same (as described in an earlier mail),
regardless on which level the directory/file is within the filesystem.

Regards,
Daniel
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] (yet another) multi active mds advise needed

2018-05-18 Thread Webert de Souza Lima
Hi Patrick

On Fri, May 18, 2018 at 6:20 PM Patrick Donnelly 
wrote:

> Each MDS may have multiple subtrees they are authoritative for. Each
> MDS may also replicate metadata from another MDS as a form of load
> balancing.


Ok, its good to know that it actually does some load balance. Thanks.
New question: will it make any difference in the balancing if instead of
having the MAIL directory in the root of cephfs and the domains's subtrees
inside it,
I discard the parent dir and put all the subtress right in cephfs root?


> standby-replay daemons are not available to take over for ranks other
> than the one it follows. So, you would want to have a standby-replay
> daemon for each rank or just have normal standbys. It will likely
> depend on the size of your MDS (cache size) and available hardware.
>
> It's best if y ou see if the normal balancer (especially in v12.2.6
> [1]) can handle the load for you without trying to micromanage things
> via pins. You can use pinning to isolate metadata load from other
> ranks as a stop-gap measure.
>

Ok I will start with the simplest way. This can be changed after deployment
if it comes to be the case.

On Fri, May 18, 2018 at 6:38 PM Daniel Baumann 
wrote:

> jftr, having 3 active mds and 3 standby-replay resulted May 20217 in a
> longer downtime for us due to http://tracker.ceph.com/issues/21749
>
> we're not using standby-replay MDS's anymore but only "normal" standby,
> and didn't have had any problems anymore (running kraken then, upgraded
> to luminous last fall).
>

Thank you very much for your feedback Daniel. I'll go for the regular
standby daemons, then.

Regards,

Webert Lima
DevOps Engineer at MAV Tecnologia
*Belo Horizonte - Brasil*
*IRC NICK - WebertRLZ*
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] (yet another) multi active mds advise needed

2018-05-18 Thread Daniel Baumann
On 05/18/2018 11:19 PM, Patrick Donnelly wrote:
> So, you would want to have a standby-replay
> daemon for each rank or just have normal standbys. It will likely
> depend on the size of your MDS (cache size) and available hardware.

jftr, having 3 active mds and 3 standby-replay resulted May 20217 in a
longer downtime for us due to http://tracker.ceph.com/issues/21749

(http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-October/thread.html#21390
- thanks again for the help back then, still much appreciated)

we're not using standby-replay MDS's anymore but only "normal" standby,
and didn't have had any problems anymore (running kraken then, upgraded
to luminous last fall).

Regards,
Daniel
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] (yet another) multi active mds advise needed

2018-05-18 Thread Patrick Donnelly
Hello Webert,

On Fri, May 18, 2018 at 1:10 PM, Webert de Souza Lima
 wrote:
> Hi,
>
> We're migrating from a Jewel / filestore based cephfs archicture to a
> Luminous / buestore based one.
>
> One MUST HAVE is multiple Active MDS daemons. I'm still lacking knowledge of
> how it actually works.
> After reading the docs and ML we learned that they work by sort of dividing
> the responsibilities, each with his own and only directory subtree. (please
> correct me if I'm wrong).

Each MDS may have multiple subtrees they are authoritative for. Each
MDS may also replicate metadata from another MDS as a form of load
balancing.

> Question 1: I'd like to know if it is viable to have 4 MDS daemons, being 3
> Active and 1 Standby (or Standby-Replay if that's still possible with
> multi-mds).

standby-replay daemons are not available to take over for ranks other
than the one it follows. So, you would want to have a standby-replay
daemon for each rank or just have normal standbys. It will likely
depend on the size of your MDS (cache size) and available hardware.

> Basically, what we have is 2 subtrees used by dovecot: INDEX and MAIL.
> Their tree is almost identical but INDEX stores all dovecot metadata with
> heavy IO going on and MAIL stores actual email files, with much more writes
> than reads.
>
> I don't know by now which one could bottleneck the MDS servers most so I
> wonder if I can take metrics on MDS usage per pool when it's deployed.
> Question 2: If the metadata workloads are very different I wonder if I can
> isolate them, like pinning MDS servers X and Y to one of the directories.

It's best if y ou see if the normal balancer (especially in v12.2.6
[1]) can handle the load for you without trying to micromanage things
via pins. You can use pinning to isolate metadata load from other
ranks as a stop-gap measure.

[1] https://github.com/ceph/ceph/pull/21412

-- 
Patrick Donnelly
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com