Re: [ceph-users] Question: CephFS + Bluestore

Webert de Souza Lima Wed, 09 May 2018 10:25:02 -0700

Hey Jon!

On Wed, May 9, 2018 at 12:11 PM, John Spray <jsp...@redhat.com> wrote:


> It depends on the metadata intensity of your workload.  It might be
> quite interesting to gather some drive stats on how many IOPS are
> currently hitting your metadata pool over a week of normal activity.
>

Any ceph built-in tool for this? maybe ceph daemonperf (altoght I'm not
sure what I should be looking at).
My current SSD disks have 2 partitions.
 - One is used for cephfs cache tier pool,
 - The other is used for both:  cephfs meta-data pool and cephfs data-ssd
(this is an additional cephfs data pool with only ssds with file layout for
a specific direcotory to use it)

Because of this, iostat shows me peaks of 12k IOPS in the metadata
partition, but this could definitely be IO for the data-ssd pool.


> If you are doing large file workloads, and the metadata mostly fits in
> RAM, then the number of IOPS from the MDS can be very, very low.  On
> the other hand, if you're doing random metadata reads from a small
> file workload where the metadata does not fit in RAM, almost every
> client read could generate a read operation, and each MDS could easily
> generate thousands of ops per second.
>

I have yet to measure it the right way but I'd assume my metadata fits in
RAM (a few 100s of MB only).

This is an email hosting cluster with dozens of thousands of users so there
are a lot of random reads and writes, but not too many small files.
Email messages are concatenated together in files up to 4MB in size (when a
rotation happens).
Most user operations are dovecot's INDEX operations and I will keep index
directory in a SSD-dedicaded pool.



> Isolating metadata OSDs is useful if the data OSDs are going to be
> completely saturated: metadata performance will be protected even if
> clients are hitting the data OSDs hard.
>

This seems to be the case.


> If "heavy write" means completely saturating the cluster, then sharing
> the OSDs is risky.  If "heavy write" just means that there are more
> writes than reads, then it may be fine if the metadata workload is not
> heavy enough to make good use of SSDs.
>

Saturarion will only happen in peak workloads, not often. By heavy write I
mean there are much more writes than reads, yes.
So I think I can start sharing the OSDs, if I think this is impacting
performance I can just change the ruleset and move metadata to a SSD-only
pool, right?


> The way I'd summarise this is: in the general case, dedicated SSDs are
> the safe way to go -- they're intrinsically better suited to metadata.
> However, in some quite common special cases, the overall number of
> metadata ops is so low that the device doesn't matter.



Thank you very much John!
Webert Lima
DevOps Engineer at MAV Tecnologia
Belo Horizonte - Brasil
IRC NICK - WebertRLZ

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Question: CephFS + Bluestore

Reply via email to