[ceph-users] Re: DB sizing for lots of large files

Richard Thornton Sat, 28 Nov 2020 18:30:08 -0800

 Thanks everyone.

So the 3/30/300GB restriction no longer exists in Octopus, so can I make it
10GB and it will use all 10GB?


Is there a migration strategy that allows me to setup the DB on the OSD,
see how much metadata my 25TB is using, make a partition on the Optane say
quadruple the size and then move the DB to the Optane?

Or maybe the best strategy would be to start with a small logical volume on
the Optane, copy over my 25TB of existing data and extend it if required?

The bluefs-bdev-migrate and bluefs-bdev-expand commnds seem to be the
ticket.



On 27 Nov 2020 at 6:19:06 am, Christian Wuerdig <christian.wuer...@gmail.com>
wrote:

> Sorry, I replied to the wrong email thread before, so reposting this:
> I think it's time to start pointing out the the 3/30/300 logic not really
> holds any longer true post Octopus:
>
> https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/message/CKRCB3HUR7UDRLHQGC7XXZPWCWNJSBNT/
> Although I suppose in a way this makes it even harder to provide a sizing
> recommendation
>
> On Fri, 27 Nov 2020 at 04:49, Burkhard Linke <
> burkhard.li...@computational.bio.uni-giessen.de> wrote:
>
> Hi,
>
>
> On 11/26/20 12:45 PM, Richard Thornton wrote:
>
> > Hi,
>
> >
>
> > Sorry to bother you all.
>
> >
>
> > It’s a home server setup.
>
> >
>
> > Three nodes (ODROID-H2+ with 32GB RAM and dual 2.5Gbit NICs), two 14TB
>
> > 7200rpm SATA drives and an Optane 118GB NVMe in each node (OS boots from
>
> > eMMC).
>
>
>
> *snipsnap*
>
>
> > Is there a rough CephFS calculation (each file uses x bytes of
>
> metadata), I
>
> > think I should be safe with 30GB, now I read I should double that (you
>
> > should allocate twice the size of the biggest layer to allow for
>
> > compaction) but I only have 118GB and two OSDs so I will have to go for
>
> > 59GB (or whatever will fit)?
>
>
> The recommended size of 30 GB is due to the level design of rocksdb;
>
> data is stored in different cache levels with increasing level sizes. 30
>
> GB is a kind of sweet spot between 3 GB and 300 GB (too small / way too
>
> large for most use case). The recommendation for doubling the size for
>
> compaction is OK, but you will waste capacity most the time.
>
>
> In our cephfs instance we have ~ 115.000.000 files. Metadata is stored
>
> on 18 SSD based OSDs. About 30-35 GB raw capacity of the data is
>
> currently in use, almost exclusively for metadata, omap and other stuff.
>
> You might be able to scale this down to your use case. Our average file
>
> size approx. 5 MB, so you can also put a little bit on top in your case.
>
>
> If your working set (files accesses in a time span) is rather small, you
>
> also have the option to use the SSD for some block device caching layer
>
> like bcache or dmcache. In this setup the whole capacity will be used,
>
> and also data operations on the OSDs will benefit from the faster SSDs.
>
> Your failure domain will be the same; if the SSD dies your data disks
>
> will be useless.
>
>
> Otherwise I would recommend to use DB partitions of the recommended size
>
> (do not forget to include some extra space for the WAL), and use the
>
> remaining capacity for extra SSD based OSDs similar to our setup. This
>
> willensure that metadata access will be fast[tm].
>
>
>
> Regards,
>
>
> Burkhard
>
>
> _______________________________________________
>
> ceph-users mailing list -- ceph-users@ceph.io
>
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
>
> _______________________________________________
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: DB sizing for lots of large files

Reply via email to