Multiple filesystems (or volumes) can be the right choice, it really
depends. But you need to be aware that for each CephFS you need (at
leats) two pools, plus one standby daemon for each active daemon.
While for a single FS (multi-active) it could be sufficient to have
one or two standby daemons in total because they automatically take
over the failed rank. As an example, you have 8 filesystems, that
means you need at least 16 pools (maybe more if you want to use EC)
which can be limited by the number of OSDs you have available. Then
you also need 16 MDS daemons (one active, one standby for each FS). In
a single FS scenario with 8 active MDS daemons it could be sufficient
to have 9 or 10 daemons in total, and you need fewer pools.
If your setup is rather static and you don't have to create a new FS
every other week, and the total number of filesystems stays the same,
it might be the better approach for you.
So I can't really recommend anything, you'll need to figure out which
scenario you need to cover. But Ceph is quite flexible, so you can
just start at one point and then develop from there.
Zitat von Sophonet <[email protected]>:
> Hi,
>
> thanks for the information - it seems that with pinning of
> subvolumes/directories you can distribute the load to different MDS.
> But in that case, what would be the difference to setting up
> different top-level volumes and attach them to different MDS? What I
> am not clear about is whether setting up one fs volume and pin
> subvolumes to different MDS is basically equivalent to using
> multiple fs volumes and attaching them to different MDS. Quotas/auth
> caps etc. can both be set for volumes as well as subvolumes.
>
> The only recommendation I have found on [0] says
>
> „...it is recommended to consolidate file system workloads onto a
> single CephFS file system, when possible. Consolidate the workloads
> to avoid over-allocating resources to MDS servers that can be
> underutilized.“
>
> Is there a workload difference when using multiple fs volumes vs. a
> single one and subvolumes? Intuitively I would think that multiple
> fs volumes might provide some more error resilience in case of
> failures - in which case only one fs (of several) would fail instead
> of the whole cluster (if there is just a single volume and
> subvolumes are used).
>
> Any insights? Thanks,
>
> Sophonet
>
> [0]
>
https://www.ibm.com/docs/en/storage-ceph/8.1.0?topic=systems-cephfs-volumes-subvolumes-subvolume-groups
>
>> Am 23.09.2025 um 15:45 schrieb Eugen Block <[email protected]>:
>>
>> Hi,
>>
>> with multiple active MDS daemons you can use pinning. This allows
>> you to pin specific directories (or subvolumes) to a specific rank
>> to spread the load. You can find the relevant docs here [0].
>>
>> Note that during an upgrade, max_mds is reduced to 1 (automatically
>> if you use the orchestrator), which can have a significant impact
>> because all the load previously spreaded across multiple daemons is
>> now shuffled onto a single node. This can crash a file system, just
>> so you're aware.
>>
>> So there are several options, two or three "fat" MDS nodes in
>> active/standby mode which can handle all the load. Or you have more
>> "fat" nodes which could handle all the load during an upgrade,
>> spreading the load again after the upgrade is finished. Or you have
>> multiple "not so fat" nodes to spread the workload but with a
>> higher risk of an issue during an upgrade.
>>
>> Regards,
>> Eugen
>>
>> [0] https://docs.ceph.com/en/latest/cephfs/multimds/
>> [1]
>>
https://docs.ceph.com/en/latest/cephfs/upgrading/#upgrading-the-mds-cluster
>>
>> Zitat von Sophonet <[email protected]>:
>>
>>> Hi list,
>>>
>>> for multiple project-level file shares (with individual access
>>> rights) I am planning to use CephFS.
>>>
>>> Technically this can be implemented both with multiple toplevel
>>> cephfs or with a single cephfs in the cluster and subvolumes.
>>>
>>> What is the preferred choice? I have not found any guidance in
>>> http://docs.ceph.com <http://docs.ceph.com/>. The only location that
>>> suggests to use subvolumes is
>>>
https://www.ibm.com/docs/en/storage-ceph/8.1.0?topic=systems-cephfs-volumes-subvolumes-subvolume-groups. However, how can I avoid that only one MDS is responsible for serving all subvolumes? Is there some current literature (books or web docs) that contain recommendations and examples? A couple of ceph-related books are available in well-known online book stores, but many of them are rather old (6 years or
even
>>> more).
>>>
>>> Thanks a lot,
>>>
>>> Sophonet
>>>
>>> _______________________________________________
>>> ceph-users mailing list -- [email protected]
>>> To unsubscribe send an email to [email protected]
>>
>>
>> _______________________________________________
>> ceph-users mailing list -- [email protected]
>> To unsubscribe send an email to [email protected]
>
> _______________________________________________
> ceph-users mailing list -- [email protected]
> To unsubscribe send an email to [email protected]
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]