[ceph-users] Re: Crushmap rule for multi-datacenter erasure coding

2023-04-04 Thread Frank Schilder
Hi Michel, I don't have experience with LRC profiles. They may reduce cross-site traffic at the expense of extra overhead. But this might actually be unproblematic with EC profiles that have a large m any ways. If you do experiments with this, please let the list know. I would like to add here

[ceph-users] Re: CephFS thrashing through the page cache

2023-04-04 Thread Xiubo Li
Hi Ashu, Yeah, please see https://patchwork.kernel.org/project/ceph-devel/list/?series=733010. Sorry I forgot to reply it here. - Xiubo On 4/4/23 13:58, Ashu Pachauri wrote: Hi Xiubo, Did you get a chance to work on this? I am curious to test out the improvements. Thanks and Regards, As

[ceph-users] Re: Recently deployed cluster showing 9Tb of raw usage without any load deployed

2023-04-04 Thread Igor Fedotov
Do you have standalone DB volumes for your OSD? If so then highly likely RAW usage is that high due to DB volumes space is considered as in-use one already. Could you please share "ceph osd df tree" output to prove that? Thanks, Igor On 4/4/2023 4:25 AM, Work Ceph wrote: Hello guys! We

[ceph-users] Re: Recently deployed cluster showing 9Tb of raw usage without any load deployed

2023-04-04 Thread Igor Fedotov
Please also note that total cluster size reported below as SIZE apparently includes DB volumes: # ceph df --- RAW STORAGE --- CLASS SIZE AVAILUSED RAW USED %RAW USED hdd373 TiB 364 TiB 9.3 TiB 9.3 TiB 2.50 On 4/4/2023 12:22 PM, Igor Fedotov wrote: Do you have standa

[ceph-users] Re: Recently deployed cluster showing 9Tb of raw usage without any load deployed

2023-04-04 Thread Work Ceph
Thank you guys for your replies. The "used space" there is exactly that. It is the accounting for Rocks.DB and WAL. ``` RAW USED: The sum of USED space and the space allocated the db and wal BlueStore partitions. ``` There is one detail I do not understand. We are off-loading WAL and RocksDB to

[ceph-users] Re: Recently deployed cluster showing 9Tb of raw usage without any load deployed

2023-04-04 Thread Igor Fedotov
Originally you mentioned 14TB HDDs not 15TB. Could this be a trick? If not - please share "ceph osd df tree" output? On 4/4/2023 2:18 PM, Work Ceph wrote: Thank you guys for your replies. The "used space" there is exactly that. It is the accounting for Rocks.DB and WAL. ``` RAW USED: The sum

[ceph-users] Re: Read and write performance on distributed filesystem

2023-04-04 Thread Xiubo Li
On 4/4/23 07:59, David Cunningham wrote: Hello, We are considering CephFS as an alternative to GlusterFS, and have some questions about performance. Is anyone able to advise us please? This would be for file systems between 100GB and 2TB in size, average file size around 5MB, and a mixture of

[ceph-users] Re: Recently deployed cluster showing 9Tb of raw usage without any load deployed

2023-04-04 Thread Work Ceph
The disks are 14.9, but the exact size of them does not matter much in this context. We figured out the issue. The raw used space accounts for the Rocks.DB and WAL space. Therefore, as we dedicated an NVME device in each host for them, Ceph is showing that space as used space already. It is funny t

[ceph-users] Re: Crushmap rule for multi-datacenter erasure coding

2023-04-04 Thread Michel Jouvin
Hi Frank, Thanks for this additional information. Currently, I'd like to experiment with LRC that provides a "natural" way to implement the multistep OSD allocation to ensure the distribution across datacenters, without tweaking the crushmap rule. Configuration of LRC plugin is far from obvio

[ceph-users] Help needed to configure erasure coding LRC plugin

2023-04-04 Thread Michel Jouvin
Hi, As discussed in another thread (Crushmap rule for multi-datacenter erasure coding), I'm trying to create an EC pool spanning 3 datacenters (datacenters are present in the crushmap), with the objective to be resilient to 1 DC down, at least keeping the readonly access to the pool and if po

[ceph-users] Re: Help needed to configure erasure coding LRC plugin

2023-04-04 Thread Michel Jouvin
Answering to myself, I found the reason for 2147483647: it's documented as a failure to find enough OSD (missing OSDs). And it is normal as I selected different hosts for the 15 OSDs but I have only 12 hosts! I'm still interested by an "expert" to confirm that LRC  k=9, m=3, l=4 configuration

[ceph-users] Crushmap rule for multi-datacenter erasure coding

2023-04-04 Thread Michel Jouvin
Hi, We have a 3-site Ceph cluster and would like to create a 4+2 EC pool with 2 chunks per datacenter, to maximise the resilience in case of 1 datacenter being down. I have not found a way to create an EC profile with this 2-level allocation strategy. I created an EC profile with a failure do

[ceph-users] Upgrading to 16.2.11 timing out on ceph-volume due to raw list performance bug, downgrade isn't possible due to new OP code in bluestore

2023-04-04 Thread Mikael Öhman
Trying to upgrade a containerized setup from 16.2.10 to 16.2.11 gave us two big surprises, I wanted to share in case anyone else encounters the same. I don't see any nice solution to this apart from a new release that fixes the performance regression that completely breaks the container setup in ce

[ceph-users] Re: Crushmap rule for multi-datacenter erasure coding

2023-04-04 Thread Frédéric Nass
Hello Michel, What you need is: step choose indep 0 type datacenter step chooseleaf indep 2 type host step emit I think you're right about the need to tweak the crush rule by editing the crushmap directly. Regards Frédéric. - Le 3 Avr 23, à 18:34, Michel Jouvin mic