[ceph-users] Re: [MDS] Pacific memory leak

2024-07-23 Thread Adrien Georget
5:17 AM Adrien Georget wrote: Hi, For the last 2 months, our MDS is frequently switching to another because of a sudden memory leak. The host has 128G RAM and most of the time the MDS occupies ~20% of memory. And in less than 3 minutes it increases to 100% and crashs with tcmalloc: allocation

[ceph-users] [MDS] Pacific memory leak

2024-07-22 Thread Adrien Georget
Hi, For the last 2 months, our MDS is frequently switching to another because of a sudden memory leak. The host has 128G RAM and most of the time the MDS occupies ~20% of memory. And in less than 3 minutes it increases to 100% and crashs with tcmalloc: allocation failed. We tried to run heap

[ceph-users] Re: Ceph 16.2.14: ceph-mgr getting oom-killed

2024-01-25 Thread Adrien Georget
We are a lot impacted by this issue with MGR in Pacific. This has to be fixed. As someone suggested in the issue tracker, we limited the memory usage of the MGR in the systemd unit (MemoryLimit=16G) in order to kill the MGR before it consumes all the memory of the server and impacts other serv

[ceph-users] Re: Ceph 16.2.14: ceph-mgr getting oom-killed

2023-11-22 Thread Adrien Georget
Hi, This memory leak with ceph-mgr seems to be due to a change in Ceph 16.2.12. Check this issue : https://tracker.ceph.com/issues/59580 We are also affected by this, with or without containerized services. Cheers, Adrien Le 22/11/2023 à 14:14, Eugen Block a écrit : One other difference is you

[ceph-users] Re: OSDs spam log with scrub starts

2023-09-01 Thread Adrien Georget
On Thu, Aug 31, 2023, at 11:17, Zakhar Kirpichenko wrote: This is happening to our 16.2.14 cluster as well. I'm not sure whether this was happening before the upgrade to 16.2.14. /Z On Thu, 31 Aug 2023, 17:49 Adrien Georget, wrote: Hello, On our 16.2.14 CephFS cluster, all OSDs are sp

[ceph-users] OSDs spam log with scrub starts

2023-08-31 Thread Adrien Georget
Hello, On our 16.2.14 CephFS cluster, all OSDs are spamming logs with messages like "log_channel(cluster) log [DBG] : xxx scrub starts". All OSDs are concerned, for different PGs. Cluster is healthy without any recovery ops. For a single PG, we can have hundreds of scrub starts msg in less th

[ceph-users] Re: cephadm automatic sizing of WAL/DB on SSD

2022-12-09 Thread Adrien Georget
Hi, We were also affected by this bug when we deployed a new Pacific cluster. Any news about the release of this fix to Ceph Pacific? It looks done for Quincy version but not Pacific. https://github.com/ceph/ceph/pull/47292 Regards, Adrien Le 05/10/2022 à 13:21, Anh Phan Tuan a écrit : It s

[ceph-users] Re: filesystem became read only after Quincy upgrade

2022-11-28 Thread Adrien Georget
a écrit : On 25/11/2022 16:25, Adrien Georget wrote: Hi Xiubo, Thanks for your analysis. Is there anything I can do to put CephFS back in healthy state? Or should I wait for to patch to fix that bug? Please try to trim the journals and umount all the clients first, and then to see could you

[ceph-users] Re: filesystem became read only after Quincy upgrade

2022-11-25 Thread Adrien Georget
one new tracker [1] to follow it, and raised a ceph PR [2] to fix this. More detail please my analysis in the tracker [2]. [1] https://tracker.ceph.com/issues/58082 [2] https://github.com/ceph/ceph/pull/49048 Thanks - Xiubo On 24/11/2022 16:33, Adrien Georget wrote: Hi Xiubo, We did the

[ceph-users] Re: filesystem became read only after Quincy upgrade

2022-11-24 Thread Adrien Georget
i a écrit : On 23/11/2022 19:49, Adrien Georget wrote: Hi, We upgraded this morning a Pacific Ceph cluster to the last Quincy version. The cluster was healthy before the upgrade, everything was done according to the upgrade procedure (non-cephadm) [1], all services have restarted correctl

[ceph-users] Re: filesystem became read only after Quincy upgrade

2022-11-23 Thread Adrien Georget
This bug looks very similar to this issue opened last year and closed without any solution : https://tracker.ceph.com/issues/52260 Adrien Le 23/11/2022 à 12:49, Adrien Georget a écrit : Hi, We upgraded this morning a Pacific Ceph cluster to the last Quincy version. The cluster was healthy

[ceph-users] filesystem became read only after Quincy upgrade

2022-11-23 Thread Adrien Georget
Hi, We upgraded this morning a Pacific Ceph cluster to the last Quincy version. The cluster was healthy before the upgrade, everything was done according to the upgrade procedure (non-cephadm) [1], all services have restarted correctly but the filesystem switched to read only mode when it beca

[ceph-users] Best way to merge crush buckets?

2020-02-28 Thread Adrien Georget
Hi all, I'm looking for the best way to merge/remap existing host buckets into one. I'm running a Ceph Nautilus cluster used as a Ceph Cinder backend with 2 pools "volume-service" and "volume-recherche" both with dedicated OSDs : |host cccephnd00x-service {|| ||    id -2   # do not

[ceph-users] Ceph NIC partitioning (NPAR)

2019-09-25 Thread Adrien Georget
Hi, I need your advice about the following setup. Currently, we have a Ceph nautilus cluster used by Openstack Cinder with single NIC in 10Gbps on OSD hosts. We will upgrade the cluster by adding 7 new hosts dedicated to Nova/Glance and we would like to add a cluster network to isolate replica