[ceph-users] libcephfs init hangs, is there a 'timeout' argument?

2023-08-09 Thread Harry G Coin
Libcephfs's 'init' call hangs when passed arguments that once worked normally, but later refer to a cluster that's either broken, is on its way out of service, has too few mons, etc.  At least the python libcephfs wrapper hangs on init. Of course mount and session timeouts work, but is there

[ceph-users] Ceph Leadership Team Meeting: 2023-08-09 Minutes

2023-08-09 Thread Patrick Donnelly
Today we discussed: - Delegating more privileges for internal hardware to allow on-call folks to fix issues. - Maybe using CephFS for the teuthology VM /home directory (it became full on Friday night) - Preparation for Open Source Day: we are seeking "low-hanging-fruit" tickets for new developers

[ceph-users] Re: how to set load balance on multi active mds?

2023-08-09 Thread zxcs
Thanks a lot, Eugen! we are using dynamic subtree pinning, we have another cluster using manual pinning, but we have many directory , and we need pin each dir for each request. so in our new cluster, we want to try dynamic subtree pinning. we don’t want to human kick in every time. Because

[ceph-users] Re: OSD delete vs destroy vs purge

2023-08-09 Thread Eugen Block
Hi, I'll try to summarize as far as I understand the process, please correct me if I'm wrong. - delete: drain and then delete (optionally keep OSD ID) - destroy: mark as destroyed (to re-use OSD ID) - purge: remove everything I would call the "delete" option in the dashboard as a "safe

[ceph-users] Re: CephFS metadata outgrow DISASTER during recovery

2023-08-09 Thread Anh Phan Tuan
Hi All, It seems I also faced a similar case last year. I have about 160 x HDD mixed size and 12 x 480GB nvme ssd for the metadata pool. I am aware of incidents when ssd osd go to near full state, I increase nearfull ratio but these osd continue to grow for unknown reason. This is production so

[ceph-users] Re: Ceph bucket notification events stop working

2023-08-09 Thread Yuval Lifshitz
Hi Daniel, I assume you are using persistent topics? We had a bug that we recently fixed, where the queue of a persistent notification was not deleted when the deletion was done from radosgw-admin. see: https://tracker.ceph.com/issues/61311 However, there are no plans to backport that to pacific.

[ceph-users] Re: how to set load balance on multi active mds?

2023-08-09 Thread Eugen Block
Hi, you could benefit from directory pinning [1] or dynamic subtree pinning [2]. We had great results with manual pinning in an older Nautilus cluster, didn't have a chance to test the dynamic subtree pinning yet though. It's difficult to tell in advance which option would suit best your

[ceph-users] how to set load balance on multi active mds?

2023-08-09 Thread zxcs
Hi, experts, we have a product env build with ceph version 16.2.11 pacific, and using CephFS. Also enable multi active mds(more than 10), but we usually see load unbalance on our client request with these mds. see below picture. the top 1 mds has 32.2k client request. and the last one only

[ceph-users] OSD delete vs destroy vs purge

2023-08-09 Thread Nicola Mori
Dear Ceph users, I see that the OSD page of the Ceph dashboard offers three possibilities for "removing" an OSD: delete, destroy and purge. The delete operation has the possibility to flag the "Preserve OSD ID(s) for replacement." option. I searched for explanations of the differences between