[ceph-users] Re: MDS crashes to damaged metadata

2024-06-10 Thread Patrick Donnelly
You could try manually deleting the files from the directory fragments, using `rados` commands. Make sure to flush your MDS journal first and take the fs offline (`ceph fs fail`). On Tue, Jun 4, 2024 at 8:50 AM Stolte, Felix wrote: > > Hi Patrick, > > it has been a year now and we did not have a

[ceph-users] Re: MDS crashes to damaged metadata

2024-06-04 Thread Stolte, Felix
Hi Patrick, it has been a year now and we did not have a single crash since upgrading to 16.2.13. We still have the 19 corrupted files which are reported by 'damage ls‘. Is it now possible to delete the corrupted files without taking the filesystem offline? Am 22.05.2023 um 20:23 schrieb Patri

[ceph-users] Re: MDS crashes to damaged metadata

2023-05-24 Thread Patrick Donnelly
On Wed, May 24, 2023 at 4:26 AM Stefan Kooman wrote: > > On 5/22/23 20:24, Patrick Donnelly wrote: > > > > > The original script is here: > > https://github.com/ceph/ceph/blob/main/src/tools/cephfs/first-damage.py > > > "# Suggested recovery sequence (for single MDS cluster): > # > # 1) Unmount al

[ceph-users] Re: MDS crashes to damaged metadata

2023-05-24 Thread Stefan Kooman
On 5/22/23 20:24, Patrick Donnelly wrote: The original script is here: https://github.com/ceph/ceph/blob/main/src/tools/cephfs/first-damage.py "# Suggested recovery sequence (for single MDS cluster): # # 1) Unmount all clients." Is this a hard requirement? This might not be feasible for an M

[ceph-users] Re: MDS crashes to damaged metadata

2023-05-22 Thread Patrick Donnelly
On Mon, May 15, 2023 at 8:55 AM Stefan Kooman wrote: > > On 12/15/22 15:31, Stolte, Felix wrote: > > Hi Patrick, > > > > we used your script to repair the damaged objects on the weekend and it > > went smoothly. Thanks for your support. > > > > We adjusted your script to scan for damaged files on

[ceph-users] Re: MDS crashes to damaged metadata

2023-05-22 Thread Patrick Donnelly
Hi Felix, On Sat, May 13, 2023 at 9:18 AM Stolte, Felix wrote: > > Hi Patrick, > > we have been running one daily snapshot since december and our cephfs crashed > 3 times because of this https://tracker.ceph.com/issues/38452 > > We currentliy have 19 files with corrupt metadata found by your >

[ceph-users] Re: MDS crashes to damaged metadata

2023-05-15 Thread Stefan Kooman
On 12/15/22 15:31, Stolte, Felix wrote: Hi Patrick, we used your script to repair the damaged objects on the weekend and it went smoothly. Thanks for your support. We adjusted your script to scan for damaged files on a daily basis, runtime is about 6h. Until thursday last week, we had exactly

[ceph-users] Re: MDS crashes to damaged metadata

2023-05-13 Thread Stolte, Felix
Hi Patrick, we have been running one daily snapshot since december and our cephfs crashed 3 times because of this https://tracker.ceph.com/issues/38452 We currentliy have 19 files with corrupt metadata found by your first-damage.py script. We isolated the these files from access by users and ar

[ceph-users] Re: MDS crashes to damaged metadata

2023-01-08 Thread Venky Shankar
Hi Felix, On Thu, Dec 15, 2022 at 8:03 PM Stolte, Felix wrote: > > Hi Patrick, > > we used your script to repair the damaged objects on the weekend and it went > smoothly. Thanks for your support. > > We adjusted your script to scan for damaged files on a daily basis, runtime > is about 6h. Unt

[ceph-users] Re: MDS crashes to damaged metadata

2023-01-08 Thread Patrick Donnelly
On Thu, Dec 15, 2022 at 9:32 AM Stolte, Felix wrote: > > Hi Patrick, > > we used your script to repair the damaged objects on the weekend and it went > smoothly. Thanks for your support. > > We adjusted your script to scan for damaged files on a daily basis, runtime > is about 6h. Until thursday

[ceph-users] Re: MDS crashes to damaged metadata

2022-12-15 Thread Stolte, Felix
Hi Patrick, we used your script to repair the damaged objects on the weekend and it went smoothly. Thanks for your support. We adjusted your script to scan for damaged files on a daily basis, runtime is about 6h. Until thursday last week, we had exactly the same 17 Files. On thursday at 13:05

[ceph-users] Re: MDS crashes to damaged metadata

2022-12-01 Thread Stolte, Felix
Had to reduce the debug level back to normal. Debug Level 20 generated about 70GB log file in one hour. Of course there was no crash in that period. -

[ceph-users] Re: MDS crashes to damaged metadata

2022-11-30 Thread Patrick Donnelly
You can run this tool. Be sure to read the comments. https://github.com/ceph/ceph/blob/main/src/tools/cephfs/first-damage.py As of now what causes the damage is not yet known but we are trying to reproduce it. If your workload reliably produces the damage, a debug_mds=20 MDS log would be extremel

[ceph-users] Re: MDS crashes to damaged metadata

2022-11-30 Thread Stolte, Felix
Hi Patrick, it does seem like it. We are not using postgres on cephfs as far as i know. We narrowed it down to three damaged inodes, but files in question had been xlsx, pdf or pst. Do you have any suggestion how to fix this? Is there a way to scan the cephfs for damaged inodes?

[ceph-users] Re: MDS crashes to damaged metadata

2022-11-30 Thread Patrick Donnelly
On Wed, Nov 30, 2022 at 3:10 PM Stolte, Felix wrote: > > Hey guys, > > our mds daemons are crashing constantly when someone is trying to delete a > file: > > -26> 2022-11-29T12:32:58.807+0100 7f081b458700 -1 > /build/ceph-16.2.10/src/mds/Server.cc: In function 'void > Server: