[ceph-users] Re: MDS_DAMAGE in 17.2.7 / Cannot delete affected files
Hi Patrick, On 30.11.23 03:58, Patrick Donnelly wrote: I've not yet fully reviewed the logs but it seems there is a bug in the detection logic which causes a spurious abort. This does not appear to be actually new damage. We are accessing the metadata (read-only) daily. The issue only popped up after updating to 17.2.7. Of course, this does not mean that there was no damage there before, only that it was not detected. Are you using postgres? Not on top of CephFS, no. We do use postgres on some RBD volumes. If you can share details about your snapshot workflow and general workloads that would be helpful (privately if desired). Our CephFS root looks like this: /archive /homes /no-snapshot /other-snapshot /scratch We are running snapshots on /homes and /other-snapshot with the same schedule. We mount the filesystem with a Kernel client on one of the Ceph Hosts (not running the MDS) and mkdir / rmdir as needed. - daily between 06:00 and 19:45 UTC (inclusive): Create a snapshot every 15 minutes, delete it unless it is hourly (xx:00) one hour later - daily on the full hour: Create a snapshot, delete the 24 hours old snapshot unless it is midnight - daily at midnight delete the snapshot from 14 days ago unless it is Sunday - every Sunday at midnight delete the snapshot from 8 weeks ago Workload is two main Samba servers (one only sharing a subdirectory which is generally not accessed on the other). Client access to those servers is limited to 1GBit/s each. Until Tuesday, we also had a mailserver with Dovecot running on top of CephFS. This was migrated on Tuesday to an RBD volume as we had some issues with hanging access to some files / directories (interestingly only in the main tree, in snapshots access was without issue). Additionally, we have a Nextcloud instance with ~200 active users storing data in CephFS as well as some other Kernel clients with little / sporadic traffic, some running Samba, some NFS, some interactive SSH / x2go servers with direct user access, some specialised web applications (notably OMERO). We run daily incremental backups of most of the CephFS content with Bareos running on a dedicated server which has the whole CephFS tree mounted read-only. For most data a full backup is performed every two months, for some data only every six months. The affected area is contained in this "every six months" full backup portion of the file system tree. Two weeks ago we deleted a folder structure with 6 TB, average file size in the range of 1GB. The structure was unter /other-snapshot as well. This led to severe load on the MDS, especially starting midnight. In conjunction with Ubuntu kernel mount, we also had issues with non-released capabilities preventing read-access to the /other-snapshot part. To combat these lingering problems, we deleted all snapshots in /other-snapshot which led to a half a dozen PGs stuck in snaptrim state (and a few hundred in snaptrim_wait). Updating from 17.2.6 to 17.2.7 solved that issue quickly, the affected PGs became unstuck and the whole cluster was in active+clean a few hours later. For now, I'll hold off on running first-damage.py to try to remove the affected files / inodes. Ultimately however, this seems to be the most sensible solution to me, at least with regards to cluster downtime. Please give me another day to review then feel free to use first-damage.py to cleanup. If you see new damage please upload the logs. We are in no hurry and will probably run first-damage.py sometime next week. I will report new damage if it comes in. Cheers Sebastian -- Dr. Sebastian Knust | Bielefeld University IT Administrator | Faculty of Physics Office: D2-110 | Universitätsstr. 25 Phone: +49 521 106 5234 | 33615 Bielefeld ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: MDS_DAMAGE in 17.2.7 / Cannot delete affected files
Hi Sebastian, On Wed, Nov 29, 2023 at 3:11 PM Sebastian Knust wrote: > > Hello Patrick, > > On 27.11.23 19:05, Patrick Donnelly wrote: > > > > I would **really** love to see the debug logs from the MDS. Please > > upload them using ceph-post-file [1]. If you can reliably reproduce, > > turn on more debugging: > > > >> ceph config set mds debug_mds 20 > >> ceph config set mds debug_ms 1 > > > > [1] https://docs.ceph.com/en/reef/man/8/ceph-post-file/ > > > > Uploaded debug log and core dump, see ceph-post-file: > 02f78445-7136-44c9-a362-410de37a0b7d > Unfortunately, we cannot easily shut down normal access to the cluster > for these tests, therefore there is quite some clutter in the logs. The > logs show three crashes, the last one with enabled core dumping (ulimits > set to unlimited) > > A note on reproducibility: To recreate the crash, reading the contents > of the file prior to removal seems necessary. Simply calling stat on the > file and then performing the removal also yields an Input/output error > but does not crash the MDS. > > Interestingly, the MDS_DAMAGE flag is reset on restart of the MDS and > only comes back once the files in question are accessed (stat call is > sufficient). I've not yet fully reviewed the logs but it seems there is a bug in the detection logic which causes a spurious abort. This does not appear to be actually new damage. Are you using postgres? If you can share details about your snapshot workflow and general workloads that would be helpful (privately if desired). > For now, I'll hold off on running first-damage.py to try to remove the > affected files / inodes. Ultimately however, this seems to be the most > sensible solution to me, at least with regards to cluster downtime. Please give me another day to review then feel free to use first-damage.py to cleanup. If you see new damage please upload the logs. -- Patrick Donnelly, Ph.D. He / Him / His Red Hat Partner Engineer IBM, Inc. GPG: 19F28A586F808C2402351B93C3301A3E258DD79D ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: MDS_DAMAGE in 17.2.7 / Cannot delete affected files
Hello Patrick, On 27.11.23 19:05, Patrick Donnelly wrote: I would **really** love to see the debug logs from the MDS. Please upload them using ceph-post-file [1]. If you can reliably reproduce, turn on more debugging: ceph config set mds debug_mds 20 ceph config set mds debug_ms 1 [1] https://docs.ceph.com/en/reef/man/8/ceph-post-file/ Uploaded debug log and core dump, see ceph-post-file: 02f78445-7136-44c9-a362-410de37a0b7d Unfortunately, we cannot easily shut down normal access to the cluster for these tests, therefore there is quite some clutter in the logs. The logs show three crashes, the last one with enabled core dumping (ulimits set to unlimited) A note on reproducibility: To recreate the crash, reading the contents of the file prior to removal seems necessary. Simply calling stat on the file and then performing the removal also yields an Input/output error but does not crash the MDS. Interestingly, the MDS_DAMAGE flag is reset on restart of the MDS and only comes back once the files in question are accessed (stat call is sufficient). For now, I'll hold off on running first-damage.py to try to remove the affected files / inodes. Ultimately however, this seems to be the most sensible solution to me, at least with regards to cluster downtime. Cheers Sebastian ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: MDS_DAMAGE in 17.2.7 / Cannot delete affected files
Hello Sebastian, On Fri, Nov 24, 2023 at 8:49 AM Sebastian Knust wrote: > > Hi, > > After updating from 17.2.6 to 17.2.7 with cephadm, our cluster went into > MDS_DAMAGE state. We had some prior issues with faulty kernel clients > not releasing capabilities, therefore the update might just be a > coincidence. > > `ceph tell mds.cephfs:0 damage ls` lists 56 affected files all with > these general details: > > { > "damage_type": "dentry", > "id": 123456, > "ino": 1234567890, > "frag": "*", > "dname": "some-filename.ext", > "snap_id": "head", > "path": "/full/path/to/file" > } > > The behaviour upon trying to access file information in the (Kernel > mounted) filesystem is a bit inconsistent. Generally, the first `stat` > call seems to result in "Input/output error", the next call provides all > `stat` data as expected from an undamaged file. The file can be read > with `cat` with full and correct content (verified with backup) once the > stat call succeeds. > > Scrubbing the affected subdirectories with `ceph tell mds.cephfs:0 scrub > start /path/to/dir/ recursive,repair,force` does not fix the issue. > > Trying to delete the file results in an "Input/output error". If the > stat calls beforehand succeeded, this also crashes the active MDS with > these messages in the system journal: > > Nov 24 14:21:15 iceph-18.servernet ceph-mds[1946861]: > > mds.0.cache.den(0x10012271195 DisplaySettings.json) newly corrupt dentry to > > be committed: [dentry > > #0x1/homes/huser/d3data/transfer/hortkrass/FLIMSIM/2023-04-12-irf-characterization/2-qwp-no-extra-filter-pc-off-tirf-94-tirf-cursor/DisplaySettings.json > > [1000275c4a0,head] auth (dversion lock) pv=0 v=225 ino=0x10012271197 > > state=1073741824 | inodepin=1 0x56413e1e2780] > > Nov 24 14:21:15 iceph-18.servernet ceph-mds[1946861]: log_channel(cluster) > > log [ERR] : MDS abort because newly corrupt dentry to be committed: [dentry > > #0x1/homes/huser/d3data/transfer/hortkrass/FLIMSIM/2023-04-12-irf-characterization/2-qwp-no-extra-filter-pc-off-tirf-94-tirf-cursor/DisplaySettings.json > > [1000275c4a0,head] auth (dversion lock) pv=0 v=225 ino=0x10012271197 > > state=1073741824 | inodepin=1 0x56413e1e2780] > > Nov 24 14:21:15 iceph-18.servernet > > ceph-eafd0514-3644-11eb-bc6a-3cecef2330fa-mds-cephfs-iceph-18-ujfqnd[1946838]: > > 2023-11-24T13:21:15.654+ 7f3fdcde0700 -1 mds.0.cache.den(0x10012271195 > > DisplaySettings.json) newly corrupt dentry to be committed: [dentry > > #0x1/homes/huser/d3data/transfer/hortkrass/FLIMSIM/2023-04-12-irf-characterization/2-qwp-no-extra-filter-pc-off-tirf-94-tirf-cursor/DisplaySettings.json > > [1000275c4a0,head] auth (dversion lock) pv=0 v=225 ino=0x1001> > > Nov 24 14:21:15 iceph-18.servernet > > ceph-eafd0514-3644-11eb-bc6a-3cecef2330fa-mds-cephfs-iceph-18-ujfqnd[1946838]: > > 2023-11-24T13:21:15.654+ 7f3fdcde0700 -1 log_channel(cluster) log > > [ERR] : MDS abort because newly corrupt dentry to be committed: [dentry > > #0x1/homes/huser/d3data/transfer/hortkrass/FLIMSIM/2023-04-12-irf-characterization/2-qwp-no-extra-filter-pc-off-tirf-94-tirf-cursor/DisplaySettings.json > > [1000275c4a0,head] auth (dversion lock) pv=0 v=225 ino=0x10012> > > Nov 24 14:21:15 iceph-18.servernet > > ceph-eafd0514-3644-11eb-bc6a-3cecef2330fa-mds-cephfs-iceph-18-ujfqnd[1946838]: > > > > /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.2.7/rpm/el8/BUILD/ceph-17.2.7/src/mds/MDSRank.cc: > > In function 'void MDSRank::abort(std::string_view)' thread 7f3fdcde0700 > > time 2023-11-24T13:21:15.655088+ > > Nov 24 14:21:15 iceph-18.servernet ceph-mds[1946861]: > > /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.2.7/rpm/el8/BUILD/ceph-17.2.7/src/mds/MDSRank.cc: > > In function 'void MDSRank::abort(std::string_view)' thread 7f3fdcde0700 > > time 2023-11-24T13:21:15.655088+ > > > > /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.2.7/rpm/el8/BUILD/ceph-17.2.7/src/mds/MDSRank.cc: > > 937: ceph_abort_msg("abort() called") > > > >ceph version 17.2.7 > > (b12291d110049b2f35e32e0de30d70e9a4c060d2) quincy (stable) > >1: > > (ceph::__ceph_abort(char const*, int, char const*, > > std::__cxx11::basic_string, > > std::allocator > const&)+0xd7) [0x7f3fe5a1cb03] > >2: > > (MDSRank::abort(std::basic_string_view > > >)+0x7d) [0x5640f2e6fa2d] > >3: > > (CDentry::check_corruption(bool)+0x740)
[ceph-users] Re: MDS_DAMAGE in 17.2.7 / Cannot delete affected files
Hi Sebastian, You can find some more discussion and fixes for this type of fs corruption here: https://www.spinics.net/lists/ceph-users/msg76952.html -- Dan van der Ster CTO Clyso GmbH p: +49 89 215252722 | a: Vancouver, Canada w: https://clyso.com | e: dan.vanders...@clyso.com We are hiring: https://www.clyso.com/jobs/ On Fri, Nov 24, 2023 at 5:48 AM Sebastian Knust wrote: > > Hi, > > After updating from 17.2.6 to 17.2.7 with cephadm, our cluster went into > MDS_DAMAGE state. We had some prior issues with faulty kernel clients > not releasing capabilities, therefore the update might just be a > coincidence. > > `ceph tell mds.cephfs:0 damage ls` lists 56 affected files all with > these general details: > > { > "damage_type": "dentry", > "id": 123456, > "ino": 1234567890, > "frag": "*", > "dname": "some-filename.ext", > "snap_id": "head", > "path": "/full/path/to/file" > } > > The behaviour upon trying to access file information in the (Kernel > mounted) filesystem is a bit inconsistent. Generally, the first `stat` > call seems to result in "Input/output error", the next call provides all > `stat` data as expected from an undamaged file. The file can be read > with `cat` with full and correct content (verified with backup) once the > stat call succeeds. > > Scrubbing the affected subdirectories with `ceph tell mds.cephfs:0 scrub > start /path/to/dir/ recursive,repair,force` does not fix the issue. > > Trying to delete the file results in an "Input/output error". If the > stat calls beforehand succeeded, this also crashes the active MDS with > these messages in the system journal: > > Nov 24 14:21:15 iceph-18.servernet ceph-mds[1946861]: > > mds.0.cache.den(0x10012271195 DisplaySettings.json) newly corrupt dentry to > > be committed: [dentry > > #0x1/homes/huser/d3data/transfer/hortkrass/FLIMSIM/2023-04-12-irf-characterization/2-qwp-no-extra-filter-pc-off-tirf-94-tirf-cursor/DisplaySettings.json > > [1000275c4a0,head] auth (dversion lock) pv=0 v=225 ino=0x10012271197 > > state=1073741824 | inodepin=1 0x56413e1e2780] > > Nov 24 14:21:15 iceph-18.servernet ceph-mds[1946861]: log_channel(cluster) > > log [ERR] : MDS abort because newly corrupt dentry to be committed: [dentry > > #0x1/homes/huser/d3data/transfer/hortkrass/FLIMSIM/2023-04-12-irf-characterization/2-qwp-no-extra-filter-pc-off-tirf-94-tirf-cursor/DisplaySettings.json > > [1000275c4a0,head] auth (dversion lock) pv=0 v=225 ino=0x10012271197 > > state=1073741824 | inodepin=1 0x56413e1e2780] > > Nov 24 14:21:15 iceph-18.servernet > > ceph-eafd0514-3644-11eb-bc6a-3cecef2330fa-mds-cephfs-iceph-18-ujfqnd[1946838]: > > 2023-11-24T13:21:15.654+ 7f3fdcde0700 -1 mds.0.cache.den(0x10012271195 > > DisplaySettings.json) newly corrupt dentry to be committed: [dentry > > #0x1/homes/huser/d3data/transfer/hortkrass/FLIMSIM/2023-04-12-irf-characterization/2-qwp-no-extra-filter-pc-off-tirf-94-tirf-cursor/DisplaySettings.json > > [1000275c4a0,head] auth (dversion lock) pv=0 v=225 ino=0x1001> > > Nov 24 14:21:15 iceph-18.servernet > > ceph-eafd0514-3644-11eb-bc6a-3cecef2330fa-mds-cephfs-iceph-18-ujfqnd[1946838]: > > 2023-11-24T13:21:15.654+ 7f3fdcde0700 -1 log_channel(cluster) log > > [ERR] : MDS abort because newly corrupt dentry to be committed: [dentry > > #0x1/homes/huser/d3data/transfer/hortkrass/FLIMSIM/2023-04-12-irf-characterization/2-qwp-no-extra-filter-pc-off-tirf-94-tirf-cursor/DisplaySettings.json > > [1000275c4a0,head] auth (dversion lock) pv=0 v=225 ino=0x10012> > > Nov 24 14:21:15 iceph-18.servernet > > ceph-eafd0514-3644-11eb-bc6a-3cecef2330fa-mds-cephfs-iceph-18-ujfqnd[1946838]: > > > > /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.2.7/rpm/el8/BUILD/ceph-17.2.7/src/mds/MDSRank.cc: > > In function 'void MDSRank::abort(std::string_view)' thread 7f3fdcde0700 > > time 2023-11-24T13:21:15.655088+ > > Nov 24 14:21:15 iceph-18.servernet ceph-mds[1946861]: > > /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.2.7/rpm/el8/BUILD/ceph-17.2.7/src/mds/MDSRank.cc: > > In function 'void MDSRank::abort(std::string_view)' thread 7f3fdcde0700 > > time 2023-11-24T13:21:15.655088+ > > > > /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.2.7/rpm/el8/BUILD/ceph-17.2.7/src/mds/MDSRank.cc: > > 937: ceph_abort_msg("abort() called") > > > >ceph version 17.2.7 > > (b12291d110049b2f35e32e0de30d70e9a4c060d2) quincy (stable) > >1: > > (ceph::__ceph_abort(char const*, int, char const*, > >