[ceph-users] Re: Help with deep scrub warnings
Hi, just for the archives: On Tue, 5 Mar 2024, Anthony D'Atri wrote: * Try applying the settings to global so that mons/mgrs get them. Setting osd_deep_scrub_interval at global instead at osd immediately turns health to OK and removes the false warning from PGs not scrubbed in time. HTH, Sascha. ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Permanent KeyError: 'TYPE' ->17.2.7: return self.blkid_api['TYPE'] == 'part'
Hi, On Wed, 8 Nov 2023, Sascha Lucas wrote: On Tue, 7 Nov 2023, Harry G Coin wrote: "/usr/lib/python3.6/site-packages/ceph_volume/util/device.py", line 482, in is_partition /usr/bin/docker: stderr return self.blkid_api['TYPE'] == 'part' /usr/bin/docker: stderr KeyError: 'TYPE' Problem found: in my case this is caused by DRBD secondary block devices, which can not be read until promoted to primary. ceph_volume/util/disk.py runs in blkid(): $ blkid -c /dev/null -p /dev/drbd4 blkid: error: /dev/drbd4: Wrong medium type but does not care about its return code. A quick fix is to use the get() method to automatically fall back to None for non existing keys: --- a/ceph_volume/util/device.py 2023-11-10 07:00:01.552497107 + +++ b/ceph_volume/util/device.py 2023-11-10 08:54:40.320718690 + @@ -476,13 +476,13 @@ @property def is_partition(self): self.load_blkid_api() if self.disk_api: return self.disk_api['TYPE'] == 'part' elif self.blkid_api: -return self.blkid_api['TYPE'] == 'part' +return self.blkid_api.get('TYPE') == 'part' return False Don't know why this is triggered in 17.2.7. @ceph-devs: should I report this somewhere else? TIA, Sascha. ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Permanent KeyError: 'TYPE' ->17.2.7: return self.blkid_api['TYPE'] == 'part'
Hi, On Tue, 7 Nov 2023, Harry G Coin wrote: These repeat for every host, only after upgrading from prev release Quincy to 17.2.7. As a result, the cluster is always warned, never indicates healthy. I'm hitting this error, too. "/usr/lib/python3.6/site-packages/ceph_volume/util/device.py", line 482, in is_partition /usr/bin/docker: stderr return self.blkid_api['TYPE'] == 'part' /usr/bin/docker: stderr KeyError: 'TYPE' Variable names indicate usage of BLKID(8). It seems that `blkid` usually returns TYPE="something", but I have devices without TYPE: /dev/mapper/data-4d323729--8fec--42c6--a1da--bacdea89fb37.disk0_data: PTUUID="c2901603-fae8-45cb-86fe-13d02e6b6dc6" PTTYPE="gpt" /dev/mapper/data-8d485122--d8ca--4e11--85bb--3f795a4e31e9.disk0_data: PTUUID="2bc7a15e" PTTYPE="dos" /dev/drbd3: PTUUID="2bc7a15e" PTTYPE="dos" Maybe this indicates why the key is missing? Please tell me if there is anything I can do to find the root cause. Thanks, Sascha. ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: MDS_DAMAGE dir_frag
Hi Venky, On Wed, 14 Dec 2022, Venky Shankar wrote: On Tue, Dec 13, 2022 at 6:43 PM Sascha Lucas wrote: Just an update: "scrub / recursive,repair" does not uncover additional errors. But also does not fix the single dirfrag error. File system scrub does not clear entries from the damage list. The damage type you are running into ("dir_frag") implies that the object for directory "V_7770505" is lost (from the metadata pool). This results in files under that directory to be unavailable. Good news is that you can regenerate the lost object by scanning the data pool. This is documented here: https://docs.ceph.com/en/latest/cephfs/disaster-recovery-experts/#recovery-from-missing-metadata-objects (You'd need not run the cephfs-table-tool or cephfs-journal-tool command though. Also, this could take time if you have lots of objects in the data pool) Since you mention that you do not see directory "CV_MAGNETIC" and no other scrub errors are seen, it's possible that the application using cephfs removed it since it was no longer needed (the data pool might have some leftover object though). Thanks a lot for your help. Just to be clear: it's the directory structure CV_MAGNETIC/V_7770505, where V_7770505 can not be seen/found. But the parent dir CV_MAGNETIC still exists. However it strengthens the idea that the application has removed the V_7770505 directory itself. Otherwise it is expected to still find/see this directory, but empty. Right? If that is the case, there is no data needing recovery, just a cleanup of orphan objects. Also very helpful: what part of the disaster-recovery-experts docs to run and what commands to skip. This seems to boil down to: cephfs-data-scan init|scan_extents|scan_inodes|scan_links|cleanup The data pool has ~100M objects. I doubt data scanning can not be done while the filesystem is online/in use? Just a mystery remains how this damage could happen... Thanks, Sascha. ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: MDS_DAMAGE dir_frag
Hi William, On Mon, 12 Dec 2022, William Edwards wrote: Op 12 dec. 2022 om 22:47 heeft Sascha Lucas het volgende geschreven: Ceph "servers" like MONs, OSDs, MDSs etc. are all 17.2.5/cephadm/podman. The filesystem kernel clients are co-located on the same hosts running the "servers". Isn’t that discouraged? Actually I don't know. Is this somewhere documented? Thanks, Sascha. ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: MDS_DAMAGE dir_frag
Hi, On Mon, 12 Dec 2022, Sascha Lucas wrote: On Mon, 12 Dec 2022, Gregory Farnum wrote: Yes, we’d very much like to understand this. What versions of the server and kernel client are you using? What platform stack — I see it looks like you are using CephFS through the volumes interface? The simplest possibility I can think of here is that you are running with a bad kernel and it used async ops poorly, maybe? But I don’t remember other spontaneous corruptions of this type anytime recent. Ceph "servers" like MONs, OSDs, MDSs etc. are all 17.2.5/cephadm/podman. The filesystem kernel clients are co-located on the same hosts running the "servers". For some other reason OS is still RHEL 8.5 (yes with community ceph). Kernel is 4.18.0-348.el8.x86_64 from release media. Just one filesystem kernel client is at 4.18.0-348.23.1.el8_5.x86_64 from EOL of 8.5. Are there known issues with this kernel versions? Have you run a normal forward scrub (which is non-disruptive) to check if there are other issues? So far I haven't dared, but will do so tomorrow. Just an update: "scrub / recursive,repair" does not uncover additional errors. But also does not fix the single dirfrag error. Thanks, Sascha. [2] https://www.spinics.net/lists/ceph-users/msg53202.html [3] https://docs.ceph.com/en/quincy/cephfs/disaster-recovery/#metadata-damage-and-repair ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: MDS_DAMAGE dir_frag
Hi Greg, On Mon, 12 Dec 2022, Gregory Farnum wrote: On Mon, Dec 12, 2022 at 12:10 PM Sascha Lucas wrote: A follow-up of [2] also mentioned having random meta-data corruption: "We have 4 clusters (all running same version) and have experienced meta-data corruption on the majority of them at some time or the other" Jewel (and upgrading from that version) was much less stable than Luminous (when we declared the filesystem “awesome” and said the Ceph upstream considered it production-ready), and things have generally gotten better with every release since then. I see. The cited corruption belongs to older releases... [3] tells me, that metadata damage can happen either from data loss (which I'm convinced not to have), or from software bugs. The later would be worth fixing. Is there a way to find the root cause? Yes, we’d very much like to understand this. What versions of the server and kernel client are you using? What platform stack — I see it looks like you are using CephFS through the volumes interface? The simplest possibility I can think of here is that you are running with a bad kernel and it used async ops poorly, maybe? But I don’t remember other spontaneous corruptions of this type anytime recent. Ceph "servers" like MONs, OSDs, MDSs etc. are all 17.2.5/cephadm/podman. The filesystem kernel clients are co-located on the same hosts running the "servers". For some other reason OS is still RHEL 8.5 (yes with community ceph). Kernel is 4.18.0-348.el8.x86_64 from release media. Just one filesystem kernel client is at 4.18.0-348.23.1.el8_5.x86_64 from EOL of 8.5. Are there known issues with this kernel versions? Have you run a normal forward scrub (which is non-disruptive) to check if there are other issues? So far I haven't dared, but will do so tomorrow. Thanks, Sascha. [2] https://www.spinics.net/lists/ceph-users/msg53202.html [3] https://docs.ceph.com/en/quincy/cephfs/disaster-recovery/#metadata-damage-and-repair ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: MDS_DAMAGE dir_frag
Hi Dhairya, On Mon, 12 Dec 2022, Dhairya Parmar wrote: You might want to look at [1] for this, also I found a relevant thread [2] that could be helpful. Thanks a lot. I already found [1,2], too. But I did not considered it, because I felt not having a "disaster"? Nothing seems broken nor crashed: all servers/services up since weeks. No disk failures, no modifications on cluster etc. Also the Warning Box in [1] tells me (as a newbie) not to run anything of this. Or in other words: not to forcefully start a disaster ;-). A follow-up of [2] also mentioned having random meta-data corruption: "We have 4 clusters (all running same version) and have experienced meta-data corruption on the majority of them at some time or the other" [3] tells me, that metadata damage can happen either from data loss (which I'm convinced not to have), or from software bugs. The later would be worth fixing. Is there a way to find the root cause? And is going through [1] relay the only option? It sounds being offline for days... At least I know now, what dirfrags[4] are. Thanks, Sascha. [1] https://docs.ceph.com/en/latest/cephfs/disaster-recovery-experts/#disaster-recovery-experts [2] https://www.spinics.net/lists/ceph-users/msg53202.html [3] https://docs.ceph.com/en/quincy/cephfs/disaster-recovery/#metadata-damage-and-repair [4] https://docs.ceph.com/en/quincy/cephfs/dirfrags/ ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] MDS_DAMAGE dir_frag
Hi, without any outage/disaster cephFS (17.2.5/cephadm) reports damaged metadata: [root@ceph106 ~]# zcat /var/log/ceph/3cacfa58-55cf-11ed-abaf-5cba2c03dec0/ceph-mds.disklib.ceph106.kbzjbg.log-20221211.gz 2022-12-10T10:12:35.161+ 7fa46779d700 1 mds.disklib.ceph106.kbzjbg Updating MDS map to version 958 from mon.1 2022-12-10T10:12:50.974+ 7fa46779d700 1 mds.disklib.ceph106.kbzjbg Updating MDS map to version 959 from mon.1 2022-12-10T15:18:36.609+ 7fa461791700 0 mds.0.cache.dir(0x11516b1) _fetched missing object for [dir 0x11516b1 /volumes/_nogroup/ec-pool4p2/aa36abb9-a22e-405f-921c-76152599c6ba/LQ1WYG_10.28.2022_04.50/CV_MAGNETIC/V_7770505/ [2,head] auth v=0 cv=0/0 ap=1+0 state=1073741888|fetching f() n() hs=0+0,ss=0+0 | waiter=1 authpin=1 0x56541d3c5a80] 2022-12-10T15:18:36.615+ 7fa461791700 -1 log_channel(cluster) log [ERR] : dir 0x11516b1 object missing on disk; some files may be lost (/volumes/_nogroup/ec-pool4p2/aa36abb9-a22e-405f-921c-76152599c6ba/LQ1WYG_10.28.2022_04.50/CV_MAGNETIC/V_7770505) 2022-12-10T15:18:40.010+ 7fa46779d700 1 mds.disklib.ceph106.kbzjbg Updating MDS map to version 960 from mon.1 2022-12-11T02:32:01.474+ 7fa468fa0700 -1 received signal: Hangup from Kernel ( Could be generated by pthread_kill(), raise(), abort(), alarm() ) UID: 0 [root@ceph101 ~]# ceph tell mds.disklib:0 damage ls 2022-12-12T10:20:42.484+0100 7fa9e37fe700 0 client.165258 ms_handle_reset on v2:xxx.xxx.xxx.xxx:6800/519677707 2022-12-12T10:20:42.504+0100 7fa9e37fe700 0 client.165264 ms_handle_reset on v2:xxx.xxx.xxx.xxx:6800/519677707 [ { "damage_type": "dir_frag", "id": 2085830739, "ino": 1099513009841, "frag": "*", "path": "/volumes/_nogroup/ec-pool4p2/aa36abb9-a22e-405f-921c-76152599c6ba/LQ1WYG_10.28.2022_04.50/CV_MAGNETIC/V_7770505" } ] The mentioned path CV_MAGNETIC/V_7770505 is not visible, but I can't tell whether this is due to being lost, or removed by the application using the cephFS. Data is on EC4+2 pool, ROOT and METADATA are on replica=3 pools. Questions are: What happened? And how to fix the problem? Is running "ceph tell mds.disklib:0 scrub start /what/path? recursive,repair" the right thing? Is this a safe command? How is the impact on production? Can the file-system stay mounted/used by clients? How long will it take for 340T? What is a dir_frag damage? TIA, Sascha. ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io