[ceph-users] MDS crashing
Hi, I have a small cluster with 11 osds and 4 filesystems. Each server (Debian 11, ceph 17.2.7) usually run several services. After troubles with a host with OSD:s I removed the OSD:s and let the cluster repair it self (x3 replica). After a while it returned to a healthy state and everything was well. This might not be important for what followed, but I mention it just in case. A couple of days later a mds-services gave a health-warning. First one was (2024-05-28 10:02) mds.cloudfs.stugan6.ywuomz(mds.0): 1 slow requests are blocked > 30 secs followed by filesystem being degraded (2024-05-28 10:22) fs cloudfs is degraded the other filesystems have been marked degraded from time to time but later cleared. at 2024-05-28 10:28 daemon mds.mediafs.stugan7.zzxavs on stugan7 is in error state daemon mds.cloudfs.stugan7.cmjbun on stugan7 is in error state at 2024-05-28 10:33 daemon mds.cloudfs.stugan4.qxwzox on stugan4 is in error state daemon mds.cloudfs.stugan5.hbkkad on stugan5 is in error state daemon mds.oxylfs.stugan7.iazekf on stugan7 is in error state MDS-services went on crashing ... I put the osd:s on pause and nodown,noout,nobackfill,norebalance,norecover flags, but at present only the flags as I have tried to get the system up and running again. While the osd:s were paused, I could 'clear up' the mess and remove all services in error state. The monitors and managers seems to function well. I could also start getting the mds-services running again. BUT, when I removed the pause from the osd:s the mds-services once again started to go inte error state. Now I have removed the mds-label from all ceph servers and therefore it has calmed down. But if I let the services be recreated the crashes will start over again. If I check the filesystem (I have marked them down for now) status the cloudfs is strange... oxylfs - 0 clients == POOL TYPE USED AVAIL oxylfs_metadata metadata 154M 20.8T oxylfs_data data1827G 20.8T cloudfs - 0 clients === RANK STATEMDSACTIVITY DNSINOS DIRS CAPS 0replay(laggy) backupfs.stugan6.bgcltx 0 0 0 0 POOL TYPE USED AVAIL cloudfs_metadata metadata 337M 20.8T cloudfs_data data 356G 20.8T mediafs - 0 clients === POOL TYPE USED AVAIL mediafs_metadata metadata 66.0M 20.8T mediafs_data data2465G 20.8T backupfs - 0 clients POOLTYPE USED AVAIL backupfsnew_metadata metadata 221M 20.8T backupfsnew_data data8740G 20.8T MDS version: ceph version 17.2.7 (b12291d110049b2f35e32e0de30d70e9a4c060d2) quincy (stable) Why is the mds (in error state) for the backupfs-filesystem shown with the cloudfs-filesystem? Now... Is there a way back to normal? /Johan ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Removed host in maintenance mode
Looking at the history I first tried ceph orch host rm hostname --offline --force and then ceph orch host rm hostname --force The second command must have removed the host (partially) because I didn't try any other commands after that. Now when I try these commands again, offline gives me Error EINVAL: hostname is online, please remove host without --offline. and then using only force Error EINVAL: host hostname does not exist As a side note I can mention I had to manually clear the crushmap after removing the host. I also manually removed the keys for the osd:s that remained in the host (after the pools recovered/rebalanced). /Johan Den 2024-05-07 kl. 12:09, skrev Eugen Block: Hi, did you remove the host from the host list [0]? ceph orch host rm [--force] [--offline] [0] https://docs.ceph.com/en/latest/cephadm/host-management/#offline-host-removal Zitat von Johan : Hi all, In my small cluster of 6 hosts I had troubles with a host (osd:s) and was planning to remove it from the cluster. Before I got to do that I needed to power down this host and therefore put it in maintenance mode. Due to some mistakes on my part I couldn't boot the host again and simply decided to force the removal from the cluster. The host is now removed but ceph (17.2.7) keep on complaining on it being in maintenance mode. How can I remove the last remnants of this host from the cluster? /Johan ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Removed host in maintenance mode
Hi all, In my small cluster of 6 hosts I had troubles with a host (osd:s) and was planning to remove it from the cluster. Before I got to do that I needed to power down this host and therefore put it in maintenance mode. Due to some mistakes on my part I couldn't boot the host again and simply decided to force the removal from the cluster. The host is now removed but ceph (17.2.7) keep on complaining on it being in maintenance mode. How can I remove the last remnants of this host from the cluster? /Johan ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Building new cluster had a couple of questions
On 2023-12-22 03:28, Robert Sander wrote: Hi, On 22.12.23 11:41, Albert Shih wrote: for n in 1-100 Put off line osd on server n Uninstall docker on server n Install podman on server n redeploy on server n end Yep, that's basically the procedure. But first try it on a test cluster. Regards For reference, this was also discussed about two years ago: https://www.spinics.net/lists/ceph-users/msg70108.html Worked for me. // Johan ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: [EXTERNAL] [Pacific] ceph orch device ls do not returns any HDD
I have checked my disks as well, all devices are hot-swappable hdd and have the removable flag set /Johan Den 2023-10-24 kl. 13:38, skrev Patrick Begou: Hi Eugen, Yes Eugen, all the devices /dev/sd[abc] have the removable flag set to 1. May be because they are hot-swappable hard drives. I have contacted the commit author Zack Cerza and he asked me for some additional tests too this morning. I add him in copy to this mail. Patrick Le 24/10/2023 à 12:57, Eugen Block a écrit : Hi, just to confirm, could you check that the disk which is *not* discovered by 16.2.11 has a "removable" flag? cat /sys/block/sdX/removable I could reproduce it as well on a test machine with a USB thumb drive (live distro) which is excluded in 16.2.11 but is shown in 16.2.10. Although I'm not a developer I tried to understand what changes were made in https://github.com/ceph/ceph/pull/46375/files#diff-330f9319b0fe352dff0486f66d3c4d6a6a3d48efd900b2ceb86551cfd88dc4c4R771 and there's this line: if get_file_contents(os.path.join(_sys_block_path, dev, 'removable')) == "1": continue The thumb drive is removable, of course, apparently that is filtered here. Regards, Eugen Zitat von Patrick Begou : Le 23/10/2023 à 03:04, 544463...@qq.com a écrit : I think you can try to roll back this part of the python code and wait for your good news :) Not so easy [root@e9865d9a7f41 ceph]# git revert 4fc6bc394dffaf3ad375ff29cbb0a3eb9e4dbefc Auto-merging src/ceph-volume/ceph_volume/tests/util/test_device.py CONFLICT (content): Merge conflict in src/ceph-volume/ceph_volume/tests/util/test_device.py Auto-merging src/ceph-volume/ceph_volume/util/device.py CONFLICT (content): Merge conflict in src/ceph-volume/ceph_volume/util/device.py Auto-merging src/ceph-volume/ceph_volume/util/disk.py CONFLICT (content): Merge conflict in src/ceph-volume/ceph_volume/util/disk.py error: could not revert 4fc6bc394df... ceph-volume: Optionally consume loop devices Patrick ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: [EXTERNAL] [Pacific] ceph orch device ls do not returns any HDD
Which OS are your running? What is the outcome of these two tests? cephadm --image quay.io/ceph/ceph:v16.2.10-20220920 ceph-volume inventory cephadm --image quay.io/ceph/ceph:v16.2.11-20230125 ceph-volume inventory /Johan Den 2023-10-16 kl. 08:25, skrev 544463...@qq.com: I encountered a similar problem on ceph17.2.5, could you found which commit caused it? ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: [EXTERNAL] [Pacific] ceph orch device ls do not returns any HDD
The problem appears in v16.2.11-20230125. I have no insight into the different commits. /Johan Den 2023-10-16 kl. 08:25, skrev 544463...@qq.com: I encountered a similar problem on ceph17.2.5, could you found which commit caused it? ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: [EXTERNAL] [Pacific] ceph orch device ls do not returns any HDD
At home Im running a small cluster, Ceph v17.2.6, Debian 11 Bullseye. I have recently added a new server to the cluster but face the same problem as Patrick, I can't add any HDD. Ceph doesn't recognise them. I have run the same tests as Patrick, using Ceph v14-v18, and as Patrick showed the problem appears in Ceph v16.2.11-20230125 === Ceph v16.2.10-20220920 === $ sudo cephadm --image quay.io/ceph/ceph:v16.2.10-20220920 ceph-volume inventory Inferring fsid 5592891c-30e4-11ed-b720-f02f741f58ac Device Path Size rotates available Model name /dev/nvme0n1 931.51 GBFalse False KINGSTON SNV2S1000G /dev/nvme1n1 931.51 GBFalse False KINGSTON SNV2S1000G /dev/sda 3.64 TB TrueFalse WDC WD4003FFBX-6 /dev/sdb 5.46 TB TrueFalse WDC WD6003FFBX-6 /dev/sdc 7.28 TB TrueFalse ST8000NE001-2M71 /dev/sdd 7.28 TB TrueFalse WDC WD8003FFBX-6 === Ceph v16.2.11-20230125 === $ sudo cephadm --image quay.io/ceph/ceph:v16.2.11-20230125 ceph-volume inventory Inferring fsid 5592891c-30e4-11ed-b720-f02f741f58ac Device Path Size Device nodesrotates available Model name /dev/md0 9.30 GB nvme1n1p2,nvme0n1p2 False False /dev/md1 59.57 GB nvme0n1p3,nvme1n1p3 False False /dev/md2 279.27 GBnvme1n1p4,nvme0n1p4 False False /dev/nvme0n1 931.51 GBnvme0n1 False False KINGSTON SNV2S1000G /dev/nvme1n1 931.51 GBnvme1n1 False False KINGSTON SNV2S1000G /Johan ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Misplaced objects greater than 100%
I think this is resolved—and you're right about the 0-weight of the root bucket being strange. I had created the rack buckets with # ceph osd crush add-bucket rack-0 rack whereas I should have used something like # ceph osd crush add-bucket rack-0 rack root=default There's a bit in the documentation (https://docs.ceph.com/en/quincy/rados/operations/crush-map) that says "Not all keys need to be specified" (in a different context, I admit). I might have saved a second or two by omitting "root=default" and maybe half a minute by not checking the CRUSH map carefully afterwards. It was not worth it. // J On 2023-04-05 12:01, c...@elchaka.de wrote: I guess this is related to your crush rules.. Unfortunaly i dont know much about creating the rules... But someone cloud give more insights when you also provide crush rule dump your "-1 0 root default" is a bit strange Am 1. April 2023 01:01:39 MESZ schrieb Johan Hattne : Here goes: # ceph -s cluster: id: e1327a10-8b8c-11ed-88b9-3cecef0e3946 health: HEALTH_OK services: mon: 5 daemons, quorum bcgonen-a,bcgonen-b,bcgonen-c,bcgonen-r0h0,bcgonen-r0h1 (age 16h) mgr: bcgonen-b.furndm(active, since 8d), standbys: bcgonen-a.qmmqxj mds: 1/1 daemons up, 2 standby osd: 36 osds: 36 up (since 16h), 36 in (since 3d); 1041 remapped pgs data: volumes: 1/1 healthy pools: 3 pools, 1041 pgs objects: 5.42M objects, 6.5 TiB usage: 19 TiB used, 428 TiB / 447 TiB avail pgs: 27087125/16252275 objects misplaced (166.667%) 1039 active+clean+remapped 2active+clean+remapped+scrubbing+deep # ceph osd tree ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF -14 149.02008 rack rack-1 -7 149.02008 host bcgonen-r1h0 20hdd 14.55269 osd.20 up 1.0 1.0 21hdd 14.55269 osd.21 up 1.0 1.0 22hdd 14.55269 osd.22 up 1.0 1.0 23hdd 14.55269 osd.23 up 1.0 1.0 24hdd 14.55269 osd.24 up 1.0 1.0 25hdd 14.55269 osd.25 up 1.0 1.0 26hdd 14.55269 osd.26 up 1.0 1.0 27hdd 14.55269 osd.27 up 1.0 1.0 28hdd 14.55269 osd.28 up 1.0 1.0 29hdd 14.55269 osd.29 up 1.0 1.0 34ssd1.74660 osd.34 up 1.0 1.0 35ssd1.74660 osd.35 up 1.0 1.0 -13 298.04016 rack rack-0 -3 149.02008 host bcgonen-r0h0 0hdd 14.55269 osd.0 up 1.0 1.0 1hdd 14.55269 osd.1 up 1.0 1.0 2hdd 14.55269 osd.2 up 1.0 1.0 3hdd 14.55269 osd.3 up 1.0 1.0 4hdd 14.55269 osd.4 up 1.0 1.0 5hdd 14.55269 osd.5 up 1.0 1.0 6hdd 14.55269 osd.6 up 1.0 1.0 7hdd 14.55269 osd.7 up 1.0 1.0 8hdd 14.55269 osd.8 up 1.0 1.0 9hdd 14.55269 osd.9 up 1.0 1.0 30ssd1.74660 osd.30 up 1.0 1.0 31ssd1.74660 osd.31 up 1.0 1.0 -5 149.02008 host bcgonen-r0h1 10hdd 14.55269 osd.10 up 1.0 1.0 11hdd 14.55269 osd.11 up 1.0 1.0 12hdd 14.55269 osd.12 up 1.0 1.0 13hdd 14.55269 osd.13 up 1.0 1.0 14hdd 14.55269 osd.14 up 1.0 1.0 15hdd 14.55269 osd.15 up 1.0 1.0 16hdd 14.55269 osd.16 up 1.0 1.0 17hdd 14.55269 osd.17 up 1.0 1.0 18hdd 14.55269 osd.18 up 1.0 1.0 19hdd 14.55269 osd.19 up 1.0 1.0 32ssd1.74660 osd.32 up 1.0 1.0 33ssd1.74660 osd.33 up 1.0 1.0 -1 0 root default # ceph osd pool ls detail pool 1 '.mgr' replicated size 3 min_size 2 crush_rule 0
[ceph-users] Re: Misplaced objects greater than 100%
Thanks Mehmet; I took a closer look at what I sent you and the problem appears to be in the CRUSH map. At some point since anything was last rebooted, I created rack buckets and moved the OSD nodes in under them: # ceph osd crush add-bucket rack-0 rack # ceph osd crush add-bucket rack-1 rack # ceph osd crush move bcgonen-r0h0 rack=rack-0 # ceph osd crush move bcgonen-r0h1 rack=rack-0 # ceph osd crush move bcgonen-r1h0 rack=rack-1 All seemed fine at the time; it was not until bcgonen-r1h0 was rebooted that stuff got weird. But as per "ceph osd tree" output, those rack buckets were sitting next to the default root as opposed to under it. Now that's fixed, and the cluster is backfilling remapped PGs. // J On 2023-03-31 16:01, Johan Hattne wrote: Here goes: # ceph -s cluster: id: e1327a10-8b8c-11ed-88b9-3cecef0e3946 health: HEALTH_OK services: mon: 5 daemons, quorum bcgonen-a,bcgonen-b,bcgonen-c,bcgonen-r0h0,bcgonen-r0h1 (age 16h) mgr: bcgonen-b.furndm(active, since 8d), standbys: bcgonen-a.qmmqxj mds: 1/1 daemons up, 2 standby osd: 36 osds: 36 up (since 16h), 36 in (since 3d); 1041 remapped pgs data: volumes: 1/1 healthy pools: 3 pools, 1041 pgs objects: 5.42M objects, 6.5 TiB usage: 19 TiB used, 428 TiB / 447 TiB avail pgs: 27087125/16252275 objects misplaced (166.667%) 1039 active+clean+remapped 2 active+clean+remapped+scrubbing+deep # ceph osd tree ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF -14 149.02008 rack rack-1 -7 149.02008 host bcgonen-r1h0 20 hdd 14.55269 osd.20 up 1.0 1.0 21 hdd 14.55269 osd.21 up 1.0 1.0 22 hdd 14.55269 osd.22 up 1.0 1.0 23 hdd 14.55269 osd.23 up 1.0 1.0 24 hdd 14.55269 osd.24 up 1.0 1.0 25 hdd 14.55269 osd.25 up 1.0 1.0 26 hdd 14.55269 osd.26 up 1.0 1.0 27 hdd 14.55269 osd.27 up 1.0 1.0 28 hdd 14.55269 osd.28 up 1.0 1.0 29 hdd 14.55269 osd.29 up 1.0 1.0 34 ssd 1.74660 osd.34 up 1.0 1.0 35 ssd 1.74660 osd.35 up 1.0 1.0 -13 298.04016 rack rack-0 -3 149.02008 host bcgonen-r0h0 0 hdd 14.55269 osd.0 up 1.0 1.0 1 hdd 14.55269 osd.1 up 1.0 1.0 2 hdd 14.55269 osd.2 up 1.0 1.0 3 hdd 14.55269 osd.3 up 1.0 1.0 4 hdd 14.55269 osd.4 up 1.0 1.0 5 hdd 14.55269 osd.5 up 1.0 1.0 6 hdd 14.55269 osd.6 up 1.0 1.0 7 hdd 14.55269 osd.7 up 1.0 1.0 8 hdd 14.55269 osd.8 up 1.0 1.0 9 hdd 14.55269 osd.9 up 1.0 1.0 30 ssd 1.74660 osd.30 up 1.0 1.0 31 ssd 1.74660 osd.31 up 1.0 1.0 -5 149.02008 host bcgonen-r0h1 10 hdd 14.55269 osd.10 up 1.0 1.0 11 hdd 14.55269 osd.11 up 1.0 1.0 12 hdd 14.55269 osd.12 up 1.0 1.0 13 hdd 14.55269 osd.13 up 1.0 1.0 14 hdd 14.55269 osd.14 up 1.0 1.0 15 hdd 14.55269 osd.15 up 1.0 1.0 16 hdd 14.55269 osd.16 up 1.0 1.0 17 hdd 14.55269 osd.17 up 1.0 1.0 18 hdd 14.55269 osd.18 up 1.0 1.0 19 hdd 14.55269 osd.19 up 1.0 1.0 32 ssd 1.74660 osd.32 up 1.0 1.0 33 ssd 1.74660 osd.33 up 1.0 1.0 -1 0 root default # ceph osd pool ls detail pool 1 '.mgr' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 1 pgp_num 1 autoscale_mode on last_change 31 flags hashpspool stripe_width 0 pg_num_max 32 pg_num_min 1 application mgr pool 2 'cephfs.cephfs.meta' replicated size 3 min_size 2 crush_rule 2 object_hash rjenkins pg_num 16 pgp_num 16 autoscale_mode on last_change 9833 lfor 0/0/584 flags hashpspool stripe_width 0 pg_autoscale_bias 4 pg_num_min 16 recovery_priority 5 application cephfs pool 3 'cephfs.cephfs.data' repli
[ceph-users] Re: Misplaced objects greater than 100%
Here goes: # ceph -s cluster: id: e1327a10-8b8c-11ed-88b9-3cecef0e3946 health: HEALTH_OK services: mon: 5 daemons, quorum bcgonen-a,bcgonen-b,bcgonen-c,bcgonen-r0h0,bcgonen-r0h1 (age 16h) mgr: bcgonen-b.furndm(active, since 8d), standbys: bcgonen-a.qmmqxj mds: 1/1 daemons up, 2 standby osd: 36 osds: 36 up (since 16h), 36 in (since 3d); 1041 remapped pgs data: volumes: 1/1 healthy pools: 3 pools, 1041 pgs objects: 5.42M objects, 6.5 TiB usage: 19 TiB used, 428 TiB / 447 TiB avail pgs: 27087125/16252275 objects misplaced (166.667%) 1039 active+clean+remapped 2active+clean+remapped+scrubbing+deep # ceph osd tree ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF -14 149.02008 rack rack-1 -7 149.02008 host bcgonen-r1h0 20hdd 14.55269 osd.20 up 1.0 1.0 21hdd 14.55269 osd.21 up 1.0 1.0 22hdd 14.55269 osd.22 up 1.0 1.0 23hdd 14.55269 osd.23 up 1.0 1.0 24hdd 14.55269 osd.24 up 1.0 1.0 25hdd 14.55269 osd.25 up 1.0 1.0 26hdd 14.55269 osd.26 up 1.0 1.0 27hdd 14.55269 osd.27 up 1.0 1.0 28hdd 14.55269 osd.28 up 1.0 1.0 29hdd 14.55269 osd.29 up 1.0 1.0 34ssd1.74660 osd.34 up 1.0 1.0 35ssd1.74660 osd.35 up 1.0 1.0 -13 298.04016 rack rack-0 -3 149.02008 host bcgonen-r0h0 0hdd 14.55269 osd.0 up 1.0 1.0 1hdd 14.55269 osd.1 up 1.0 1.0 2hdd 14.55269 osd.2 up 1.0 1.0 3hdd 14.55269 osd.3 up 1.0 1.0 4hdd 14.55269 osd.4 up 1.0 1.0 5hdd 14.55269 osd.5 up 1.0 1.0 6hdd 14.55269 osd.6 up 1.0 1.0 7hdd 14.55269 osd.7 up 1.0 1.0 8hdd 14.55269 osd.8 up 1.0 1.0 9hdd 14.55269 osd.9 up 1.0 1.0 30ssd1.74660 osd.30 up 1.0 1.0 31ssd1.74660 osd.31 up 1.0 1.0 -5 149.02008 host bcgonen-r0h1 10hdd 14.55269 osd.10 up 1.0 1.0 11hdd 14.55269 osd.11 up 1.0 1.0 12hdd 14.55269 osd.12 up 1.0 1.0 13hdd 14.55269 osd.13 up 1.0 1.0 14hdd 14.55269 osd.14 up 1.0 1.0 15hdd 14.55269 osd.15 up 1.0 1.0 16hdd 14.55269 osd.16 up 1.0 1.0 17hdd 14.55269 osd.17 up 1.0 1.0 18hdd 14.55269 osd.18 up 1.0 1.0 19hdd 14.55269 osd.19 up 1.0 1.0 32ssd1.74660 osd.32 up 1.0 1.0 33ssd1.74660 osd.33 up 1.0 1.0 -1 0 root default # ceph osd pool ls detail pool 1 '.mgr' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 1 pgp_num 1 autoscale_mode on last_change 31 flags hashpspool stripe_width 0 pg_num_max 32 pg_num_min 1 application mgr pool 2 'cephfs.cephfs.meta' replicated size 3 min_size 2 crush_rule 2 object_hash rjenkins pg_num 16 pgp_num 16 autoscale_mode on last_change 9833 lfor 0/0/584 flags hashpspool stripe_width 0 pg_autoscale_bias 4 pg_num_min 16 recovery_priority 5 application cephfs pool 3 'cephfs.cephfs.data' replicated size 3 min_size 2 crush_rule 1 object_hash rjenkins pg_num 1024 pgp_num 1024 autoscale_mode on last_change 7630 lfor 0/1831/6544 flags hashpspool,bulk stripe_width 0 application cephfs crush_rules 1 and 2 are just used to assign the data and meta pool to HDD and SSD, respectively (failure domain: host). // J On 2023-03-31 15:37, c...@elchaka.de wrote: Need to know some more about your cluster... Ceph -s Ceph osd df tree Replica or ec? ... Perhaps this can give us some insight Mehmet Am 31. März 2023 18:08:38 MESZ schrieb Johan Hattne : Dear all; Up until a few hours ago, I had a seemingly normally-behaving cluster (Quincy, 17.2.5) with 36 OSDs, evenly distributed across 3 of its 6 nodes. The cluster is only used for CephFS and the only non-standard configuration I can think of is that I had 2 active
[ceph-users] Misplaced objects greater than 100%
Dear all; Up until a few hours ago, I had a seemingly normally-behaving cluster (Quincy, 17.2.5) with 36 OSDs, evenly distributed across 3 of its 6 nodes. The cluster is only used for CephFS and the only non-standard configuration I can think of is that I had 2 active MDSs, but only 1 standby. I had also doubled mds_cache_memory limit to 8 GB (all OSD hosts have 256 G of RAM) at some point in the past. Then I rebooted one of the OSD nodes. The rebooted node held one of the active MDSs. Now the node is back up: ceph -s says the cluster is healthy, but all PGs are in a active+clean+remapped state and 166.67% of the objects are misplaced (dashboard: -66.66% healthy). The data pool is a threefold replica with 5.4M object, the number of misplaced objects is reported as 27087410/16252446. The denominator in the ratio makes sense to me (16.2M / 3 = 5.4M), but the numerator does not. I also note that the ratio is *exactly* 5 / 3. The filesystem is still mounted and appears to be usable, but df reports it as 100% full; I suspect it would say 167% but that is capped somewhere. Any ideas about what is going on? Any suggestions for recovery? // Best wishes; Johan ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Failed to apply 1 service(s): mon
Hi, As you suggested, it was the network that was wrong. It was set to 192.168.119.1/24 and when I changed it to ...119.0/24 the error went away. Even introduced the error again and the error-messages reappeared. But it is strange that I have had this error for weeks without any errormessage ... Thank you for your help! /Johan Den 2022-11-07 kl. 09:09, skrev Eugen Block: Hi, how does your mon section of the myservice.yaml look? Could you please paste it? How did you configure the public network? Can you share # ceph config get mon public_network It sounds like you have 192.168.119.1/24 but you wanted 192.168.119.0/24 configured (no host bits set), can you verify? Regards, Eugen Zitat von Johan : Im quite new to ceph and is changing a lot in my teststorage (Debian 11, 4 hosts in total, ceph 17.2.5) as I learn and find out better ways to arrange things. Now I must have done something wrong and cant find a way to get rid of this error. ceph -s gives HEALTH_WARN, Failed to apply 1 service(s): mon Using the Dashboard I find the error below showing up every minute. I have mon-service running on 3 servers. A few days ago a mon-service gave errors and I removed it. Im not sure if the error showed up before or after I removed the troublesome service. The error can have showed up as I tried to move a mon-daemon from one host to another. I have tried to use ceph orch ls --service_name=mon --export > myservice.yaml in order to find a possible error and correct it, but with no success. Is it this informatin in the error-message that gives me trouble? 192.168.119.1/24 has host bits set What have I done wrong and how can I correct it? /Johan Failed to apply mon spec ServiceSpec.from_json(yaml.safe_load('''service_type: mon service_name: mon placement: count: 3 ''')): 192.168.119.1/24 has host bits set Traceback (most recent call last): File "/usr/share/ceph/mgr/cephadm/serve.py", line 508, in _apply_all_services if self._apply_service(spec): File "/usr/share/ceph/mgr/cephadm/serve.py", line 650, in _apply_service all_slots, slots_to_add, daemons_to_remove = ha.place() File "/usr/share/ceph/mgr/cephadm/schedule.py", line 262, in place candidates = self.get_candidates() # type: List[DaemonPlacement] File "/usr/share/ceph/mgr/cephadm/schedule.py", line 431, in get_candidates if self.filter_new_host(h.hostname): File "/usr/share/ceph/mgr/cephadm/serve.py", line 614, in matches_network public_network = ipaddress.ip_network(pn) File "/lib64/python3.6/ipaddress.py", line 74, in ip_network return IPv4Network(address, strict) File "/lib64/python3.6/ipaddress.py", line 1519, in __init__ raise ValueError('%s has host bits set' % self) ValueError: 192.168.119.1/24 has host bits set ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Failed to apply 1 service(s): mon
Im quite new to ceph and is changing a lot in my teststorage (Debian 11, 4 hosts in total, ceph 17.2.5) as I learn and find out better ways to arrange things. Now I must have done something wrong and cant find a way to get rid of this error. ceph -s gives HEALTH_WARN, Failed to apply 1 service(s): mon Using the Dashboard I find the error below showing up every minute. I have mon-service running on 3 servers. A few days ago a mon-service gave errors and I removed it. Im not sure if the error showed up before or after I removed the troublesome service. The error can have showed up as I tried to move a mon-daemon from one host to another. I have tried to use ceph orch ls --service_name=mon --export > myservice.yaml in order to find a possible error and correct it, but with no success. Is it this informatin in the error-message that gives me trouble? 192.168.119.1/24 has host bits set What have I done wrong and how can I correct it? /Johan Failed to apply mon spec ServiceSpec.from_json(yaml.safe_load('''service_type: mon service_name: mon placement: count: 3 ''')): 192.168.119.1/24 has host bits set Traceback (most recent call last): File "/usr/share/ceph/mgr/cephadm/serve.py", line 508, in _apply_all_services if self._apply_service(spec): File "/usr/share/ceph/mgr/cephadm/serve.py", line 650, in _apply_service all_slots, slots_to_add, daemons_to_remove = ha.place() File "/usr/share/ceph/mgr/cephadm/schedule.py", line 262, in place candidates = self.get_candidates() # type: List[DaemonPlacement] File "/usr/share/ceph/mgr/cephadm/schedule.py", line 431, in get_candidates if self.filter_new_host(h.hostname): File "/usr/share/ceph/mgr/cephadm/serve.py", line 614, in matches_network public_network = ipaddress.ip_network(pn) File "/lib64/python3.6/ipaddress.py", line 74, in ip_network return IPv4Network(address, strict) File "/lib64/python3.6/ipaddress.py", line 1519, in __init__ raise ValueError('%s has host bits set' % self) ValueError: 192.168.119.1/24 has host bits set ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: OSD failed to load OSD map for epoch
OK, thanks! This is the same package as in the Octopus images, so I would expect Pacific to fail just as spectacularly. What's the best way to have this fixed? New issue on the Ceph tracker? I understand the Ceph images use CentOS packages, so should they be poked as well? // Best wishes; Johan On 2021-07-27 23:48, Eugen Block wrote: Alright, it's great that you could fix it! In my one-node test cluster (Pacific) I see this smartctl version: [ceph: root@pacific /]# rpm -q smartmontools smartmontools-7.1-1.el8.x86_64 Zitat von Johan Hattne : Thanks a lot, Eugen! I had not found those threads, but I did eventually recover; details below. And yes, this is a toy size-2 cluster with two OSDs, but I suspect I would seen the same problem on a more reasonable setup since this whole mess was caused by Octopus's smartmontools not playing nice with the NVMes. Just as in the previous thread Eugen provided, I got an OSD map from the monitors: # ceph osd getmap 4372 > /tmp/osd_map_4372 copied it to the OSD hosts and imported it: # CEPH_ARGS="--bluestore-ignore-data-csum" ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-0/ --op set-osdmap --file /tmp/osd_map_4372 Given the initial cause of the error, I removed the WAL devices: # ceph-bluestore-tool --path /var/lib/ceph/osd/ceph-0 --devs-source /var/lib/ceph/osd/ceph-0/block.wal --dev-target /var/lib/ceph/osd/ceph-0/block --command bluefs-bdev-migrate # ceph-volume lvm zap /var/lib/ceph/osd/ceph-0/block.wal Here I got bitten by what looks like #49554, so # lvchange --deltag "ceph.wal_device=/dev/ceph-wal/wal-0" --deltag "ceph.wal_uuid=G7Z5qA-OaJQ-Spvs-X4ec-0SvX-vT2C-C0Dbpe" /dev/ceph-block-0/block-0 And analogously for osd1. After restarting the OSDs, deep scrubbing, and a bit of manual repair, the cluster is healthy again. The reason for the crash turns out to be a known problem with smartmontools <7.2 and the Micron 2200 NVMes that were used to back the WAL (https://www.smartmontools.org/ticket/1404). Unfortunately, the Octopus image ships with smartmontools 7.1, which will crash the kernel on e.g. "smartctl -a /dev/nvme0". Before switching to Octopus containers, I was using smartmontools from Debian backports, which does not have this problem. Does Pacific have newer smartmontools? // Best wishes; Johan On 2021-07-27 06:35, Eugen Block wrote: Hi, did you read this thread [1] reporting a similar issue? It refers to a solution described in [2] but the OP in [1] recreated all OSDs, so it's not clear what the root cause was. Can you start the OSD with more verbose (debug) output and share that? Does your cluster really have only two OSDs? Are you running it with size 2 pools? [1] https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/EUFDKK3HEA5DPTUVJ5LBNQSWAKZH5ZM7/ [2] http://lists.ceph.com/pipermail/ceph-users-ceph.com/2019-August/036592.html Zitat von Johan Hattne : Dear all; We have 3-node cluster that has two OSDs on separate nodes, each with wal on NVMe. It's been running fine for quite some time, albeit under very light load. This week, we moved from package-based Octopus to container-based ditto (15.2.13, all on Debian stable). Within a few hours of that change, both OSDs crashed and dmesg filled up with stuff like: DMAR: DRHD: handling fault status reg 2 DMAR: [DMA Read] Request device [06:00.0] PASID fault addr ffbc [fault reason 06] PTE Read access is not set where 06:00.0 is the NVMe with the wal. This happened at the same time on *both* OSD nodes, but I'll worry about why this happened later. I would first like to see if I can get the cluster back up. From cephadm shell, I activate OSD 1 and try to start it (I did create a minimal /etc/ceph/ceph.conf with global "fsid" and "mon host" for that purpose): # ceph-volume lvm activate 1 cce125b2-2597-4be9-bd17-23eb059d2778 --no-systemd # ceph-osd -d --cluster ceph --id 1 This gives "osd.1 0 OSD::init() : unable to read osd superblock", and the subsequent output indicates that this due to checksum errors. So ignore checksum mismatches and try again: # CEPH_ARGS="--bluestore-ignore-data-csum" ceph-osd -d --cluster ceph --id 1 which results in "osd.1 0 failed to load OSD map for epoch 4372, got 0 bytes". The monitors are at 4378, as per: # ceph osd stat 2 osds: 0 up (since 47h), 1 in (since 47h); epoch: e4378 Is there any way to get past this? For instance, could I coax the OSDs into epoch 4378? This is the first time I deal a ceph disaster, so there may be all kinds of obvious things I'm missing. // Best wishes; Johan ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-u
[ceph-users] Re: OSD failed to load OSD map for epoch
Thanks a lot, Eugen! I had not found those threads, but I did eventually recover; details below. And yes, this is a toy size-2 cluster with two OSDs, but I suspect I would seen the same problem on a more reasonable setup since this whole mess was caused by Octopus's smartmontools not playing nice with the NVMes. Just as in the previous thread Eugen provided, I got an OSD map from the monitors: # ceph osd getmap 4372 > /tmp/osd_map_4372 copied it to the OSD hosts and imported it: # CEPH_ARGS="--bluestore-ignore-data-csum" ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-0/ --op set-osdmap --file /tmp/osd_map_4372 Given the initial cause of the error, I removed the WAL devices: # ceph-bluestore-tool --path /var/lib/ceph/osd/ceph-0 --devs-source /var/lib/ceph/osd/ceph-0/block.wal --dev-target /var/lib/ceph/osd/ceph-0/block --command bluefs-bdev-migrate # ceph-volume lvm zap /var/lib/ceph/osd/ceph-0/block.wal Here I got bitten by what looks like #49554, so # lvchange --deltag "ceph.wal_device=/dev/ceph-wal/wal-0" --deltag "ceph.wal_uuid=G7Z5qA-OaJQ-Spvs-X4ec-0SvX-vT2C-C0Dbpe" /dev/ceph-block-0/block-0 And analogously for osd1. After restarting the OSDs, deep scrubbing, and a bit of manual repair, the cluster is healthy again. The reason for the crash turns out to be a known problem with smartmontools <7.2 and the Micron 2200 NVMes that were used to back the WAL (https://www.smartmontools.org/ticket/1404). Unfortunately, the Octopus image ships with smartmontools 7.1, which will crash the kernel on e.g. "smartctl -a /dev/nvme0". Before switching to Octopus containers, I was using smartmontools from Debian backports, which does not have this problem. Does Pacific have newer smartmontools? // Best wishes; Johan On 2021-07-27 06:35, Eugen Block wrote: Hi, did you read this thread [1] reporting a similar issue? It refers to a solution described in [2] but the OP in [1] recreated all OSDs, so it's not clear what the root cause was. Can you start the OSD with more verbose (debug) output and share that? Does your cluster really have only two OSDs? Are you running it with size 2 pools? [1] https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/EUFDKK3HEA5DPTUVJ5LBNQSWAKZH5ZM7/ [2] http://lists.ceph.com/pipermail/ceph-users-ceph.com/2019-August/036592.html Zitat von Johan Hattne : Dear all; We have 3-node cluster that has two OSDs on separate nodes, each with wal on NVMe. It's been running fine for quite some time, albeit under very light load. This week, we moved from package-based Octopus to container-based ditto (15.2.13, all on Debian stable). Within a few hours of that change, both OSDs crashed and dmesg filled up with stuff like: DMAR: DRHD: handling fault status reg 2 DMAR: [DMA Read] Request device [06:00.0] PASID fault addr ffbc [fault reason 06] PTE Read access is not set where 06:00.0 is the NVMe with the wal. This happened at the same time on *both* OSD nodes, but I'll worry about why this happened later. I would first like to see if I can get the cluster back up. From cephadm shell, I activate OSD 1 and try to start it (I did create a minimal /etc/ceph/ceph.conf with global "fsid" and "mon host" for that purpose): # ceph-volume lvm activate 1 cce125b2-2597-4be9-bd17-23eb059d2778 --no-systemd # ceph-osd -d --cluster ceph --id 1 This gives "osd.1 0 OSD::init() : unable to read osd superblock", and the subsequent output indicates that this due to checksum errors. So ignore checksum mismatches and try again: # CEPH_ARGS="--bluestore-ignore-data-csum" ceph-osd -d --cluster ceph --id 1 which results in "osd.1 0 failed to load OSD map for epoch 4372, got 0 bytes". The monitors are at 4378, as per: # ceph osd stat 2 osds: 0 up (since 47h), 1 in (since 47h); epoch: e4378 Is there any way to get past this? For instance, could I coax the OSDs into epoch 4378? This is the first time I deal a ceph disaster, so there may be all kinds of obvious things I'm missing. // Best wishes; Johan ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] OSD failed to load OSD map for epoch
Dear all; We have 3-node cluster that has two OSDs on separate nodes, each with wal on NVMe. It's been running fine for quite some time, albeit under very light load. This week, we moved from package-based Octopus to container-based ditto (15.2.13, all on Debian stable). Within a few hours of that change, both OSDs crashed and dmesg filled up with stuff like: DMAR: DRHD: handling fault status reg 2 DMAR: [DMA Read] Request device [06:00.0] PASID fault addr ffbc [fault reason 06] PTE Read access is not set where 06:00.0 is the NVMe with the wal. This happened at the same time on *both* OSD nodes, but I'll worry about why this happened later. I would first like to see if I can get the cluster back up. From cephadm shell, I activate OSD 1 and try to start it (I did create a minimal /etc/ceph/ceph.conf with global "fsid" and "mon host" for that purpose): # ceph-volume lvm activate 1 cce125b2-2597-4be9-bd17-23eb059d2778 --no-systemd # ceph-osd -d --cluster ceph --id 1 This gives "osd.1 0 OSD::init() : unable to read osd superblock", and the subsequent output indicates that this due to checksum errors. So ignore checksum mismatches and try again: # CEPH_ARGS="--bluestore-ignore-data-csum" ceph-osd -d --cluster ceph --id 1 which results in "osd.1 0 failed to load OSD map for epoch 4372, got 0 bytes". The monitors are at 4378, as per: # ceph osd stat 2 osds: 0 up (since 47h), 1 in (since 47h); epoch: e4378 Is there any way to get past this? For instance, could I coax the OSDs into epoch 4378? This is the first time I deal a ceph disaster, so there may be all kinds of obvious things I'm missing. // Best wishes; Johan ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Fwd: Kinetic support
Hey, What is the current status of Kinetic KV support in Ceph? I'm asking because: https://www.crn.com.au/news/seagate-quietly-bins-open-storage-project-519345 .. and the fact that kinetic-cpp-client hasn't been updated in four years and only compiles against OpenSSL 1.0.2, which will become EOL by the end of 2019. Or am I totally wrong? ^ Thank you for reply in advance, /Johan ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io