Regards,
Eugen
Zitat von kefu chai <[email protected]>:
> Hello Ceph community,
>
> I'm writing on behalf of a friend who is experiencing a critical cluster
> issue after upgrading and would appreciate any assistance.
>
> Environment:
>
> - 5 MON nodes, 2 MGR nodes, 40 OSD servers (306 OSDs total)
> - OS: CentOS 8.2 upgraded to 8.4
> - Ceph: 15.2.17 upgraded to 17.2.7
> - Upgrade method: yum update in rolling batches
>
> Timeline: The upgrade started on October 8th at 1:00 PM. We upgraded
> MON/MGR servers first, and then upgraded OSD nodes in batches of 5 nodes.
> The process appeared normal initially, but when approximately 10 OSD
> servers remained, OSDs began going down.
>
> MON Quorum Issue: When the OSDs began failing, the monitors failed to
form
> a quorum. In an attempt to recover, we stopped 4 out of 5 monitors.
> However, the remaining monitor (mbjson20010) then failed to start due to
a
> missing .ldb file. We eventually recovered this single monitor from OSD
> using the instructions at
>
https://docs.ceph.com/en/quincy/rados/troubleshooting/troubleshooting-mon/#mon-store-recovery-using-osds
,
> so
> we now have only 1 MON in the cluster instead of the original 5.
>
> However, rebuilding the MON store did not help, and restarting the OSD
> servers also failed to resolve the issue. The cluster status remains
> problematic.
>
> Current Cluster Status:
>
> - Only 1 MON daemon active (quorum: mbjson20010) - down from 5 MONs
> - OSDs: 91 up / 229 in (out of 306 total)
> - 88.872% of PGs are not active
> - 4.779% of PGs are unknown
> - 3,918 PGs down
> - 1,311 PGs stale+down
> - Only 12 PGs active+clean
>
> Critical Error: When examining OSD logs, we discovered that some OSDs are
> failing to start with the following error:
>
> osd.43 39677784 init missing pg_pool_t for deleted pool 9 for pg 9.3ds3;
> please downgrade to luminous and allow pg deletion to complete before
> upgrading
>
> Full error context from one of the failing OSDs:
>
> # tail /var/log/ceph/ceph-osd.43.log
>
> -7> 2025-10-12T13:40:05.987+0800 7fdd13259540 1
> bluestore(/var/lib/ceph/osd/ceph-43) _upgrade_super from 4, latest 4
>
> -6> 2025-10-12T13:40:05.987+0800 7fdd13259540 1
> bluestore(/var/lib/ceph/osd/ceph-43) _upgrade_super done
>
> -5> 2025-10-12T13:40:05.987+0800 7fdd13259540 2 osd.43 0 journal
looks
> like ssd
>
> -4> 2025-10-12T13:40:05.987+0800 7fdd13259540 2 osd.43 0 boot
>
> -3> 2025-10-12T13:40:05.987+0800 7fdceb2cc700 5
> bluestore.MempoolThread(0x55c7b0c66b40) _resize_shards cache_size:
> 8589934592 kv_alloc: 1717986918 kv_used: 91136 kv_onode_alloc: 343597383
> kv_onode_used: 23328 meta_alloc: 6871947673 meta_used: 2984 data_alloc: 0
> data_used: 0
>
> -2> 2025-10-12T13:40:05.989+0800 7fdd13259540 -1 osd.43 39677784 init
> missing pg_pool_t for deleted pool 9 for pg 9.3ds3; please downgrade to
> luminous and allow pg deletion to complete before upgrading
>
> -1> 2025-10-12T13:40:05.991+0800 7fdd13259540 -1
>
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.2.7/rpm/el8/BUILD/ceph-17.2.7/src/osd/OSD.cc:
> In function 'int OSD::init()' thread 7fdd13259540 time
> 2025-10-12T13:40:05.990845+0800
>
>
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.2.7/rpm/el8/BUILD/ceph-17.2.7/src/osd/OSD.cc:
> 3735: ceph_abort_msg("abort() called")
>
> # tail /var/log/ceph/ceph-osd.51.log
> -7> 2025-10-12T13:39:36.739+0800 7f603e5f7540 1
> bluestore(/var/lib/ceph/osd/ceph-51) _upgrade_super from 4, latest 4
> -6> 2025-10-12T13:39:36.739+0800 7f603e5f7540 1
> bluestore(/var/lib/ceph/osd/ceph-51) _upgrade_super done
> -5> 2025-10-12T13:39:36.739+0800 7f603e5f7540 2 osd.51 0 journal
looks
> like ssd
> -4> 2025-10-12T13:39:36.739+0800 7f603e5f7540 2 osd.51 0 boot
> -3> 2025-10-12T13:39:36.739+0800 7f6016669700 5
> bluestore.MempoolThread(0x55e839d4cb40) _resize_shards cache_size:
> 8589934592 kv_alloc: 1717986918 kv_used: 31232 kv_onode_alloc: 343597383
> kv_onode_used: 21584 meta_alloc: 6871947673 meta_used: 1168 data_alloc: 0
> data_used: 0
> -2> 2025-10-12T13:39:36.741+0800 7f603e5f7540 -1 osd.51 39677784 init
> missing pg_pool_t for deleted pool 6 for pg 6.1f; please downgrade to
> luminous and allow pg deletion to complete before upgrading
> -1> 2025-10-12T13:39:36.742+0800 7f603e5f7540 -1
>
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.2.7/rpm/el8/BUILD/ceph-17.2.7/src/osd/OSD.cc:
> In function 'int OSD::init()' thread 7f603e5f7540 time
> 2025-10-12T13:39:36.742527+0800
>
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.2.7/rpm/el8/BUILD/ceph-17.2.7/src/osd/OSD.cc:
> 3735: ceph_abort_msg("abort() called")
>
> Investigation Findings: We examined all OSD instances that failed to
start.
> All of them exhibit the same error pattern in their logs and all contain
PG
> references to non-existent pools. For example, running
> "ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-51 --op
list-pgs"
> shows PG references to pools that no longer exist (e.g., pool 9, pool 10,
> pool 4, pool 6, pool 8), while the current pools are numbered 101, 140,
> 141, 149, 212, 213, 216, 217, 218, 219. Notably, each affected OSD
contains
> only 2-3 PGs referencing these non-existent pools, which is significantly
> fewer than the hundreds of PGs a regular OSD typically contains. It
appears
> the OSD metadata has been corrupted or overwritten with stale references
to
> deleted pools from previous operations, preventing these OSDs from
starting
> and causing widespread PG state abnormalities across the cluster.
>
> 2 PGs referencing non-existent pools were found in osd.51:
>
> # ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-51 --op
list-pgs
> 1.0
> 6.1f
>
> # ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-51 --op list
> Error getting attr on : 1.0_head,#1:00000000::::head#, (61) No data
> available
> Error getting attr on : 6.1f_head,#6:f8000000::::head#, (61) No data
> available
>
["1.0",{"oid":"","key":"","snapid":-2,"hash":0,"max":0,"pool":1,"namespace":"","max":0}]
>
["1.0",{"oid":"main.db-journal.0000000000000000","key":"","snapid":-2,"hash":1969844440,"max":0,"pool":1,"namespace":"devicehealth","max":0}]
>
["1.0",{"oid":"main.db.0000000000000000","key":"","snapid":-2,"hash":1315310604,"max":0,"pool":1,"namespace":"devicehealth","max":0}]
>
["6.1f",{"oid":"","key":"","snapid":-2,"hash":31,"max":0,"pool":6,"namespace":"","max":0}]
>
> We also performed a comprehensive check by listing all PGs from all OSD
> nodes using "ceph-objectstore-tool --op list-pgs" and comparing the
results
> with the output of "ceph pg dump". This comparison revealed that quite a
> few PGs are missing from the OSD listings. We suspect that some OSDs that
> previously held these missing PGs may now be corrupted, which would
explain
> both the missing PGs and the widespread cluster degradation. It appears
the
> OSD metadata has been corrupted or overwritten with stale references to
> deleted pools from previous operations, preventing these OSDs from
starting
> and causing widespread PG state abnormalities across the cluster.
>
> It appears the OSD objectstore's metadata has been corrupted or
overwritten
> with stale references to deleted pools from previous operations,
preventing
> these OSDs from starting and causing widespread PG state abnormalities
> across the cluster.
>
> Questions:
>
> 1. How can we safely restore the missing PGs from the OSD without data
> loss?
> 2. Has anyone encountered similar issues when upgrading from Octopus
> (15.2.x) to Quincy (17.2.x)?
>
> We understand that skipping major versions may not be officially
supported,
> but we urgently need guidance on the safest recovery path at this point.
>
> Any help would be greatly appreciated. Thank you in advance.
>
> --
> Regards
> Kefu Chai
> _______________________________________________
> ceph-users mailing list -- [email protected]
> To unsubscribe send an email to [email protected]
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]