That's an excellent point! Between my last ceph upgrade and now, I did make a new crush ruleset and a new pool that uses that crush rule. It was just for SSDs, of which I have 5, one per host.
All of my other pools are using the default crush rulesets
"replicated_rule" for the Replica x3, and "erasure-code" for the
EC pools.
{
"rule_id": 2,
"rule_name": "highspeedSSD",
"ruleset": 2,
"type": 1,
"min_size": 1,
"max_size": 10,
"steps": [
{
"op": "take",
"item": -27,
"item_name": "default~ssd"
},
{
"op": "chooseleaf_firstn",
"num": 0,
"type": "host"
},
{
"op": "emit"
}
]
}
They're on OSDs 5, 14, 26, 32 and 33. the new "ssdpool" is number
13 and is a replica x3.
So I looked at at all the PGs starting with 13 (table below) and yes, just as one would hope, they are only using the ssd OSDs. But, each SSD and its corresponding OSD is on a different host. Every one of the 32 PGs has its 3x OSDs therefore also on different hosts. No PG for pool is being replicated to the same host.
If one of the PGs for this pool were not being properly replicated to 3 different OSDs on 3 different hosts, that would be a warning or error I'd think?
PG_STAT | OBJECTS | MISSING_ON_PRIMARY | DEGRADED | MISPLACED | UNFOUND | BYTES | OMAP_BYTES* | OMAP_KEYS* | LOG | DISK_LOG | STATE | STATE_STAMP | VERSION | REPORTED | UP | UP_PRIMARY | ACTING | ACTING_PRIMARY | LAST_SCRUB | SCRUB_STAMP | LAST_DEEP_SCRUB | DEEP_SCRUB_STAMP | SNAPTRIMQ_LEN |
13.1a | 36 | 0 | 0 | 0 | 0 | 1.51E+08 | 0 | 0 | 190 | 190 | active+clean | 2022-02-10T02:52:45.595274+0000 | 8595'190 | 10088:2829 | [5,33,14] | 5 | [5,33,14] | 5 | 8595'190 | 2022-02-10T02:52:45.595191+0000 | 8595'190 | 2022-02-09T01:48:54.354375+0000 | 0 |
13.1b | 34 | 0 | 0 | 0 | 0 | 1.43E+08 | 0 | 0 | 186 | 186 | active+clean | 2022-02-10T09:57:13.864888+0000 | 8595'186 | 10088:2764 | [32,5,26] | 32 | [32,5,26] | 32 | 8595'186 | 2022-02-10T09:57:13.864816+0000 | 8595'186 | 2022-02-10T09:57:13.864816+0000 | 0 |
13.1c | 30 | 0 | 0 | 0 | 0 | 1.26E+08 | 0 | 0 | 174 | 174 | active+clean | 2022-02-10T00:51:38.215782+0000 | 8531'174 | 10088:2565 | [32,26,33] | 32 | [32,26,33] | 32 | 8531'174 | 2022-02-10T00:51:38.215699+0000 | 8531'174 | 2022-02-04T00:13:03.051036+0000 | 0 |
13.1d | 31 | 0 | 0 | 0 | 0 | 1.26E+08 | 0 | 0 | 182 | 182 | active+clean | 2022-02-09T22:34:36.010164+0000 | 8472'182 | 10088:2954 | [14,5,33] | 14 | [14,5,33] | 14 | 8472'182 | 2022-02-09T22:34:36.010085+0000 | 8472'182 | 2022-02-09T22:34:36.010085+0000 | 0 |
13.1e | 22 | 0 | 0 | 0 | 0 | 92274688 | 0 | 0 | 140 | 140 | active+clean | 2022-02-09T18:39:59.737323+0000 | 8563'140 | 10088:2286 | [33,14,32] | 33 | [33,14,32] | 33 | 8563'140 | 2022-02-09T09:27:55.107416+0000 | 8563'140 | 2022-02-08T07:54:49.474418+0000 | 0 |
13.1f | 31 | 0 | 0 | 0 | 0 | 1.3E+08 | 0 | 0 | 208 | 208 | active+clean | 2022-02-10T02:51:33.770146+0000 | 8412'208 | 10088:2784 | [32,26,5] | 32 | [32,26,5] | 32 | 8412'208 | 2022-02-10T02:51:33.770066+0000 | 8412'208 | 2022-02-08T19:42:17.625397+0000 | 0 |
13.a | 32 | 0 | 0 | 0 | 0 | 1.34E+08 | 0 | 0 | 188 | 188 | active+clean | 2022-02-10T01:13:32.640014+0000 | 8595'188 | 10088:2833 | [14,32,33] | 14 | [14,32,33] | 14 | 8595'188 | 2022-02-10T01:13:32.639917+0000 | 8595'188 | 2022-02-04T23:15:54.724121+0000 | 0 |
13.b | 30 | 0 | 0 | 0 | 0 | 1.26E+08 | 0 | 0 | 156 | 156 | active+clean | 2022-02-09T18:39:48.449523+0000 | 8595'156 | 10088:2739 | [26,33,14] | 26 | [26,33,14] | 26 | 8595'156 | 2022-02-09T18:39:48.449379+0000 | 8595'156 | 2022-02-09T18:39:48.449379+0000 | 0 |
13.c | 31 | 0 | 0 | 0 | 0 | 1.3E+08 | 0 | 0 | 186 | 186 | active+clean | 2022-02-09T18:39:59.737388+0000 | 8595'186 | 10088:2493 | [33,14,32] | 33 | [33,14,32] | 33 | 8595'186 | 2022-02-09T18:39:51.151311+0000 | 8595'186 | 2022-02-07T06:01:56.864006+0000 | 0 |
13.d | 29 | 0 | 0 | 0 | 0 | 1.22E+08 | 0 | 0 | 172 | 172 | active+clean | 2022-02-10T08:31:08.425299+0000 | 8564'172 | 10088:2621 | [26,14,32] | 26 | [26,14,32] | 26 | 8564'172 | 2022-02-10T08:31:08.425204+0000 | 8564'172 | 2022-02-10T08:31:08.425204+0000 | 0 |
13.e | 30 | 0 | 0 | 0 | 0 | 1.26E+08 | 0 | 0 | 184 | 184 | active+clean | 2022-02-09T23:47:01.504802+0000 | 8595'184 | 10088:2535 | [5,14,26] | 5 | [5,14,26] | 5 | 8595'184 | 2022-02-09T23:47:01.504705+0000 | 8595'184 | 2022-02-05T11:09:57.757107+0000 | 0 |
13.f | 26 | 0 | 0 | 0 | 0 | 1.09E+08 | 0 | 0 | 198 | 198 | active+clean | 2022-02-10T07:01:38.357554+0000 | 8595'198 | 10088:2651 | [33,14,5] | 33 | [33,14,5] | 33 | 8595'198 | 2022-02-10T07:01:38.357485+0000 | 8595'198 | 2022-02-07T20:45:51.724002+0000 | 0 |
13.0 | 34 | 0 | 0 | 0 | 0 | 1.43E+08 | 0 | 0 | 206 | 206 | active+clean | 2022-02-09T21:20:58.522212+0000 | 8564'206 | 10088:2889 | [26,14,5] | 26 | [26,14,5] | 26 | 8564'206 | 2022-02-09T21:20:58.522134+0000 | 8564'206 | 2022-02-08T13:25:16.519211+0000 | 0 |
13.1 | 34 | 0 | 0 | 0 | 0 | 1.43E+08 | 0 | 0 | 220 | 220 | active+clean | 2022-02-09T22:41:21.388826+0000 | 8412'220 | 10088:2887 | [14,32,26] | 14 | [14,32,26] | 14 | 8412'220 | 2022-02-09T22:41:21.388760+0000 | 8412'220 | 2022-02-04T20:58:46.282428+0000 | 0 |
13.10 | 32 | 0 | 0 | 0 | 0 | 1.34E+08 | 0 | 0 | 194 | 194 | active+clean | 2022-02-10T12:41:25.745944+0000 | 8472'194 | 10088:2574 | [33,26,32] | 33 | [33,26,32] | 33 | 8472'194 | 2022-02-10T12:41:25.745869+0000 | 8472'194 | 2022-02-08T01:28:32.427800+0000 | 0 |
13.11 | 28 | 0 | 0 | 0 | 0 | 1.17E+08 | 0 | 0 | 178 | 178 | active+clean | 2022-02-09T18:39:59.735355+0000 | 8595'178 | 10088:2626 | [33,14,5] | 33 | [33,14,5] | 33 | 8595'178 | 2022-02-09T14:01:45.095410+0000 | 8595'178 | 2022-02-08T06:08:39.135803+0000 | 0 |
13.12 | 34 | 0 | 0 | 0 | 0 | 1.43E+08 | 0 | 0 | 198 | 198 | active+clean | 2022-02-09T18:39:59.735275+0000 | 8595'198 | 10088:2781 | [33,14,5] | 33 | [33,14,5] | 33 | 8595'198 | 2022-02-09T16:22:31.596817+0000 | 8595'198 | 2022-02-09T16:22:31.596817+0000 | 0 |
13.13 | 35 | 0 | 0 | 0 | 0 | 1.47E+08 | 0 | 0 | 204 | 204 | active+clean | 2022-02-09T18:39:50.237871+0000 | 8595'204 | 10088:2849 | [5,33,32] | 5 | [5,33,32] | 5 | 8595'204 | 2022-02-09T18:39:50.237797+0000 | 8595'204 | 2022-02-08T16:32:38.006399+0000 | 0 |
13.14 | 28 | 0 | 0 | 0 | 0 | 1.17E+08 | 0 | 0 | 180 | 180 | active+clean | 2022-02-10T17:11:34.974225+0000 | 8626'180 | 10089:2498 | [32,33,26] | 32 | [32,33,26] | 32 | 8626'180 | 2022-02-10T17:11:34.974155+0000 | 8626'180 | 2022-02-09T09:19:54.199018+0000 | 0 |
13.15 | 30 | 0 | 0 | 0 | 0 | 1.26E+08 | 0 | 0 | 170 | 170 | active+clean | 2022-02-10T03:53:41.072711+0000 | 8511'170 | 10088:2876 | [5,33,26] | 5 | [5,33,26] | 5 | 8511'170 | 2022-02-10T03:53:41.072610+0000 | 8511'170 | 2022-02-08T20:51:10.657159+0000 | 0 |
13.16 | 35 | 0 | 0 | 0 | 0 | 1.47E+08 | 0 | 0 | 196 | 196 | active+clean | 2022-02-10T12:08:59.786545+0000 | 8472'196 | 10088:2877 | [14,26,33] | 14 | [14,26,33] | 14 | 8472'196 | 2022-02-10T12:08:59.786462+0000 | 8472'196 | 2022-02-09T04:02:45.033237+0000 | 0 |
13.17 | 21 | 0 | 0 | 0 | 0 | 88080384 | 0 | 0 | 150 | 150 | active+clean | 2022-02-10T06:19:19.045692+0000 | 8508'150 | 10088:2581 | [26,5,14] | 26 | [26,5,14] | 26 | 8508'150 | 2022-02-10T06:19:19.045612+0000 | 8508'150 | 2022-02-04T19:20:53.523798+0000 | 0 |
13.18 | 29 | 0 | 0 | 0 | 0 | 1.22E+08 | 0 | 0 | 186 | 186 | active+clean | 2022-02-10T01:10:28.630477+0000 | 8532'186 | 10088:2529 | [5,26,14] | 5 | [5,26,14] | 5 | 8532'186 | 2022-02-10T01:10:28.630406+0000 | 8532'186 | 2022-02-10T01:10:28.630406+0000 | 0 |
13.19 | 28 | 0 | 0 | 0 | 0 | 1.17E+08 | 0 | 0 | 174 | 174 | active+clean | 2022-02-09T23:29:46.574414+0000 | 8626'174 | 10088:2963 | [14,33,26] | 14 | [14,33,26] | 14 | 8626'174 | 2022-02-09T23:29:46.574337+0000 | 8626'174 | 2022-02-04T20:58:50.149946+0000 | 0 |
13.2 | 26 | 0 | 0 | 0 | 0 | 1.09E+08 | 0 | 0 | 178 | 178 | active+clean | 2022-02-10T01:31:11.275696+0000 | 8595'178 | 10088:2662 | [5,26,32] | 5 | [5,26,32] | 5 | 8595'178 | 2022-02-10T01:31:11.275593+0000 | 8595'178 | 2022-02-07T17:54:21.490486+0000 | 0 |
13.3 | 31 | 0 | 0 | 0 | 0 | 1.3E+08 | 0 | 0 | 166 | 166 | active+clean | 2022-02-10T06:57:38.939205+0000 | 8442'166 | 10088:2784 | [5,32,33] | 5 | [5,32,33] | 5 | 8442'166 | 2022-02-10T06:57:38.939091+0000 | 8442'166 | 2022-02-09T01:57:23.398996+0000 | 0 |
13.4 | 39 | 0 | 0 | 0 | 0 | 1.64E+08 | 0 | 0 | 208 | 208 | active+clean | 2022-02-10T17:13:14.834732+0000 | 8595'208 | 10089:2931 | [14,5,26] | 14 | [14,5,26] | 14 | 8595'208 | 2022-02-10T17:13:14.834656+0000 | 8595'208 | 2022-02-08T10:42:51.846976+0000 | 0 |
13.5 | 27 | 0 | 0 | 0 | 0 | 1.13E+08 | 0 | 0 | 188 | 188 | active+clean | 2022-02-09T18:39:47.423693+0000 | 8595'188 | 10088:2678 | [5,33,26] | 5 | [5,33,26] | 5 | 8595'188 | 2022-02-09T07:45:41.024361+0000 | 8595'188 | 2022-02-06T23:19:13.370525+0000 | 0 |
13.6 | 32 | 0 | 0 | 0 | 0 | 1.34E+08 | 0 | 0 | 180 | 180 | active+clean | 2022-02-10T03:22:39.617758+0000 | 8412'180 | 10088:2508 | [33,14,32] | 33 | [33,14,32] | 33 | 8412'180 | 2022-02-10T03:22:39.617672+0000 | 8412'180 | 2022-02-10T03:22:39.617672+0000 | 0 |
13.7 | 41 | 0 | 0 | 0 | 0 | 1.72E+08 | 0 | 0 | 186 | 186 | active+clean | 2022-02-10T08:17:50.715439+0000 | 8564'186 | 10088:3106 | [5,26,33] | 5 | [5,26,33] | 5 | 8564'186 | 2022-02-10T08:17:50.715345+0000 | 8564'186 | 2022-02-09T00:50:25.102262+0000 | 0 |
13.8 | 23 | 0 | 0 | 0 | 0 | 96468992 | 0 | 0 | 152 | 152 | active+clean | 2022-02-10T06:53:46.438741+0000 | 8597'152 | 10088:2574 | [5,14,33] | 5 | [5,14,33] | 5 | 8597'152 | 2022-02-10T06:53:46.438657+0000 | 8597'152 | 2022-02-10T06:53:46.438657+0000 | 0 |
13.9 | 29 | 0 | 0 | 0 | 0 | 1.22E+08 | 0 | 0 | 180 | 180 | active+clean | 2022-02-10T12:44:24.581217+0000 | 8595'180 | 10088:2478 | [33,14,32] | 33 | [33,14,32] | 33 | 8595'180 | 2022-02-10T12:44:24.581129+0000 | 8595'180 | 2022-02-06T18:32:25.066730+0000 | 0 |
I don?t know how to get better errors out of cephadm, but the only way I can think of for this to happen is if your crush rule is somehow placing multiple replicas of a pg on a single host that cephadm wants to upgrade. So check your rules, your pool sizes, and osd tree?-Greg
On Thu, Feb 10, 2022 at 8:25 AM Zach Heise (SSCC) <he...@ssc.wisc.edu> wrote:
It could be an issue with the devicehealthpool as you are correct, it is a single PG - but when the cluster is reporting that everything is healthy, it's difficult where to go from there. What I don't understand is why its refusing to upgrade ANY of the osd daemons; I have 33 of them, why would a single PG going offline be a problem for all of them?
I did try stopping the upgrade and restarting it, but it just picks up at the same place (11/56 daemons upgraded) and immediately reports the same issue.
Is there any way to at least tell which PG is the problematic one?
_______________________________________________
Zach
On 2022-02-09 4:19 PM, anthony.da...@gmail.com wrote:
Speculation: might the devicehealth pool be involved? It seems to typically have just 1 PG.On Feb 9, 2022, at 1:41 PM, Zach Heise (SSCC) <he...@ssc.wisc.edu> wrote: Good afternoon, thank you for your reply. Yes I know you are right, eventually we'll switch to an odd number of mons rather than even. We're still in 'testing' mode right now and only my coworkers and I are using the cluster. Of the 7 pools, all but 2 are replica x3. The last two are EC 2+2. Zach Heise On 2022-02-09 3:38 PM, sascha.art...@gmail.com wrote:Hello, all your pools running replica > 1? also having 4 monitors is pretty bad for split brain situations.. Zach Heise (SSCC) <he...@ssc.wisc.edu> schrieb am Mi., 9. Feb. 2022, 22:02: Hello, ceph health detail says my 5-node cluster is healthy, yet when I ran ceph orch upgrade start --ceph-version 16.2.7 everything seemed to go fine until we got to the OSD section, now for the past hour, every 15 seconds a new log entry of 'Upgrade: unsafe to stop osd(s) at this time (1 PGs are or would become offline)' appears in the logs. ceph pg dump_stuck (unclean, degraded, etc) shows "ok" for everything too. Yet somehow 1 PG is (apparently) holding up all the OSD upgrades and not letting the process finish. Should I stop the upgrade and try it again? (I haven't done that before so was just nervous to try it). Any other ideas? cluster: id: 9aa000e8-b999-11eb-82f2-ecf4bbcc0ac0 health: HEALTH_OK services: mon: 4 daemons, quorum ceph05,ceph04,ceph01,ceph03 (age 92m) mgr: ceph03.futetp(active, since 97m), standbys: ceph01.fblojp mds: 1/1 daemons up, 1 hot standby osd: 33 osds: 33 up (since 2h), 33 in (since 4h); 9 remapped pgs data: volumes: 1/1 healthy pools: 7 pools, 193 pgs objects: 3.72k objects, 14 GiB usage: 43 GiB used, 64 TiB / 64 TiB avail pgs: 231/11170 objects misplaced (2.068%) 185 active+clean 8 active+clean+remapped io: client: 1.2 KiB/s rd, 2 op/s rd, 0 op/s wr progress: Upgrade to 16.2.7 (5m) [=====.......................] (remaining: 24m) -- Zach_______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io_______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
_______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io