That's an excellent point! Between my last ceph upgrade and now, I did make a new crush ruleset and a new pool that uses that crush rule. It was just for SSDs, of which I have 5, one per host.

All of my other pools are using the default crush rulesets "replicated_rule" for the Replica x3, and "erasure-code" for the EC pools.

{
    "rule_id": 2,
    "rule_name": "highspeedSSD",
    "ruleset": 2,
    "type": 1,
    "min_size": 1,
    "max_size": 10,
    "steps": [
        {
            "op": "take",
            "item": -27,
            "item_name": "default~ssd"
        },
        {
            "op": "chooseleaf_firstn",
            "num": 0,
            "type": "host"
        },
        {
            "op": "emit"
        }
    ]
}

They're on OSDs 5, 14, 26, 32 and 33. the new "ssdpool" is number 13 and is a replica x3.

So I looked at at all the PGs starting with 13 (table below) and yes, just as one would hope, they are only using the ssd OSDs. But, each SSD and its corresponding OSD is on a different host. Every one of the 32 PGs has its 3x OSDs therefore also on different hosts. No PG for pool is being replicated to the same host.

If one of the PGs for this pool were not being properly replicated to 3 different OSDs on 3 different hosts, that would be a warning or error I'd think?

PG_STAT OBJECTS MISSING_ON_PRIMARY DEGRADED MISPLACED UNFOUND BYTES OMAP_BYTES* OMAP_KEYS* LOG DISK_LOG STATE STATE_STAMP VERSION REPORTED UP UP_PRIMARY ACTING ACTING_PRIMARY LAST_SCRUB SCRUB_STAMP LAST_DEEP_SCRUB DEEP_SCRUB_STAMP SNAPTRIMQ_LEN
13.1a 36 0 0 0 0 1.51E+08 0 0 190 190 active+clean 2022-02-10T02:52:45.595274+0000 8595'190 10088:2829 [5,33,14] 5 [5,33,14] 5 8595'190 2022-02-10T02:52:45.595191+0000 8595'190 2022-02-09T01:48:54.354375+0000 0
13.1b 34 0 0 0 0 1.43E+08 0 0 186 186 active+clean 2022-02-10T09:57:13.864888+0000 8595'186 10088:2764 [32,5,26] 32 [32,5,26] 32 8595'186 2022-02-10T09:57:13.864816+0000 8595'186 2022-02-10T09:57:13.864816+0000 0
13.1c 30 0 0 0 0 1.26E+08 0 0 174 174 active+clean 2022-02-10T00:51:38.215782+0000 8531'174 10088:2565 [32,26,33] 32 [32,26,33] 32 8531'174 2022-02-10T00:51:38.215699+0000 8531'174 2022-02-04T00:13:03.051036+0000 0
13.1d 31 0 0 0 0 1.26E+08 0 0 182 182 active+clean 2022-02-09T22:34:36.010164+0000 8472'182 10088:2954 [14,5,33] 14 [14,5,33] 14 8472'182 2022-02-09T22:34:36.010085+0000 8472'182 2022-02-09T22:34:36.010085+0000 0
13.1e 22 0 0 0 0 92274688 0 0 140 140 active+clean 2022-02-09T18:39:59.737323+0000 8563'140 10088:2286 [33,14,32] 33 [33,14,32] 33 8563'140 2022-02-09T09:27:55.107416+0000 8563'140 2022-02-08T07:54:49.474418+0000 0
13.1f 31 0 0 0 0 1.3E+08 0 0 208 208 active+clean 2022-02-10T02:51:33.770146+0000 8412'208 10088:2784 [32,26,5] 32 [32,26,5] 32 8412'208 2022-02-10T02:51:33.770066+0000 8412'208 2022-02-08T19:42:17.625397+0000 0
13.a 32 0 0 0 0 1.34E+08 0 0 188 188 active+clean 2022-02-10T01:13:32.640014+0000 8595'188 10088:2833 [14,32,33] 14 [14,32,33] 14 8595'188 2022-02-10T01:13:32.639917+0000 8595'188 2022-02-04T23:15:54.724121+0000 0
13.b 30 0 0 0 0 1.26E+08 0 0 156 156 active+clean 2022-02-09T18:39:48.449523+0000 8595'156 10088:2739 [26,33,14] 26 [26,33,14] 26 8595'156 2022-02-09T18:39:48.449379+0000 8595'156 2022-02-09T18:39:48.449379+0000 0
13.c 31 0 0 0 0 1.3E+08 0 0 186 186 active+clean 2022-02-09T18:39:59.737388+0000 8595'186 10088:2493 [33,14,32] 33 [33,14,32] 33 8595'186 2022-02-09T18:39:51.151311+0000 8595'186 2022-02-07T06:01:56.864006+0000 0
13.d 29 0 0 0 0 1.22E+08 0 0 172 172 active+clean 2022-02-10T08:31:08.425299+0000 8564'172 10088:2621 [26,14,32] 26 [26,14,32] 26 8564'172 2022-02-10T08:31:08.425204+0000 8564'172 2022-02-10T08:31:08.425204+0000 0
13.e 30 0 0 0 0 1.26E+08 0 0 184 184 active+clean 2022-02-09T23:47:01.504802+0000 8595'184 10088:2535 [5,14,26] 5 [5,14,26] 5 8595'184 2022-02-09T23:47:01.504705+0000 8595'184 2022-02-05T11:09:57.757107+0000 0
13.f 26 0 0 0 0 1.09E+08 0 0 198 198 active+clean 2022-02-10T07:01:38.357554+0000 8595'198 10088:2651 [33,14,5] 33 [33,14,5] 33 8595'198 2022-02-10T07:01:38.357485+0000 8595'198 2022-02-07T20:45:51.724002+0000 0
13.0 34 0 0 0 0 1.43E+08 0 0 206 206 active+clean 2022-02-09T21:20:58.522212+0000 8564'206 10088:2889 [26,14,5] 26 [26,14,5] 26 8564'206 2022-02-09T21:20:58.522134+0000 8564'206 2022-02-08T13:25:16.519211+0000 0
13.1 34 0 0 0 0 1.43E+08 0 0 220 220 active+clean 2022-02-09T22:41:21.388826+0000 8412'220 10088:2887 [14,32,26] 14 [14,32,26] 14 8412'220 2022-02-09T22:41:21.388760+0000 8412'220 2022-02-04T20:58:46.282428+0000 0
13.10 32 0 0 0 0 1.34E+08 0 0 194 194 active+clean 2022-02-10T12:41:25.745944+0000 8472'194 10088:2574 [33,26,32] 33 [33,26,32] 33 8472'194 2022-02-10T12:41:25.745869+0000 8472'194 2022-02-08T01:28:32.427800+0000 0
13.11 28 0 0 0 0 1.17E+08 0 0 178 178 active+clean 2022-02-09T18:39:59.735355+0000 8595'178 10088:2626 [33,14,5] 33 [33,14,5] 33 8595'178 2022-02-09T14:01:45.095410+0000 8595'178 2022-02-08T06:08:39.135803+0000 0
13.12 34 0 0 0 0 1.43E+08 0 0 198 198 active+clean 2022-02-09T18:39:59.735275+0000 8595'198 10088:2781 [33,14,5] 33 [33,14,5] 33 8595'198 2022-02-09T16:22:31.596817+0000 8595'198 2022-02-09T16:22:31.596817+0000 0
13.13 35 0 0 0 0 1.47E+08 0 0 204 204 active+clean 2022-02-09T18:39:50.237871+0000 8595'204 10088:2849 [5,33,32] 5 [5,33,32] 5 8595'204 2022-02-09T18:39:50.237797+0000 8595'204 2022-02-08T16:32:38.006399+0000 0
13.14 28 0 0 0 0 1.17E+08 0 0 180 180 active+clean 2022-02-10T17:11:34.974225+0000 8626'180 10089:2498 [32,33,26] 32 [32,33,26] 32 8626'180 2022-02-10T17:11:34.974155+0000 8626'180 2022-02-09T09:19:54.199018+0000 0
13.15 30 0 0 0 0 1.26E+08 0 0 170 170 active+clean 2022-02-10T03:53:41.072711+0000 8511'170 10088:2876 [5,33,26] 5 [5,33,26] 5 8511'170 2022-02-10T03:53:41.072610+0000 8511'170 2022-02-08T20:51:10.657159+0000 0
13.16 35 0 0 0 0 1.47E+08 0 0 196 196 active+clean 2022-02-10T12:08:59.786545+0000 8472'196 10088:2877 [14,26,33] 14 [14,26,33] 14 8472'196 2022-02-10T12:08:59.786462+0000 8472'196 2022-02-09T04:02:45.033237+0000 0
13.17 21 0 0 0 0 88080384 0 0 150 150 active+clean 2022-02-10T06:19:19.045692+0000 8508'150 10088:2581 [26,5,14] 26 [26,5,14] 26 8508'150 2022-02-10T06:19:19.045612+0000 8508'150 2022-02-04T19:20:53.523798+0000 0
13.18 29 0 0 0 0 1.22E+08 0 0 186 186 active+clean 2022-02-10T01:10:28.630477+0000 8532'186 10088:2529 [5,26,14] 5 [5,26,14] 5 8532'186 2022-02-10T01:10:28.630406+0000 8532'186 2022-02-10T01:10:28.630406+0000 0
13.19 28 0 0 0 0 1.17E+08 0 0 174 174 active+clean 2022-02-09T23:29:46.574414+0000 8626'174 10088:2963 [14,33,26] 14 [14,33,26] 14 8626'174 2022-02-09T23:29:46.574337+0000 8626'174 2022-02-04T20:58:50.149946+0000 0
13.2 26 0 0 0 0 1.09E+08 0 0 178 178 active+clean 2022-02-10T01:31:11.275696+0000 8595'178 10088:2662 [5,26,32] 5 [5,26,32] 5 8595'178 2022-02-10T01:31:11.275593+0000 8595'178 2022-02-07T17:54:21.490486+0000 0
13.3 31 0 0 0 0 1.3E+08 0 0 166 166 active+clean 2022-02-10T06:57:38.939205+0000 8442'166 10088:2784 [5,32,33] 5 [5,32,33] 5 8442'166 2022-02-10T06:57:38.939091+0000 8442'166 2022-02-09T01:57:23.398996+0000 0
13.4 39 0 0 0 0 1.64E+08 0 0 208 208 active+clean 2022-02-10T17:13:14.834732+0000 8595'208 10089:2931 [14,5,26] 14 [14,5,26] 14 8595'208 2022-02-10T17:13:14.834656+0000 8595'208 2022-02-08T10:42:51.846976+0000 0
13.5 27 0 0 0 0 1.13E+08 0 0 188 188 active+clean 2022-02-09T18:39:47.423693+0000 8595'188 10088:2678 [5,33,26] 5 [5,33,26] 5 8595'188 2022-02-09T07:45:41.024361+0000 8595'188 2022-02-06T23:19:13.370525+0000 0
13.6 32 0 0 0 0 1.34E+08 0 0 180 180 active+clean 2022-02-10T03:22:39.617758+0000 8412'180 10088:2508 [33,14,32] 33 [33,14,32] 33 8412'180 2022-02-10T03:22:39.617672+0000 8412'180 2022-02-10T03:22:39.617672+0000 0
13.7 41 0 0 0 0 1.72E+08 0 0 186 186 active+clean 2022-02-10T08:17:50.715439+0000 8564'186 10088:3106 [5,26,33] 5 [5,26,33] 5 8564'186 2022-02-10T08:17:50.715345+0000 8564'186 2022-02-09T00:50:25.102262+0000 0
13.8 23 0 0 0 0 96468992 0 0 152 152 active+clean 2022-02-10T06:53:46.438741+0000 8597'152 10088:2574 [5,14,33] 5 [5,14,33] 5 8597'152 2022-02-10T06:53:46.438657+0000 8597'152 2022-02-10T06:53:46.438657+0000 0
13.9 29 0 0 0 0 1.22E+08 0 0 180 180 active+clean 2022-02-10T12:44:24.581217+0000 8595'180 10088:2478 [33,14,32] 33 [33,14,32] 33 8595'180 2022-02-10T12:44:24.581129+0000 8595'180 2022-02-06T18:32:25.066730+0000 0

Zach 


On 2022-02-10 11:07 AM, gfar...@redhat.com wrote:
I don?t know how to get better errors out of cephadm, but the only way I can think of for this to happen is if your crush rule is somehow placing multiple replicas of a pg on a single host that cephadm wants to upgrade. So check your rules, your pool sizes, and osd tree?
-Greg

On Thu, Feb 10, 2022 at 8:25 AM Zach Heise (SSCC) <he...@ssc.wisc.edu> wrote:

It could be an issue with the devicehealthpool as you are correct, it is a single PG - but when the cluster is reporting that everything is healthy, it's difficult where to go from there. What I don't understand is why its refusing to upgrade ANY of the osd daemons; I have 33 of them, why would a single PG going offline be a problem for all of them?

I did try stopping the upgrade and restarting it, but it just picks up at the same place (11/56 daemons upgraded) and immediately reports the same issue.

Is there any way to at least tell which PG is the problematic one?


Zach


On 2022-02-09 4:19 PM, anthony.da...@gmail.com wrote:
Speculation:  might the devicehealth pool be involved?  It seems to typically have just 1 PG.



On Feb 9, 2022, at 1:41 PM, Zach Heise (SSCC) <he...@ssc.wisc.edu> wrote:

Good afternoon, thank you for your reply. Yes I know you are right, eventually we'll switch to an odd number of mons rather than even. We're still in 'testing' mode right now and only my coworkers and I are using the cluster.

Of the 7 pools, all but 2 are replica x3. The last two are EC 2+2.

Zach Heise


On 2022-02-09 3:38 PM, sascha.art...@gmail.com wrote:
Hello,

all your pools running replica > 1?
also having 4 monitors is pretty bad for split brain situations..

Zach Heise (SSCC) <he...@ssc.wisc.edu> schrieb am Mi., 9. Feb. 2022, 22:02:

   Hello,

   ceph health detail says my 5-node cluster is healthy, yet when I ran
   ceph orch upgrade start --ceph-version 16.2.7 everything seemed to go
   fine until we got to the OSD section, now for the past hour, every 15
   seconds a new log entry of  'Upgrade: unsafe to stop osd(s) at
   this time
   (1 PGs are or would become offline)' appears in the logs.

   ceph pg dump_stuck (unclean, degraded, etc) shows "ok" for everything
   too. Yet somehow 1 PG is (apparently) holding up all the OSD upgrades
   and not letting the process finish. Should I stop the upgrade and
   try it
   again? (I haven't done that before so was just nervous to try it).
   Any
   other ideas?

      cluster:
        id:     9aa000e8-b999-11eb-82f2-ecf4bbcc0ac0
        health: HEALTH_OK

      services:
        mon: 4 daemons, quorum ceph05,ceph04,ceph01,ceph03 (age 92m)
        mgr: ceph03.futetp(active, since 97m), standbys: ceph01.fblojp
        mds: 1/1 daemons up, 1 hot standby
        osd: 33 osds: 33 up (since 2h), 33 in (since 4h); 9 remapped pgs

      data:
        volumes: 1/1 healthy
        pools:   7 pools, 193 pgs
        objects: 3.72k objects, 14 GiB
        usage:   43 GiB used, 64 TiB / 64 TiB avail
        pgs:     231/11170 objects misplaced (2.068%)
                 185 active+clean
                 8   active+clean+remapped

      io:
        client:   1.2 KiB/s rd, 2 op/s rd, 0 op/s wr

      progress:
        Upgrade to 16.2.7 (5m)
          [=====.......................] (remaining: 24m)

   --     Zach
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to