[ceph-users] Re: Cluster healthy, but 16.2.7 osd daemon upgrade says its unsafe to stop them?

Zach Heise (SSCC) Thu, 10 Feb 2022 10:11:22 -0800

That's an excellent point! Between my last ceph upgrade and now, I did make a new crush ruleset and a new pool that uses that crush rule. It was just for SSDs, of which I have 5, one per host.

All of my other pools are using the default crush rulesets "replicated_rule" for the Replica x3, and "erasure-code" for the EC pools.

{
    "rule_id": 2,
    "rule_name": "highspeedSSD",
    "ruleset": 2,
    "type": 1,
    "min_size": 1,
    "max_size": 10,
    "steps": [
        {
            "op": "take",
            "item": -27,
            "item_name": "default~ssd"
        },
        {
            "op": "chooseleaf_firstn",
            "num": 0,
            "type": "host"
        },
        {
            "op": "emit"
        }
    ]
}

They're on OSDs 5, 14, 26, 32 and 33. the new "ssdpool" is number 13 and is a replica x3.

So I looked at at all the PGs starting with 13 (table below) and yes, just as one would hope, they are only using the ssd OSDs. But, each SSD and its corresponding OSD is on a different host. Every one of the 32 PGs has its 3x OSDs therefore also on different hosts. No PG for pool is being replicated to the same host.

If one of the PGs for this pool were not being properly replicated to 3 different OSDs on 3 different hosts, that would be a warning or error I'd think?

PG_STAT	OBJECTS	MISSING_ON_PRIMARY	DEGRADED	MISPLACED	UNFOUND	BYTES	OMAP_BYTES*	OMAP_KEYS*	LOG	DISK_LOG	STATE	STATE_STAMP	VERSION	REPORTED	UP	UP_PRIMARY	ACTING	ACTING_PRIMARY	LAST_SCRUB	SCRUB_STAMP	LAST_DEEP_SCRUB	DEEP_SCRUB_STAMP	SNAPTRIMQ_LEN
13.1a	36	0	0	0	0	1.51E+08	0	0	190	190	active+clean	2022-02-10T02:52:45.595274+0000	8595'190	10088:2829	[5,33,14]	5	[5,33,14]	5	8595'190	2022-02-10T02:52:45.595191+0000	8595'190	2022-02-09T01:48:54.354375+0000	0
13.1b	34	0	0	0	0	1.43E+08	0	0	186	186	active+clean	2022-02-10T09:57:13.864888+0000	8595'186	10088:2764	[32,5,26]	32	[32,5,26]	32	8595'186	2022-02-10T09:57:13.864816+0000	8595'186	2022-02-10T09:57:13.864816+0000	0
13.1c	30	0	0	0	0	1.26E+08	0	0	174	174	active+clean	2022-02-10T00:51:38.215782+0000	8531'174	10088:2565	[32,26,33]	32	[32,26,33]	32	8531'174	2022-02-10T00:51:38.215699+0000	8531'174	2022-02-04T00:13:03.051036+0000	0
13.1d	31	0	0	0	0	1.26E+08	0	0	182	182	active+clean	2022-02-09T22:34:36.010164+0000	8472'182	10088:2954	[14,5,33]	14	[14,5,33]	14	8472'182	2022-02-09T22:34:36.010085+0000	8472'182	2022-02-09T22:34:36.010085+0000	0
13.1e	22	0	0	0	0	92274688	0	0	140	140	active+clean	2022-02-09T18:39:59.737323+0000	8563'140	10088:2286	[33,14,32]	33	[33,14,32]	33	8563'140	2022-02-09T09:27:55.107416+0000	8563'140	2022-02-08T07:54:49.474418+0000	0
13.1f	31	0	0	0	0	1.3E+08	0	0	208	208	active+clean	2022-02-10T02:51:33.770146+0000	8412'208	10088:2784	[32,26,5]	32	[32,26,5]	32	8412'208	2022-02-10T02:51:33.770066+0000	8412'208	2022-02-08T19:42:17.625397+0000	0
13.a	32	0	0	0	0	1.34E+08	0	0	188	188	active+clean	2022-02-10T01:13:32.640014+0000	8595'188	10088:2833	[14,32,33]	14	[14,32,33]	14	8595'188	2022-02-10T01:13:32.639917+0000	8595'188	2022-02-04T23:15:54.724121+0000	0
13.b	30	0	0	0	0	1.26E+08	0	0	156	156	active+clean	2022-02-09T18:39:48.449523+0000	8595'156	10088:2739	[26,33,14]	26	[26,33,14]	26	8595'156	2022-02-09T18:39:48.449379+0000	8595'156	2022-02-09T18:39:48.449379+0000	0
13.c	31	0	0	0	0	1.3E+08	0	0	186	186	active+clean	2022-02-09T18:39:59.737388+0000	8595'186	10088:2493	[33,14,32]	33	[33,14,32]	33	8595'186	2022-02-09T18:39:51.151311+0000	8595'186	2022-02-07T06:01:56.864006+0000	0
13.d	29	0	0	0	0	1.22E+08	0	0	172	172	active+clean	2022-02-10T08:31:08.425299+0000	8564'172	10088:2621	[26,14,32]	26	[26,14,32]	26	8564'172	2022-02-10T08:31:08.425204+0000	8564'172	2022-02-10T08:31:08.425204+0000	0
13.e	30	0	0	0	0	1.26E+08	0	0	184	184	active+clean	2022-02-09T23:47:01.504802+0000	8595'184	10088:2535	[5,14,26]	5	[5,14,26]	5	8595'184	2022-02-09T23:47:01.504705+0000	8595'184	2022-02-05T11:09:57.757107+0000	0
13.f	26	0	0	0	0	1.09E+08	0	0	198	198	active+clean	2022-02-10T07:01:38.357554+0000	8595'198	10088:2651	[33,14,5]	33	[33,14,5]	33	8595'198	2022-02-10T07:01:38.357485+0000	8595'198	2022-02-07T20:45:51.724002+0000	0
13.0	34	0	0	0	0	1.43E+08	0	0	206	206	active+clean	2022-02-09T21:20:58.522212+0000	8564'206	10088:2889	[26,14,5]	26	[26,14,5]	26	8564'206	2022-02-09T21:20:58.522134+0000	8564'206	2022-02-08T13:25:16.519211+0000	0
13.1	34	0	0	0	0	1.43E+08	0	0	220	220	active+clean	2022-02-09T22:41:21.388826+0000	8412'220	10088:2887	[14,32,26]	14	[14,32,26]	14	8412'220	2022-02-09T22:41:21.388760+0000	8412'220	2022-02-04T20:58:46.282428+0000	0
13.10	32	0	0	0	0	1.34E+08	0	0	194	194	active+clean	2022-02-10T12:41:25.745944+0000	8472'194	10088:2574	[33,26,32]	33	[33,26,32]	33	8472'194	2022-02-10T12:41:25.745869+0000	8472'194	2022-02-08T01:28:32.427800+0000	0
13.11	28	0	0	0	0	1.17E+08	0	0	178	178	active+clean	2022-02-09T18:39:59.735355+0000	8595'178	10088:2626	[33,14,5]	33	[33,14,5]	33	8595'178	2022-02-09T14:01:45.095410+0000	8595'178	2022-02-08T06:08:39.135803+0000	0
13.12	34	0	0	0	0	1.43E+08	0	0	198	198	active+clean	2022-02-09T18:39:59.735275+0000	8595'198	10088:2781	[33,14,5]	33	[33,14,5]	33	8595'198	2022-02-09T16:22:31.596817+0000	8595'198	2022-02-09T16:22:31.596817+0000	0
13.13	35	0	0	0	0	1.47E+08	0	0	204	204	active+clean	2022-02-09T18:39:50.237871+0000	8595'204	10088:2849	[5,33,32]	5	[5,33,32]	5	8595'204	2022-02-09T18:39:50.237797+0000	8595'204	2022-02-08T16:32:38.006399+0000	0
13.14	28	0	0	0	0	1.17E+08	0	0	180	180	active+clean	2022-02-10T17:11:34.974225+0000	8626'180	10089:2498	[32,33,26]	32	[32,33,26]	32	8626'180	2022-02-10T17:11:34.974155+0000	8626'180	2022-02-09T09:19:54.199018+0000	0
13.15	30	0	0	0	0	1.26E+08	0	0	170	170	active+clean	2022-02-10T03:53:41.072711+0000	8511'170	10088:2876	[5,33,26]	5	[5,33,26]	5	8511'170	2022-02-10T03:53:41.072610+0000	8511'170	2022-02-08T20:51:10.657159+0000	0
13.16	35	0	0	0	0	1.47E+08	0	0	196	196	active+clean	2022-02-10T12:08:59.786545+0000	8472'196	10088:2877	[14,26,33]	14	[14,26,33]	14	8472'196	2022-02-10T12:08:59.786462+0000	8472'196	2022-02-09T04:02:45.033237+0000	0
13.17	21	0	0	0	0	88080384	0	0	150	150	active+clean	2022-02-10T06:19:19.045692+0000	8508'150	10088:2581	[26,5,14]	26	[26,5,14]	26	8508'150	2022-02-10T06:19:19.045612+0000	8508'150	2022-02-04T19:20:53.523798+0000	0
13.18	29	0	0	0	0	1.22E+08	0	0	186	186	active+clean	2022-02-10T01:10:28.630477+0000	8532'186	10088:2529	[5,26,14]	5	[5,26,14]	5	8532'186	2022-02-10T01:10:28.630406+0000	8532'186	2022-02-10T01:10:28.630406+0000	0
13.19	28	0	0	0	0	1.17E+08	0	0	174	174	active+clean	2022-02-09T23:29:46.574414+0000	8626'174	10088:2963	[14,33,26]	14	[14,33,26]	14	8626'174	2022-02-09T23:29:46.574337+0000	8626'174	2022-02-04T20:58:50.149946+0000	0
13.2	26	0	0	0	0	1.09E+08	0	0	178	178	active+clean	2022-02-10T01:31:11.275696+0000	8595'178	10088:2662	[5,26,32]	5	[5,26,32]	5	8595'178	2022-02-10T01:31:11.275593+0000	8595'178	2022-02-07T17:54:21.490486+0000	0
13.3	31	0	0	0	0	1.3E+08	0	0	166	166	active+clean	2022-02-10T06:57:38.939205+0000	8442'166	10088:2784	[5,32,33]	5	[5,32,33]	5	8442'166	2022-02-10T06:57:38.939091+0000	8442'166	2022-02-09T01:57:23.398996+0000	0
13.4	39	0	0	0	0	1.64E+08	0	0	208	208	active+clean	2022-02-10T17:13:14.834732+0000	8595'208	10089:2931	[14,5,26]	14	[14,5,26]	14	8595'208	2022-02-10T17:13:14.834656+0000	8595'208	2022-02-08T10:42:51.846976+0000	0
13.5	27	0	0	0	0	1.13E+08	0	0	188	188	active+clean	2022-02-09T18:39:47.423693+0000	8595'188	10088:2678	[5,33,26]	5	[5,33,26]	5	8595'188	2022-02-09T07:45:41.024361+0000	8595'188	2022-02-06T23:19:13.370525+0000	0
13.6	32	0	0	0	0	1.34E+08	0	0	180	180	active+clean	2022-02-10T03:22:39.617758+0000	8412'180	10088:2508	[33,14,32]	33	[33,14,32]	33	8412'180	2022-02-10T03:22:39.617672+0000	8412'180	2022-02-10T03:22:39.617672+0000	0
13.7	41	0	0	0	0	1.72E+08	0	0	186	186	active+clean	2022-02-10T08:17:50.715439+0000	8564'186	10088:3106	[5,26,33]	5	[5,26,33]	5	8564'186	2022-02-10T08:17:50.715345+0000	8564'186	2022-02-09T00:50:25.102262+0000	0
13.8	23	0	0	0	0	96468992	0	0	152	152	active+clean	2022-02-10T06:53:46.438741+0000	8597'152	10088:2574	[5,14,33]	5	[5,14,33]	5	8597'152	2022-02-10T06:53:46.438657+0000	8597'152	2022-02-10T06:53:46.438657+0000	0
13.9	29	0	0	0	0	1.22E+08	0	0	180	180	active+clean	2022-02-10T12:44:24.581217+0000	8595'180	10088:2478	[33,14,32]	33	[33,14,32]	33	8595'180	2022-02-10T12:44:24.581129+0000	8595'180	2022-02-06T18:32:25.066730+0000	0

Zach

On 2022-02-10 11:07 AM, gfar...@redhat.com wrote:

I don?t know how to get better errors out of cephadm, but the only way I can think of for this to happen is if your crush rule is somehow placing multiple replicas of a pg on a single host that cephadm wants to upgrade. So check your rules, your pool sizes, and osd tree?

-Greg
On Thu, Feb 10, 2022 at 8:25 AM Zach Heise (SSCC) <he...@ssc.wisc.edu> wrote:
It could be an issue with the devicehealthpool as you are correct, it is a single PG - but when the cluster is reporting that everything is healthy, it's difficult where to go from there. What I don't understand is why its refusing to upgrade ANY of the osd daemons; I have 33 of them, why would a single PG going offline be a problem for all of them?

I did try stopping the upgrade and restarting it, but it just picks up at the same place (11/56 daemons upgraded) and immediately reports the same issue.

Is there any way to at least tell which PG is the problematic one?
Zach

On 2022-02-09 4:19 PM, anthony.da...@gmail.com wrote:
Speculation:  might the devicehealth pool be involved?  It seems to typically have just 1 PG.
On Feb 9, 2022, at 1:41 PM, Zach Heise (SSCC) <he...@ssc.wisc.edu> wrote:

Good afternoon, thank you for your reply. Yes I know you are right, eventually we'll switch to an odd number of mons rather than even. We're still in 'testing' mode right now and only my coworkers and I are using the cluster.

Of the 7 pools, all but 2 are replica x3. The last two are EC 2+2.

Zach Heise


On 2022-02-09 3:38 PM, sascha.art...@gmail.com wrote:
Hello,

all your pools running replica > 1?
also having 4 monitors is pretty bad for split brain situations..

Zach Heise (SSCC) <he...@ssc.wisc.edu> schrieb am Mi., 9. Feb. 2022, 22:02:

   Hello,

   ceph health detail says my 5-node cluster is healthy, yet when I ran
   ceph orch upgrade start --ceph-version 16.2.7 everything seemed to go
   fine until we got to the OSD section, now for the past hour, every 15
   seconds a new log entry of  'Upgrade: unsafe to stop osd(s) at
   this time
   (1 PGs are or would become offline)' appears in the logs.

   ceph pg dump_stuck (unclean, degraded, etc) shows "ok" for everything
   too. Yet somehow 1 PG is (apparently) holding up all the OSD upgrades
   and not letting the process finish. Should I stop the upgrade and
   try it
   again? (I haven't done that before so was just nervous to try it).
   Any
   other ideas?

      cluster:
        id:     9aa000e8-b999-11eb-82f2-ecf4bbcc0ac0
        health: HEALTH_OK

      services:
        mon: 4 daemons, quorum ceph05,ceph04,ceph01,ceph03 (age 92m)
        mgr: ceph03.futetp(active, since 97m), standbys: ceph01.fblojp
        mds: 1/1 daemons up, 1 hot standby
        osd: 33 osds: 33 up (since 2h), 33 in (since 4h); 9 remapped pgs

      data:
        volumes: 1/1 healthy
        pools:   7 pools, 193 pgs
        objects: 3.72k objects, 14 GiB
        usage:   43 GiB used, 64 TiB / 64 TiB avail
        pgs:     231/11170 objects misplaced (2.068%)
                 185 active+clean
                 8   active+clean+remapped

      io:
        client:   1.2 KiB/s rd, 2 op/s rd, 0 op/s wr

      progress:
        Upgrade to 16.2.7 (5m)
          [=====.......................] (remaining: 24m)

   --     Zach
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Cluster healthy, but 16.2.7 osd daemon upgrade says its unsafe to stop them?

Reply via email to