[ceph-users] Re: osd removal leaves 'stray daemon'

2022-12-05 Thread Holger Naundorf

Hello,
a mgr failover did not change the situation - the osd still shows up in 
the 'ceph node ls' - I assume that this is more or less 'working as 
intended' as I did ask for the OSD to be kept in the CRUSH map to be 
replacd later - but as we are still not so experienced with Ceph here I 
wanted to get some input from other sites.


Regards,
Holger

On 30.11.22 16:28, Adam King wrote:
I typically don't see this when I do OSD replacement. If you do a mgr 
failover ("ceph mgr fail") and wait a few minutes does this still show 
up? The stray daemon/host warning is roughly equivalent to comparing the 
daemons in `ceph node ls` and `ceph orch ps` and seeing if there's 
anything in the former but not the latter. Sometimes I have seen the mgr 
will have some out of data info in the node ls and a failover will 
refresh it.


On Fri, Nov 25, 2022 at 6:07 AM Holger Naundorf > wrote:


Hello,
I have a question about osd removal/replacement:

I just removed an osd where the disk was still running but had read
errors, leading to failed deep scrubs - as the intent is to replace
this
as soon as we manage to get a spare I removed it with the
'--replace' flag:

# ceph orch osd rm 224 --replace

After all placement groups are evacuated I now have 1 osd down/out
and showing as 'destroyed':

# ceph osd tree
ID   CLASS  WEIGHT      TYPE NAME        STATUS     REWEIGHT  PRI-AFF
(...)
214    hdd    14.55269          osd.214         up   1.0  1.0
224    hdd    14.55269          osd.224  destroyed         0  1.0
234    hdd    14.55269          osd.234         up   1.0  1.0
(...)

All as expected - but now the health check complains that the
(destroyed) osd is not managed:

# ceph health detail
HEALTH_WARN 1 stray daemon(s) not managed by cephadm
[WRN] CEPHADM_STRAY_DAEMON: 1 stray daemon(s) not managed by cephadm
      stray daemon osd.224 on host ceph19 not managed by cephadm

Is this expected behaviour and I have to live with the yellow check
until we get a replacement disk and recreate the osd or did something
not finish correctly?

Regards,
Holger

-- 
Dr. Holger Naundorf

Christian-Albrechts-Universität zu Kiel
Rechenzentrum / HPC / Server und Storage
Tel: +49 431 880-1990
Fax:  +49 431 880-1523
naund...@rz.uni-kiel.de 
___
ceph-users mailing list -- ceph-users@ceph.io

To unsubscribe send an email to ceph-users-le...@ceph.io




--
Dr. Holger Naundorf
Christian-Albrechts-Universität zu Kiel
Rechenzentrum / HPC / Server und Storage
Tel: +49 431 880-1990
Fax:  +49 431 880-1523
naund...@rz.uni-kiel.de
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: osd removal leaves 'stray daemon'

2022-11-30 Thread Adam King
I typically don't see this when I do OSD replacement. If you do a mgr
failover ("ceph mgr fail") and wait a few minutes does this still show up?
The stray daemon/host warning is roughly equivalent to comparing the
daemons in `ceph node ls` and `ceph orch ps` and seeing if there's anything
in the former but not the latter. Sometimes I have seen the mgr will have
some out of data info in the node ls and a failover will refresh it.

On Fri, Nov 25, 2022 at 6:07 AM Holger Naundorf 
wrote:

> Hello,
> I have a question about osd removal/replacement:
>
> I just removed an osd where the disk was still running but had read
> errors, leading to failed deep scrubs - as the intent is to replace this
> as soon as we manage to get a spare I removed it with the '--replace' flag:
>
> # ceph orch osd rm 224 --replace
>
> After all placement groups are evacuated I now have 1 osd down/out
> and showing as 'destroyed':
>
> # ceph osd tree
> ID   CLASS  WEIGHT  TYPE NAMESTATUS REWEIGHT  PRI-AFF
> (...)
> 214hdd14.55269  osd.214 up   1.0  1.0
> 224hdd14.55269  osd.224  destroyed 0  1.0
> 234hdd14.55269  osd.234 up   1.0  1.0
> (...)
>
> All as expected - but now the health check complains that the
> (destroyed) osd is not managed:
>
> # ceph health detail
> HEALTH_WARN 1 stray daemon(s) not managed by cephadm
> [WRN] CEPHADM_STRAY_DAEMON: 1 stray daemon(s) not managed by cephadm
>  stray daemon osd.224 on host ceph19 not managed by cephadm
>
> Is this expected behaviour and I have to live with the yellow check
> until we get a replacement disk and recreate the osd or did something
> not finish correctly?
>
> Regards,
> Holger
>
> --
> Dr. Holger Naundorf
> Christian-Albrechts-Universität zu Kiel
> Rechenzentrum / HPC / Server und Storage
> Tel: +49 431 880-1990
> Fax:  +49 431 880-1523
> naund...@rz.uni-kiel.de
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io