[ceph-users] radosgw sync non-existent bucket ceph reef 18.2.2

2024-04-30 Thread Christopher Durham


Hi,
I have a reef cluster 18.2.2 on Rocky 8.9. This cluster has been upgraded from 
pacific->quincy->reef over the past few years. It is a multi site with one 
other cluster that works fine with s3/radosgw on both sides, with proper 
bidirectional data replication.
On one of the master cluster's radosgw logs, I noticed a sync request regarding 
a deleted bucket. I am not sure when this error started, but I know that the 
bucket in question was deleted a long time before the upgrade to reef. 
Perhapsthis error existed prior to reef, I do not know. Here is the error in 
the radosgw log:
:get_bucket_index_log_status ERROR: rgw_read_bucket_full_sync_status() on 
pipe{s={b=BUCKET_NAME:CLUSTERID ..., z=, az= ...},d={b=..,az=...}} returned 
ret=-2
My understanding:
s=source, d=destination, each of which is a tuple with the appropriate info 
necessary

This happens for BUCKET_NAME every few minutes. Said bucket does not exist on 
either side of the multisite, but did in the past.
Any way I can force radosgw to stop trying to replicate?
Thanks
-Chris
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Reconstructing an OSD server when the boot OS is corrupted

2024-04-30 Thread Eugen Block

Oh I'm sorry, Peter, I don't know why I wrote Karl. I apologize.

Zitat von Eugen Block :


Hi Karl,

I must admit that I haven't dealt with raw OSDs yet. We've been  
usually working with LVM based clusters (some of the customers used  
SUSE's product SES) and in SES there was a recommendation to switch  
to LVM before adopting with cephadm. So we usually did a rebuild of  
all OSDs before upgrading. Hopefully someone else has more  
experience with raw OSDs.
If the host is offline in the orch list, usually the MGR can't  
communicate via SSH. Make sure you have the ceph pub key in the  
authorized_keys, there's a troubleshooting page in the docs [1] for  
ssh errors.


Regards,
Eugen

[1] https://docs.ceph.com/en/reef/cephadm/troubleshooting/#ssh-errors

Zitat von Peter van Heusden :


Thanks Eugen and others for the advice. These are not, however, lvm-based
OSDs. I can get a list of what is out there with:

cephadm ceph-volume raw list

and tried

cephadm ceph-volume raw activate

but it tells me I need to manually run activate.

I was able to find the correct data disks with for example:

ceph-bluestore-tool show-label --dev /dev/sda2

but on running e.g.

cephadm ceph-volume raw activate --osd-id 20 --device /dev/sda --osd-uuid
74f4ce9c-4623-41b7-a7f9-cc81bb9467ef --block.db /dev/nvme1n1p1 --block.wal
/dev/nvme0n1p1

(OSD ID inferred from the list of down OSDs)

I got an error that "systemd support not yet implemented". On adding
--no-systemd to the command, I get the response:

stderr KeyError: 'osd_id'
"
The on-disk metadata indeed doesn't have an osd_id for most entries. For
the one instance I can find with the osd_id key in the metadata, the
"cephadm ceph-volume raw activate" completes but with no apparent change to
the system.

Is there any advice on how to recover the configuration with raw, not LVM,
OSDs?

And then once I have things added back in: the host is currently listed as
offline in the output of "ceph orch host ls". How can it be re-added to
this list?

Thank you,
Peter

BTW full error message:

Inferring fsid ed7b2c16-b053-45e2-a1fe-bf3474f90508
Using ceph image with id '59248721b0c7' and tag 'v17' created on 2024-04-24
16:06:51 + UTC
quay.io/ceph/ceph@sha256:96f2a53bc3028eec16e790c6225e7d7acad8a48737a57ec14eea7ce036733233
Non-zero exit code 1 from /usr/bin/docker run --rm --ipc=host
--stop-signal=SIGTERM --ulimit nofile=1048576 --net=host --entrypoint
/usr/sbin/ceph-volume --privileged --group-add=disk --init -e
CONTAINER_IMAGE=
quay.io/ceph/ceph@sha256:96f2a53bc3028eec16e790c6225e7d7acad8a48737a57ec14eea7ce036733233
-e NODE_NAME=ceph-osd3 -e CEPH_USE_RANDOM_NONCE=1 -e
CEPH_VOLUME_SKIP_RESTORECON=yes -e CEPH_VOLUME_DEBUG=1 -v
/var/log/ceph/ed7b2c16-b053-45e2-a1fe-bf3474f90508:/var/log/ceph:z -v
/dev:/dev -v /run/udev:/run/udev -v /sys:/sys -v /run/lvm:/run/lvm -v
/run/lock/lvm:/run/lock/lvm -v /:/rootfs -v
/tmp/ceph-tmpjox0_hj0:/etc/ceph/ceph.conf:z
quay.io/ceph/ceph@sha256:96f2a53bc3028eec16e790c6225e7d7acad8a48737a57ec14eea7ce036733233
raw activate --osd-id 20 --device /dev/sda --osd-uuid
74f4ce9c-4623-41b7-a7f9-cc81bb9467ef --block.db /dev/nvme1n1p1 --block.wal
/dev/nvme0n1p1 --no-systemd
/usr/bin/docker: stderr Traceback (most recent call last):
/usr/bin/docker: stderr   File "/usr/sbin/ceph-volume", line 11, in 
/usr/bin/docker: stderr load_entry_point('ceph-volume==1.0.0',
'console_scripts', 'ceph-volume')()
/usr/bin/docker: stderr   File
"/usr/lib/python3.6/site-packages/ceph_volume/main.py", line 41, in __init__
/usr/bin/docker: stderr self.main(self.argv)
/usr/bin/docker: stderr   File
"/usr/lib/python3.6/site-packages/ceph_volume/decorators.py", line 59, in
newfunc
/usr/bin/docker: stderr return f(*a, **kw)
/usr/bin/docker: stderr   File
"/usr/lib/python3.6/site-packages/ceph_volume/main.py", line 153, in main
/usr/bin/docker: stderr terminal.dispatch(self.mapper, subcommand_args)
/usr/bin/docker: stderr   File
"/usr/lib/python3.6/site-packages/ceph_volume/terminal.py", line 194, in
dispatch
/usr/bin/docker: stderr instance.main()
/usr/bin/docker: stderr   File
"/usr/lib/python3.6/site-packages/ceph_volume/devices/raw/main.py", line
32, in main
/usr/bin/docker: stderr terminal.dispatch(self.mapper, self.argv)
/usr/bin/docker: stderr   File
"/usr/lib/python3.6/site-packages/ceph_volume/terminal.py", line 194, in
dispatch
/usr/bin/docker: stderr instance.main()
/usr/bin/docker: stderr   File
"/usr/lib/python3.6/site-packages/ceph_volume/devices/raw/activate.py",
line 166, in main
/usr/bin/docker: stderr systemd=not self.args.no_systemd)
/usr/bin/docker: stderr   File
"/usr/lib/python3.6/site-packages/ceph_volume/decorators.py", line 16, in
is_root
/usr/bin/docker: stderr return func(*a, **kw)
/usr/bin/docker: stderr   File
"/usr/lib/python3.6/site-packages/ceph_volume/devices/raw/activate.py",
line 79, in activate
/usr/bin/docker: stderr osd_id = meta['osd_id']
/usr/bin/docker: stderr KeyError: 'osd_id'
Traceback (most recent call 

[ceph-users] Re: Reconstructing an OSD server when the boot OS is corrupted

2024-04-30 Thread Eugen Block

Hi Karl,

I must admit that I haven't dealt with raw OSDs yet. We've been  
usually working with LVM based clusters (some of the customers used  
SUSE's product SES) and in SES there was a recommendation to switch to  
LVM before adopting with cephadm. So we usually did a rebuild of all  
OSDs before upgrading. Hopefully someone else has more experience with  
raw OSDs.
If the host is offline in the orch list, usually the MGR can't  
communicate via SSH. Make sure you have the ceph pub key in the  
authorized_keys, there's a troubleshooting page in the docs [1] for  
ssh errors.


Regards,
Eugen

[1] https://docs.ceph.com/en/reef/cephadm/troubleshooting/#ssh-errors

Zitat von Peter van Heusden :


Thanks Eugen and others for the advice. These are not, however, lvm-based
OSDs. I can get a list of what is out there with:

cephadm ceph-volume raw list

and tried

cephadm ceph-volume raw activate

but it tells me I need to manually run activate.

I was able to find the correct data disks with for example:

ceph-bluestore-tool show-label --dev /dev/sda2

but on running e.g.

cephadm ceph-volume raw activate --osd-id 20 --device /dev/sda --osd-uuid
74f4ce9c-4623-41b7-a7f9-cc81bb9467ef --block.db /dev/nvme1n1p1 --block.wal
/dev/nvme0n1p1

(OSD ID inferred from the list of down OSDs)

I got an error that "systemd support not yet implemented". On adding
--no-systemd to the command, I get the response:

stderr KeyError: 'osd_id'
"
The on-disk metadata indeed doesn't have an osd_id for most entries. For
the one instance I can find with the osd_id key in the metadata, the
"cephadm ceph-volume raw activate" completes but with no apparent change to
the system.

Is there any advice on how to recover the configuration with raw, not LVM,
OSDs?

And then once I have things added back in: the host is currently listed as
offline in the output of "ceph orch host ls". How can it be re-added to
this list?

Thank you,
Peter

BTW full error message:

Inferring fsid ed7b2c16-b053-45e2-a1fe-bf3474f90508
Using ceph image with id '59248721b0c7' and tag 'v17' created on 2024-04-24
16:06:51 + UTC
quay.io/ceph/ceph@sha256:96f2a53bc3028eec16e790c6225e7d7acad8a48737a57ec14eea7ce036733233
Non-zero exit code 1 from /usr/bin/docker run --rm --ipc=host
--stop-signal=SIGTERM --ulimit nofile=1048576 --net=host --entrypoint
/usr/sbin/ceph-volume --privileged --group-add=disk --init -e
CONTAINER_IMAGE=
quay.io/ceph/ceph@sha256:96f2a53bc3028eec16e790c6225e7d7acad8a48737a57ec14eea7ce036733233
-e NODE_NAME=ceph-osd3 -e CEPH_USE_RANDOM_NONCE=1 -e
CEPH_VOLUME_SKIP_RESTORECON=yes -e CEPH_VOLUME_DEBUG=1 -v
/var/log/ceph/ed7b2c16-b053-45e2-a1fe-bf3474f90508:/var/log/ceph:z -v
/dev:/dev -v /run/udev:/run/udev -v /sys:/sys -v /run/lvm:/run/lvm -v
/run/lock/lvm:/run/lock/lvm -v /:/rootfs -v
/tmp/ceph-tmpjox0_hj0:/etc/ceph/ceph.conf:z
quay.io/ceph/ceph@sha256:96f2a53bc3028eec16e790c6225e7d7acad8a48737a57ec14eea7ce036733233
raw activate --osd-id 20 --device /dev/sda --osd-uuid
74f4ce9c-4623-41b7-a7f9-cc81bb9467ef --block.db /dev/nvme1n1p1 --block.wal
/dev/nvme0n1p1 --no-systemd
/usr/bin/docker: stderr Traceback (most recent call last):
/usr/bin/docker: stderr   File "/usr/sbin/ceph-volume", line 11, in 
/usr/bin/docker: stderr load_entry_point('ceph-volume==1.0.0',
'console_scripts', 'ceph-volume')()
/usr/bin/docker: stderr   File
"/usr/lib/python3.6/site-packages/ceph_volume/main.py", line 41, in __init__
/usr/bin/docker: stderr self.main(self.argv)
/usr/bin/docker: stderr   File
"/usr/lib/python3.6/site-packages/ceph_volume/decorators.py", line 59, in
newfunc
/usr/bin/docker: stderr return f(*a, **kw)
/usr/bin/docker: stderr   File
"/usr/lib/python3.6/site-packages/ceph_volume/main.py", line 153, in main
/usr/bin/docker: stderr terminal.dispatch(self.mapper, subcommand_args)
/usr/bin/docker: stderr   File
"/usr/lib/python3.6/site-packages/ceph_volume/terminal.py", line 194, in
dispatch
/usr/bin/docker: stderr instance.main()
/usr/bin/docker: stderr   File
"/usr/lib/python3.6/site-packages/ceph_volume/devices/raw/main.py", line
32, in main
/usr/bin/docker: stderr terminal.dispatch(self.mapper, self.argv)
/usr/bin/docker: stderr   File
"/usr/lib/python3.6/site-packages/ceph_volume/terminal.py", line 194, in
dispatch
/usr/bin/docker: stderr instance.main()
/usr/bin/docker: stderr   File
"/usr/lib/python3.6/site-packages/ceph_volume/devices/raw/activate.py",
line 166, in main
/usr/bin/docker: stderr systemd=not self.args.no_systemd)
/usr/bin/docker: stderr   File
"/usr/lib/python3.6/site-packages/ceph_volume/decorators.py", line 16, in
is_root
/usr/bin/docker: stderr return func(*a, **kw)
/usr/bin/docker: stderr   File
"/usr/lib/python3.6/site-packages/ceph_volume/devices/raw/activate.py",
line 79, in activate
/usr/bin/docker: stderr osd_id = meta['osd_id']
/usr/bin/docker: stderr KeyError: 'osd_id'
Traceback (most recent call last):
  File "/usr/sbin/cephadm", line 9679, in 
main()
  File "/usr/sbin/cephadm", 

[ceph-users] Re: Reconstructing an OSD server when the boot OS is corrupted

2024-04-30 Thread Bogdan Adrian Velica
Hi,

If I may, I would try something like this but I haven't tested this so
please take this with a grain of salt...

1.I would reinstall the Operating System in this case...
Since the root filesystem is accessible but the OS is not bootable, the
most straightforward approach would be to perform a clean install of Ubuntu
20.04 on the corrupted server. Make sure to not format the data drives used
by Ceph during this process (the OSDs here).

2. Then reinstall Ceph:
After reinstalling the operating system, you'll need to reinstall Ceph.
Ensure that the version you install is compatible with the existing cluster
(v15.2.17 as per your setup).

3. Reattach OSDs:
- Using ceph-bluestore-tool show-label to identify the correct devices is a
good start. You've successfully identified the devices, which is crucial in
thie case.
- You attempted to use cephadm ceph-volume raw activate but encountered
issues due to missing osd_id in metadata and systemd not being implemented
for this command in your environment.
This suggests that there might be an inconsistency or corruption in the
metadata of the OSDs but I am not sure... maybe a dev could help here.

Handling osd_id KeyError:
- The error KeyError 'osd_id' I think indicates that the metadata required
to map the OSD ID to the device is missing or something (maybe
corruption?). If possible, check the output of "ceph-bluestore-tool
show-label --dev /dev/your_disk_drive_here" and other devices to verify if
osd_id is present in any metadata there. If it's consistently missing,
there might be a need to recreate or repopulate this metadata (I don't know
how to do this part)
- For OSDs where osd_id is available, try activating them individually
using the "cephadm ceph-volume raw activate" command with the *--no-systemd*
flag.

4. Re-adding the Host to the Cluster if this appears offline
If the host appears as offline, once the system is operational and the Ceph
services are running, you can add the host back to the cluster using "ceph
orch host add " command - this stept I am not sure it is needed.

5. Make sure the hostname matches what the cluster expects, and the
networking is correctly configured - similar as the old config if possible.

Just my 2 cents.

Thank you,
Bogdan Velica
croit.io

On Tue, Apr 30, 2024 at 10:38 PM Peter van Heusden  wrote:

> Thanks Eugen and others for the advice. These are not, however, lvm-based
> OSDs. I can get a list of what is out there with:
>
> cephadm ceph-volume raw list
>
> and tried
>
> cephadm ceph-volume raw activate
>
> but it tells me I need to manually run activate.
>
> I was able to find the correct data disks with for example:
>
> ceph-bluestore-tool show-label --dev /dev/sda2
>
> but on running e.g.
>
> cephadm ceph-volume raw activate --osd-id 20 --device /dev/sda --osd-uuid
> 74f4ce9c-4623-41b7-a7f9-cc81bb9467ef --block.db /dev/nvme1n1p1 --block.wal
> /dev/nvme0n1p1
>
> (OSD ID inferred from the list of down OSDs)
>
> I got an error that "systemd support not yet implemented". On adding
> --no-systemd to the command, I get the response:
>
> stderr KeyError: 'osd_id'
> "
> The on-disk metadata indeed doesn't have an osd_id for most entries. For
> the one instance I can find with the osd_id key in the metadata, the
> "cephadm ceph-volume raw activate" completes but with no apparent change to
> the system.
>
> Is there any advice on how to recover the configuration with raw, not LVM,
> OSDs?
>
> And then once I have things added back in: the host is currently listed as
> offline in the output of "ceph orch host ls". How can it be re-added to
> this list?
>
> Thank you,
> Peter
>
> BTW full error message:
>
> Inferring fsid ed7b2c16-b053-45e2-a1fe-bf3474f90508
> Using ceph image with id '59248721b0c7' and tag 'v17' created on 2024-04-24
> 16:06:51 + UTC
>
> quay.io/ceph/ceph@sha256:96f2a53bc3028eec16e790c6225e7d7acad8a48737a57ec14eea7ce036733233
> Non-zero
> 
> exit code 1 from /usr/bin/docker run --rm --ipc=host
> --stop-signal=SIGTERM --ulimit nofile=1048576 --net=host --entrypoint
> /usr/sbin/ceph-volume --privileged --group-add=disk --init -e
> CONTAINER_IMAGE=
>
> quay.io/ceph/ceph@sha256:96f2a53bc3028eec16e790c6225e7d7acad8a48737a57ec14eea7ce036733233
> -e
> 
> NODE_NAME=ceph-osd3 -e CEPH_USE_RANDOM_NONCE=1 -e
> CEPH_VOLUME_SKIP_RESTORECON=yes -e CEPH_VOLUME_DEBUG=1 -v
> /var/log/ceph/ed7b2c16-b053-45e2-a1fe-bf3474f90508:/var/log/ceph:z -v
> /dev:/dev -v /run/udev:/run/udev -v /sys:/sys -v /run/lvm:/run/lvm -v
> /run/lock/lvm:/run/lock/lvm -v /:/rootfs -v
> /tmp/ceph-tmpjox0_hj0:/etc/ceph/ceph.conf:z
>
> quay.io/ceph/ceph@sha256:96f2a53bc3028eec16e790c6225e7d7acad8a48737a57ec14eea7ce036733233
> raw activate --osd-id 20 --device /dev/sda --osd-uuid
> 74f4ce9c-4623-41b7-a7f9-cc81bb9467ef --block.db /dev/nvme1n1p1 

[ceph-users] Re: Reconstructing an OSD server when the boot OS is corrupted

2024-04-30 Thread Peter van Heusden
Thanks Eugen and others for the advice. These are not, however, lvm-based
OSDs. I can get a list of what is out there with:

cephadm ceph-volume raw list

and tried

cephadm ceph-volume raw activate

but it tells me I need to manually run activate.

I was able to find the correct data disks with for example:

ceph-bluestore-tool show-label --dev /dev/sda2

but on running e.g.

cephadm ceph-volume raw activate --osd-id 20 --device /dev/sda --osd-uuid
74f4ce9c-4623-41b7-a7f9-cc81bb9467ef --block.db /dev/nvme1n1p1 --block.wal
/dev/nvme0n1p1

(OSD ID inferred from the list of down OSDs)

I got an error that "systemd support not yet implemented". On adding
--no-systemd to the command, I get the response:

stderr KeyError: 'osd_id'
"
The on-disk metadata indeed doesn't have an osd_id for most entries. For
the one instance I can find with the osd_id key in the metadata, the
"cephadm ceph-volume raw activate" completes but with no apparent change to
the system.

Is there any advice on how to recover the configuration with raw, not LVM,
OSDs?

And then once I have things added back in: the host is currently listed as
offline in the output of "ceph orch host ls". How can it be re-added to
this list?

Thank you,
Peter

BTW full error message:

Inferring fsid ed7b2c16-b053-45e2-a1fe-bf3474f90508
Using ceph image with id '59248721b0c7' and tag 'v17' created on 2024-04-24
16:06:51 + UTC
quay.io/ceph/ceph@sha256:96f2a53bc3028eec16e790c6225e7d7acad8a48737a57ec14eea7ce036733233
Non-zero exit code 1 from /usr/bin/docker run --rm --ipc=host
--stop-signal=SIGTERM --ulimit nofile=1048576 --net=host --entrypoint
/usr/sbin/ceph-volume --privileged --group-add=disk --init -e
CONTAINER_IMAGE=
quay.io/ceph/ceph@sha256:96f2a53bc3028eec16e790c6225e7d7acad8a48737a57ec14eea7ce036733233
-e NODE_NAME=ceph-osd3 -e CEPH_USE_RANDOM_NONCE=1 -e
CEPH_VOLUME_SKIP_RESTORECON=yes -e CEPH_VOLUME_DEBUG=1 -v
/var/log/ceph/ed7b2c16-b053-45e2-a1fe-bf3474f90508:/var/log/ceph:z -v
/dev:/dev -v /run/udev:/run/udev -v /sys:/sys -v /run/lvm:/run/lvm -v
/run/lock/lvm:/run/lock/lvm -v /:/rootfs -v
/tmp/ceph-tmpjox0_hj0:/etc/ceph/ceph.conf:z
quay.io/ceph/ceph@sha256:96f2a53bc3028eec16e790c6225e7d7acad8a48737a57ec14eea7ce036733233
raw activate --osd-id 20 --device /dev/sda --osd-uuid
74f4ce9c-4623-41b7-a7f9-cc81bb9467ef --block.db /dev/nvme1n1p1 --block.wal
/dev/nvme0n1p1 --no-systemd
/usr/bin/docker: stderr Traceback (most recent call last):
/usr/bin/docker: stderr   File "/usr/sbin/ceph-volume", line 11, in 
/usr/bin/docker: stderr load_entry_point('ceph-volume==1.0.0',
'console_scripts', 'ceph-volume')()
/usr/bin/docker: stderr   File
"/usr/lib/python3.6/site-packages/ceph_volume/main.py", line 41, in __init__
/usr/bin/docker: stderr self.main(self.argv)
/usr/bin/docker: stderr   File
"/usr/lib/python3.6/site-packages/ceph_volume/decorators.py", line 59, in
newfunc
/usr/bin/docker: stderr return f(*a, **kw)
/usr/bin/docker: stderr   File
"/usr/lib/python3.6/site-packages/ceph_volume/main.py", line 153, in main
/usr/bin/docker: stderr terminal.dispatch(self.mapper, subcommand_args)
/usr/bin/docker: stderr   File
"/usr/lib/python3.6/site-packages/ceph_volume/terminal.py", line 194, in
dispatch
/usr/bin/docker: stderr instance.main()
/usr/bin/docker: stderr   File
"/usr/lib/python3.6/site-packages/ceph_volume/devices/raw/main.py", line
32, in main
/usr/bin/docker: stderr terminal.dispatch(self.mapper, self.argv)
/usr/bin/docker: stderr   File
"/usr/lib/python3.6/site-packages/ceph_volume/terminal.py", line 194, in
dispatch
/usr/bin/docker: stderr instance.main()
/usr/bin/docker: stderr   File
"/usr/lib/python3.6/site-packages/ceph_volume/devices/raw/activate.py",
line 166, in main
/usr/bin/docker: stderr systemd=not self.args.no_systemd)
/usr/bin/docker: stderr   File
"/usr/lib/python3.6/site-packages/ceph_volume/decorators.py", line 16, in
is_root
/usr/bin/docker: stderr return func(*a, **kw)
/usr/bin/docker: stderr   File
"/usr/lib/python3.6/site-packages/ceph_volume/devices/raw/activate.py",
line 79, in activate
/usr/bin/docker: stderr osd_id = meta['osd_id']
/usr/bin/docker: stderr KeyError: 'osd_id'
Traceback (most recent call last):
  File "/usr/sbin/cephadm", line 9679, in 
main()
  File "/usr/sbin/cephadm", line 9667, in main
r = ctx.func(ctx)
  File "/usr/sbin/cephadm", line 2116, in _infer_config
return func(ctx)
  File "/usr/sbin/cephadm", line 2061, in _infer_fsid
return func(ctx)
  File "/usr/sbin/cephadm", line 2144, in _infer_image
return func(ctx)
  File "/usr/sbin/cephadm", line 2019, in _validate_fsid
return func(ctx)
  File "/usr/sbin/cephadm", line 6272, in command_ceph_volume
out, err, code = call_throws(ctx, c.run_cmd(),
verbosity=CallVerbosity.QUIET_UNLESS_ERROR)
  File "/usr/sbin/cephadm", line 1807, in call_throws
raise RuntimeError('Failed command: %s' % ' '.join(command))
RuntimeError: Failed command: /usr/bin/docker run --rm --ipc=host
--stop-signal=SIGTERM 

[ceph-users] Re: Remove an OSD with hardware issue caused rgw 503

2024-04-30 Thread Mary Zhang
Sorry Frank, I typed the wrong name.

On Tue, Apr 30, 2024, 8:51 AM Mary Zhang  wrote:

> Sounds good. Thank you Kevin and have a nice day!
>
> Best Regards,
> Mary
>
> On Tue, Apr 30, 2024, 8:21 AM Frank Schilder  wrote:
>
>> I think you are panicking way too much. Chances are that you will never
>> need that command, so don't get fussed out by an old post.
>>
>> Just follow what I wrote and, in the extremely rare case that recovery
>> does not complete due to missing information, send an e-mail to this list
>> and state that you still have the disk of the down OSD. Someone will send
>> you the export/import commands within a short time.
>>
>> So stop worrying and just administrate your cluster with common storage
>> admin sense.
>>
>> Best regards,
>> =
>> Frank Schilder
>> AIT Risø Campus
>> Bygning 109, rum S14
>>
>> 
>> From: Mary Zhang 
>> Sent: Tuesday, April 30, 2024 5:00 PM
>> To: Frank Schilder
>> Cc: Eugen Block; ceph-users@ceph.io; Wesley Dillingham
>> Subject: Re: [ceph-users] Re: Remove an OSD with hardware issue caused
>> rgw 503
>>
>> Thank you Frank for sharing such valuable experience! I really appreciate
>> it.
>> We observe similar timelines: it took more than 1 week to drain our OSD.
>> Regarding export PGs from failed disk and inject it back to the cluster,
>> do you have any documentations? I find this online Ceph.io — Incomplete PGs
>> -- OH MY!, but
>> not sure whether it's the standard process.
>>
>> Thanks,
>> Mary
>>
>> On Tue, Apr 30, 2024 at 3:27 AM Frank Schilder > fr...@dtu.dk>> wrote:
>> Hi all,
>>
>> I second Eugen's recommendation. We have a cluster with large HDD OSDs
>> where the following timings are found:
>>
>> - drain an OSD: 2 weeks.
>> - down an OSD and let cluster recover: 6 hours.
>>
>> The drain OSD procedure is - in my experience - a complete waste of time,
>> actually puts your cluster at higher risk of a second failure (its not
>> guaranteed that the bad PG(s) is/are drained first) and also screws up all
>> sorts of internal operations like scrub etc for an unnecessarily long time.
>> The recovery procedure is much faster, because it uses all-to-all recovery
>> while drain is limited to no more than max_backfills PGs at a time and your
>> broken disk sits much longer in the cluster.
>>
>> On SSDs the "down OSD"-method shows a similar speed-up factor.
>>
>> For a security measure, don't destroy the OSD right away, wait for
>> recovery to complete and only then destroy the OSD and throw away the disk.
>> In case an error occurs during recovery, you can almost always still export
>> PGs from a failed disk and inject it back into the cluster. This, however,
>> requires to take disks out as soon as they show problems and before they
>> fail hard. Keep a little bit of life time to have a chance to recover data.
>> Look at the manual of ddrescue why it is important to stop IO from a
>> failing disk as soon as possible.
>>
>> Best regards,
>> =
>> Frank Schilder
>> AIT Risø Campus
>> Bygning 109, rum S14
>>
>> 
>> From: Eugen Block mailto:ebl...@nde.ag>>
>> Sent: Saturday, April 27, 2024 10:29 AM
>> To: Mary Zhang
>> Cc: ceph-users@ceph.io; Wesley Dillingham
>> Subject: [ceph-users] Re: Remove an OSD with hardware issue caused rgw 503
>>
>> If the rest of the cluster is healthy and your resiliency is
>> configured properly, for example to sustain the loss of one or more
>> hosts at a time, you don’t need to worry about a single disk. Just
>> take it out and remove it (forcefully) so it doesn’t have any clients
>> anymore. Ceph will immediately assign different primary OSDs and your
>> clients will be happy again. ;-)
>>
>> Zitat von Mary Zhang > maryzhang0...@gmail.com>>:
>>
>> > Thank you Wesley for the clear explanation between the 2 methods!
>> > The tracker issue you mentioned https://tracker.ceph.com/issues/44400
>> talks
>> > about primary-affinity. Could primary-affinity help remove an OSD with
>> > hardware issue from the cluster gracefully?
>> >
>> > Thanks,
>> > Mary
>> >
>> >
>> > On Fri, Apr 26, 2024 at 8:43 AM Wesley Dillingham <
>> w...@wesdillingham.com>
>> > wrote:
>> >
>> >> What you want to do is to stop the OSD (and all its copies of data it
>> >> contains) by stopping the OSD service immediately. The downside of this
>> >> approach is it causes the PGs on that OSD to be degraded. But the
>> upside is
>> >> the OSD which has bad hardware is immediately no  longer participating
>> in
>> >> any client IO (the source of your RGW 503s). In this situation the PGs
>> go
>> >> into degraded+backfilling
>> >>
>> >> The alternative method is to keep the failing OSD up and in the cluster
>> >> but slowly migrate the data off of it, this would be a long drawn out
>> >> period of time in which the failing disk would continue to serve 

[ceph-users] Re: Remove an OSD with hardware issue caused rgw 503

2024-04-30 Thread Mary Zhang
Sounds good. Thank you Kevin and have a nice day!

Best Regards,
Mary

On Tue, Apr 30, 2024, 8:21 AM Frank Schilder  wrote:

> I think you are panicking way too much. Chances are that you will never
> need that command, so don't get fussed out by an old post.
>
> Just follow what I wrote and, in the extremely rare case that recovery
> does not complete due to missing information, send an e-mail to this list
> and state that you still have the disk of the down OSD. Someone will send
> you the export/import commands within a short time.
>
> So stop worrying and just administrate your cluster with common storage
> admin sense.
>
> Best regards,
> =
> Frank Schilder
> AIT Risø Campus
> Bygning 109, rum S14
>
> 
> From: Mary Zhang 
> Sent: Tuesday, April 30, 2024 5:00 PM
> To: Frank Schilder
> Cc: Eugen Block; ceph-users@ceph.io; Wesley Dillingham
> Subject: Re: [ceph-users] Re: Remove an OSD with hardware issue caused rgw
> 503
>
> Thank you Frank for sharing such valuable experience! I really appreciate
> it.
> We observe similar timelines: it took more than 1 week to drain our OSD.
> Regarding export PGs from failed disk and inject it back to the cluster,
> do you have any documentations? I find this online Ceph.io — Incomplete PGs
> -- OH MY!, but
> not sure whether it's the standard process.
>
> Thanks,
> Mary
>
> On Tue, Apr 30, 2024 at 3:27 AM Frank Schilder  fr...@dtu.dk>> wrote:
> Hi all,
>
> I second Eugen's recommendation. We have a cluster with large HDD OSDs
> where the following timings are found:
>
> - drain an OSD: 2 weeks.
> - down an OSD and let cluster recover: 6 hours.
>
> The drain OSD procedure is - in my experience - a complete waste of time,
> actually puts your cluster at higher risk of a second failure (its not
> guaranteed that the bad PG(s) is/are drained first) and also screws up all
> sorts of internal operations like scrub etc for an unnecessarily long time.
> The recovery procedure is much faster, because it uses all-to-all recovery
> while drain is limited to no more than max_backfills PGs at a time and your
> broken disk sits much longer in the cluster.
>
> On SSDs the "down OSD"-method shows a similar speed-up factor.
>
> For a security measure, don't destroy the OSD right away, wait for
> recovery to complete and only then destroy the OSD and throw away the disk.
> In case an error occurs during recovery, you can almost always still export
> PGs from a failed disk and inject it back into the cluster. This, however,
> requires to take disks out as soon as they show problems and before they
> fail hard. Keep a little bit of life time to have a chance to recover data.
> Look at the manual of ddrescue why it is important to stop IO from a
> failing disk as soon as possible.
>
> Best regards,
> =
> Frank Schilder
> AIT Risø Campus
> Bygning 109, rum S14
>
> 
> From: Eugen Block mailto:ebl...@nde.ag>>
> Sent: Saturday, April 27, 2024 10:29 AM
> To: Mary Zhang
> Cc: ceph-users@ceph.io; Wesley Dillingham
> Subject: [ceph-users] Re: Remove an OSD with hardware issue caused rgw 503
>
> If the rest of the cluster is healthy and your resiliency is
> configured properly, for example to sustain the loss of one or more
> hosts at a time, you don’t need to worry about a single disk. Just
> take it out and remove it (forcefully) so it doesn’t have any clients
> anymore. Ceph will immediately assign different primary OSDs and your
> clients will be happy again. ;-)
>
> Zitat von Mary Zhang  maryzhang0...@gmail.com>>:
>
> > Thank you Wesley for the clear explanation between the 2 methods!
> > The tracker issue you mentioned https://tracker.ceph.com/issues/44400
> talks
> > about primary-affinity. Could primary-affinity help remove an OSD with
> > hardware issue from the cluster gracefully?
> >
> > Thanks,
> > Mary
> >
> >
> > On Fri, Apr 26, 2024 at 8:43 AM Wesley Dillingham  >
> > wrote:
> >
> >> What you want to do is to stop the OSD (and all its copies of data it
> >> contains) by stopping the OSD service immediately. The downside of this
> >> approach is it causes the PGs on that OSD to be degraded. But the
> upside is
> >> the OSD which has bad hardware is immediately no  longer participating
> in
> >> any client IO (the source of your RGW 503s). In this situation the PGs
> go
> >> into degraded+backfilling
> >>
> >> The alternative method is to keep the failing OSD up and in the cluster
> >> but slowly migrate the data off of it, this would be a long drawn out
> >> period of time in which the failing disk would continue to serve client
> >> reads and also facilitate backfill but you wouldnt take a copy of the
> data
> >> out of the cluster and cause degraded PGs. In this scenario the PGs
> would
> >> be remapped+backfilling
> >>
> >> I tried to find a way to have 

[ceph-users] Re: Remove an OSD with hardware issue caused rgw 503

2024-04-30 Thread Frank Schilder
I think you are panicking way too much. Chances are that you will never need 
that command, so don't get fussed out by an old post.

Just follow what I wrote and, in the extremely rare case that recovery does not 
complete due to missing information, send an e-mail to this list and state that 
you still have the disk of the down OSD. Someone will send you the 
export/import commands within a short time.

So stop worrying and just administrate your cluster with common storage admin 
sense.

Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14


From: Mary Zhang 
Sent: Tuesday, April 30, 2024 5:00 PM
To: Frank Schilder
Cc: Eugen Block; ceph-users@ceph.io; Wesley Dillingham
Subject: Re: [ceph-users] Re: Remove an OSD with hardware issue caused rgw 503

Thank you Frank for sharing such valuable experience! I really appreciate it.
We observe similar timelines: it took more than 1 week to drain our OSD.
Regarding export PGs from failed disk and inject it back to the cluster, do you 
have any documentations? I find this online Ceph.io — Incomplete PGs -- OH 
MY!, but not sure 
whether it's the standard process.

Thanks,
Mary

On Tue, Apr 30, 2024 at 3:27 AM Frank Schilder 
mailto:fr...@dtu.dk>> wrote:
Hi all,

I second Eugen's recommendation. We have a cluster with large HDD OSDs where 
the following timings are found:

- drain an OSD: 2 weeks.
- down an OSD and let cluster recover: 6 hours.

The drain OSD procedure is - in my experience - a complete waste of time, 
actually puts your cluster at higher risk of a second failure (its not 
guaranteed that the bad PG(s) is/are drained first) and also screws up all 
sorts of internal operations like scrub etc for an unnecessarily long time. The 
recovery procedure is much faster, because it uses all-to-all recovery while 
drain is limited to no more than max_backfills PGs at a time and your broken 
disk sits much longer in the cluster.

On SSDs the "down OSD"-method shows a similar speed-up factor.

For a security measure, don't destroy the OSD right away, wait for recovery to 
complete and only then destroy the OSD and throw away the disk. In case an 
error occurs during recovery, you can almost always still export PGs from a 
failed disk and inject it back into the cluster. This, however, requires to 
take disks out as soon as they show problems and before they fail hard. Keep a 
little bit of life time to have a chance to recover data. Look at the manual of 
ddrescue why it is important to stop IO from a failing disk as soon as possible.

Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14


From: Eugen Block mailto:ebl...@nde.ag>>
Sent: Saturday, April 27, 2024 10:29 AM
To: Mary Zhang
Cc: ceph-users@ceph.io; Wesley Dillingham
Subject: [ceph-users] Re: Remove an OSD with hardware issue caused rgw 503

If the rest of the cluster is healthy and your resiliency is
configured properly, for example to sustain the loss of one or more
hosts at a time, you don’t need to worry about a single disk. Just
take it out and remove it (forcefully) so it doesn’t have any clients
anymore. Ceph will immediately assign different primary OSDs and your
clients will be happy again. ;-)

Zitat von Mary Zhang mailto:maryzhang0...@gmail.com>>:

> Thank you Wesley for the clear explanation between the 2 methods!
> The tracker issue you mentioned https://tracker.ceph.com/issues/44400 talks
> about primary-affinity. Could primary-affinity help remove an OSD with
> hardware issue from the cluster gracefully?
>
> Thanks,
> Mary
>
>
> On Fri, Apr 26, 2024 at 8:43 AM Wesley Dillingham 
> mailto:w...@wesdillingham.com>>
> wrote:
>
>> What you want to do is to stop the OSD (and all its copies of data it
>> contains) by stopping the OSD service immediately. The downside of this
>> approach is it causes the PGs on that OSD to be degraded. But the upside is
>> the OSD which has bad hardware is immediately no  longer participating in
>> any client IO (the source of your RGW 503s). In this situation the PGs go
>> into degraded+backfilling
>>
>> The alternative method is to keep the failing OSD up and in the cluster
>> but slowly migrate the data off of it, this would be a long drawn out
>> period of time in which the failing disk would continue to serve client
>> reads and also facilitate backfill but you wouldnt take a copy of the data
>> out of the cluster and cause degraded PGs. In this scenario the PGs would
>> be remapped+backfilling
>>
>> I tried to find a way to have your cake and eat it to in relation to this
>> "predicament" in this tracker issue: https://tracker.ceph.com/issues/44400
>> but it was deemed "wont fix".
>>
>> Respectfully,
>>
>> *Wes Dillingham*
>> LinkedIn 
>> w...@wesdillingham.com

[ceph-users] Re: Remove an OSD with hardware issue caused rgw 503

2024-04-30 Thread Mary Zhang
Thank you Frank for sharing such valuable experience! I really appreciate
it.
We observe similar timelines: it took more than 1 week to drain our OSD.
Regarding export PGs from failed disk and inject it back to the cluster, do
you have any documentations? I find this online Ceph.io — Incomplete PGs --
OH MY! , but not
sure whether it's the standard process.

Thanks,
Mary

On Tue, Apr 30, 2024 at 3:27 AM Frank Schilder  wrote:

> Hi all,
>
> I second Eugen's recommendation. We have a cluster with large HDD OSDs
> where the following timings are found:
>
> - drain an OSD: 2 weeks.
> - down an OSD and let cluster recover: 6 hours.
>
> The drain OSD procedure is - in my experience - a complete waste of time,
> actually puts your cluster at higher risk of a second failure (its not
> guaranteed that the bad PG(s) is/are drained first) and also screws up all
> sorts of internal operations like scrub etc for an unnecessarily long time.
> The recovery procedure is much faster, because it uses all-to-all recovery
> while drain is limited to no more than max_backfills PGs at a time and your
> broken disk sits much longer in the cluster.
>
> On SSDs the "down OSD"-method shows a similar speed-up factor.
>
> For a security measure, don't destroy the OSD right away, wait for
> recovery to complete and only then destroy the OSD and throw away the disk.
> In case an error occurs during recovery, you can almost always still export
> PGs from a failed disk and inject it back into the cluster. This, however,
> requires to take disks out as soon as they show problems and before they
> fail hard. Keep a little bit of life time to have a chance to recover data.
> Look at the manual of ddrescue why it is important to stop IO from a
> failing disk as soon as possible.
>
> Best regards,
> =
> Frank Schilder
> AIT Risø Campus
> Bygning 109, rum S14
>
> 
> From: Eugen Block 
> Sent: Saturday, April 27, 2024 10:29 AM
> To: Mary Zhang
> Cc: ceph-users@ceph.io; Wesley Dillingham
> Subject: [ceph-users] Re: Remove an OSD with hardware issue caused rgw 503
>
> If the rest of the cluster is healthy and your resiliency is
> configured properly, for example to sustain the loss of one or more
> hosts at a time, you don’t need to worry about a single disk. Just
> take it out and remove it (forcefully) so it doesn’t have any clients
> anymore. Ceph will immediately assign different primary OSDs and your
> clients will be happy again. ;-)
>
> Zitat von Mary Zhang :
>
> > Thank you Wesley for the clear explanation between the 2 methods!
> > The tracker issue you mentioned https://tracker.ceph.com/issues/44400
> talks
> > about primary-affinity. Could primary-affinity help remove an OSD with
> > hardware issue from the cluster gracefully?
> >
> > Thanks,
> > Mary
> >
> >
> > On Fri, Apr 26, 2024 at 8:43 AM Wesley Dillingham  >
> > wrote:
> >
> >> What you want to do is to stop the OSD (and all its copies of data it
> >> contains) by stopping the OSD service immediately. The downside of this
> >> approach is it causes the PGs on that OSD to be degraded. But the
> upside is
> >> the OSD which has bad hardware is immediately no  longer participating
> in
> >> any client IO (the source of your RGW 503s). In this situation the PGs
> go
> >> into degraded+backfilling
> >>
> >> The alternative method is to keep the failing OSD up and in the cluster
> >> but slowly migrate the data off of it, this would be a long drawn out
> >> period of time in which the failing disk would continue to serve client
> >> reads and also facilitate backfill but you wouldnt take a copy of the
> data
> >> out of the cluster and cause degraded PGs. In this scenario the PGs
> would
> >> be remapped+backfilling
> >>
> >> I tried to find a way to have your cake and eat it to in relation to
> this
> >> "predicament" in this tracker issue:
> https://tracker.ceph.com/issues/44400
> >> but it was deemed "wont fix".
> >>
> >> Respectfully,
> >>
> >> *Wes Dillingham*
> >> LinkedIn 
> >> w...@wesdillingham.com
> >>
> >>
> >>
> >>
> >> On Fri, Apr 26, 2024 at 11:25 AM Mary Zhang 
> >> wrote:
> >>
> >>> Thank you Eugen for your warm help!
> >>>
> >>> I'm trying to understand the difference between 2 methods.
> >>> For method 1, or "ceph orch osd rm osd_id", OSD Service — Ceph
> >>> Documentation
> >>> 
> >>> says
> >>> it involves 2 steps:
> >>>
> >>>1.
> >>>
> >>>evacuating all placement groups (PGs) from the OSD
> >>>2.
> >>>
> >>>removing the PG-free OSD from the cluster
> >>>
> >>> For method 2, or the procedure you recommended, Adding/Removing OSDs —
> >>> Ceph
> >>> Documentation
> >>> <
> >>>
> https://docs.ceph.com/en/latest/rados/operations/add-or-rm-osds/#removing-osds-manual
> >>> >
> >>> says
> >>> "After the OSD has been taken out of 

[ceph-users] Re: Ceph Squid released?

2024-04-30 Thread James Page
Hi Robert

On Mon, Apr 29, 2024 at 8:06 AM Robert Sander 
wrote:

> On 4/29/24 08:50, Alwin Antreich wrote:
>
> > well it says it in the article.
> >
> > The upcoming Squid release serves as a testament to how the Ceph
> > project continues to deliver innovative features to users without
> > compromising on quality.
> >
> >
> > I believe it is more a statement of having new members and tiers and to
> > sound the marketing drums a bit. :)
>
> The Ubuntu 24.04 release notes also claim that this release comes with
> Ceph Squid:
>
> https://discourse.ubuntu.com/t/noble-numbat-release-notes/39890


Almost - we've been using snapshots from the squid branch to get early
visibility of the upcoming Squid release as we knew it would come after the
Ubuntu 24.04 release date.  The release notes were not quite on this front
(I've now updated).

The Squid release will be provided via a Stable Release Updates post
release by the Ceph project.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph reef and (slow) backfilling - how to speed it up

2024-04-30 Thread Pierre Riteau
Hi Götz,

You can change the value of osd_max_backfills (for all OSDs or specific
ones) using `ceph config`, but you need
enable osd_mclock_override_recovery_settings. See
https://docs.ceph.com/en/quincy/rados/configuration/mclock-config-ref/#steps-to-modify-mclock-max-backfills-recovery-limits
for more information.

Best regards,
Pierre Riteau

On Sat, 27 Apr 2024 at 08:32, Götz Reinicke 
wrote:

> Dear ceph community,
>
> I’ve a ceph cluster which got upgraded from nautilus/pacific/…to reef over
> time. Now I added two new nodes to an existing EC pool as I did with the
> previous versions of ceph.
>
> Now I face the fact, that the previous „backfilling tuning“ I’v used by
> increasing injectargs --osd-max-backfills=XX --osd-recovery-max-active=YY
> dose not work anymore.
>
> With adjusting thous parameters the backfill was running with up to 2k +-
> objects/s.
>
> As I’m not (yet) familiar with the reef opiont the only speed up so far I
> found is „ceph config set osd osd_mclock_profile high_recovery_ops“ which
> currently runs the backfill with up to 600 opbjects/s.
>
> My question: What is a best (simple) way to speed that backfill up ?
>
> I’v tried to understand the custom profiles (?) but without success - and
> did not apply anything other yet.
>
> Thanks for feedback and suggestions ! Best regards . Götz
>
>
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] SPDK with cephadm and reef

2024-04-30 Thread R A
Hello Community,

is there a guide / documentation how to configure spdk with cephadm (running in 
containers) in reef?

BR

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Remove an OSD with hardware issue caused rgw 503

2024-04-30 Thread Frank Schilder
Hi all,

I second Eugen's recommendation. We have a cluster with large HDD OSDs where 
the following timings are found:

- drain an OSD: 2 weeks.
- down an OSD and let cluster recover: 6 hours.

The drain OSD procedure is - in my experience - a complete waste of time, 
actually puts your cluster at higher risk of a second failure (its not 
guaranteed that the bad PG(s) is/are drained first) and also screws up all 
sorts of internal operations like scrub etc for an unnecessarily long time. The 
recovery procedure is much faster, because it uses all-to-all recovery while 
drain is limited to no more than max_backfills PGs at a time and your broken 
disk sits much longer in the cluster.

On SSDs the "down OSD"-method shows a similar speed-up factor.

For a security measure, don't destroy the OSD right away, wait for recovery to 
complete and only then destroy the OSD and throw away the disk. In case an 
error occurs during recovery, you can almost always still export PGs from a 
failed disk and inject it back into the cluster. This, however, requires to 
take disks out as soon as they show problems and before they fail hard. Keep a 
little bit of life time to have a chance to recover data. Look at the manual of 
ddrescue why it is important to stop IO from a failing disk as soon as possible.

Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14


From: Eugen Block 
Sent: Saturday, April 27, 2024 10:29 AM
To: Mary Zhang
Cc: ceph-users@ceph.io; Wesley Dillingham
Subject: [ceph-users] Re: Remove an OSD with hardware issue caused rgw 503

If the rest of the cluster is healthy and your resiliency is
configured properly, for example to sustain the loss of one or more
hosts at a time, you don’t need to worry about a single disk. Just
take it out and remove it (forcefully) so it doesn’t have any clients
anymore. Ceph will immediately assign different primary OSDs and your
clients will be happy again. ;-)

Zitat von Mary Zhang :

> Thank you Wesley for the clear explanation between the 2 methods!
> The tracker issue you mentioned https://tracker.ceph.com/issues/44400 talks
> about primary-affinity. Could primary-affinity help remove an OSD with
> hardware issue from the cluster gracefully?
>
> Thanks,
> Mary
>
>
> On Fri, Apr 26, 2024 at 8:43 AM Wesley Dillingham 
> wrote:
>
>> What you want to do is to stop the OSD (and all its copies of data it
>> contains) by stopping the OSD service immediately. The downside of this
>> approach is it causes the PGs on that OSD to be degraded. But the upside is
>> the OSD which has bad hardware is immediately no  longer participating in
>> any client IO (the source of your RGW 503s). In this situation the PGs go
>> into degraded+backfilling
>>
>> The alternative method is to keep the failing OSD up and in the cluster
>> but slowly migrate the data off of it, this would be a long drawn out
>> period of time in which the failing disk would continue to serve client
>> reads and also facilitate backfill but you wouldnt take a copy of the data
>> out of the cluster and cause degraded PGs. In this scenario the PGs would
>> be remapped+backfilling
>>
>> I tried to find a way to have your cake and eat it to in relation to this
>> "predicament" in this tracker issue: https://tracker.ceph.com/issues/44400
>> but it was deemed "wont fix".
>>
>> Respectfully,
>>
>> *Wes Dillingham*
>> LinkedIn 
>> w...@wesdillingham.com
>>
>>
>>
>>
>> On Fri, Apr 26, 2024 at 11:25 AM Mary Zhang 
>> wrote:
>>
>>> Thank you Eugen for your warm help!
>>>
>>> I'm trying to understand the difference between 2 methods.
>>> For method 1, or "ceph orch osd rm osd_id", OSD Service — Ceph
>>> Documentation
>>> 
>>> says
>>> it involves 2 steps:
>>>
>>>1.
>>>
>>>evacuating all placement groups (PGs) from the OSD
>>>2.
>>>
>>>removing the PG-free OSD from the cluster
>>>
>>> For method 2, or the procedure you recommended, Adding/Removing OSDs —
>>> Ceph
>>> Documentation
>>> <
>>> https://docs.ceph.com/en/latest/rados/operations/add-or-rm-osds/#removing-osds-manual
>>> >
>>> says
>>> "After the OSD has been taken out of the cluster, Ceph begins rebalancing
>>> the cluster by migrating placement groups out of the OSD that was removed.
>>> "
>>>
>>> What's the difference between "evacuating PGs" in method 1 and "migrating
>>> PGs" in method 2? I think method 1 must read the OSD to be removed.
>>> Otherwise, we would not see slow ops warning. Does method 2 not involve
>>> reading this OSD?
>>>
>>> Thanks,
>>> Mary
>>>
>>> On Fri, Apr 26, 2024 at 5:15 AM Eugen Block  wrote:
>>>
>>> > Hi,
>>> >
>>> > if you remove the OSD this way, it will be drained. Which means that
>>> > it will try to recover PGs from this OSD, and in case of hardware
>>> > failure it might lead to slow requests. It might make sense to
>>> > forcefully 

[ceph-users] Re: stretched cluster new pool and second pool with nvme

2024-04-30 Thread Stefan Kooman

On 30-04-2024 11:22, ronny.lippold wrote:

hi stefan ... you are the hero of the month ;)


:p.



i don't know, why i did not found your bug report.

i have the exact same problem and resolved the HEALTH only with "ceph 
osd force_healthy_stretch_mode --yes-i-really-mean-it"

will comment the report soon.

actually, we think about 4/2 size without stretch mode enable.

what was your solution?


This specific setup (on which I did the testing) is going to be full 
flash (SSD). So the HDDs are going to be phased out. And only the 
default non-device-class crush rule will be used. While that will work 
for this (small) cluster, it is not a solution. This issue should be 
fixed, as I figure there are quite a few cluster that want to use 
device-classes and use stretch mode at the same time.


Gr. Stefan
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io