[ceph-users] Re: [CEPH] OSD Memory Usage

2023-11-15 Thread Nguyễn Hữu Khôi
Hello,
Yes, I see it does not  exceed RSS but I see in "ceph orch ps". it is over
target.  Does Mem Use include cache, I am right?

NAMEHOST  PORTSSTATUS REFRESHED
 AGE  MEM USE  MEM LIM  VERSIONIMAGE ID  CONTAINER ID

osd.7   sg-osd01   running (3d)  8m ago
4w4231M4096M  17.2.6 90a2664234e1  922185643cb8
osd.8   sg-osd03   running (3d)  7m ago
4w3407M4096M  17.2.6 90a2664234e1  0ec74fe54bbe
osd.9   sg-osd01   running (3d)  8m ago
4w4575M4096M  17.2.6 90a2664234e1  c2f1c1ee2087
osd.10  sg-osd03   running (3d)  7m ago
4w3821M4096M  17.2.6 90a2664234e1  fecbd5e910de
osd.11  sg-osd01   running (3d)  8m ago
4w3578M4096M  17.2.6 90a2664234e1  f201704e9026
osd.12  sg-osd03   running (3d)  7m ago
4w3076M4096M  17.2.6 90a2664234e1  e741b67b6582
osd.13  sg-osd01   running (3d)  8m ago
4w3688M4096M  17.2.6 90a2664234e1  bffa59278fc2
osd.14  sg-osd03   running (3d)  7m ago
4w3652M4096M  17.2.6 90a2664234e1  7d9eb3fb9c1e
osd.15  sg-osd01   running (3d)  8m ago
4w3343M4096M  17.2.6 90a2664234e1  d96a425ae5c9
osd.16  sg-osd03   running (3d)  7m ago
4w2492M4096M  17.2.6 90a2664234e1  637c43176fdc
osd.17  sg-osd01   running (3d)  8m ago
4w3011M4096M  17.2.6 90a2664234e1  a39456dd2c0c
osd.18  sg-osd03   running (3d)  7m ago
4w2341M4096M  17.2.6 90a2664234e1  7b750672391b
osd.19  sg-osd01   running (3d)  8m ago
4w2672M4096M  17.2.6 90a2664234e1  6358234e95f5
osd.20  sg-osd03   running (3d)  7m ago
4w3297M4096M  17.2.6 90a2664234e1  2ecba6b066fd
osd.21  sg-osd01   running (3d)  8m ago
4w5147M4096M  17.2.6 90a2664234e1  1d0e4efe48bd
osd.22  sg-osd03   running (3d)  7m ago
4w3432M4096M  17.2.6 90a2664234e1  5bb6d4f71f9d
osd.23  sg-osd03   running (3d)  7m ago
4w2893M4096M  17.2.6 90a2664234e1  f7e1948e57d5
osd.24  sg-osd02   running (3d)  7m ago
 12d3007M4096M  17.2.6 90a2664234e1  85d896abe467
osd.25  sg-osd02   running (3d)  7m ago
 12d2666M4096M  17.2.6 90a2664234e1  9800cd8ff1a1
osd.26  sg-osd02   running (3d)  7m ago
 12d2918M4096M  17.2.6 90a2664234e1  f2e0b2d50625
osd.27  sg-osd02   running (3d)  7m ago
 12d3586M4096M  17.2.6 90a2664234e1  ee2fa3a9b40a
osd.28  sg-osd02   running (3d)  7m ago
 12d2391M4096M  17.2.6 90a2664234e1  4cf7adf9f60a
osd.29  sg-osd02   running (3d)  7m ago
 12d5642M4096M  17.2.6 90a2664234e1  8c1ba98a1738
osd.30  sg-osd02   running (3d)  7m ago
 12d4728M4096M  17.2.6 90a2664234e1  e308497de2e5
osd.31  sg-osd02   running (3d)  7m ago
 12d3615M4096M  17.2.6 90a2664234e1  89b80d464627
osd.32  sg-osd02   running (3d)  7m ago
 12d1703M4096M  17.2.6 90a2664234e1  1e4608786078
osd.33  sg-osd02   running (3d)  7m ago
 12d3039M4096M  17.2.6 90a2664234e1  16e04a1da987
osd.34  sg-osd02   running (3d)  7m ago
 12d2434M4096M  17.2.6 90a2664234e1  014076e28182



btw as you said, I feel this value does not have much impact because if we
set 1 or 4GB. It still can consume much memory when they need more memory,

Nguyen Huu Khoi


On Thu, Nov 16, 2023 at 2:13 PM Zakhar Kirpichenko  wrote:

> You're most welcome!
>
> I'd say that real leak issues are very rare. For example, these are my
> OSDs with memory target=16GB which have been running for quite a while, as
> you can see they don't exceed 16 GB RSS:
>
>  PID USER  PR  NIVIRTRESSHR S  %CPU  %MEM TIME+
> COMMAND
>   92298 167   20   0   18.7g  15.8g  12264 S   1.3   4.2   1974:06
> ceph-osd
>   94527 167   20   0   19.5g  15.8g  12248 S   2.3   4.2   2287:26
> ceph-osd
>   93749 167   20   0   19.1g  15.7g  12804 S   2.3   4.2   1768:22
> ceph-osd
>   89534 167   20   0   20.1g  15.7g  12412 S   4.0   4.2   2512:18
> ceph-osd
> 3706552 167   20   0   20.5g  15.7g  15588 S   2.3   4.2   1385:26
> ceph-osd
>   90297 167   20   0   19.5g  15.6g  12432 S   3.0  

[ceph-users] Re: iSCSI GW trusted IPs

2023-11-15 Thread Eugen Block

Hi,

I don't have a solution for you, I just wanted to make you aware of  
this note in the docs:



Warning
The iSCSI gateway is in maintenance as of November 2022. This means  
that it is no longer in active development and will not be updated  
to add new features.


Here's some more information [2]:

The planned replacement is based on the newer NVMe-oF protocol and  
SPDK. See this presentation for the purported performance benefits:  
https://ci.spdk.io/download/2022-virtual-forum-prc/D2_4_Yue_A_Performance_S…


The git repository is here: https://github.com/ceph/ceph-nvmeof.  
However, this is not yet something recommended for a  
production-grade setup. At the very least, wait until this  
subproject makes it into

Ceph documentation and becomes available as RPMs and DEBs.
For now, you can still use ceph-iscsi - assuming that you need it,  
i.e. that raw RBD is not an option.


It will probably still work but you might encounter issues which won't  
be resolved anymore.


Regards,
Eugen

[1] https://docs.ceph.com/en/quincy/rbd/iscsi-overview/#ceph-iscsi
[2]  
https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/message/GDJJL7VSDUJITPM3JV7RCVXVOIQO2CAN/


Zitat von Ramon Orrù :


Hi,
I’m configuring  the  iSCSI GW services on a quincy  17.2.3 cluster.

I brought almost everything up and running (using cephadm), but I’m  
stuck in a configuration detail:


if I check the gateway status in the   Block -> iSCSI -> Overview  
section of the dashboard, they’re showing “Down” status, while the  
gateways are actually running. It makes me think the mgr is not able  
to talk with iSCSI APIs in order to collect info on the gateways,  
despite I correctly added my mgr hosts IPs to the trusted_ip_list  
parameter in my iscsi service definition yaml.


While further checking the gateway logs I found some messages like:

debug :::172.17.17.22 - - [15/Nov/2023 10:54:05] "GET  
/api/config?decrypt_passwords=True HTTP/1.1" 200 -
debug :::172.17.17.22 - - [15/Nov/2023 10:54:05] "GET /api/_ping  
HTTP/1.1" 200 -
debug :::172.17.17.22 - - [15/Nov/2023 10:54:05] "GET  
/api/gatewayinfo HTTP/1.1" 200 -


Just after I reload the dashboard page. So I tried to add the  
172.17.17.22 IP address to trusted_ip_list and it worked: iSCSI  
gateways status went green and Up on the dashboard.
It sounds to me like it's some container private network address,  
but I can’t find any evidence of it when inspecting the containers  
cephadm spawned.


My question is: how can I identify the IPs I need to make the iSCSI  
gateways properly reachable? I tried to add the whole  172.16.0.0/24  
private class but no luck , the iscsi container starts but is not  
allowing  172.17.17.22 to access the APIs.


Thanks in advance

regards

Ramon


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: [CEPH] OSD Memory Usage

2023-11-15 Thread Zakhar Kirpichenko
You're most welcome!

I'd say that real leak issues are very rare. For example, these are my OSDs
with memory target=16GB which have been running for quite a while, as you
can see they don't exceed 16 GB RSS:

 PID USER  PR  NIVIRTRESSHR S  %CPU  %MEM TIME+
COMMAND
  92298 167   20   0   18.7g  15.8g  12264 S   1.3   4.2   1974:06
ceph-osd
  94527 167   20   0   19.5g  15.8g  12248 S   2.3   4.2   2287:26
ceph-osd
  93749 167   20   0   19.1g  15.7g  12804 S   2.3   4.2   1768:22
ceph-osd
  89534 167   20   0   20.1g  15.7g  12412 S   4.0   4.2   2512:18
ceph-osd
3706552 167   20   0   20.5g  15.7g  15588 S   2.3   4.2   1385:26
ceph-osd
  90297 167   20   0   19.5g  15.6g  12432 S   3.0   4.1   2261:00
ceph-osd
   9799 167   20   0   22.9g  15.4g  12432 S   2.0   4.1   2494:00
ceph-osd
   9778 167   20   0   23.1g  15.3g  12556 S   2.6   4.1   2591:25
ceph-osd
   9815 167   20   0   23.4g  15.1g  12584 S   2.0   4.0   2722:28
ceph-osd
   9809 167   20   0   22.3g  15.1g  12068 S   3.6   4.0   5234:52
ceph-osd
   9811 167   20   0   23.4g  14.9g  12952 S   2.6   4.0   2593:19
ceph-osd
   9819 167   20   0   23.9g  14.9g  12636 S   2.6   4.0   3043:19
ceph-osd
   9820 167   20   0   23.3g  14.8g  12884 S   2.0   3.9   3073:43
ceph-osd
   9769 167   20   0   22.4g  14.7g  12612 S   2.6   3.9   2840:22
ceph-osd
   9836 167   20   0   24.0g  14.7g  12648 S   2.6   3.9   3300:34
ceph-osd
   9818 167   20   0   22.0g  14.7g  12152 S   2.3   3.9   5729:06
ceph-osd

Long story short, if you set reasonable targets, OSDs are unlikely to
exceed them during normal operations. If you set memory targets too low, it
is likely that they will be exceeded as OSDs need reasonable amounts of
memory to operate.

/Z

On Thu, 16 Nov 2023 at 08:37, Nguyễn Hữu Khôi 
wrote:

> Hello. Thank you very much for your explanation.
>
> Because I thought that  osd_memory_target will help me limit OSD memory
> usage which will help prevent memory leak - I tried google and many people
> talked about memory leak. A nice man, @Anthony D'Atri  ,
> on this forum helped me to understand that it wont help to limit OSD usage.
>
> I set it to 1GB because I want to see how this option works.
>
> I will read and test with caches options.
>
> Nguyen Huu Khoi
>
>
> On Thu, Nov 16, 2023 at 12:23 PM Zakhar Kirpichenko 
> wrote:
>
>> Hi,
>>
>> osd_memory_target is a "target", i.e. an OSD make an effort to consume up
>> to the specified amount of RAM, but won't consume less than required for
>> its operation and caches, which have some minimum values such as for
>> example osd_memory_cache_min, bluestore_cache_size,
>> bluestore_cache_size_hdd, bluestore_cache_size_ssd, etc. The recommended
>> and default OSD memory target is 4 GB.
>>
>> Your nodes have a sufficient amount of RAM, thus I don't see why you
>> would want to reduce OSD memory consumption below the recommended defaults,
>> especially considering that in-memory caches are important for Ceph
>> operations as they're many times faster than the fastest storage devices. I
>> run my OSDs with osd_memory_target=17179869184 (16 GB) and it helps,
>> especially with slower HDD-backed OSDs.
>>
>> /Z
>>
>> On Thu, 16 Nov 2023 at 01:02, Nguyễn Hữu Khôi 
>> wrote:
>>
>>> Hello,
>>> I am using a CEPH cluster. After monitoring it, I set:
>>>
>>> ceph config set osd osd_memory_target_autotune false
>>>
>>> ceph config set osd osd_memory_target 1G
>>>
>>> Then restart all OSD services then do test again, I just use fio commands
>>> from multi clients and I see that OSD memory consume is over 1GB. Would
>>> you
>>> like to help me understand this case?
>>>
>>> Ceph version: Quincy
>>>
>>> OSD: 3 nodes with 11 nvme each and 512GB ram per node.
>>>
>>> CPU: 2 socket xeon gold 6138 cpu with 56 cores per socket.
>>>
>>> Network: 25Gbps x 2 for public network and 25Gbps x 2 for storage
>>> network.
>>> MTU is 9000
>>>
>>> Thank you very much.
>>>
>>>
>>> Nguyen Huu Khoi
>>> ___
>>> ceph-users mailing list -- ceph-users@ceph.io
>>> To unsubscribe send an email to ceph-users-le...@ceph.io
>>>
>>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: [CEPH] OSD Memory Usage

2023-11-15 Thread Nguyễn Hữu Khôi
Hello. Thank you very much for your explanation.

Because I thought that  osd_memory_target will help me limit OSD memory
usage which will help prevent memory leak - I tried google and many people
talked about memory leak. A nice man, @Anthony D'Atri  ,
on this forum helped me to understand that it wont help to limit OSD usage.

I set it to 1GB because I want to see how this option works.

I will read and test with caches options.

Nguyen Huu Khoi


On Thu, Nov 16, 2023 at 12:23 PM Zakhar Kirpichenko 
wrote:

> Hi,
>
> osd_memory_target is a "target", i.e. an OSD make an effort to consume up
> to the specified amount of RAM, but won't consume less than required for
> its operation and caches, which have some minimum values such as for
> example osd_memory_cache_min, bluestore_cache_size,
> bluestore_cache_size_hdd, bluestore_cache_size_ssd, etc. The recommended
> and default OSD memory target is 4 GB.
>
> Your nodes have a sufficient amount of RAM, thus I don't see why you would
> want to reduce OSD memory consumption below the recommended defaults,
> especially considering that in-memory caches are important for Ceph
> operations as they're many times faster than the fastest storage devices. I
> run my OSDs with osd_memory_target=17179869184 (16 GB) and it helps,
> especially with slower HDD-backed OSDs.
>
> /Z
>
> On Thu, 16 Nov 2023 at 01:02, Nguyễn Hữu Khôi 
> wrote:
>
>> Hello,
>> I am using a CEPH cluster. After monitoring it, I set:
>>
>> ceph config set osd osd_memory_target_autotune false
>>
>> ceph config set osd osd_memory_target 1G
>>
>> Then restart all OSD services then do test again, I just use fio commands
>> from multi clients and I see that OSD memory consume is over 1GB. Would
>> you
>> like to help me understand this case?
>>
>> Ceph version: Quincy
>>
>> OSD: 3 nodes with 11 nvme each and 512GB ram per node.
>>
>> CPU: 2 socket xeon gold 6138 cpu with 56 cores per socket.
>>
>> Network: 25Gbps x 2 for public network and 25Gbps x 2 for storage network.
>> MTU is 9000
>>
>> Thank you very much.
>>
>>
>> Nguyen Huu Khoi
>> ___
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
>>
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: [CEPH] OSD Memory Usage

2023-11-15 Thread Zakhar Kirpichenko
Hi,

osd_memory_target is a "target", i.e. an OSD make an effort to consume up
to the specified amount of RAM, but won't consume less than required for
its operation and caches, which have some minimum values such as for
example osd_memory_cache_min, bluestore_cache_size,
bluestore_cache_size_hdd, bluestore_cache_size_ssd, etc. The recommended
and default OSD memory target is 4 GB.

Your nodes have a sufficient amount of RAM, thus I don't see why you would
want to reduce OSD memory consumption below the recommended defaults,
especially considering that in-memory caches are important for Ceph
operations as they're many times faster than the fastest storage devices. I
run my OSDs with osd_memory_target=17179869184 (16 GB) and it helps,
especially with slower HDD-backed OSDs.

/Z

On Thu, 16 Nov 2023 at 01:02, Nguyễn Hữu Khôi 
wrote:

> Hello,
> I am using a CEPH cluster. After monitoring it, I set:
>
> ceph config set osd osd_memory_target_autotune false
>
> ceph config set osd osd_memory_target 1G
>
> Then restart all OSD services then do test again, I just use fio commands
> from multi clients and I see that OSD memory consume is over 1GB. Would you
> like to help me understand this case?
>
> Ceph version: Quincy
>
> OSD: 3 nodes with 11 nvme each and 512GB ram per node.
>
> CPU: 2 socket xeon gold 6138 cpu with 56 cores per socket.
>
> Network: 25Gbps x 2 for public network and 25Gbps x 2 for storage network.
> MTU is 9000
>
> Thank you very much.
>
>
> Nguyen Huu Khoi
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: migrate wal/db to block device

2023-11-15 Thread Chris Dunlop

Hi Igor,

The immediate answer is to use "ceph-volume lvm zap" on the db LV after 
running the migrate. But for the longer term I think the "lvm zap" should 
be included in the "lvm migrate" process.


I.e. this works to migrate a separate wal/db to the block device:

#
# WARNING! DO NOT ZAP AFTER STARTING THE OSD!!
#
$ cephadm ceph-volume lvm list "${osd}" > ~/"osd.${osd}.list"
$ systemctl stop "${osd_service}"
$ cephadm shell --fsid "${fsid}" --name "osd.${osd}" -- \
  ceph-volume lvm migrate --osd-id "${osd}" --osd-fsid "${osd_fsid}" \
  --from db wal --target "${vg_lv}"
$ cephadm shell --fsid "${fsid}" --name "osd.${osd}" -- \
  ceph-volume lvm zap "${db_lv}"
$ systemctl start "${osd_service}"

WARNING! If you don't do the zap before starting the osd, the osd will be 
running with the db still on the LV. If you then stop the osd and zap the 
LV and start the osd again, you'll be running on the original db as it was 
copied to the block device before the migrate, which will be missing any 
updates done in the meantime. I don't know what problems that might cause.  
In this situation I've restored the LV tags (i.e. all tags on the db LV, 
the db_device and db_uuid tags on the block LV) using the info from 
~/osd.${osd}.list (otherwise the migrate fails!) and then gone through the 
migrate process again.


The problem is, it turns out the osd is being activated as a "raw" device 
rather than an "lvm" device, and the "raw" db device (which is actually an 
lvm LV) still has a bluestore label on it after the migrate, so it's still 
seen as a component of the osd.


E.g. before the migrate, both of these show the osd with the separate db:

$ cephadm ceph-volume lvm list
$ cephadm ceph-volume raw list

After the migrate (without zap), the "lvm list" does NOT show the separate 
db (because the appropriate LV tags have been removed), but the "raw list" 
still shows the osd with the separate db.


And the osd is being activated as a "raw" device, both before and after 
the migrate. E.g. extract from the journal before the migrate:


Nov 15 22:39:05 k12 bash[3829222]: Running command: /usr/bin/chown -R ceph:ceph 
/var/lib/ceph/osd/ceph-25
Nov 15 22:39:05 k12 bash[3829222]: Running command: 
/usr/bin/ceph-bluestore-tool prime-osd-dir --path /var/lib/ceph/osd/ceph-25 
--no-mon-config --dev /dev/mapper/ceph--5ccbb386--142b--4bf7--
Nov 15 22:39:05 k12 bash[3829222]: Running command: /usr/bin/chown -h ceph:ceph 
/dev/mapper/ceph--5ccbb386--142b--4bf7--a180--04bcf9a1f61b-osd--block--7710024b--ec71--4fd3--b94c--c4c4b9af2d2
Nov 15 22:39:05 k12 bash[3829222]: Running command: /usr/bin/chown -R ceph:ceph 
/dev/dm-1
Nov 15 22:39:05 k12 bash[3829222]: Running command: /usr/bin/ln -s 
/dev/mapper/ceph--5ccbb386--142b--4bf7--a180--04bcf9a1f61b-osd--block--7710024b--ec71--4fd3--b94c--c4c4b9af2d21
 /var/lib/ce
Nov 15 22:39:05 k12 bash[3829222]: Running command: /usr/bin/chown -h ceph:ceph 
/dev/mapper/ceph--d4b1e932--4557--4b88--bed2--9305a07e76eb-osd--db--6a507f57--884c--4947--a147--cd50f98f1a23
Nov 15 22:39:05 k12 bash[3829222]: Running command: /usr/bin/chown -R ceph:ceph 
/dev/dm-2
Nov 15 22:39:05 k12 bash[3829222]: Running command: /usr/bin/ln -s 
/dev/mapper/ceph--d4b1e932--4557--4b88--bed2--9305a07e76eb-osd--db--6a507f57--884c--4947--a147--cd50f98f1a23
 /var/lib/ceph/
Nov 15 22:39:05 k12 bash[3829222]: Running command: /usr/bin/chown -R ceph:ceph 
/var/lib/ceph/osd/ceph-25
Nov 15 22:39:05 k12 bash[3829222]: --> ceph-volume raw activate successful for 
osd ID: 25

After a migrate without a zap - note there are still two mapper/lv devices 
found, which includes the now-unwanted db LV:


Nov 16 09:08:31 k12 bash[4012506]: Running command: /usr/bin/chown -R ceph:ceph 
/var/lib/ceph/osd/ceph-25
Nov 16 09:08:31 k12 bash[4012506]: Running command: 
/usr/bin/ceph-bluestore-tool prime-osd-dir --path /var/lib/ceph/osd/ceph-25 
--no-mon-config --dev /dev/mapper/ceph--5ccbb386--142b--4bf7--
Nov 16 09:08:31 k12 bash[4012506]: Running command: /usr/bin/chown -h ceph:ceph 
/dev/mapper/ceph--5ccbb386--142b--4bf7--a180--04bcf9a1f61b-osd--block--7710024b--ec71--4fd3--b94c--c4c4b9af2d2
Nov 16 09:08:31 k12 bash[4012506]: Running command: /usr/bin/chown -R ceph:ceph 
/dev/dm-1
Nov 16 09:08:31 k12 bash[4012506]: Running command: /usr/bin/ln -s 
/dev/mapper/ceph--5ccbb386--142b--4bf7--a180--04bcf9a1f61b-osd--block--7710024b--ec71--4fd3--b94c--c4c4b9af2d21
 /var/lib/ce
Nov 16 09:08:31 k12 bash[4012506]: Running command: /usr/bin/chown -h ceph:ceph 
/dev/mapper/ceph--d4b1e932--4557--4b88--bed2--9305a07e76eb-osd--db--6a507f57--884c--4947--a147--cd50f98f1a23
Nov 16 09:08:31 k12 bash[4012506]: Running command: /usr/bin/chown -R ceph:ceph 
/dev/dm-2
Nov 16 09:08:31 k12 bash[4012506]: Running command: /usr/bin/ln -s 
/dev/mapper/ceph--d4b1e932--4557--4b88--bed2--9305a07e76eb-osd--db--6a507f57--884c--4947--a147--cd50f98f1a23
 /var/lib/ceph/
Nov 16 09:08:31 k12 bash[4012506]: Running command: /usr/bin/chown -R ceph:ceph 
/var/lib/ceph/osd/ceph-25
Nov 16

[ceph-users] Re: Different behaviors for ceph kernel client in limiting IOPS when data pool enters `nearfull`?

2023-11-15 Thread Xiubo Li

Hi Matt,

On 11/15/23 02:40, Matt Larson wrote:

On CentOS 7 systems with the CephFS kernel client, if the data pool has a
`nearfull` status there is a slight reduction in write speeds (possibly
20-50% fewer IOPS).

On a similar Rocky 8 system with the CephFS kernel client, if the data pool
has `nearfull` status, a similar test shows write speeds at different block
sizes shows the IOPS < 150 bottlenecked vs the typical write
performance that might be with 2-3 IOPS at a particular block size.

Is there any way to avoid the extremely bottlenecked IOPS seen on the Rocky
8 system CephFS kernel clients during the `nearfull` condition or to have
behavior more similar to the CentOS 7 CephFS clients?

Do different OS or Linux kernels have greatly different ways they respond
or limit on the IOPS? Are there any options to adjust how they limit on
IOPS?


Just to be clear that the kernel on CentOS 7 is lower than the kernel on 
Rocky 8, they may behave differently someway. BTW, are the ceph versions 
the same for your test between CentOS 7 and Rocky 8 ?


I saw in libceph.ko there has some code will handle the OSD FULL case, 
but I didn't find the near full case, let's get help from Ilya about this.


@Ilya,

Do you know will the osdc will behave differently when it detects the 
pool is near full ?


Thanks

- Xiubo


Thanks,
   Matt


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] remove spurious data

2023-11-15 Thread Giuliano Maggi
Hi,

I’d like to remove some “spurious" data:

root@nerffs03:/# ceph df
--- RAW STORAGE ---
CLASS SIZEAVAILUSED  RAW USED  %RAW USED
hdd1.0 PiB  1.0 PiB  47 GiB47 GiB  0
TOTAL  1.0 PiB  1.0 PiB  47 GiB47 GiB  0
 
--- POOLS ---
POOL  ID  PGS  STORED  OBJECTS USED  %USED  MAX AVAIL
.mgr   11  70 MiB   19  209 MiB  0327 TiB
.nfs   3   32   572 B4   36 KiB  0327 TiB
root@nerffs03:/# 

as you can see, there are no data pools.
The 47GiB could be from previous pools/filesystems that I used for testing. 

how can I remove this “spurious” data without affecting the OSDs?
I was checking "ceph-volume lvm zap”, but I am not sure it this is the right 
one.

(I am running ceph version quincy 17.2.7)

Thanks,
Giuliano.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] rasize= in ceph.conf some section?

2023-11-15 Thread Pat Riehecky
Hello,

I'm trying to make it easy to distribute the expected config settings for my 
ceph volumes for other admin groups.  Is there a place I can set rasize in the 
ceph.conf where the client would pick it up?  The NFS world these days has 
nfsmount.conf that I've grown very fond of.

I'm a bit concerned that there will be drift in rasize and whatnot that causes 
performance issues.  While it doesn't seem much, having two places to look for 
mount information is less ideal
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] ceph -s very slow in my rdma eviroment

2023-11-15 Thread WeiGuo Ren
Today I run some ceph -s in my rdma enviroment. but very slow.
After perf ceph -s to  FlameGraph. I find  almost all the time is
spent on compact zone.
Has anyone encountered it?
my enviroment .
rpm -qa | grep ibverb
libibverbs-41mlnx1-OFED.4.1.0.1.1.41102.x86_64
libibverbs-devel-41mlnx1-OFED.4.1.0.1.1.41102.x86_64
ceph -v 14.2.5
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] planning upgrade from pacific to quincy

2023-11-15 Thread Simon Oosthoek

Hi All
(apologies if you get this twice, I suspect mails from my @science.ru.nl 
account get dropped by most receiving mail servers, due to the strict 
DMARC policies in place)


after a long while being in health_err state (due to an unfound object, 
which we eventually decided to "forget"), we are now planning to upgrade 
our cluster which is running Pacific (at least on the mons/mdss/osds, 
the gateways are by accident running quincy already). The installation 
is via packages from ceph.com, unless it's quincy from ubuntu.


ceph versions:
"mon": {"ceph version 16.2.13 
(5378749ba6be3a0868b51803968ee9cde4833a3e) pacific (stable)": 3},
"mgr": {"ceph version 16.2.13 
(5378749ba6be3a0868b51803968ee9cde4833a3e) pacific (stable)": 3},
"osd": {"ceph version 16.2.13 
(5378749ba6be3a0868b51803968ee9cde4833a3e) pacific (stable)": 252,
 "ceph version 16.2.14 (238ba602515df21ea7ffc75c88db29f9e5ef12c9) 
pacific (stable)": 12 },
"mds": { "ceph version 16.2.13 
(5378749ba6be3a0868b51803968ee9cde4833a3e) pacific (stable)": 2 },
"rgw": {"ceph version 17.2.6 
(d7ff0d10654d2280e08f1ab989c7cdf3064446a5) quincy (stable)": 8 },
"overall": {"ceph version 16.2.13 
(5378749ba6be3a0868b51803968ee9cde4833a3e) pacific (stable)": 260,
"ceph version 16.2.14 
(238ba602515df21ea7ffc75c88db29f9e5ef12c9) pacific (stable)": 12,
"ceph version 17.2.6 (d7ff0d10654d2280e08f1ab989c7cdf3064446a5) 
quincy (stable)": 8 }


The OS on the mons and mdss are still ubuntu 18.04, the osds are a mix 
of ubuntu 18 and ubuntu 20. The gateways are ubuntu 22.04, which is why 
these are already on quincy.


The plan is to move to quincy and eventually cephadm/containered ceph, 
since that is apparently "the way to go", though I have my doubts.


The steps we think are the right order are:
- reinstall the mons with ubuntu 22.04 + quincy
- reinstall the osds (same)
- reinstall the mdss (same)

Once this is up and running, we want to investigate and migrate to 
cephadm orchestration.


Alternative appear to be: move to orchestration first and then upgrade 
ceph to quincy (possibly skipping the ubuntu upgrade?)


Another alternative could be to upgrade to quincy on ubuntu 18.04 using 
packages, but I haven't investigated the availability of quincy packages 
for ubuntu 18.04 (which is out of free (LTS) support by canonical)


Cheers

/Simon
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Issue with using the block device inside a pod.

2023-11-15 Thread Kushagr Gupta
Hi Team,

Components:
Kubernetes, Ceph

Problem statement:
We are trying to integrate Ceph with kubernetes.
We are unable to utilize the block volume mode in the pod.

Description:
OS: Almalinux 8.8
Ceph version: 18.2
Kubernetes version: 1.28.2

We have deployed a single node kubernetes cluster and a 3 node ceph cluster.
We are trying to integrate kubernetes with ceph using csi-rbd plugin.
We have used the following link to do the same:
https://docs.ceph.com/en/reef/rbd/rbd-kubernetes/

We are strugling with the usage of the pvc we created in the block mode.
Kindly refer the "raw-block-pvc-4.yaml".
We created a pod using the file "raw-block-pod-4.yaml".

The pod was created successfully. The device path we used was /dev/xvda.
We are unable to use this path inside the pod. Its not gettting shown in
the lsblk of the pod and its not even shown in the df -kh of the pod.
When we try to mount /dev/xvda on /tmp but we get permission denied error.
We get a similar error when we try to make partiotion of the disk. The
partition gets created but partiotion table is not updated.
At the same time we are able to format the disk using the command mkfs.ext4
/dev/xvda.
Kindly refer the output:
"
[root@kube-cluster-1 ceph_provisioner]# k exec -it
pod-with-raw-block-volume-2 sh
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future
version. Use kubectl exec [POD] -- [COMMAND] instead.
sh-4.4# hostname
pod-with-raw-block-volume-2
sh-4.4# df -kh
Filesystem  Size  Used Avail Use% Mounted on
overlay  70G   14G   57G  19% /
tmpfs64M 0   64M   0% /dev
tmpfs32G 0   32G   0% /sys/fs/cgroup
/dev/mapper/almalinux-root   70G   14G   57G  19% /etc/hosts
shm  64M 0   64M   0% /dev/shm
tmpfs63G   12K   63G   1% /run/secrets/
kubernetes.io/serviceaccount
tmpfs32G 0   32G   0% /proc/acpi
tmpfs32G 0   32G   0% /proc/scsi
tmpfs32G 0   32G   0% /sys/firmware
sh-4.4# lsblk
NAME   MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
loop07:00 1G  0 loop
loop17:10   500G  0 loop
sda  8:00 446.6G  0 disk
|-sda1   8:1200M  0 part
|-sda2   8:2  1M  0 part
|-sda3   8:3  1G  0 part
`-sda4   8:40 445.4G  0 part
sdb  8:16   0 893.3G  0 disk
rbd0   252:00 1G  0 disk
rbd1   252:16   0   500G  0 disk
rbd2   252:32   0   100G  0 disk
sh-4.4# ls /dev
core  fd  full  mqueue  null  ptmx  pts  random  shm  stderr  stdin  stdout
 termination-log  tty  urandom  xvda  zero
sh-4.4#
"

Altough we are able to see a volume corresponding to the PV.
But we are not able to use the block device.

Could anyone help me out.

Thanks and Regards,
Kushagra Gupta
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph Allocation - used space is unreasonably higher than stored space

2023-11-15 Thread motaharesdq
Thank you Igor,

Yeah the 25K waste per rados object seems reasonable, couple of questions 
though:

1. Is the story of blobs re-using empty sub-sections of already allocated 
"min_alloc_size"ed blocks, just for RBD/FS? I read some blogs about 
onode->extent->blob->min_alloc->pextent->disk flow and how a write smaller than 
min_alloc_size counts as a small write and if any other small write comes by 
later, which is fit to this empty area of the blob, it will use that area; I 
expected this re-use behavior in rados overally.
Is my assumption of how allocation works totally wrong or it just doesn't apply 
for s3? (maybe because object hints are immutable?) and do you know any 
documentation about allocation details? I couldn't find much official data 
about it.

2. We have a ceph cluster that was updated to pacific but the OSDs were from a 
previous octopus cluster with bluestore_min_alloc_size_hdd = 64KB, 
bluestore_prefer_deferred_size_hdd = 65536, and both bluestore_allocator and 
bluefs_allocator are bitmap and OSDs were not re-deployed afterward. We were 
concerned that re-deploying HDD OSDs with bluestore_min_alloc_size_hdd = 4KB 
might cause i/o performance issues since the number of blocks and hence write 
operations will increase. Do you know how it might affect the cluster?

Many thanks
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: CephFS mirror very slow (maybe for small files?)

2023-11-15 Thread Stuart Cornell
Hi Jos,
 I have tried adding multiple daemons but it seems only 1 is active, and there 
is no improvement in throughput.
On further reading, you suggestion conflicts with the docs 
(https://docs.ceph.com/en/reef/dev/cephfs-mirroring/#:~:text=Multiple%20mirror%20daemons%20can%20be,set%20thus%20providing%20high%2Davailability.)
It recommends against multiple daemons and it also says that it "...rebalances 
the directory assignment amongst the new set thus providing high-availability." 
This sounds like it can only balance if there are multiple directories 
registered for snapshot. As mentioned in my OP we must use only 1.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: CephFS mirror very slow (maybe for small files?)

2023-11-15 Thread Stuart Cornell
Thankyou Jos.
 I will try the multiple daemons to see how that helps. It looks like I need to 
wait for the fix [1] to be in a release (currently pending review) before I can 
apply it.

Stuart
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: reef 18.2.1 QE Validation status

2023-11-15 Thread Redouane Kachach
Yes, cephadm has some tests for monitoring that should be enough to ensure
basic functionality is working properly. The rest of the changes in the PR
are for rook orchestrator.

On Tue, Nov 14, 2023 at 5:04 AM Nizamudeen A  wrote:

> dashboard changes are minimal and approved. and since the dashboard change
> is related to the
> monitoring stack (prometheus..) which is something not covered in the
> dashboard test suites, I don't think running it is necessary.
> But maybe the cephadm suite has some monitoring stack related testings
> written?
>
> On Tue, Nov 14, 2023 at 1:10 AM Yuri Weinstein 
> wrote:
>
>> Ack Travis.
>>
>> Since it touches a dashboard, Nizam - please reply/approve.
>>
>> I assume that rados/dashboard tests will be sufficient, but expecting
>> your recommendations.
>>
>> This addition will make the final release likely to be pushed.
>>
>> On Mon, Nov 13, 2023 at 11:30 AM Travis Nielsen 
>> wrote:
>> >
>> > I'd like to see these changes for much improved dashboard integration
>> with Rook. The changes are to the rook mgr orchestrator module, and
>> supporting test changes. Thus, this should be very low risk to the ceph
>> release. I don't know the details of the tautology suites, but I would
>> think suites involving the mgr modules would only be necessary.
>> >
>> > Travis
>> >
>> > On Mon, Nov 13, 2023 at 12:14 PM Yuri Weinstein 
>> wrote:
>> >>
>> >> Redouane
>> >>
>> >> What would be a sufficient level of testing (tautology suite(s))
>> >> assuming this PR is approved to be added?
>> >>
>> >> On Mon, Nov 13, 2023 at 9:13 AM Redouane Kachach 
>> wrote:
>> >> >
>> >> > Hi Yuri,
>> >> >
>> >> > I've just backported to reef several fixes that I introduced in the
>> last months for the rook orchestrator. Most of them are fixes for dashboard
>> issues/crashes that only happen on Rook environments. The PR [1] has all
>> the changes and it was merged into reef this morning. We really need these
>> changes to be part of the next reef release as the upcoming Rook stable
>> version will be based on it.
>> >> >
>> >> > Please, can you include those changes in the upcoming reef 18.2.1
>> release?
>> >> >
>> >> > [1] https://github.com/ceph/ceph/pull/54224
>> >> >
>> >> > Thanks a lot,
>> >> > Redouane.
>> >> >
>> >> >
>> >> > On Mon, Nov 13, 2023 at 6:03 PM Yuri Weinstein 
>> wrote:
>> >> >>
>> >> >> -- Forwarded message -
>> >> >> From: Venky Shankar 
>> >> >> Date: Thu, Nov 9, 2023 at 11:52 PM
>> >> >> Subject: Re: [ceph-users] Re: reef 18.2.1 QE Validation status
>> >> >> To: Yuri Weinstein 
>> >> >> Cc: dev , ceph-users 
>> >> >>
>> >> >>
>> >> >> Hi Yuri,
>> >> >>
>> >> >> On Fri, Nov 10, 2023 at 4:55 AM Yuri Weinstein 
>> wrote:
>> >> >> >
>> >> >> > I've updated all approvals and merged PRs in the tracker and it
>> looks
>> >> >> > like we are ready for gibba, LRC upgrades pending approval/update
>> from
>> >> >> > Venky.
>> >> >>
>> >> >> The smoke test failure is caused by missing (kclient) patches in
>> >> >> Ubuntu 20.04 that certain parts of the fs suite (via smoke tests)
>> rely
>> >> >> on. More details here
>> >> >>
>> >> >> https://tracker.ceph.com/issues/63488#note-8
>> >> >>
>> >> >> The kclient tests in smoke pass with other distro's and the fs suite
>> >> >> tests have been reviewed and look good. Run details are here
>> >> >>
>> >> >>
>> https://tracker.ceph.com/projects/cephfs/wiki/Reef#07-Nov-2023
>> >> >>
>> >> >> The smoke failure is noted as a known issue for now. Consider this
>> run
>> >> >> as "fs approved".
>> >> >>
>> >> >> >
>> >> >> > On Thu, Nov 9, 2023 at 1:31 PM Radoslaw Zarzynski <
>> rzarz...@redhat.com> wrote:
>> >> >> > >
>> >> >> > > rados approved!
>> >> >> > >
>> >> >> > > Details are here:
>> https://tracker.ceph.com/projects/rados/wiki/REEF#1821-Review.
>> >> >> > >
>> >> >> > > On Mon, Nov 6, 2023 at 10:33 PM Yuri Weinstein <
>> ywein...@redhat.com> wrote:
>> >> >> > > >
>> >> >> > > > Details of this release are summarized here:
>> >> >> > > >
>> >> >> > > > https://tracker.ceph.com/issues/63443#note-1
>> >> >> > > >
>> >> >> > > > Seeking approvals/reviews for:
>> >> >> > > >
>> >> >> > > > smoke - Laura, Radek, Prashant, Venky (POOL_APP_NOT_ENABLE
>> failures)
>> >> >> > > > rados - Neha, Radek, Travis, Ernesto, Adam King
>> >> >> > > > rgw - Casey
>> >> >> > > > fs - Venky
>> >> >> > > > orch - Adam King
>> >> >> > > > rbd - Ilya
>> >> >> > > > krbd - Ilya
>> >> >> > > > upgrade/quincy-x (reef) - Laura PTL
>> >> >> > > > powercycle - Brad
>> >> >> > > > perf-basic - Laura, Prashant (POOL_APP_NOT_ENABLE failures)
>> >> >> > > >
>> >> >> > > > Please reply to this email with approval and/or trackers of
>> known
>> >> >> > > > issues/PRs to address them.
>> >> >> > > >
>> >> >> > > > TIA
>> >> >> > > > YuriW
>> >> >> > > > ___
>> >> >> > > > Dev mailing list -- d...@ceph.io
>> >> >> > > > To unsubscribe send an email to dev-le...@ceph.io
>> >> >> > > >
>> >> >> > >
>> >> >> > _

[ceph-users] Upgrading From RHCS v4 to OSS Ceph

2023-11-15 Thread jarulsam
Hi everyone,

I have a storage cluster running RHCS v4 (old, I know) and am looking to upgrade
it soon. I would also like to migrate from RHCS to the open source version of
Ceph at some point, as our support contract with RedHat for Ceph is likely going
to not be renewed going forward.

I was wondering if anyone has any advice as to how to upgrade our cluster with
minimal production impact. I have the following server configuration:

  + 3x monitors
  + 3x metadata servers
  + 2x RadosGWs with 2x servers running HAproxy and keepalived for HA RadosGWs.
  + 19x OSDs - 110TB HDD and 1TB NVMe each. (Total ~2.1PB raw)

Currently, I have RHCS v4 installed baremetal on RHEL 7. I see that newer
versions of Ceph require containerized deployments so I am thinking it is best
to first migrate to a containerized installation then try and upgrade everything
else.

My first inclination is to do the upgrade like this:

  1. Move existing installation to containerized, maintain all the same versions
 and OS installations.

  2. Pull one monitor, fresh reinstall RHEL 9, reinstall RHCS v4, readd to
 cluster. Repeat for all the monitors.

  3. Pull one MDS, do the same as step 2 but for MDS.

  4. Pull one RadosGW, do the same as step 2 but for RadosGW.

  5. Pull one OSD, rebalance, fresh reinstall RHEL 9, reinstall RHCS v4, readd
 to cluster, rebalance. Repeat for all OSDs. 

  6. Upgrade RHCS to OSS Ceph Octopus -> Pacific -> Quincy -> Reef.

Does such a plan seem reasonable? Are there any major pitfalls of an approach
like this? Ideally, I would just rebuild an entire new cluster on Ceph Reef,
however there are obvious budgetary issues with such a plan.

My biggest concerns are with moving to a containerized installation, then
migrating from RHCS to OSS Ceph.

Any advice or feedback is much appreciated.

Best,

Josh
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: reef 18.2.1 QE Validation status

2023-11-15 Thread Redouane Kachach
Hi Yuri,

I've just backported to reef several fixes that I introduced in the last
months for the rook orchestrator. Most of them are fixes for dashboard
issues/crashes that only happen on Rook environments. The PR [1] has all
the changes and it was merged into reef this morning. We really
need these changes to be part of the next reef release as the upcoming Rook
stable version will be based on it.

Please, can you include those changes in the upcoming reef 18.2.1 release?

[1] https://github.com/ceph/ceph/pull/54224

Thanks a lot,
Redouane.


On Mon, Nov 13, 2023 at 6:03 PM Yuri Weinstein  wrote:

> -- Forwarded message -
> From: Venky Shankar 
> Date: Thu, Nov 9, 2023 at 11:52 PM
> Subject: Re: [ceph-users] Re: reef 18.2.1 QE Validation status
> To: Yuri Weinstein 
> Cc: dev , ceph-users 
>
>
> Hi Yuri,
>
> On Fri, Nov 10, 2023 at 4:55 AM Yuri Weinstein 
> wrote:
> >
> > I've updated all approvals and merged PRs in the tracker and it looks
> > like we are ready for gibba, LRC upgrades pending approval/update from
> > Venky.
>
> The smoke test failure is caused by missing (kclient) patches in
> Ubuntu 20.04 that certain parts of the fs suite (via smoke tests) rely
> on. More details here
>
> https://tracker.ceph.com/issues/63488#note-8
>
> The kclient tests in smoke pass with other distro's and the fs suite
> tests have been reviewed and look good. Run details are here
>
> https://tracker.ceph.com/projects/cephfs/wiki/Reef#07-Nov-2023
>
> The smoke failure is noted as a known issue for now. Consider this run
> as "fs approved".
>
> >
> > On Thu, Nov 9, 2023 at 1:31 PM Radoslaw Zarzynski 
> wrote:
> > >
> > > rados approved!
> > >
> > > Details are here:
> https://tracker.ceph.com/projects/rados/wiki/REEF#1821-Review.
> > >
> > > On Mon, Nov 6, 2023 at 10:33 PM Yuri Weinstein 
> wrote:
> > > >
> > > > Details of this release are summarized here:
> > > >
> > > > https://tracker.ceph.com/issues/63443#note-1
> > > >
> > > > Seeking approvals/reviews for:
> > > >
> > > > smoke - Laura, Radek, Prashant, Venky (POOL_APP_NOT_ENABLE failures)
> > > > rados - Neha, Radek, Travis, Ernesto, Adam King
> > > > rgw - Casey
> > > > fs - Venky
> > > > orch - Adam King
> > > > rbd - Ilya
> > > > krbd - Ilya
> > > > upgrade/quincy-x (reef) - Laura PTL
> > > > powercycle - Brad
> > > > perf-basic - Laura, Prashant (POOL_APP_NOT_ENABLE failures)
> > > >
> > > > Please reply to this email with approval and/or trackers of known
> > > > issues/PRs to address them.
> > > >
> > > > TIA
> > > > YuriW
> > > > ___
> > > > Dev mailing list -- d...@ceph.io
> > > > To unsubscribe send an email to dev-le...@ceph.io
> > > >
> > >
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
>
>
>
> --
> Cheers,
> Venky
>
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph Allocation - used space is unreasonably higher than stored space

2023-11-15 Thread motaharesdq
Igor Fedotov wrote:
> Hi Motahare,
> 
> On 13/11/2023 14:44, Motahare S wrote:
> >   Hello everyone,
> > 
> >  Recently we have noticed that the results of "ceph df" stored and used
> >  space does not match; as the amount of stored data *1.5 (ec factor) is
> >  still like 5TB away from used amount:
> > 
> >  POOLID   PGS   STORED  OBJECTS USED  %USED
> >MAX AVAIL
> >  default.rgw.buckets.data12  1024  144 TiB   70.60M  221 TiB  18.68
> >643 TiB
> > 
> >  blob and alloc configs are as below:
> >  bluestore_min_alloc_size_hdd : 65536
> >  bluestore_min_alloc_size_ssd  : 4096
> >  luestore_max_blob_size_hdd : 524288
> > 
> >  bluestore_max_blob_size_ssd : 65536
> > 
> >  bluefs_shared_alloc_size : 65536
> > 
> >   From sources across web about how ceph actually writes on the disk, I
> >  presumed that It will zero-pad the extents of an object to match the
> >  4KB bdev_block_size, and then writes it in a blob which matches the
> >  min_alloc_size, however it can re-use parts of the blob's unwritten (but
> >  allocated because of min_alloc_size) space for another extent later.
> >  The problem though, was that we tested different configs in a minimal ceph
> >  octopus cluster with a 2G osd and bluestore_min_alloc_size_hdd = 65536.
> >  When we uploaded a 1KB file with aws s3 client, the amount of used/stored
> >  space was 64KB/1KB. We then uploaded another 1KB, and it went 128K/2K; kept
> >  doing it until 100% of the pool was used, but only 32MB stored. I expected
> >  ceph to start writing new 1KB files in the wasted 63KB(60KB)s of
> >  min_alloc_size blocks, but the cluster was totally acting as a full cluster
> >  and could no longer receive any new object. Is this behaviour expected for
> >  s3? Does ceph really use 64x space if your dataset is made of 1KB files?
> >  and all your object sizes should be a multiple of 64KB? Note that 5TB /
> >  (70.6M*1.5) ~ 50 so for every rados object about 50KB is wasted on average.
> >  we didn't observe this problem in RBD pools, probably because it cuts all
> >  objects in 4MB. 
> The above analysis is correct, indeed BlueStore will waste up to 64K for 
> every object unaligned to 64K (i.e. both 1K and 65K objects will waste 
> 63K).
> 
> Hence n*1K objects take n*64K bytes.
> 
> And since S3 objects are unaligned it tend to waste 32K bytes in average 
> on each object (assuming their sizes are distributed equally).
> 
> The only correction to the above math would be due to the actual m+n EC 
> layout. E.g. for 2+1 EC object count multiplier would be 3 not 1.5. 
> Hence the overhead per rados object is rather less than 50K in your case.
> 
> >   I know that min_alloc_hdd is changed to 4KB in
> > pacific, but I'm still
> >  curious how allocation really works and why it doesn't behave as expected?
> >  Also, re-deploying OSDs is a headache. 
> >
> > Sincerely
> > Motahare
> > ___
> > ceph-users mailing list -- ceph-users(a)ceph.io
> > To unsubscribe send an email to ceph-users-leave(a)ceph.io

Thank you Igor,

Yeah the 25K waste per rados object seems reasonable. Couple of questions 
though:

1. Is the whole flow of blobs re-using allocated space (empty sub-sections of 
already allocated "min_alloc_size"ed blocks) just for RBD/FS? I read some blogs 
about onode->extent->blob->min_alloc->pextent re-using via small writes, and I 
expected this behavior in rados overally.
e. g. https://blog.51cto.com/u_15265005/2888373
Is my assumption totally wrong or it just does not apply for s3? (maybe because 
objects are immutable?)

2. We have a ceph cluster that was updated to pacific but the OSDs were from a 
previous octopus cluster and were updated but not re-deployed afterward. We 
were concerned that re-deploying OSDs with bluestore_min_alloc_size_hdd = 4KB 
might cause i/o performance issues since the number of blocks and hence r/w 
operations will increase. Do you have any views on how it might affect our 
cluster?

Many thanks
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] [CEPH] OSD Memory Usage

2023-11-15 Thread Nguyễn Hữu Khôi
Hello,
I am using a CEPH cluster. After monitoring it, I set:

ceph config set osd osd_memory_target_autotune false

ceph config set osd osd_memory_target 1G

Then restart all OSD services then do test again, I just use fio commands
from multi clients and I see that OSD memory consume is over 1GB. Would you
like to help me understand this case?

Ceph version: Quincy

OSD: 3 nodes with 11 nvme each and 512GB ram per node.

CPU: 2 socket xeon gold 6138 cpu with 56 cores per socket.

Network: 25Gbps x 2 for public network and 25Gbps x 2 for storage network.
MTU is 9000

Thank you very much.


Nguyen Huu Khoi
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Reinitialize rgw garbage collector

2023-11-15 Thread Pierre GINDRAUD

Hello Michael,

Did you receive any help on this ?

We are issuing the same problem without solution for now.

Regard

--
Pierre
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: reef 18.2.1 QE Validation status

2023-11-15 Thread Radoslaw Zarzynski
rados approved!

Details are here: https://tracker.ceph.com/projects/rados/wiki/REEF#1821-Review.

On Mon, Nov 6, 2023 at 10:33 PM Yuri Weinstein  wrote:
>
> Details of this release are summarized here:
>
> https://tracker.ceph.com/issues/63443#note-1
>
> Seeking approvals/reviews for:
>
> smoke - Laura, Radek, Prashant, Venky (POOL_APP_NOT_ENABLE failures)
> rados - Neha, Radek, Travis, Ernesto, Adam King
> rgw - Casey
> fs - Venky
> orch - Adam King
> rbd - Ilya
> krbd - Ilya
> upgrade/quincy-x (reef) - Laura PTL
> powercycle - Brad
> perf-basic - Laura, Prashant (POOL_APP_NOT_ENABLE failures)
>
> Please reply to this email with approval and/or trackers of known
> issues/PRs to address them.
>
> TIA
> YuriW
> ___
> Dev mailing list -- d...@ceph.io
> To unsubscribe send an email to dev-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: reef 18.2.1 QE Validation status

2023-11-15 Thread Kaleb Keithley
On Tue, Nov 7, 2023 at 4:12 PM Adam King  wrote:

> I think the orch code itself is doing fine, but a bunch of tests are
> failing due to https://tracker.ceph.com/issues/63151. I think that's
> likely related to the ganesha build we have included in the container and
> if we want nfs over rgw to work properly in this release I think we'll have
> to update it. From previous notes in the tracker, it looks like 5.5-2 is
> currently in there (specifically nfs-ganesha-rgw-5.5-2.el8s.x86_64  package
> probably has an issue).
>
>
AIUI there's a build of ganesha-5.7 available now in shaman. We need a
better process for keeping shaman up to date.


-- 

Kaleb
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Large size differences between pgs

2023-11-15 Thread Miroslav Svoboda
Namely, the problem what I am trying to solve is that with such a large cluster 
I will lose a lot of  capacity like unused. I have deviation set to value 1 at 
the balancer, that is, if I'm not mistaken +-1pg per OSD and then due to the 
size dispersion between the largest and smallest PGs on the one OSDs 
host.Svoboda Miroslav
 Původní zpráva Od: Anthony D'Atri  Datum: 
15.11.23  21:54  (GMT+01:00) Komu: Miroslav Svoboda 
 Předmět: Re: [ceph-users] Large size differences 
between pgs How are you determining PG size?> On Nov 15, 2023, at 15:46, 
Miroslav Svoboda  wrote:> > Hi,is it possible 
decrease large size differences between pgs? I have 5PB cluster and differences 
between smalest and bigest pgs are somewhere about 25GB.thanks,Svoboda 
Miroslav> ___> ceph-users mailing 
list -- ceph-users@ceph.io> To unsubscribe send an email to 
ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Large size differences between pgs

2023-11-15 Thread Miroslav Svoboda
Hi,is it possible decrease large size differences between pgs? I have 5PB 
cluster and differences between smalest and bigest pgs are somewhere about 
25GB.thanks,Svoboda Miroslav
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: reef 18.2.1 QE Validation status

2023-11-15 Thread Travis Nielsen
The tests were re-run  with
Guillaume's changes and are passing now!

Thanks,
Travis

On Wed, Nov 15, 2023 at 1:19 PM Yuri Weinstein  wrote:

> Sounds like it's a must to be added.
>
> When the reef backport PR can be merged?
>
> On Wed, Nov 15, 2023 at 12:13 PM Travis Nielsen 
> wrote:
> >
> > Thanks Guiilaume and Redo for tracking down this issue. After talking
> more with Guillaume I now realized that not all the tests were using the
> expected latest-reef-devel label so Rook tests were incorrectly showing
> green for Reef. :(
> > Now that I ran the tests again in the test PR with all tests using the
> latest-reef-devel label, all the tests with OSDs on PVCs are failing that
> use ceph-volume raw mode. So this is a blocker for Rook scenarios, we
> really need to fix to avoid breaking OSDs.
> >
> > Thanks,
> > Travis
> >
> > On Wed, Nov 15, 2023 at 1:03 PM Guillaume Abrioux 
> wrote:
> >>
> >> Hi Yuri, (thanks)
> >>
> >> Indeed, we had a regression in ceph-volume impacting rook scenarios
> which was supposed to be fixed by [1].
> >> It turns out rook's CI didn't catch that fix wasn't enough for some
> reason (I believe the CI run wasn't using the right image, Travis might
> confirm or give more details).
> >> Another patch [2] is needed in order to fix this regression.
> >>
> >> Let me know if more details are needed.
> >>
> >> Thanks,
> >>
> >> [1]
> https://github.com/ceph/ceph/pull/54429/commits/ee26074a5e7e90b4026659bf3adb1bc973595e91
> >> [2] https://github.com/ceph/ceph/pull/54514/files
> >>
> >>
> >> --
> >> Guillaume Abrioux
> >> Software Engineer
> >>
> >> 
> >> From: Yuri Weinstein 
> >> Sent: 15 November 2023 20:23
> >> To: Nizamudeen A ; Guillaume Abrioux <
> gabri...@redhat.com>; Travis Nielsen 
> >> Cc: Adam King ; Redouane Kachach <
> rkach...@redhat.com>; dev ; ceph-users 
> >> Subject: [EXTERNAL] [ceph-users] Re: reef 18.2.1 QE Validation status
> >>
> >> This is on behalf of Guillaume.
> >>
> >> We have one more last mites issue that may have to be included
> >> https://tracker.ceph.com/issues/63545
> https://github.com/ceph/ceph/pull/54514
> >>
> >> Travis, Redo, Guillaume will provide more context and details.
> >>
> >> We are assessing the situation as 18.2.1 has been built and signed.
> >>
> >> On Tue, Nov 14, 2023 at 11:07 AM Yuri Weinstein 
> wrote:
> >> >
> >> > OK thx!
> >> >
> >> > We have completed the approvals.
> >> >
> >> > On Tue, Nov 14, 2023 at 9:13 AM Nizamudeen A  wrote:
> >> > >
> >> > > dashboard approved. Failure known and unrelated!
> >> > >
> >> > > On Tue, Nov 14, 2023, 22:34 Adam King  wrote:
> >> > >>
> >> > >> orch approved.  After reruns, orch/cephadm was just hitting two
> known (nonblocker) issues and orch/rook teuthology suite is known to not be
> functional currently.
> >> > >>
> >> > >> On Tue, Nov 14, 2023 at 10:33 AM Yuri Weinstein <
> ywein...@redhat.com> wrote:
> >> > >>>
> >> > >>> Build 4 with https://github.com/ceph/ceph/pull/54224  was built
> and I
> >> > >>> ran the tests below and asking for approvals:
> >> > >>>
> >> > >>> smoke - Laura
> >> > >>> rados/mgr - PASSED
> >> > >>> rados/dashboard - Nizamudeen
> >> > >>> orch - Adam King
> >> > >>>
> >> > >>> See Build 4 runs - https://tracker.ceph.com/issues/63443#note-1
> >> > >>>
> >> > >>> On Tue, Nov 14, 2023 at 12:21 AM Redouane Kachach <
> rkach...@redhat.com> wrote:
> >> > >>> >
> >> > >>> > Yes, cephadm has some tests for monitoring that should be
> enough to ensure basic functionality is working properly. The rest of the
> changes in the PR are for rook orchestrator.
> >> > >>> >
> >> > >>> > On Tue, Nov 14, 2023 at 5:04 AM Nizamudeen A 
> wrote:
> >> > >>> >>
> >> > >>> >> dashboard changes are minimal and approved. and since the
> dashboard change is related to the
> >> > >>> >> monitoring stack (prometheus..) which is something not covered
> in the dashboard test suites, I don't think running it is necessary.
> >> > >>> >> But maybe the cephadm suite has some monitoring stack related
> testings written?
> >> > >>> >>
> >> > >>> >> On Tue, Nov 14, 2023 at 1:10 AM Yuri Weinstein <
> ywein...@redhat.com> wrote:
> >> > >>> >>>
> >> > >>> >>> Ack Travis.
> >> > >>> >>>
> >> > >>> >>> Since it touches a dashboard, Nizam - please reply/approve.
> >> > >>> >>>
> >> > >>> >>> I assume that rados/dashboard tests will be sufficient, but
> expecting
> >> > >>> >>> your recommendations.
> >> > >>> >>>
> >> > >>> >>> This addition will make the final release likely to be pushed.
> >> > >>> >>>
> >> > >>> >>> On Mon, Nov 13, 2023 at 11:30 AM Travis Nielsen <
> tniel...@redhat.com> wrote:
> >> > >>> >>> >
> >> > >>> >>> > I'd like to see these changes for much improved dashboard
> integration with Rook. The changes are to the rook mgr orchestrator module,
> and supporting test changes. Thus, this should be very low risk to the ceph
> release. I don't know the details of the tautology suites, but I would
> think suites involving 

[ceph-users] Re: reef 18.2.1 QE Validation status

2023-11-15 Thread Yuri Weinstein
Sounds like it's a must to be added.

When the reef backport PR can be merged?

On Wed, Nov 15, 2023 at 12:13 PM Travis Nielsen  wrote:
>
> Thanks Guiilaume and Redo for tracking down this issue. After talking more 
> with Guillaume I now realized that not all the tests were using the expected 
> latest-reef-devel label so Rook tests were incorrectly showing green for 
> Reef. :(
> Now that I ran the tests again in the test PR with all tests using the 
> latest-reef-devel label, all the tests with OSDs on PVCs are failing that use 
> ceph-volume raw mode. So this is a blocker for Rook scenarios, we really need 
> to fix to avoid breaking OSDs.
>
> Thanks,
> Travis
>
> On Wed, Nov 15, 2023 at 1:03 PM Guillaume Abrioux  wrote:
>>
>> Hi Yuri, (thanks)
>>
>> Indeed, we had a regression in ceph-volume impacting rook scenarios which 
>> was supposed to be fixed by [1].
>> It turns out rook's CI didn't catch that fix wasn't enough for some reason 
>> (I believe the CI run wasn't using the right image, Travis might confirm or 
>> give more details).
>> Another patch [2] is needed in order to fix this regression.
>>
>> Let me know if more details are needed.
>>
>> Thanks,
>>
>> [1] 
>> https://github.com/ceph/ceph/pull/54429/commits/ee26074a5e7e90b4026659bf3adb1bc973595e91
>> [2] https://github.com/ceph/ceph/pull/54514/files
>>
>>
>> --
>> Guillaume Abrioux
>> Software Engineer
>>
>> 
>> From: Yuri Weinstein 
>> Sent: 15 November 2023 20:23
>> To: Nizamudeen A ; Guillaume Abrioux ; 
>> Travis Nielsen 
>> Cc: Adam King ; Redouane Kachach ; 
>> dev ; ceph-users 
>> Subject: [EXTERNAL] [ceph-users] Re: reef 18.2.1 QE Validation status
>>
>> This is on behalf of Guillaume.
>>
>> We have one more last mites issue that may have to be included
>> https://tracker.ceph.com/issues/63545  
>> https://github.com/ceph/ceph/pull/54514
>>
>> Travis, Redo, Guillaume will provide more context and details.
>>
>> We are assessing the situation as 18.2.1 has been built and signed.
>>
>> On Tue, Nov 14, 2023 at 11:07 AM Yuri Weinstein  wrote:
>> >
>> > OK thx!
>> >
>> > We have completed the approvals.
>> >
>> > On Tue, Nov 14, 2023 at 9:13 AM Nizamudeen A  wrote:
>> > >
>> > > dashboard approved. Failure known and unrelated!
>> > >
>> > > On Tue, Nov 14, 2023, 22:34 Adam King  wrote:
>> > >>
>> > >> orch approved.  After reruns, orch/cephadm was just hitting two known 
>> > >> (nonblocker) issues and orch/rook teuthology suite is known to not be 
>> > >> functional currently.
>> > >>
>> > >> On Tue, Nov 14, 2023 at 10:33 AM Yuri Weinstein  
>> > >> wrote:
>> > >>>
>> > >>> Build 4 with https://github.com/ceph/ceph/pull/54224  was built and I
>> > >>> ran the tests below and asking for approvals:
>> > >>>
>> > >>> smoke - Laura
>> > >>> rados/mgr - PASSED
>> > >>> rados/dashboard - Nizamudeen
>> > >>> orch - Adam King
>> > >>>
>> > >>> See Build 4 runs - https://tracker.ceph.com/issues/63443#note-1
>> > >>>
>> > >>> On Tue, Nov 14, 2023 at 12:21 AM Redouane Kachach 
>> > >>>  wrote:
>> > >>> >
>> > >>> > Yes, cephadm has some tests for monitoring that should be enough to 
>> > >>> > ensure basic functionality is working properly. The rest of the 
>> > >>> > changes in the PR are for rook orchestrator.
>> > >>> >
>> > >>> > On Tue, Nov 14, 2023 at 5:04 AM Nizamudeen A  wrote:
>> > >>> >>
>> > >>> >> dashboard changes are minimal and approved. and since the dashboard 
>> > >>> >> change is related to the
>> > >>> >> monitoring stack (prometheus..) which is something not covered in 
>> > >>> >> the dashboard test suites, I don't think running it is necessary.
>> > >>> >> But maybe the cephadm suite has some monitoring stack related 
>> > >>> >> testings written?
>> > >>> >>
>> > >>> >> On Tue, Nov 14, 2023 at 1:10 AM Yuri Weinstein 
>> > >>> >>  wrote:
>> > >>> >>>
>> > >>> >>> Ack Travis.
>> > >>> >>>
>> > >>> >>> Since it touches a dashboard, Nizam - please reply/approve.
>> > >>> >>>
>> > >>> >>> I assume that rados/dashboard tests will be sufficient, but 
>> > >>> >>> expecting
>> > >>> >>> your recommendations.
>> > >>> >>>
>> > >>> >>> This addition will make the final release likely to be pushed.
>> > >>> >>>
>> > >>> >>> On Mon, Nov 13, 2023 at 11:30 AM Travis Nielsen 
>> > >>> >>>  wrote:
>> > >>> >>> >
>> > >>> >>> > I'd like to see these changes for much improved dashboard 
>> > >>> >>> > integration with Rook. The changes are to the rook mgr 
>> > >>> >>> > orchestrator module, and supporting test changes. Thus, this 
>> > >>> >>> > should be very low risk to the ceph release. I don't know the 
>> > >>> >>> > details of the tautology suites, but I would think suites 
>> > >>> >>> > involving the mgr modules would only be necessary.
>> > >>> >>> >
>> > >>> >>> > Travis
>> > >>> >>> >
>> > >>> >>> > On Mon, Nov 13, 2023 at 12:14 PM Yuri Weinstein 
>> > >>> >>> >  wrote:
>> > >>> >>> >>
>> > >>> >>> >> Redouane
>> > >>> >>> >>
>> > >>> >>> >> What would be a sufficient level of testing (tautology s

[ceph-users] Re: reef 18.2.1 QE Validation status

2023-11-15 Thread Travis Nielsen
Thanks Guiilaume and Redo for tracking down this issue. After talking more
with Guillaume I now realized that not all the tests were using the
expected latest-reef-devel label so Rook tests were incorrectly showing
green for Reef. :(
Now that I ran the tests again in the test PR
 with all tests using the
latest-reef-devel label, all the tests with OSDs on PVCs are failing that
use ceph-volume raw mode. So this is a blocker for Rook scenarios, we
really need to fix to avoid breaking OSDs.

Thanks,
Travis

On Wed, Nov 15, 2023 at 1:03 PM Guillaume Abrioux  wrote:

> Hi Yuri, (thanks)
>
> Indeed, we had a regression in ceph-volume impacting rook scenarios which
> was supposed to be fixed by [1].
> It turns out rook's CI didn't catch that fix wasn't enough for some reason
> (I believe the CI run wasn't using the right image, Travis might confirm or
> give more details).
> Another patch [2] is needed in order to fix this regression.
>
> Let me know if more details are needed.
>
> Thanks,
>
> [1]
> https://github.com/ceph/ceph/pull/54429/commits/ee26074a5e7e90b4026659bf3adb1bc973595e91
> [2] https://github.com/ceph/ceph/pull/54514/files
>
>
> --
> Guillaume Abrioux
> Software Engineer
>
> --
> *From:* Yuri Weinstein 
> *Sent:* 15 November 2023 20:23
> *To:* Nizamudeen A ; Guillaume Abrioux <
> gabri...@redhat.com>; Travis Nielsen 
> *Cc:* Adam King ; Redouane Kachach ;
> dev ; ceph-users 
> *Subject:* [EXTERNAL] [ceph-users] Re: reef 18.2.1 QE Validation status
>
> This is on behalf of Guillaume.
>
> We have one more last mites issue that may have to be included
> https://tracker.ceph.com/issues/63545
> https://github.com/ceph/ceph/pull/54514
>
> Travis, Redo, Guillaume will provide more context and details.
>
> We are assessing the situation as 18.2.1 has been built and signed.
>
> On Tue, Nov 14, 2023 at 11:07 AM Yuri Weinstein 
> wrote:
> >
> > OK thx!
> >
> > We have completed the approvals.
> >
> > On Tue, Nov 14, 2023 at 9:13 AM Nizamudeen A  wrote:
> > >
> > > dashboard approved. Failure known and unrelated!
> > >
> > > On Tue, Nov 14, 2023, 22:34 Adam King  wrote:
> > >>
> > >> orch approved.  After reruns, orch/cephadm was just hitting two known
> (nonblocker) issues and orch/rook teuthology suite is known to not be
> functional currently.
> > >>
> > >> On Tue, Nov 14, 2023 at 10:33 AM Yuri Weinstein 
> wrote:
> > >>>
> > >>> Build 4 with https://github.com/ceph/ceph/pull/54224  was built and
> I
> > >>> ran the tests below and asking for approvals:
> > >>>
> > >>> smoke - Laura
> > >>> rados/mgr - PASSED
> > >>> rados/dashboard - Nizamudeen
> > >>> orch - Adam King
> > >>>
> > >>> See Build 4 runs - https://tracker.ceph.com/issues/63443#note-1
> > >>>
> > >>> On Tue, Nov 14, 2023 at 12:21 AM Redouane Kachach <
> rkach...@redhat.com> wrote:
> > >>> >
> > >>> > Yes, cephadm has some tests for monitoring that should be enough
> to ensure basic functionality is working properly. The rest of the changes
> in the PR are for rook orchestrator.
> > >>> >
> > >>> > On Tue, Nov 14, 2023 at 5:04 AM Nizamudeen A 
> wrote:
> > >>> >>
> > >>> >> dashboard changes are minimal and approved. and since the
> dashboard change is related to the
> > >>> >> monitoring stack (prometheus..) which is something not covered in
> the dashboard test suites, I don't think running it is necessary.
> > >>> >> But maybe the cephadm suite has some monitoring stack related
> testings written?
> > >>> >>
> > >>> >> On Tue, Nov 14, 2023 at 1:10 AM Yuri Weinstein <
> ywein...@redhat.com> wrote:
> > >>> >>>
> > >>> >>> Ack Travis.
> > >>> >>>
> > >>> >>> Since it touches a dashboard, Nizam - please reply/approve.
> > >>> >>>
> > >>> >>> I assume that rados/dashboard tests will be sufficient, but
> expecting
> > >>> >>> your recommendations.
> > >>> >>>
> > >>> >>> This addition will make the final release likely to be pushed.
> > >>> >>>
> > >>> >>> On Mon, Nov 13, 2023 at 11:30 AM Travis Nielsen <
> tniel...@redhat.com> wrote:
> > >>> >>> >
> > >>> >>> > I'd like to see these changes for much improved dashboard
> integration with Rook. The changes are to the rook mgr orchestrator module,
> and supporting test changes. Thus, this should be very low risk to the ceph
> release. I don't know the details of the tautology suites, but I would
> think suites involving the mgr modules would only be necessary.
> > >>> >>> >
> > >>> >>> > Travis
> > >>> >>> >
> > >>> >>> > On Mon, Nov 13, 2023 at 12:14 PM Yuri Weinstein <
> ywein...@redhat.com> wrote:
> > >>> >>> >>
> > >>> >>> >> Redouane
> > >>> >>> >>
> > >>> >>> >> What would be a sufficient level of testing (tautology
> suite(s))
> > >>> >>> >> assuming this PR is approved to be added?
> > >>> >>> >>
> > >>> >>> >> On Mon, Nov 13, 2023 at 9:13 AM Redouane Kachach <
> rkach...@redhat.com> wrote:
> > >>> >>> >> >
> > >>> >>> >> > Hi Yuri,
> > >>> >>> >> >
> > >>> >>> >> > I've just backported to reef several fixes that I
> introd

[ceph-users] Re: reef 18.2.1 QE Validation status

2023-11-15 Thread Guillaume Abrioux
Hi Yuri, (thanks)

Indeed, we had a regression in ceph-volume impacting rook scenarios which was 
supposed to be fixed by [1].
It turns out rook's CI didn't catch that fix wasn't enough for some reason (I 
believe the CI run wasn't using the right image, Travis might confirm or give 
more details).
Another patch [2] is needed in order to fix this regression.

Let me know if more details are needed.

Thanks,

[1] 
https://github.com/ceph/ceph/pull/54429/commits/ee26074a5e7e90b4026659bf3adb1bc973595e91
[2] https://github.com/ceph/ceph/pull/54514/files


--
Guillaume Abrioux
Software Engineer


From: Yuri Weinstein 
Sent: 15 November 2023 20:23
To: Nizamudeen A ; Guillaume Abrioux ; 
Travis Nielsen 
Cc: Adam King ; Redouane Kachach ; dev 
; ceph-users 
Subject: [EXTERNAL] [ceph-users] Re: reef 18.2.1 QE Validation status

This is on behalf of Guillaume.

We have one more last mites issue that may have to be included
https://tracker.ceph.com/issues/63545  https://github.com/ceph/ceph/pull/54514

Travis, Redo, Guillaume will provide more context and details.

We are assessing the situation as 18.2.1 has been built and signed.

On Tue, Nov 14, 2023 at 11:07 AM Yuri Weinstein  wrote:
>
> OK thx!
>
> We have completed the approvals.
>
> On Tue, Nov 14, 2023 at 9:13 AM Nizamudeen A  wrote:
> >
> > dashboard approved. Failure known and unrelated!
> >
> > On Tue, Nov 14, 2023, 22:34 Adam King  wrote:
> >>
> >> orch approved.  After reruns, orch/cephadm was just hitting two known 
> >> (nonblocker) issues and orch/rook teuthology suite is known to not be 
> >> functional currently.
> >>
> >> On Tue, Nov 14, 2023 at 10:33 AM Yuri Weinstein  
> >> wrote:
> >>>
> >>> Build 4 with https://github.com/ceph/ceph/pull/54224  was built and I
> >>> ran the tests below and asking for approvals:
> >>>
> >>> smoke - Laura
> >>> rados/mgr - PASSED
> >>> rados/dashboard - Nizamudeen
> >>> orch - Adam King
> >>>
> >>> See Build 4 runs - https://tracker.ceph.com/issues/63443#note-1
> >>>
> >>> On Tue, Nov 14, 2023 at 12:21 AM Redouane Kachach  
> >>> wrote:
> >>> >
> >>> > Yes, cephadm has some tests for monitoring that should be enough to 
> >>> > ensure basic functionality is working properly. The rest of the changes 
> >>> > in the PR are for rook orchestrator.
> >>> >
> >>> > On Tue, Nov 14, 2023 at 5:04 AM Nizamudeen A  wrote:
> >>> >>
> >>> >> dashboard changes are minimal and approved. and since the dashboard 
> >>> >> change is related to the
> >>> >> monitoring stack (prometheus..) which is something not covered in the 
> >>> >> dashboard test suites, I don't think running it is necessary.
> >>> >> But maybe the cephadm suite has some monitoring stack related testings 
> >>> >> written?
> >>> >>
> >>> >> On Tue, Nov 14, 2023 at 1:10 AM Yuri Weinstein  
> >>> >> wrote:
> >>> >>>
> >>> >>> Ack Travis.
> >>> >>>
> >>> >>> Since it touches a dashboard, Nizam - please reply/approve.
> >>> >>>
> >>> >>> I assume that rados/dashboard tests will be sufficient, but expecting
> >>> >>> your recommendations.
> >>> >>>
> >>> >>> This addition will make the final release likely to be pushed.
> >>> >>>
> >>> >>> On Mon, Nov 13, 2023 at 11:30 AM Travis Nielsen  
> >>> >>> wrote:
> >>> >>> >
> >>> >>> > I'd like to see these changes for much improved dashboard 
> >>> >>> > integration with Rook. The changes are to the rook mgr orchestrator 
> >>> >>> > module, and supporting test changes. Thus, this should be very low 
> >>> >>> > risk to the ceph release. I don't know the details of the tautology 
> >>> >>> > suites, but I would think suites involving the mgr modules would 
> >>> >>> > only be necessary.
> >>> >>> >
> >>> >>> > Travis
> >>> >>> >
> >>> >>> > On Mon, Nov 13, 2023 at 12:14 PM Yuri Weinstein 
> >>> >>> >  wrote:
> >>> >>> >>
> >>> >>> >> Redouane
> >>> >>> >>
> >>> >>> >> What would be a sufficient level of testing (tautology suite(s))
> >>> >>> >> assuming this PR is approved to be added?
> >>> >>> >>
> >>> >>> >> On Mon, Nov 13, 2023 at 9:13 AM Redouane Kachach 
> >>> >>> >>  wrote:
> >>> >>> >> >
> >>> >>> >> > Hi Yuri,
> >>> >>> >> >
> >>> >>> >> > I've just backported to reef several fixes that I introduced in 
> >>> >>> >> > the last months for the rook orchestrator. Most of them are 
> >>> >>> >> > fixes for dashboard issues/crashes that only happen on Rook 
> >>> >>> >> > environments. The PR [1] has all the changes and it was merged 
> >>> >>> >> > into reef this morning. We really need these changes to be part 
> >>> >>> >> > of the next reef release as the upcoming Rook stable version 
> >>> >>> >> > will be based on it.
> >>> >>> >> >
> >>> >>> >> > Please, can you include those changes in the upcoming reef 
> >>> >>> >> > 18.2.1 release?
> >>> >>> >> >
> >>> >>> >> > [1] https://github.com/ceph/ceph/pull/54224
> >>> >>> >> >
> >>> >>> >> > Thanks a lot,
> >>> >>> >> > Redouane.
> >>> >>> >> >
> >>> >>> >> >
> >>> >>> >> > On Mon, Nov 13, 2023 at 6:03 PM Yuri Weinstein 
> >>> >>> >> 

[ceph-users] Re: reef 18.2.1 QE Validation status

2023-11-15 Thread Yuri Weinstein
This is on behalf of Guillaume.

We have one more last mites issue that may have to be included
https://tracker.ceph.com/issues/63545 https://github.com/ceph/ceph/pull/54514

Travis, Redo, Guillaume will provide more context and details.

We are assessing the situation as 18.2.1 has been built and signed.

On Tue, Nov 14, 2023 at 11:07 AM Yuri Weinstein  wrote:
>
> OK thx!
>
> We have completed the approvals.
>
> On Tue, Nov 14, 2023 at 9:13 AM Nizamudeen A  wrote:
> >
> > dashboard approved. Failure known and unrelated!
> >
> > On Tue, Nov 14, 2023, 22:34 Adam King  wrote:
> >>
> >> orch approved.  After reruns, orch/cephadm was just hitting two known 
> >> (nonblocker) issues and orch/rook teuthology suite is known to not be 
> >> functional currently.
> >>
> >> On Tue, Nov 14, 2023 at 10:33 AM Yuri Weinstein  
> >> wrote:
> >>>
> >>> Build 4 with https://github.com/ceph/ceph/pull/54224 was built and I
> >>> ran the tests below and asking for approvals:
> >>>
> >>> smoke - Laura
> >>> rados/mgr - PASSED
> >>> rados/dashboard - Nizamudeen
> >>> orch - Adam King
> >>>
> >>> See Build 4 runs - https://tracker.ceph.com/issues/63443#note-1
> >>>
> >>> On Tue, Nov 14, 2023 at 12:21 AM Redouane Kachach  
> >>> wrote:
> >>> >
> >>> > Yes, cephadm has some tests for monitoring that should be enough to 
> >>> > ensure basic functionality is working properly. The rest of the changes 
> >>> > in the PR are for rook orchestrator.
> >>> >
> >>> > On Tue, Nov 14, 2023 at 5:04 AM Nizamudeen A  wrote:
> >>> >>
> >>> >> dashboard changes are minimal and approved. and since the dashboard 
> >>> >> change is related to the
> >>> >> monitoring stack (prometheus..) which is something not covered in the 
> >>> >> dashboard test suites, I don't think running it is necessary.
> >>> >> But maybe the cephadm suite has some monitoring stack related testings 
> >>> >> written?
> >>> >>
> >>> >> On Tue, Nov 14, 2023 at 1:10 AM Yuri Weinstein  
> >>> >> wrote:
> >>> >>>
> >>> >>> Ack Travis.
> >>> >>>
> >>> >>> Since it touches a dashboard, Nizam - please reply/approve.
> >>> >>>
> >>> >>> I assume that rados/dashboard tests will be sufficient, but expecting
> >>> >>> your recommendations.
> >>> >>>
> >>> >>> This addition will make the final release likely to be pushed.
> >>> >>>
> >>> >>> On Mon, Nov 13, 2023 at 11:30 AM Travis Nielsen  
> >>> >>> wrote:
> >>> >>> >
> >>> >>> > I'd like to see these changes for much improved dashboard 
> >>> >>> > integration with Rook. The changes are to the rook mgr orchestrator 
> >>> >>> > module, and supporting test changes. Thus, this should be very low 
> >>> >>> > risk to the ceph release. I don't know the details of the tautology 
> >>> >>> > suites, but I would think suites involving the mgr modules would 
> >>> >>> > only be necessary.
> >>> >>> >
> >>> >>> > Travis
> >>> >>> >
> >>> >>> > On Mon, Nov 13, 2023 at 12:14 PM Yuri Weinstein 
> >>> >>> >  wrote:
> >>> >>> >>
> >>> >>> >> Redouane
> >>> >>> >>
> >>> >>> >> What would be a sufficient level of testing (tautology suite(s))
> >>> >>> >> assuming this PR is approved to be added?
> >>> >>> >>
> >>> >>> >> On Mon, Nov 13, 2023 at 9:13 AM Redouane Kachach 
> >>> >>> >>  wrote:
> >>> >>> >> >
> >>> >>> >> > Hi Yuri,
> >>> >>> >> >
> >>> >>> >> > I've just backported to reef several fixes that I introduced in 
> >>> >>> >> > the last months for the rook orchestrator. Most of them are 
> >>> >>> >> > fixes for dashboard issues/crashes that only happen on Rook 
> >>> >>> >> > environments. The PR [1] has all the changes and it was merged 
> >>> >>> >> > into reef this morning. We really need these changes to be part 
> >>> >>> >> > of the next reef release as the upcoming Rook stable version 
> >>> >>> >> > will be based on it.
> >>> >>> >> >
> >>> >>> >> > Please, can you include those changes in the upcoming reef 
> >>> >>> >> > 18.2.1 release?
> >>> >>> >> >
> >>> >>> >> > [1] https://github.com/ceph/ceph/pull/54224
> >>> >>> >> >
> >>> >>> >> > Thanks a lot,
> >>> >>> >> > Redouane.
> >>> >>> >> >
> >>> >>> >> >
> >>> >>> >> > On Mon, Nov 13, 2023 at 6:03 PM Yuri Weinstein 
> >>> >>> >> >  wrote:
> >>> >>> >> >>
> >>> >>> >> >> -- Forwarded message -
> >>> >>> >> >> From: Venky Shankar 
> >>> >>> >> >> Date: Thu, Nov 9, 2023 at 11:52 PM
> >>> >>> >> >> Subject: Re: [ceph-users] Re: reef 18.2.1 QE Validation status
> >>> >>> >> >> To: Yuri Weinstein 
> >>> >>> >> >> Cc: dev , ceph-users 
> >>> >>> >> >>
> >>> >>> >> >>
> >>> >>> >> >> Hi Yuri,
> >>> >>> >> >>
> >>> >>> >> >> On Fri, Nov 10, 2023 at 4:55 AM Yuri Weinstein 
> >>> >>> >> >>  wrote:
> >>> >>> >> >> >
> >>> >>> >> >> > I've updated all approvals and merged PRs in the tracker and 
> >>> >>> >> >> > it looks
> >>> >>> >> >> > like we are ready for gibba, LRC upgrades pending 
> >>> >>> >> >> > approval/update from
> >>> >>> >> >> > Venky.
> >>> >>> >> >>
> >>> >>> >> >> The smoke test failure is caused by missing (kclient) patches in
> >>> >>> >> >> Ubuntu 20.0

[ceph-users] Re: Debian 12 support

2023-11-15 Thread Daniel Baumann
On 11/15/23 19:52, Daniel Baumann wrote:
> for 18.2.0, there's only one trivial thing needed:
> https://git.progress-linux.org/packages/graograman-backports-extras/ceph/commit/?id=ed59c69244ec7b81ec08f7a2d1a1f0a90e765de0

or, for mainline inclusion, an alternative depends would be suitable too:

Build-Depends: g++-11 | g++ (>= 7)

Regards,
Daniel
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: per-rbd snapshot limitation

2023-11-15 Thread David C.
I don't think this parameter exists (today)

Le mer. 15 nov. 2023 à 19:25, Wesley Dillingham  a
écrit :

> Are you aware of any config item that can be set (perhaps in the ceph.conf
> or config db) so the limit is enforced immediately at creation time without
> needing to set it for each rbd?
>
> Respectfully,
>
> *Wes Dillingham*
> w...@wesdillingham.com
> LinkedIn 
>
>
> On Wed, Nov 15, 2023 at 1:14 PM David C.  wrote:
>
>> rbd create testpool/test3 --size=100M
>> rbd snap limit set testpool/test3 --limit 3
>>
>>
>> Le mer. 15 nov. 2023 à 17:58, Wesley Dillingham 
>> a écrit :
>>
>>> looking into how to limit snapshots at the ceph level for RBD snapshots.
>>> Ideally ceph would enforce an arbitrary number of snapshots allowable per
>>> rbd.
>>>
>>> Reading the man page for rbd command I see this option:
>>> https://docs.ceph.com/en/quincy/man/8/rbd/#cmdoption-rbd-limit
>>>
>>> --limit
>>>
>>> Specifies the limit for the number of snapshots permitted.
>>>
>>> Seems perfect. But on attempting to use it as such I get an error:
>>>
>>> admin@rbdtest:~$ rbd create testpool/test3 --size=100M --limit=3
>>> rbd: unrecognised option '--limit=3'
>>>
>>> Where am I going wrong here? Is there another way to enforce a limit of
>>> snapshots for RBD? Thanks.
>>>
>>> Respectfully,
>>>
>>> *Wes Dillingham*
>>> w...@wesdillingham.com
>>> LinkedIn 
>>> ___
>>> ceph-users mailing list -- ceph-users@ceph.io
>>> To unsubscribe send an email to ceph-users-le...@ceph.io
>>>
>>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Debian 12 support

2023-11-15 Thread Daniel Baumann
On 11/15/23 19:31, Gregory Farnum wrote:
> There are versioning and dependency issues

for 18.2.0, there's only one trivial thing needed:

https://git.progress-linux.org/packages/graograman-backports-extras/ceph/commit/?id=ed59c69244ec7b81ec08f7a2d1a1f0a90e765de0

then, the packages build fine/as-is on bookworm.

Regards,
Daniel
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Debian 12 support

2023-11-15 Thread Gregory Farnum
There are versioning and dependency issues (both of packages, and compiler
toolchain pieces) which mean that the existing reef releases do not build
on Debian. Our upstream support for Debian has always been inconsistent
because we don’t have anybody dedicated or involved enough in both Debian
and Ceph to keep it working on a day-to-day basis. (I think the basic
problem is Debian makes much bigger and less frequent jumps in compiler
toolchains and packages than the other distros we work with, and none of
the developers have used it as their working OS since ~2010.)

Matthew has submitted a number of PRs to deal with those issues that are in
the reef branch and will let us do upstream builds for the next point
release. (Thanks Matthew!) Proxmox may have grabbed them or done their own
changes without pushing them upstream, or they might have found some other
workarounds that fit their needs.
-Greg

On Mon, Nov 13, 2023 at 8:42 AM Luke Hall 
wrote:

> On 13/11/2023 16:28, Daniel Baumann wrote:
> > On 11/13/23 17:14, Luke Hall wrote:
> >> How is it that Proxmox were able to release Debian12 packages for Quincy
> >> quite some time ago?
> >
> > because you can, as always, just (re-)build the package yourself.
>
> I guess I was just trying to point out that there seems to be nothing
> fundamentally blocking these builds which makes it more surprising that
> the official Ceph repo doesn't have Debian12 packages yet.
>
> >> My understanding is that they change almost nothing in their packages
> >> and just roll them to fit with their naming schema etc.
> >
> > yes, we're doing the same since kraken and put them in our own repo
> > (either builds of the "original" ceph sources, or backports from debian
> > - whichever is earlier available).. which is easier/simpler/more
> > reliable and avoids any dependency on external repositories.
> >
> > Regards,
> > Daniel
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
>
> --
> All postal correspondence to:
> The Positive Internet Company, 24 Ganton Street, London. W1F 7QY
>
> *Follow us on Twitter* @posipeople
>
> The Positive Internet Company Limited is registered in England and Wales.
> Registered company number: 3673639. VAT no: 726 7072 28.
> Registered office: Northside House, Mount Pleasant, Barnet, Herts, EN4 9EE.
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: per-rbd snapshot limitation

2023-11-15 Thread Wesley Dillingham
Are you aware of any config item that can be set (perhaps in the ceph.conf
or config db) so the limit is enforced immediately at creation time without
needing to set it for each rbd?

Respectfully,

*Wes Dillingham*
w...@wesdillingham.com
LinkedIn 


On Wed, Nov 15, 2023 at 1:14 PM David C.  wrote:

> rbd create testpool/test3 --size=100M
> rbd snap limit set testpool/test3 --limit 3
>
>
> Le mer. 15 nov. 2023 à 17:58, Wesley Dillingham  a
> écrit :
>
>> looking into how to limit snapshots at the ceph level for RBD snapshots.
>> Ideally ceph would enforce an arbitrary number of snapshots allowable per
>> rbd.
>>
>> Reading the man page for rbd command I see this option:
>> https://docs.ceph.com/en/quincy/man/8/rbd/#cmdoption-rbd-limit
>>
>> --limit
>>
>> Specifies the limit for the number of snapshots permitted.
>>
>> Seems perfect. But on attempting to use it as such I get an error:
>>
>> admin@rbdtest:~$ rbd create testpool/test3 --size=100M --limit=3
>> rbd: unrecognised option '--limit=3'
>>
>> Where am I going wrong here? Is there another way to enforce a limit of
>> snapshots for RBD? Thanks.
>>
>> Respectfully,
>>
>> *Wes Dillingham*
>> w...@wesdillingham.com
>> LinkedIn 
>> ___
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
>>
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Join us for the User + Dev Monthly Meetup - November 16!

2023-11-15 Thread Laura Flores
Hi Ceph users and developers,

I wanted to inform you about a change in the agenda for tomorrow's User +
Dev meeting.

Our originally scheduled speaker, Christian Theune, needs to reschedule his
presentation on "Operational Reliability and Flexibility in Ceph Upgrades"
to a later date.

However, we are excited to introduce a new focus topic: Zac Dover, the Ceph
Documentation lead, will be presenting a first draft of the "Ceph
Beginner's Guide", a guide geared toward making Ceph concepts accessible to
first-time users. Whether you are a Ceph beginner, seasoned user, or
developer, your feedback on this guide will be most welcome.

The meeting remains scheduled for tomorrow, November 16th at 10:00 AM EST.
Feel free to update any questions or comments you may have under the "Open
Discussion" section accordingly:
https://pad.ceph.com/p/ceph-user-dev-monthly-minutes

As always, if you have an idea for a focus topic you'd like to present at a
future meeting, you are welcome to submit it through our Google Form:
https://docs.google.com/forms/d/e/1FAIpQLSdboBhxVoBZoaHm8xSmeBoemuXoV_rmh4vJDGBrp6d-D3-BlQ/viewform?usp=sf_link
.

Thanks,
Laura Flores

On Mon, Nov 13, 2023 at 12:42 PM Laura Flores  wrote:

> Hi Ceph users and developers,
>
> You are invited to join us at the User + Dev meeting this week Thursday,
> November 16th at 10:00 AM EST! See below for more meeting details.
>
> The focus topic, "Operational Reliability and Flexibility in Ceph
> Upgrades", will be presented by Christian Theune. His presentation will
> highlight some issues encountered in a long running cluster when upgrading
> to stable releases, including migration between EC profiles, challenges
> related to RGW zone replication, and low-level bugs that need more
> attention from developers.
>
> The last part of the meeting will be dedicated to open discussion. Feel
> free to add questions for the speakers or additional topics under the "Open
> Discussion" section on the agenda:
> https://pad.ceph.com/p/ceph-user-dev-monthly-minutes
>
> If you have an idea for a focus topic you'd like to present at a future
> meeting, you are welcome to submit it to this Google Form:
> https://docs.google.com/forms/d/e/1FAIpQLSdboBhxVoBZoaHm8xSmeBoemuXoV_rmh4vJDGBrp6d-D3-BlQ/viewform?usp=sf_link
> Any Ceph user or developer is eligible to submit!
>
> Thanks,
> Laura Flores
>
> Meeting link: https://meet.jit.si/ceph-user-dev-monthly
>
> Time conversions:
> UTC:   Thursday, November 16, 15:00 UTC
> Mountain View, CA, US: Thursday, November 16,  7:00 PST
> Phoenix, AZ, US:   Thursday, November 16,  8:00 MST
> Denver, CO, US:Thursday, November 16,  8:00 MST
> Huntsville, AL, US:Thursday, November 16,  9:00 CST
> Raleigh, NC, US:   Thursday, November 16, 10:00 EST
> London, England:   Thursday, November 16, 15:00 GMT
> Paris, France: Thursday, November 16, 16:00 CET
> Helsinki, Finland: Thursday, November 16, 17:00 EET
> Tel Aviv, Israel:  Thursday, November 16, 17:00 IST
> Pune, India:   Thursday, November 16, 20:30 IST
> Brisbane, Australia:   Friday, November 17,  1:00 AEST
> Singapore, Asia:   Thursday, November 16, 23:00 +08
> Auckland, New Zealand: Friday, November 17,  4:00 NZDT
>
> --
>
> Laura Flores
>
> She/Her/Hers
>
> Software Engineer, Ceph Storage 
>
> Chicago, IL
>
> lflo...@ibm.com | lflo...@redhat.com 
> M: +17087388804
>
>
>

-- 

Laura Flores

She/Her/Hers

Software Engineer, Ceph Storage 

Chicago, IL

lflo...@ibm.com | lflo...@redhat.com 
M: +17087388804
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: per-rbd snapshot limitation

2023-11-15 Thread David C.
rbd create testpool/test3 --size=100M
rbd snap limit set testpool/test3 --limit 3


Le mer. 15 nov. 2023 à 17:58, Wesley Dillingham  a
écrit :

> looking into how to limit snapshots at the ceph level for RBD snapshots.
> Ideally ceph would enforce an arbitrary number of snapshots allowable per
> rbd.
>
> Reading the man page for rbd command I see this option:
> https://docs.ceph.com/en/quincy/man/8/rbd/#cmdoption-rbd-limit
>
> --limit
>
> Specifies the limit for the number of snapshots permitted.
>
> Seems perfect. But on attempting to use it as such I get an error:
>
> admin@rbdtest:~$ rbd create testpool/test3 --size=100M --limit=3
> rbd: unrecognised option '--limit=3'
>
> Where am I going wrong here? Is there another way to enforce a limit of
> snapshots for RBD? Thanks.
>
> Respectfully,
>
> *Wes Dillingham*
> w...@wesdillingham.com
> LinkedIn 
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: per-rbd snapshot limitation

2023-11-15 Thread Wesley Dillingham
Perfect, thank you.

Respectfully,

*Wes Dillingham*
w...@wesdillingham.com
LinkedIn 


On Wed, Nov 15, 2023 at 1:00 PM Ilya Dryomov  wrote:

> On Wed, Nov 15, 2023 at 5:57 PM Wesley Dillingham 
> wrote:
> >
> > looking into how to limit snapshots at the ceph level for RBD snapshots.
> > Ideally ceph would enforce an arbitrary number of snapshots allowable per
> > rbd.
> >
> > Reading the man page for rbd command I see this option:
> > https://docs.ceph.com/en/quincy/man/8/rbd/#cmdoption-rbd-limit
> >
> > --limit
> >
> > Specifies the limit for the number of snapshots permitted.
> >
> > Seems perfect. But on attempting to use it as such I get an error:
> >
> > admin@rbdtest:~$ rbd create testpool/test3 --size=100M --limit=3
> > rbd: unrecognised option '--limit=3'
> >
> > Where am I going wrong here? Is there another way to enforce a limit of
> > snapshots for RBD? Thanks.
>
> Hi Wes,
>
> I think you want "rbd snap limit set --limit 3 testpool/test3".
>
> Thanks,
>
> Ilya
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: per-rbd snapshot limitation

2023-11-15 Thread Ilya Dryomov
On Wed, Nov 15, 2023 at 5:57 PM Wesley Dillingham  
wrote:
>
> looking into how to limit snapshots at the ceph level for RBD snapshots.
> Ideally ceph would enforce an arbitrary number of snapshots allowable per
> rbd.
>
> Reading the man page for rbd command I see this option:
> https://docs.ceph.com/en/quincy/man/8/rbd/#cmdoption-rbd-limit
>
> --limit
>
> Specifies the limit for the number of snapshots permitted.
>
> Seems perfect. But on attempting to use it as such I get an error:
>
> admin@rbdtest:~$ rbd create testpool/test3 --size=100M --limit=3
> rbd: unrecognised option '--limit=3'
>
> Where am I going wrong here? Is there another way to enforce a limit of
> snapshots for RBD? Thanks.

Hi Wes,

I think you want "rbd snap limit set --limit 3 testpool/test3".

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] per-rbd snapshot limitation

2023-11-15 Thread Wesley Dillingham
looking into how to limit snapshots at the ceph level for RBD snapshots.
Ideally ceph would enforce an arbitrary number of snapshots allowable per
rbd.

Reading the man page for rbd command I see this option:
https://docs.ceph.com/en/quincy/man/8/rbd/#cmdoption-rbd-limit

--limit

Specifies the limit for the number of snapshots permitted.

Seems perfect. But on attempting to use it as such I get an error:

admin@rbdtest:~$ rbd create testpool/test3 --size=100M --limit=3
rbd: unrecognised option '--limit=3'

Where am I going wrong here? Is there another way to enforce a limit of
snapshots for RBD? Thanks.

Respectfully,

*Wes Dillingham*
w...@wesdillingham.com
LinkedIn 
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Ceph Leadership Team Meeting Minutes Nov 15, 2023

2023-11-15 Thread Ernesto Puerta
Hi Cephers,

These are the topics discussed today:


   - 18.2.1


   - Almost ready, packages built/signed


   - Plan to release on Monday


   - Last minute PR for Rook


   - Lab update to be finished by tomorrow


   - Finalize CDM APAC time


   - Review
   https://doodle.com/meeting/participate/id/aM9XGZ3a/vote?reauth=true


   - New time: 9.30 Pacific Time


   - Laura and Neha will sync up about changing the time


   - Squid dev freeze date


   - Proposal: end of January


   - Docs question: https://tracker.ceph.com/issues/11385: Can a member of
   the community just raise a PR attempting to standardize commands, without
   coordinating with a team?


   - ballpark date for pacific eol? docs still have the '2023-10-01'
   estimate


   - Discussion regarding EOL being at time of release or at some point
   shortly after to account for regressions


   - Conversation regarding clarity of messaging being important re:
   expectations for post-release fixes


   - concerns on stability of minor releases


   - tentatively this year


   - distro status update: still working to remove centos8/rhel8/ubuntu20
   from main. https://github.com/ceph/ceph/pull/53901 stalled on container
   stuff in teuthology, and the need to rebuild containers with centos9 base


   - discuss the quincy/dashboard-v3 backports? was tabled from 11/1
   [postponed to Nov 22]


   - Docs (Zac): CQ January 2024
   https://pad.ceph.com/p/ceph_quarterly_2024_01


   - Unittestability of dencoding: ask for review --
   
https://trello.com/c/R0h47dq2/870-unittestability-of-dencoding#comment-65539327d7112bc652ce43aa


   - User + Dev meeting tomorrow


   - Need RGW representatives and people with EC profile knowledge


Kind Regards,

Ernesto
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: iSCSI GW trusted IPs

2023-11-15 Thread Brent Kennedy
I just setup iscisi on a reef cluster and I couldn’t add targets properly until 
I put in the username and password entered for the gateways via the "Discovery 
Authentication" button at the top of the targets page in the iscsi area.  I 
don’t remember if the quincy console had that though.  In my previous setup, it 
was something you entered through the command line.

-Brent

-Original Message-
From: Ramon Orrù  
Sent: Wednesday, November 15, 2023 6:27 AM
To: ceph-users@ceph.io
Subject: [ceph-users] iSCSI GW trusted IPs

Hi,
I’m configuring  the  iSCSI GW services on a quincy  17.2.3 cluster.

I brought almost everything up and running (using cephadm), but I’m stuck in a 
configuration detail:

if I check the gateway status in the   Block -> iSCSI -> Overview section of 
the dashboard, they’re showing “Down” status, while the gateways are actually 
running. It makes me think the mgr is not able to talk with iSCSI APIs in order 
to collect info on the gateways, despite I correctly added my mgr hosts IPs to 
the trusted_ip_list parameter in my iscsi service definition yaml.

While further checking the gateway logs I found some messages like: 

debug :::172.17.17.22 - - [15/Nov/2023 10:54:05] "GET 
/api/config?decrypt_passwords=True HTTP/1.1" 200 - debug :::172.17.17.22 - 
- [15/Nov/2023 10:54:05] "GET /api/_ping HTTP/1.1" 200 - debug 
:::172.17.17.22 - - [15/Nov/2023 10:54:05] "GET /api/gatewayinfo HTTP/1.1" 
200 -

Just after I reload the dashboard page. So I tried to add the 172.17.17.22 IP 
address to trusted_ip_list and it worked: iSCSI gateways status went green and 
Up on the dashboard.
It sounds to me like it's some container private network address, but I can’t 
find any evidence of it when inspecting the containers cephadm spawned.

My question is: how can I identify the IPs I need to make the iSCSI gateways 
properly reachable? I tried to add the whole  172.16.0.0/24 private class but 
no luck , the iscsi container starts but is not allowing  172.17.17.22 to 
access the APIs.

Thanks in advance

regards

Ramon


___
ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to 
ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: migrate wal/db to block device

2023-11-15 Thread Chris Dunlop

Hi Igor,

On Wed, Nov 15, 2023 at 12:30:57PM +0300, Igor Fedotov wrote:

Hi Chris,

haven't checked you actions thoroughly but migration to be done on a 
down OSD which is apparently not the case here.


May be that's a culprit and we/you somehow missed the relevant error 
during the migration process?


The migration was done with the container still running, but the osd 
process was stopped within the container, like so:


$ cephadm  unit --fsid ${fsid} --name osd.${osdid} stop

I've confirmed that command indeed stops the ceph-osd process.

I restored the tags on both the db and block LVs (the db LV had all it's 
tags removed, and the block LV had the db_device and db_uuid tags removed 
during the previous "lvm migrate" attempt) and confirmed "ceph-volume lvm 
list" then returned the same as before the previous "lvm migrate" attempt.  
(I'm pretty sure "ceph-volume lvm list" just reads the tags direct from 
the LVs and presents them in a formatted output.)


I then tried the migrate again, this time stopping the container before 
the migrate:


$ systemctl stop "${osd_service}"
$ cephadm shell --fsid "${fsid}" --name "osd.${osd}" -- \
  ceph-volume lvm migrate --osd-id "${osd}" --osd-fsid "${osd_fsid}" \
  --from db wal --target "${vg_lv}"
$ systemctl start "${osd_service}"

Unfortunately that had precisely the same result:

- "lsof" shows the new osd process still has the original fast wal/db 
  device open

- "iostat" shows this device is still getting i/o
- both "ceph-volume lvm list" and "lvs -o tag" show all the tags have been 
  removed from the db device, and the db_device and db_uuid tags have been 
  removed from the block device.


Notably, whilst the "lvm migrate" is running, "iostat" on the db device 
shows very high read activity (and no write activity), so it's certainly 
reading whatever is on there, presumably to copy the data to the block 
device.


However even after the migrate something is making the osd start up with 
the original db device rather than using the block device for the db.


Any ideas?

Cheers,

Chris
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] iSCSI GW trusted IPs

2023-11-15 Thread Ramon Orrù
Hi, 
I’m configuring  the  iSCSI GW services on a quincy  17.2.3 cluster.

I brought almost everything up and running (using cephadm), but I’m stuck in a 
configuration detail:

if I check the gateway status in the   Block -> iSCSI -> Overview section of 
the dashboard, they’re showing “Down” status, while the gateways are actually 
running. It makes me think the mgr is not able to talk with iSCSI APIs in order 
to collect info on the gateways, despite I correctly added my mgr hosts IPs to 
the trusted_ip_list parameter in my iscsi service definition yaml.

While further checking the gateway logs I found some messages like: 

debug :::172.17.17.22 - - [15/Nov/2023 10:54:05] "GET 
/api/config?decrypt_passwords=True HTTP/1.1" 200 -
debug :::172.17.17.22 - - [15/Nov/2023 10:54:05] "GET /api/_ping HTTP/1.1" 
200 -
debug :::172.17.17.22 - - [15/Nov/2023 10:54:05] "GET /api/gatewayinfo 
HTTP/1.1" 200 -

Just after I reload the dashboard page. So I tried to add the 172.17.17.22 IP 
address to trusted_ip_list and it worked: iSCSI gateways status went green and 
Up on the dashboard.
It sounds to me like it's some container private network address, but I can’t 
find any evidence of it when inspecting the containers cephadm spawned.

My question is: how can I identify the IPs I need to make the iSCSI gateways 
properly reachable? I tried to add the whole  172.16.0.0/24 private class but 
no luck , the iscsi container starts but is not allowing  172.17.17.22 to 
access the APIs.

Thanks in advance

regards

Ramon


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] planning upgrade from pacific to quincy

2023-11-15 Thread Simon Oosthoek

Hi All

after a long while being in health_err state (due to an unfound object, 
which we eventually decided to "forget"), we are now planning to upgrade 
our cluster which is running Pacific (at least on the mons/mdss/osds, 
the gateways are by accident running quincy already). The installation 
is via packages from ceph.com, unless it's quincy from ubuntu.


ceph versions:
"mon": {"ceph version 16.2.13 
(5378749ba6be3a0868b51803968ee9cde4833a3e) pacific (stable)": 3},
"mgr": {"ceph version 16.2.13 
(5378749ba6be3a0868b51803968ee9cde4833a3e) pacific (stable)": 3},
"osd": {"ceph version 16.2.13 
(5378749ba6be3a0868b51803968ee9cde4833a3e) pacific (stable)": 252,
 "ceph version 16.2.14 (238ba602515df21ea7ffc75c88db29f9e5ef12c9) 
pacific (stable)": 12 },
"mds": { "ceph version 16.2.13 
(5378749ba6be3a0868b51803968ee9cde4833a3e) pacific (stable)": 2 },
"rgw": {"ceph version 17.2.6 
(d7ff0d10654d2280e08f1ab989c7cdf3064446a5) quincy (stable)": 8 },
"overall": {"ceph version 16.2.13 
(5378749ba6be3a0868b51803968ee9cde4833a3e) pacific (stable)": 260,
"ceph version 16.2.14 
(238ba602515df21ea7ffc75c88db29f9e5ef12c9) pacific (stable)": 12,
"ceph version 17.2.6 (d7ff0d10654d2280e08f1ab989c7cdf3064446a5) 
quincy (stable)": 8 }


The OS on the mons and mdss are still ubuntu 18.04, the osds are a mix 
of ubuntu 18 and ubuntu 20. The gateways are ubuntu 22.04, which is why 
these are already on quincy.


The plan is to move to quincy and eventually cephadm/containered ceph, 
since that is apparently "the way to go", though I have my doubts.


The steps we think are the right order are:
- reinstall the mons with ubuntu 22.04 + quincy
- reinstall the osds (same)
- reinstall the mdss (same)

Once this is up and running, we want to investigate and migrate to 
cephadm orchestration.


Alternative appear to be: move to orchestration first and then upgrade 
ceph to quincy (possibly skipping the ubuntu upgrade?)


Another alternative could be to upgrade to quincy on ubuntu 18.04 using 
packages, but I haven't investigated the availability of quincy packages 
for ubuntu 18.04 (which is out of free (LTS) support by canonical)


Cheers

/Simon
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: migrate wal/db to block device

2023-11-15 Thread Eugen Block
Oh right, I responded from my mobile phone and missed the examples.  
Thanks for the clarification!

OP did stop the OSD according to his output:


$ cephadm  unit --fsid ${fsid} --name osd.${osdid} stop


But there might have been an error anyway, I guess.

Zitat von Igor Fedotov :


Hi Eugen,

this scenario is supported, see the last example on the relevant doc page:

Moves BlueFS data from main, DB and WAL devices to main device, WAL  
and DB are removed:


ceph-volume  lvm  migrate  --osd-id  1  --osd-fsid--from   
db  wal  --target  vgname/data



Thanks,
Igor

On 11/15/2023 11:20 AM, Eugen Block wrote:

Hi,

AFAIU, you can’t migrate back to the slow device. It’s either  
migrating from the slow device to a fast device or remove between  
fast devices. I’m not aware that your scenario was considered in  
that tool. The docs don’t specifically say that, but they also  
don’t mention going back to slow device only. Someone please  
correct me, but I’d say you’ll have to rebuild that OSD to detach  
it from the fast device.


Regards,
Eugen

Zitat von Chris Dunlop :


Hi,

What's the correct way to migrate an OSD wal/db from a fast device  
to the (slow) block device?


I have an osd with wal/db on a fast LV device and block on a slow  
LV device. I want to move the wal/db onto the block device so I  
can reconfigure the fast device before moving the wal/db back to  
the fast device.


This link says to use "ceph-volume lvm migrate" (I'm on pacific,  
but the quincy and reef docs are the same):


https://docs.ceph.com/en/pacific/ceph-volume/lvm/migrate/

I tried:

$ cephadm  unit --fsid ${fsid} --name osd.${osdid} stop
$ cephadm shell --fsid ${fsid} --name osd.${osdid} -- \
  ceph-volume lvm migrate --osd-id ${osdid} --osd-fsid ${osd_fsid} \
  --from db wal --target ${block_vglv}
$ systemctl stop ${osd_service}
$ systemctl start ${osd_service}

"cephadm ceph-volume lvm list" now shows only the (slow) block  
device whereas before the migrate it was showing both the block  
and db devices.  However "lsof" shows the new osd process still  
has the original fast wal/db device open and "iostat" shows this  
device is still getting i/o.


Also:

$ ls -l /var/lib/ceph/${fsid}/osd.${osdid}/block*

...shows both the "block" and "block.db" symlinks to the original  
separate devices.


And there are now no lv_tags on the original wal/db LV:

$ lvs -o lv_tags ${original_db_vg_lv}

Now I'm concerned there's device mismatch for this osd: "cephadm  
ceph-volume lvm list" believes there's no separate wal/db, but the  
osd is currently *using* the original separate wal/db.


I guess if the server were to restart this osd would be in all  
sorts of trouble.


What's going on there, and what can be done to fix it?  Is it a  
matter of recreating the tags on the original db device?  (But  
then what happens to whatever did get migrated to the block device  
- e.g. is that space lost?)
Or is it a matter of using ceph-bluestore-tool to do a  
bluefs-bdev-migrate, e.g. something like:


$ cephadm  unit --fsid ${fsid} --name osd.${osdid} stop
$ osddir=/var/lib/ceph/osd/ceph-${osdid}
$ cephadm shell --fsid ${fsid} --name osd.${osdid} -- \
  ceph-bluestore-tool --path ${osddir} --devs-source ${osddir}/block.db \
  --dev-target ${osddir}/block bluefs-bdev-migrate
$ rm /var/lib/ceph/${fsid}/osd.${osdid}/block.db
$ systemctl stop ${osd_service}
$ systemctl start ${osd_service}

Or... something else?


And how *should* moving the wal/db be done?

Cheers,

Chris
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


--
Igor Fedotov
Ceph Lead Developer

Looking for help with your Ceph cluster? Contact us athttps://croit.io

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263
Web:https://croit.io  | YouTube:https://goo.gl/PGE1Bx



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] How to configure something like osd_deep_scrub_min_interval?

2023-11-15 Thread Frank Schilder
Hi folks,

I am fighting a bit with odd deep-scrub behavior on HDDs and discovered a 
likely cause of why the distribution of last_deep_scrub_stamps is so weird. I 
wrote a small script to extract a histogram of scrubs by "days not scrubbed" 
(more precisely, intervals not scrubbed; see code) to find out how (deep-) 
scrub times are distributed. Output below.

What I expected is along the lines that HDD-OSDs try to scrub every 1-3 days, 
while they try to deep-scrub every 7-14 days. In other words, OSDs that have 
been deep-scrubbed within the last 7 days would *never* be in scrubbing+deep 
state. However, what I see is completely different. There seems to be no 
distinction between scrub- and deep-scrub start times. This is really 
unexpected as nobody would try to deep-scrub HDDs every day. Weekly to 
bi-weekly is normal, specifically for large drives.

Is there a way to configure something like osd_deep_scrub_min_interval (no, I 
don't want to run cron jobs for scrubbing yet)? In the output below, I would 
like to be able to configure a minimum period of 1-2 weeks before the next 
deep-scrub happens. How can I do that?

The observed behavior is very unusual for RAID systems (if its not a bug in the 
report script). With this behavior its not surprising that people complain 
about "not deep-scrubbed in time" messages and too high deep-scrub IO load when 
such a large percentage of OSDs is needlessly deep-scrubbed after 1-6 days 
again already.

Sample output:

# scrub-report 
dumped pgs

Scrub report:
   4121 PGs not scrubbed since  1 intervals (6h)
   3831 PGs not scrubbed since  2 intervals (6h)
   4012 PGs not scrubbed since  3 intervals (6h)
   3986 PGs not scrubbed since  4 intervals (6h)
   2998 PGs not scrubbed since  5 intervals (6h)
   1488 PGs not scrubbed since  6 intervals (6h)
909 PGs not scrubbed since  7 intervals (6h)
771 PGs not scrubbed since  8 intervals (6h)
582 PGs not scrubbed since  9 intervals (6h) 2 scrubbing
431 PGs not scrubbed since 10 intervals (6h)
333 PGs not scrubbed since 11 intervals (6h) 1 scrubbing
265 PGs not scrubbed since 12 intervals (6h)
195 PGs not scrubbed since 13 intervals (6h)
116 PGs not scrubbed since 14 intervals (6h)
 78 PGs not scrubbed since 15 intervals (6h) 1 scrubbing
 72 PGs not scrubbed since 16 intervals (6h)
 37 PGs not scrubbed since 17 intervals (6h)
  5 PGs not scrubbed since 18 intervals (6h) 14.237* 19.5cd* 19.12cc* 
19.1233* 14.40e*
 33 PGs not scrubbed since 20 intervals (6h)
 23 PGs not scrubbed since 21 intervals (6h)
 16 PGs not scrubbed since 22 intervals (6h)
 12 PGs not scrubbed since 23 intervals (6h)
  8 PGs not scrubbed since 24 intervals (6h)
  2 PGs not scrubbed since 25 intervals (6h) 19.eef* 19.bb3*
  4 PGs not scrubbed since 26 intervals (6h) 19.b4c* 19.10b8* 19.f13* 
14.1ed*
  5 PGs not scrubbed since 27 intervals (6h) 19.43f* 19.231* 19.1dbe* 
19.1788* 19.16c0*
  6 PGs not scrubbed since 28 intervals (6h)
  2 PGs not scrubbed since 30 intervals (6h) 19.10f6* 14.9d*
  3 PGs not scrubbed since 31 intervals (6h) 19.1322* 19.1318* 8.a*
  1 PGs not scrubbed since 32 intervals (6h) 19.133f*
  1 PGs not scrubbed since 33 intervals (6h) 19.1103*
  3 PGs not scrubbed since 36 intervals (6h) 19.19cc* 19.12f4* 19.248*
  1 PGs not scrubbed since 39 intervals (6h) 19.1984*
  1 PGs not scrubbed since 41 intervals (6h) 14.449*
  1 PGs not scrubbed since 44 intervals (6h) 19.179f*

Deep-scrub report:
   3723 PGs not deep-scrubbed since  1 intervals (24h)
   4621 PGs not deep-scrubbed since  2 intervals (24h) 8 scrubbing+deep
   3588 PGs not deep-scrubbed since  3 intervals (24h) 8 scrubbing+deep
   2929 PGs not deep-scrubbed since  4 intervals (24h) 3 scrubbing+deep
   1705 PGs not deep-scrubbed since  5 intervals (24h) 4 scrubbing+deep
   1904 PGs not deep-scrubbed since  6 intervals (24h) 5 scrubbing+deep
   1540 PGs not deep-scrubbed since  7 intervals (24h) 7 scrubbing+deep
   1304 PGs not deep-scrubbed since  8 intervals (24h) 7 scrubbing+deep
923 PGs not deep-scrubbed since  9 intervals (24h) 5 scrubbing+deep
557 PGs not deep-scrubbed since 10 intervals (24h) 7 scrubbing+deep
501 PGs not deep-scrubbed since 11 intervals (24h) 2 scrubbing+deep
363 PGs not deep-scrubbed since 12 intervals (24h) 2 scrubbing+deep
377 PGs not deep-scrubbed since 13 intervals (24h) 1 scrubbing+deep
383 PGs not deep-scrubbed since 14 intervals (24h) 2 scrubbing+deep
252 PGs not deep-scrubbed since 15 intervals (24h) 2 scrubbing+deep
116 PGs not deep-scrubbed since 16 intervals (24h) 5 scrubbing+deep
 47 PGs not deep-scrubbed since 17 intervals (24h) 2 scrubbing+deep
 10 PGs not deep-scrubbed since 18 intervals (24h)
  2 PGs not deep-scrubbed since 19 intervals (24h) 19.1c6c* 19.a01*
  1 PGs not deep-scrubbed since 20 intervals (24h) 14.1ed*
  2 PGs not deep-scrubbed since 21 intervals (24h) 1

[ceph-users] Re: migrate wal/db to block device

2023-11-15 Thread Igor Fedotov

Hi Chris,

haven't checked you actions thoroughly but migration to be done on a 
down OSD which is apparently not the case here.


May be that's a culprit and we/you somehow missed the relevant error 
during the migration process?



Thanks,

Igor

On 11/15/2023 5:33 AM, Chris Dunlop wrote:

Hi,

What's the correct way to migrate an OSD wal/db from a fast device to 
the (slow) block device?


I have an osd with wal/db on a fast LV device and block on a slow LV 
device. I want to move the wal/db onto the block device so I can 
reconfigure the fast device before moving the wal/db back to the fast 
device.


This link says to use "ceph-volume lvm migrate" (I'm on pacific, but 
the quincy and reef docs are the same):


https://docs.ceph.com/en/pacific/ceph-volume/lvm/migrate/

I tried:

$ cephadm  unit --fsid ${fsid} --name osd.${osdid} stop
$ cephadm shell --fsid ${fsid} --name osd.${osdid} -- \
  ceph-volume lvm migrate --osd-id ${osdid} --osd-fsid ${osd_fsid} \
  --from db wal --target ${block_vglv}
$ systemctl stop ${osd_service}
$ systemctl start ${osd_service}

"cephadm ceph-volume lvm list" now shows only the (slow) block device 
whereas before the migrate it was showing both the block and db 
devices.  However "lsof" shows the new osd process still has the 
original fast wal/db device open and "iostat" shows this device is 
still getting i/o.


Also:

$ ls -l /var/lib/ceph/${fsid}/osd.${osdid}/block*

...shows both the "block" and "block.db" symlinks to the original 
separate devices.


And there are now no lv_tags on the original wal/db LV:

$ lvs -o lv_tags ${original_db_vg_lv}

Now I'm concerned there's device mismatch for this osd: "cephadm 
ceph-volume lvm list" believes there's no separate wal/db, but the osd 
is currently *using* the original separate wal/db.


I guess if the server were to restart this osd would be in all sorts 
of trouble.


What's going on there, and what can be done to fix it?  Is it a matter 
of recreating the tags on the original db device?  (But then what 
happens to whatever did get migrated to the block device - e.g. is 
that space lost?)
Or is it a matter of using ceph-bluestore-tool to do a 
bluefs-bdev-migrate, e.g. something like:


$ cephadm  unit --fsid ${fsid} --name osd.${osdid} stop
$ osddir=/var/lib/ceph/osd/ceph-${osdid}
$ cephadm shell --fsid ${fsid} --name osd.${osdid} -- \
  ceph-bluestore-tool --path ${osddir} --devs-source ${osddir}/block.db \
  --dev-target ${osddir}/block bluefs-bdev-migrate
$ rm /var/lib/ceph/${fsid}/osd.${osdid}/block.db
$ systemctl stop ${osd_service}
$ systemctl start ${osd_service}

Or... something else?


And how *should* moving the wal/db be done?

Cheers,

Chris
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


--
Igor Fedotov
Ceph Lead Developer

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263
Web: https://croit.io | YouTube: https://goo.gl/PGE1Bx
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: migrate wal/db to block device

2023-11-15 Thread Igor Fedotov

Hi Eugen,

this scenario is supported, see the last example on the relevant doc page:

Moves BlueFS data from main, DB and WAL devices to main device, WAL and 
DB are removed:


ceph-volume  lvm  migrate  --osd-id  1  --osd-fsid--from  db  wal  
--target  vgname/data


Thanks,
Igor

On 11/15/2023 11:20 AM, Eugen Block wrote:

Hi,

AFAIU, you can’t migrate back to the slow device. It’s either 
migrating from the slow device to a fast device or remove between fast 
devices. I’m not aware that your scenario was considered in that tool. 
The docs don’t specifically say that, but they also don’t mention 
going back to slow device only. Someone please correct me, but I’d say 
you’ll have to rebuild that OSD to detach it from the fast device.


Regards,
Eugen

Zitat von Chris Dunlop :


Hi,

What's the correct way to migrate an OSD wal/db from a fast device to 
the (slow) block device?


I have an osd with wal/db on a fast LV device and block on a slow LV 
device. I want to move the wal/db onto the block device so I can 
reconfigure the fast device before moving the wal/db back to the fast 
device.


This link says to use "ceph-volume lvm migrate" (I'm on pacific, but 
the quincy and reef docs are the same):


https://docs.ceph.com/en/pacific/ceph-volume/lvm/migrate/

I tried:

$ cephadm  unit --fsid ${fsid} --name osd.${osdid} stop
$ cephadm shell --fsid ${fsid} --name osd.${osdid} -- \
  ceph-volume lvm migrate --osd-id ${osdid} --osd-fsid ${osd_fsid} \
  --from db wal --target ${block_vglv}
$ systemctl stop ${osd_service}
$ systemctl start ${osd_service}

"cephadm ceph-volume lvm list" now shows only the (slow) block device 
whereas before the migrate it was showing both the block and db 
devices.  However "lsof" shows the new osd process still has the 
original fast wal/db device open and "iostat" shows this device is 
still getting i/o.


Also:

$ ls -l /var/lib/ceph/${fsid}/osd.${osdid}/block*

...shows both the "block" and "block.db" symlinks to the original 
separate devices.


And there are now no lv_tags on the original wal/db LV:

$ lvs -o lv_tags ${original_db_vg_lv}

Now I'm concerned there's device mismatch for this osd: "cephadm 
ceph-volume lvm list" believes there's no separate wal/db, but the 
osd is currently *using* the original separate wal/db.


I guess if the server were to restart this osd would be in all sorts 
of trouble.


What's going on there, and what can be done to fix it?  Is it a 
matter of recreating the tags on the original db device?  (But then 
what happens to whatever did get migrated to the block device - e.g. 
is that space lost?)
Or is it a matter of using ceph-bluestore-tool to do a 
bluefs-bdev-migrate, e.g. something like:


$ cephadm  unit --fsid ${fsid} --name osd.${osdid} stop
$ osddir=/var/lib/ceph/osd/ceph-${osdid}
$ cephadm shell --fsid ${fsid} --name osd.${osdid} -- \
  ceph-bluestore-tool --path ${osddir} --devs-source 
${osddir}/block.db \

  --dev-target ${osddir}/block bluefs-bdev-migrate
$ rm /var/lib/ceph/${fsid}/osd.${osdid}/block.db
$ systemctl stop ${osd_service}
$ systemctl start ${osd_service}

Or... something else?


And how *should* moving the wal/db be done?

Cheers,

Chris
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


--
Igor Fedotov
Ceph Lead Developer

Looking for help with your Ceph cluster? Contact us athttps://croit.io

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263
Web:https://croit.io  | YouTube:https://goo.gl/PGE1Bx
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: RGW: user modify default_storage_class does not work

2023-11-15 Thread Huy Nguyen
Thanks for your reply. You are right, newly-created bucket will now have 
"placement_rule": "default-placement/COLD". But then I have another question 
that can we specify the default storage class when creating a new bucket? I 
found a way to set placement but not with storage class:

s3.create_bucket(Bucket=bucket_name, 
CreateBucketConfiguration={'LocationConstraint': 'default:default-placement'})
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Stretch mode size

2023-11-15 Thread Eugen Block
No it’s not too late, it will take some time till we get there. So  
thanks for the additional input, I am aware of the MON communication.


Zitat von Sake Ceph :

Don't forget with stretch mode, osds only communicate with mons in  
the same DC and the tiebreaker only communicate with the other mons  
(to prevent split brain scenarios).


Little late response, but I wanted you to know this :)



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: migrate wal/db to block device

2023-11-15 Thread Eugen Block

Hi,

AFAIU, you can’t migrate back to the slow device. It’s either  
migrating from the slow device to a fast device or remove between fast  
devices. I’m not aware that your scenario was considered in that tool.  
The docs don’t specifically say that, but they also don’t mention  
going back to slow device only. Someone please correct me, but I’d say  
you’ll have to rebuild that OSD to detach it from the fast device.


Regards,
Eugen

Zitat von Chris Dunlop :


Hi,

What's the correct way to migrate an OSD wal/db from a fast device  
to the (slow) block device?


I have an osd with wal/db on a fast LV device and block on a slow LV  
device. I want to move the wal/db onto the block device so I can  
reconfigure the fast device before moving the wal/db back to the  
fast device.


This link says to use "ceph-volume lvm migrate" (I'm on pacific, but  
the quincy and reef docs are the same):


https://docs.ceph.com/en/pacific/ceph-volume/lvm/migrate/

I tried:

$ cephadm  unit --fsid ${fsid} --name osd.${osdid} stop
$ cephadm shell --fsid ${fsid} --name osd.${osdid} -- \
  ceph-volume lvm migrate --osd-id ${osdid} --osd-fsid ${osd_fsid} \
  --from db wal --target ${block_vglv}
$ systemctl stop ${osd_service}
$ systemctl start ${osd_service}

"cephadm ceph-volume lvm list" now shows only the (slow) block  
device whereas before the migrate it was showing both the block and  
db devices.  However "lsof" shows the new osd process still has the  
original fast wal/db device open and "iostat" shows this device is  
still getting i/o.


Also:

$ ls -l /var/lib/ceph/${fsid}/osd.${osdid}/block*

...shows both the "block" and "block.db" symlinks to the original  
separate devices.


And there are now no lv_tags on the original wal/db LV:

$ lvs -o lv_tags ${original_db_vg_lv}

Now I'm concerned there's device mismatch for this osd: "cephadm  
ceph-volume lvm list" believes there's no separate wal/db, but the  
osd is currently *using* the original separate wal/db.


I guess if the server were to restart this osd would be in all sorts  
of trouble.


What's going on there, and what can be done to fix it?  Is it a  
matter of recreating the tags on the original db device?  (But then  
what happens to whatever did get migrated to the block device - e.g.  
is that space lost?)
Or is it a matter of using ceph-bluestore-tool to do a  
bluefs-bdev-migrate, e.g. something like:


$ cephadm  unit --fsid ${fsid} --name osd.${osdid} stop
$ osddir=/var/lib/ceph/osd/ceph-${osdid}
$ cephadm shell --fsid ${fsid} --name osd.${osdid} -- \
  ceph-bluestore-tool --path ${osddir} --devs-source ${osddir}/block.db \
  --dev-target ${osddir}/block bluefs-bdev-migrate
$ rm /var/lib/ceph/${fsid}/osd.${osdid}/block.db
$ systemctl stop ${osd_service}
$ systemctl start ${osd_service}

Or... something else?


And how *should* moving the wal/db be done?

Cheers,

Chris
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Stretch mode size

2023-11-15 Thread Sake Ceph
Don't forget with stretch mode, osds only communicate with mons in the same DC 
and the tiebreaker only communicate with the other mons (to prevent split brain 
scenarios).

Little late response, but I wanted you to know this :)
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io