[ceph-users] Re: Quincy: failure to enable mgr rgw module if not --force

2023-10-25 Thread Eugen Block

Hi,

just gave it a shot on Reef where the commands are available after  
enabling the module. It seems to work, but I did just a few tests like  
creating a realm, zonegroup and zone. This bootstrapped 2 rgw daemons  
and created 3 pools (.log, .control, .meta). But then the MGRs started  
to respawn every minute or so, so I removed the cluster. But you can  
do basically the same with 'ceph orch apply rgw ...' command [1] to  
create a realm, zonegroup and zone, or just use a spec file. I don't  
see a real benefit of using the rgw module here, but as I said, I did  
only some minimal testing.


Regards,
Eugen

[1] https://docs.ceph.com/en/quincy/cephadm/services/rgw/

Zitat von Michel Jouvin :


Hi,

I'm trying to use the rgw mgr module to configure RGWs.  
Unfortunately it is not present in 'ceph mgr module ls' list and any  
attempt to enable it suggests that one mgr doesn't support it and  
that --force should be added. Adding --force effectively enabled it.


It is strange as it is a brand new cluster, created in Quincy, using  
cephadm. Why this need for --force? And it seems that even if the  
module is listed as enabled, the 'ceph rgw' command is not  
recognized and the help is not available for the rgw subcommand?  
What are we doing wrong?


Cheers,

Michel
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: [EXTERNAL] [Pacific] ceph orch device ls do not returns any HDD

2023-10-25 Thread Patrick Begou

Hi,

Running git pull this morning I saw the patch on the main branch and try 
to compile it but it fails with cython for rbd.pyx. I have many similar 
errors:


rbd.pyx:760:44: Cannot assign type 'int (*)(uint64_t, uint64_t, void *) 
except? -1' to 'librbd_progress_fn_t'. Exception values are 
incompatible. Suggest adding 'noexcept' to type 'int (uint64_t, 
uint64_t, void *) except? -1'.
rbd.pyx:763:23: Cannot assign type 'int (*)(uint64_t, uint64_t, void *) 
except? -1 nogil' to 'librbd_progress_fn_t'. Exception values are 
incompatible. Suggest adding 'noexcept' to type 'int (uint64_t, 
uint64_t, void *) except? -1 nogil'.
rbd.pyx:868:44: Cannot assign type 'int (*)(uint64_t, uint64_t, void *) 
except? -1' to 'librbd_progress_fn_t'. Exception values are 
incompatible. Suggest adding 'noexcept' to type 'int (uint64_t, 
uint64_t, void *) except? -1'.



I don't know cython at all.

I've juste run
./install-deps.sh
./do_cmake.sh
cd build
ninja

# gcc --version
gcc (GCC) 11.2.1 20220127 (Red Hat 11.2.1-9)

Any suggestion ?

Thanks

Patrick

Le 24/10/2023 à 22:43, Zack Cerza a écrit :

That's correct - it's the removable flag that's causing the disks to
be excluded.

I actually just merged this PR last week:
https://github.com/ceph/ceph/pull/49954

One of the changes it made was to enable removable (but not USB)
devices, as there are vendors that report hot-swappable drives as
removable. Patrick, it looks like this may resolve your issue as well.


On Tue, Oct 24, 2023 at 5:57 AM Eugen Block  wrote:

Hi,


May be because they are hot-swappable hard drives.

yes, that's my assumption as well.


Zitat von Patrick Begou :


Hi Eugen,

Yes Eugen, all the devices /dev/sd[abc] have the removable flag set
to 1. May be because they are hot-swappable hard drives.

I have contacted the commit author Zack Cerza and he asked me for
some additional tests too this morning. I add him in copy to this
mail.

Patrick

Le 24/10/2023 à 12:57, Eugen Block a écrit :

Hi,

just to confirm, could you check that the disk which is *not*
discovered by 16.2.11 has a "removable" flag?

cat /sys/block/sdX/removable

I could reproduce it as well on a test machine with a USB thumb
drive (live distro) which is excluded in 16.2.11 but is shown in
16.2.10. Although I'm not a developer I tried to understand what
changes were made in
https://github.com/ceph/ceph/pull/46375/files#diff-330f9319b0fe352dff0486f66d3c4d6a6a3d48efd900b2ceb86551cfd88dc4c4R771
 and there's this
line:


if get_file_contents(os.path.join(_sys_block_path, dev,
'removable')) == "1":
continue

The thumb drive is removable, of course, apparently that is filtered here.

Regards,
Eugen

Zitat von Patrick Begou :


Le 23/10/2023 à 03:04, 544463...@qq.com a écrit :

I think you can try to roll back this part of the python code and
wait for your good news :)


Not so easy 😕


[root@e9865d9a7f41 ceph]# git revert
4fc6bc394dffaf3ad375ff29cbb0a3eb9e4dbefc
Auto-merging src/ceph-volume/ceph_volume/tests/util/test_device.py
CONFLICT (content): Merge conflict in
src/ceph-volume/ceph_volume/tests/util/test_device.py
Auto-merging src/ceph-volume/ceph_volume/util/device.py
CONFLICT (content): Merge conflict in
src/ceph-volume/ceph_volume/util/device.py
Auto-merging src/ceph-volume/ceph_volume/util/disk.py
CONFLICT (content): Merge conflict in
src/ceph-volume/ceph_volume/util/disk.py
error: could not revert 4fc6bc394df... ceph-volume: Optionally
consume loop devices

Patrick
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io




___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Combining masks in ceph config

2023-10-25 Thread Frank Schilder
Hi all,

I have a case where I want to set options for a set of HDDs under a common 
sub-tree with root A. I have also HDDs in another disjoint sub-tree with root 
B. Therefore, I would like to do something like

ceph config set osd/class:hdd,datacenter:A option value

The above does not give a syntax error, but I'm also not sure it does the right 
thing. Does the above mean "class:hdd and datacenter:A" or does it mean "for 
OSDs with device class 'hdd,datacenter:A'"?

Thanks and best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph 16.2.14: pgmap updated every few seconds for no apparent reason

2023-10-25 Thread Eugen Block

Hi,

this setting is not as harmless as I assumed. There seem to be more  
ticks/periods/health_checks involved. When I choose a mgr_tick_period  
value > 30 seconds the two MGRs keep respawning. 30 seconds are the  
highest value that still seemed to work without MGR respawn, even with  
increased mon_mgr_beacon_grace (default 30 sec.). So if you decide to  
increase the mgr_tick_period don't go over 30 unless you find out what  
else you need to change.


Regards,
Eugen


Zitat von Eugen Block :


Hi,

you can change the report interval with this config option (default  
2 seconds):


$ ceph config get mgr mgr_tick_period
2

$ ceph config set mgr mgr_tick_period 10

Regards,
Eugen

Zitat von Chris Palmer :

I have just checked 2 quincy 17.2.6 clusters, and I see exactly the  
same. The pgmap version is bumping every two seconds (which ties in  
with the frequency you observed). Both clusters are healthy with  
nothing apart from client IO happening.


On 13/10/2023 12:09, Zakhar Kirpichenko wrote:

Hi,

I am investigating excessive mon writes in our cluster and wondering
whether excessive pgmap updates could be the culprit. Basically pgmap is
updated every few seconds, sometimes over ten times per minute, in a
healthy cluster with no OSD and/or PG changes:

Oct 13 11:03:03 ceph03 bash[4019]: cluster 2023-10-13T11:03:01.515438+
mgr.ceph01.vankui (mgr.336635131) 838252 : cluster [DBG] pgmap v606575:
2400 pgs: 5 active+clean+scrubbing+deep, 2395 active+clean; 16 TiB data, 61
TiB used, 716 TiB / 777 TiB avail; 60 MiB/s rd, 109 MiB/s wr, 5.65k op/s
Oct 13 11:03:04 ceph03 bash[4019]: cluster 2023-10-13T11:03:03.520953+
mgr.ceph01.vankui (mgr.336635131) 838253 : cluster [DBG] pgmap v606576:
2400 pgs: 5 active+clean+scrubbing+deep, 2395 active+clean; 16 TiB data, 61
TiB used, 716 TiB / 777 TiB avail; 64 MiB/s rd, 128 MiB/s wr, 5.76k op/s
Oct 13 11:03:06 ceph03 bash[4019]: cluster 2023-10-13T11:03:05.524474+
mgr.ceph01.vankui (mgr.336635131) 838255 : cluster [DBG] pgmap v606577:
2400 pgs: 5 active+clean+scrubbing+deep, 2395 active+clean; 16 TiB data, 61
TiB used, 716 TiB / 777 TiB avail; 64 MiB/s rd, 122 MiB/s wr, 5.57k op/s
Oct 13 11:03:08 ceph03 bash[4019]: cluster 2023-10-13T11:03:07.530484+
mgr.ceph01.vankui (mgr.336635131) 838256 : cluster [DBG] pgmap v606578:
2400 pgs: 5 active+clean+scrubbing+deep, 2395 active+clean; 16 TiB data, 61
TiB used, 716 TiB / 777 TiB avail; 79 MiB/s rd, 127 MiB/s wr, 6.62k op/s
Oct 13 11:03:10 ceph03 bash[4019]: cluster 2023-10-13T11:03:09.57+
mgr.ceph01.vankui (mgr.336635131) 838258 : cluster [DBG] pgmap v606579:
2400 pgs: 5 active+clean+scrubbing+deep, 2395 active+clean; 16 TiB data, 61
TiB used, 716 TiB / 777 TiB avail; 66 MiB/s rd, 104 MiB/s wr, 5.38k op/s
Oct 13 11:03:12 ceph03 bash[4019]: cluster 2023-10-13T11:03:11.537908+
mgr.ceph01.vankui (mgr.336635131) 838259 : cluster [DBG] pgmap v606580:
2400 pgs: 5 active+clean+scrubbing+deep, 2395 active+clean; 16 TiB data, 61
TiB used, 716 TiB / 777 TiB avail; 85 MiB/s rd, 121 MiB/s wr, 6.43k op/s
Oct 13 11:03:13 ceph03 bash[4019]: cluster 2023-10-13T11:03:13.543490+
mgr.ceph01.vankui (mgr.336635131) 838260 : cluster [DBG] pgmap v606581:
2400 pgs: 5 active+clean+scrubbing+deep, 2395 active+clean; 16 TiB data, 61
TiB used, 716 TiB / 777 TiB avail; 78 MiB/s rd, 127 MiB/s wr, 6.54k op/s
Oct 13 11:03:16 ceph03 bash[4019]: cluster 2023-10-13T11:03:15.547122+
mgr.ceph01.vankui (mgr.336635131) 838262 : cluster [DBG] pgmap v606582:
2400 pgs: 5 active+clean+scrubbing+deep, 2395 active+clean; 16 TiB data, 61
TiB used, 716 TiB / 777 TiB avail; 71 MiB/s rd, 122 MiB/s wr, 6.08k op/s
Oct 13 11:03:18 ceph03 bash[4019]: cluster 2023-10-13T11:03:17.553180+
mgr.ceph01.vankui (mgr.336635131) 838263 : cluster [DBG] pgmap v606583:
2400 pgs: 1 active+clean+scrubbing, 5 active+clean+scrubbing+deep, 2394
active+clean; 16 TiB data, 61 TiB used, 716 TiB / 777 TiB avail; 75 MiB/s
rd, 176 MiB/s wr, 6.83k op/s
Oct 13 11:03:20 ceph03 bash[4019]: cluster 2023-10-13T11:03:19.555960+
mgr.ceph01.vankui (mgr.336635131) 838264 : cluster [DBG] pgmap v606584:
2400 pgs: 1 active+clean+scrubbing, 5 active+clean+scrubbing+deep, 2394
active+clean; 16 TiB data, 61 TiB used, 716 TiB / 777 TiB avail; 58 MiB/s
rd, 161 MiB/s wr, 5.55k op/s
Oct 13 11:03:22 ceph03 bash[4019]: cluster 2023-10-13T11:03:21.560597+
mgr.ceph01.vankui (mgr.336635131) 838266 : cluster [DBG] pgmap v606585:
2400 pgs: 1 active+clean+scrubbing, 5 active+clean+scrubbing+deep, 2394
active+clean; 16 TiB data, 61 TiB used, 716 TiB / 777 TiB avail; 62 MiB/s
rd, 221 MiB/s wr, 6.19k op/s
Oct 13 11:03:24 ceph03 bash[4019]: cluster 2023-10-13T11:03:23.565974+
mgr.ceph01.vankui (mgr.336635131) 838267 : cluster [DBG] pgmap v606586:
2400 pgs: 1 active+clean+scrubbing, 5 active+clean+scrubbing+deep, 2394
active+clean; 16 TiB data, 61 TiB used, 716 TiB / 777 TiB avail; 50 MiB/s
rd, 246 MiB/s wr, 5.93k op/s
Oct 13 11:03:26 ceph03 bash[4019]: cluster 2023-10-13T11:03:25

[ceph-users] Re: Ceph 16.2.14: pgmap updated every few seconds for no apparent reason

2023-10-25 Thread Zakhar Kirpichenko
Thanks for the warning, Eugen.

/Z

On Wed, 25 Oct 2023 at 13:04, Eugen Block  wrote:

> Hi,
>
> this setting is not as harmless as I assumed. There seem to be more
> ticks/periods/health_checks involved. When I choose a mgr_tick_period
> value > 30 seconds the two MGRs keep respawning. 30 seconds are the
> highest value that still seemed to work without MGR respawn, even with
> increased mon_mgr_beacon_grace (default 30 sec.). So if you decide to
> increase the mgr_tick_period don't go over 30 unless you find out what
> else you need to change.
>
> Regards,
> Eugen
>
>
> Zitat von Eugen Block :
>
> > Hi,
> >
> > you can change the report interval with this config option (default
> > 2 seconds):
> >
> > $ ceph config get mgr mgr_tick_period
> > 2
> >
> > $ ceph config set mgr mgr_tick_period 10
> >
> > Regards,
> > Eugen
> >
> > Zitat von Chris Palmer :
> >
> >> I have just checked 2 quincy 17.2.6 clusters, and I see exactly the
> >> same. The pgmap version is bumping every two seconds (which ties in
> >> with the frequency you observed). Both clusters are healthy with
> >> nothing apart from client IO happening.
> >>
> >> On 13/10/2023 12:09, Zakhar Kirpichenko wrote:
> >>> Hi,
> >>>
> >>> I am investigating excessive mon writes in our cluster and wondering
> >>> whether excessive pgmap updates could be the culprit. Basically pgmap
> is
> >>> updated every few seconds, sometimes over ten times per minute, in a
> >>> healthy cluster with no OSD and/or PG changes:
> >>>
> >>> Oct 13 11:03:03 ceph03 bash[4019]: cluster
> 2023-10-13T11:03:01.515438+
> >>> mgr.ceph01.vankui (mgr.336635131) 838252 : cluster [DBG] pgmap v606575:
> >>> 2400 pgs: 5 active+clean+scrubbing+deep, 2395 active+clean; 16 TiB
> data, 61
> >>> TiB used, 716 TiB / 777 TiB avail; 60 MiB/s rd, 109 MiB/s wr, 5.65k
> op/s
> >>> Oct 13 11:03:04 ceph03 bash[4019]: cluster
> 2023-10-13T11:03:03.520953+
> >>> mgr.ceph01.vankui (mgr.336635131) 838253 : cluster [DBG] pgmap v606576:
> >>> 2400 pgs: 5 active+clean+scrubbing+deep, 2395 active+clean; 16 TiB
> data, 61
> >>> TiB used, 716 TiB / 777 TiB avail; 64 MiB/s rd, 128 MiB/s wr, 5.76k
> op/s
> >>> Oct 13 11:03:06 ceph03 bash[4019]: cluster
> 2023-10-13T11:03:05.524474+
> >>> mgr.ceph01.vankui (mgr.336635131) 838255 : cluster [DBG] pgmap v606577:
> >>> 2400 pgs: 5 active+clean+scrubbing+deep, 2395 active+clean; 16 TiB
> data, 61
> >>> TiB used, 716 TiB / 777 TiB avail; 64 MiB/s rd, 122 MiB/s wr, 5.57k
> op/s
> >>> Oct 13 11:03:08 ceph03 bash[4019]: cluster
> 2023-10-13T11:03:07.530484+
> >>> mgr.ceph01.vankui (mgr.336635131) 838256 : cluster [DBG] pgmap v606578:
> >>> 2400 pgs: 5 active+clean+scrubbing+deep, 2395 active+clean; 16 TiB
> data, 61
> >>> TiB used, 716 TiB / 777 TiB avail; 79 MiB/s rd, 127 MiB/s wr, 6.62k
> op/s
> >>> Oct 13 11:03:10 ceph03 bash[4019]: cluster
> 2023-10-13T11:03:09.57+
> >>> mgr.ceph01.vankui (mgr.336635131) 838258 : cluster [DBG] pgmap v606579:
> >>> 2400 pgs: 5 active+clean+scrubbing+deep, 2395 active+clean; 16 TiB
> data, 61
> >>> TiB used, 716 TiB / 777 TiB avail; 66 MiB/s rd, 104 MiB/s wr, 5.38k
> op/s
> >>> Oct 13 11:03:12 ceph03 bash[4019]: cluster
> 2023-10-13T11:03:11.537908+
> >>> mgr.ceph01.vankui (mgr.336635131) 838259 : cluster [DBG] pgmap v606580:
> >>> 2400 pgs: 5 active+clean+scrubbing+deep, 2395 active+clean; 16 TiB
> data, 61
> >>> TiB used, 716 TiB / 777 TiB avail; 85 MiB/s rd, 121 MiB/s wr, 6.43k
> op/s
> >>> Oct 13 11:03:13 ceph03 bash[4019]: cluster
> 2023-10-13T11:03:13.543490+
> >>> mgr.ceph01.vankui (mgr.336635131) 838260 : cluster [DBG] pgmap v606581:
> >>> 2400 pgs: 5 active+clean+scrubbing+deep, 2395 active+clean; 16 TiB
> data, 61
> >>> TiB used, 716 TiB / 777 TiB avail; 78 MiB/s rd, 127 MiB/s wr, 6.54k
> op/s
> >>> Oct 13 11:03:16 ceph03 bash[4019]: cluster
> 2023-10-13T11:03:15.547122+
> >>> mgr.ceph01.vankui (mgr.336635131) 838262 : cluster [DBG] pgmap v606582:
> >>> 2400 pgs: 5 active+clean+scrubbing+deep, 2395 active+clean; 16 TiB
> data, 61
> >>> TiB used, 716 TiB / 777 TiB avail; 71 MiB/s rd, 122 MiB/s wr, 6.08k
> op/s
> >>> Oct 13 11:03:18 ceph03 bash[4019]: cluster
> 2023-10-13T11:03:17.553180+
> >>> mgr.ceph01.vankui (mgr.336635131) 838263 : cluster [DBG] pgmap v606583:
> >>> 2400 pgs: 1 active+clean+scrubbing, 5 active+clean+scrubbing+deep, 2394
> >>> active+clean; 16 TiB data, 61 TiB used, 716 TiB / 777 TiB avail; 75
> MiB/s
> >>> rd, 176 MiB/s wr, 6.83k op/s
> >>> Oct 13 11:03:20 ceph03 bash[4019]: cluster
> 2023-10-13T11:03:19.555960+
> >>> mgr.ceph01.vankui (mgr.336635131) 838264 : cluster [DBG] pgmap v606584:
> >>> 2400 pgs: 1 active+clean+scrubbing, 5 active+clean+scrubbing+deep, 2394
> >>> active+clean; 16 TiB data, 61 TiB used, 716 TiB / 777 TiB avail; 58
> MiB/s
> >>> rd, 161 MiB/s wr, 5.55k op/s
> >>> Oct 13 11:03:22 ceph03 bash[4019]: cluster
> 2023-10-13T11:03:21.560597+
> >>> mgr.ceph01.vankui (mgr.336635131) 838266 : cluster [DBG] pgmap v606585:
> >>> 2400 pgs: 1 active

[ceph-users] cephadm failing to add hosts despite a working SSH connection

2023-10-25 Thread Michel Jouvin

Hi,

I'm struggling with a problem to add cephadm some hosts in our Quincy 
cluster. "ceph orch host add host addr" fails with the famous "missing 2 
required positional arguments: 'hostname' and 'addr'" because of bug 
https://tracker.ceph.com/issues/59081 but looking at cephadm messages 
with "ceph -W cephadm", I can see:




Log: Opening SSH connection to 10.81.22.183, port 22
[conn=736] Connected to SSH server at 10.81.22.183, port 22
[conn=736]   Local address: 10.81.22.151, port 53640
[conn=736]   Peer address: 10.81.22.183, port 22
[conn=736] Login timeout expired
[conn=736] Aborting connection
Traceback (most recent call last): (removed)
cephadm.ssh.HostConnectionError: Failed to connect to jc-rgw3 
(10.81.22.183). Login timeout expired

Log: Opening SSH connection to 10.81.22.183, port 22
[conn=736] Connected to SSH server at 10.81.22.183, port 22
[conn=736]   Local address: 10.81.22.151, port 53640
[conn=736]   Peer address: 10.81.22.183, port 22
[conn=736] Login timeout expired
[conn=736] Aborting connection


It is very strange for me because " ssh -i /tmp/cephadm_identity_xxx 
10.81.22.183" is working fine |when executed in the active mgr container.

|

|The host I'm trying to add is a RGW that has 3 active network 
connections: Ceph public network, our intranet network (used for 
managing the server) and the network of the application that will use 
the RGW. It seems to be somewhat related to this network configuration 
as main cluster servers (MONs, OSDs) which have only the the 2 Ceph 
networks and the intranet one don't suffer the same problem. In 
particular, what is strange is that I can successfully add the host if I 
use its intranet adress rather than the Ceph public network one 
(|||10.81.22.183) in the cephadm command.


I have 3 hosts sharing the same network configuration and having the 
same problem.


Any hint or suggestion to troubleshoot further this problem would be 
highly appreciated!


Best regards,

Michel
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: radosgw - octopus - 500 Bad file descriptor on upload

2023-10-25 Thread BEAUDICHON Hubert (Acoss)
Hi,
We encountered the same kind of error for one of our users.
CEPH Version : 16.2.10

2023-10-24T17:57:22.438+0200 7fc27ab44700  0 WARNING: set_req_state_err 
err_no=125 resorting to 500
2023-10-24T17:57:22.439+0200 7fc584957700  0 req 12200560481916573577 
143.735748291s ERROR: RESTFUL_IO(s)->complete_header() returned err=Bad file 
descriptor
2023-10-24T17:57:22.439+0200 7fbecfaed700  1 == req done req=0x7fbdb86ab600 
op status=-125 http_status=500 latency=143.735748291s ==
2023-10-24T17:57:22.439+0200 7fbecfaed700  1 beast: 0x7fbdb86ab600: 
10.227.131.117 - dev-centralog-save [24/Oct/2023:17:54:58.703 +0200] "PUT 
&partNumber=1 HTTP/1.1" 500 58720313 - "" - 
latency=143.735748291s

I haven't got any clue on the cause...
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Ceph Leadership Team notes 10/25

2023-10-25 Thread Dan van der Ster
Hi all,

Here are this week's notes from the CLT:

* Collective review of the Reef/Squid "State of Cephalopod" slides.
* Smoke test suite was unscheduled but it's back on now.
* Releases:
   * 17.2.7: about to start building last week, delayed by a few
issues (https://tracker.ceph.com/issues/63257,
https://tracker.ceph.com/issues/63305,
https://github.com/ceph/ceph/pull/54169). ceph_exporter test coverage
will be prioritized.
   * 18.2.1: all PRs in testing or merged.
* Ceph Board approved a new Foundation member tiers model, Silver,
Gold, Platinum, Diamond. Working on implementation with LF.

-- dan
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: radosgw - octopus - 500 Bad file descriptor on upload

2023-10-25 Thread David C.
Hi Hubert,

It's an error "125" (ECANCELED)  (and there may be many reasons for it).

I see a high latency (144sec), is the object big ?
No network problems ?


Cordialement,

*David CASIER*





Le mer. 25 oct. 2023 à 16:37, BEAUDICHON Hubert (Acoss) <
hubert.beaudic...@acoss.fr> a écrit :

> Hi,
> We encountered the same kind of error for one of our users.
> CEPH Version : 16.2.10
>
> 2023-10-24T17:57:22.438+0200 7fc27ab44700  0 WARNING: set_req_state_err
> err_no=125 resorting to 500
> 2023-10-24T17:57:22.439+0200 7fc584957700  0 req 12200560481916573577
> 143.735748291s ERROR: RESTFUL_IO(s)->complete_header() returned err=Bad
> file descriptor
> 2023-10-24T17:57:22.439+0200 7fbecfaed700  1 == req done
> req=0x7fbdb86ab600 op status=-125 http_status=500 latency=143.735748291s
> ==
> 2023-10-24T17:57:22.439+0200 7fbecfaed700  1 beast: 0x7fbdb86ab600:
> 10.227.131.117 - dev-centralog-save [24/Oct/2023:17:54:58.703 +0200] "PUT
> &partNumber=1 HTTP/1.1" 500 58720313 - "" -
> latency=143.735748291s
>
> I haven't got any clue on the cause...
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: How to trigger scrubbing in Ceph on-demand ?

2023-10-25 Thread Jayjeet Chakraborty
Hi Reto,

Thanks a lot for the instructions. I tried the same, but still couldn't
trigger scrubbing deterministically. The first time I initiated scrubbing,
I saw scrubbing status in ceph -s, but for subsequent times, I didn't see
any scrubbing status. Do you know what might be going on potentially? Any
ideas would be appreciated. Thanks.

Best Regards,
*Jayjeet Chakraborty*
Ph.D. Student
Department of Computer Science and Engineering
University of California, Santa Cruz
*Email: jayje...@ucsc.edu *


On Wed, Oct 18, 2023 at 7:47 AM Reto Gysi  wrote:

> Hi
>
> I haven't updated to reef yet. I've tried this on quincy.
>
> # create a testfile on cephfs.rgysi.data pool
> root@zephir:/home/rgysi/misc# echo cephtest123 > cephtest.txt
>
> #list inode of new file
> root@zephir:/home/rgysi/misc# ls -i cephtest.txt
> 1099518867574 cephtest.txt
>
> convert inode value to hex value
> root@zephir:/home/rgysi/misc# printf "%x" 1099518867574
> 16e7876
>
> # search for this value in the rados pool cephfs.rgysi.data, to find
> object(s)
> root@zephir:/home/rgysi/misc# rados -p cephfs.rgysi.data ls | grep
> 16e7876
> 16e7876.
>
> # find pg for the object
> root@zephir:/home/rgysi/misc# ceph osd map cephfs.rgysi.data
> 16e7876.
> osdmap e105365 pool 'cephfs.rgysi.data' (25) object '16e7876.'
> -> pg 25.ee1befa1 (25.1) -> up ([0,2,8], p0) acting ([0,2,8], p0)
>
> #Initiate a deep-scrub for this pg
> root@zephir:/home/rgysi/misc# ceph pg deep-scrub 25.1
> instructing pg 25.1 on osd.0 to deep-scrub
>
> # check status of scrubbing
> root@zephir:/home/rgysi/misc# ceph pg ls scrubbing
> PGOBJECTS  DEGRADED  MISPLACED  UNFOUND  BYTESOMAP_BYTES*
>  OMAP_KEYS*  LOG   STATESINCE  VERSION
> REPORTEDUP ACTING SCRUB_STAMP
>  DEEP_SCRUB_STAMP LAST_S
> CRUB_DURATION  SCRUB_SCHEDULING
> 25.137774 0  00  628698231420
>   0  2402  active+clean+scrubbing+deep 7s  105365'1178098
>  105365:8066292  [0,2,8]p0  [0,2,8]p0  2023-10-18T05:17:48.631392+
>  2023-10-08T11:30:58.883164+
>3  deep scrubbing for 1s
>
>
> Best Regards,
>
> Reto
>
> Am Mi., 18. Okt. 2023 um 16:24 Uhr schrieb Jayjeet Chakraborty <
> jayje...@ucsc.edu>:
>
>> Hi all,
>>
>> Just checking if someone had a chance to go through the scrub trigger
>> issue
>> above. Thanks.
>>
>> Best Regards,
>> *Jayjeet Chakraborty*
>> Ph.D. Student
>> Department of Computer Science and Engineering
>> University of California, Santa Cruz
>> *Email: jayje...@ucsc.edu *
>>
>>
>> On Mon, Oct 16, 2023 at 9:01 PM Jayjeet Chakraborty 
>> wrote:
>>
>> > Hi all,
>> >
>> > I am trying to trigger deep scrubbing in Ceph reef (18.2.0) on demand
>> on a
>> > set of files that I randomly write to CephFS. I have tried both invoking
>> > deep-scrub on CephFS using ceph tell and just deep scrubbing a
>> > particular PG. Unfortunately, none of that seems to be working for me.
>> I am
>> > monitoring the ceph status output, it never shows any scrubbing
>> > information. Can anyone please help me out on this ? In a nutshell, I
>> need
>> > Ceph to scrub for me anytime I want. I am using Ceph with default
>> configs
>> > for scrubbing. Thanks all.
>> >
>> > Best Regards,
>> > *Jayjeet Chakraborty*
>> > Ph.D. Student
>> > Department of Computer Science and Engineering
>> > University of California, Santa Cruz
>> > *Email: jayje...@ucsc.edu *
>> >
>> ___
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
>>
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Dashboard crash with rook/reef and external prometheus

2023-10-25 Thread r-ceph
I'm fairly new to the community so I figured I'd ask about this here before 
creating an issue - I'm not sure how supported this config is.

I am running rook v1.12.6 and ceph 18.2.0.  I've enabled the dashboard in the 
CRD and it has been working for a while.  However, the charts are empty.

I do have Prometheus+Grafana running on my cluster, and I can access many of 
the ceph metrics from there.  With the upgrade to reef I noticed that many of 
the qunicy dashboard elements have been replaced by charts, so I wanted to get 
those working.

I discovered that if I run ceph dashboard set-prometheus-api-host  the 
charts immediately are populated (including historical data).  However, when I 
do this I rapidly start getting ceph health alerts due to a crashing mgr 
module.  If I set the prometheus api host url to '' the crashes stop 
accumulating, though this disables the charts.

I am running the prometheus-community/prometheus-25.2.0 chart.  Various ceph 
grafana dashboards that I've found published work fine.

The following are relevant dumps.  Please let me know if you have any ideas, or 
if I should go ahead and create an issue for this...

mgr console output during crash:
debug 2023-10-24T15:11:23.498+ 7fc81fa5d700 -1 mgr.server reply reply (2) 
No such file or directory This Orchestrator does not support `orch prometheus 
access info`
debug 2023-10-24T15:11:23.502+ 7fc7ea3f3700  0 [dashboard INFO request] 
[:::10.1.0.106:49760] [GET] [200] [0.012s] [admin] [101.0B] 
/api/health/get_cluster_capacity
debug 2023-10-24T15:11:23.502+ 7fc813985700  0 [stats WARNING root] cmdtag  
not found in client metadata
debug 2023-10-24T15:11:23.502+ 7fc813985700  0 [stats WARNING root] cmdtag  
not found in client metadata
debug 2023-10-24T15:11:23.502+ 7fc7e83ef700  0 [dashboard INFO request] 
[:::10.1.0.106:5580] [GET] [200] [0.011s] [admin] [73.0B] /api/osd/settings
debug 2023-10-24T15:11:23.506+ 7fc85411a700  0 log_channel(audit) log [DBG] 
: from='mon.2 -' entity='mon.' cmd=[{"prefix": "balancer status", "format": 
"json"}]: dispatch
debug 2023-10-24T15:11:23.506+ 7fc813985700  0 [stats WARNING root] cmdtag  
not found in client metadata
debug 2023-10-24T15:11:23.506+ 7fc7e9bf2700  0 [dashboard INFO request] 
[:::10.1.0.106:20241] [GET] [200] [0.014s] [admin] [34.0B] 
/api/prometheus/rules
debug 2023-10-24T15:11:23.630+ 7fc7ebbf6700  0 [dashboard INFO 
orchestrator] is orchestrator available: True,
debug 2023-10-24T15:11:23.734+ 7fc7ebbf6700  0 [dashboard INFO 
orchestrator] is orchestrator available: True,
debug 2023-10-24T15:11:23.802+ 7fc86511c700  0 log_channel(cluster) log 
[DBG] : pgmap v126: 617 pgs: 53 active+remapped+backfill_wait, 2 
active+remapped+backfilling, 562 active+clean; 34 TiB data, 68 TiB used, 64 TiB 
/ 132 TiB avail; 2.4 MiB/s rd, 93 KiB/s wr, 21 op/s; 1213586/22505781 objects 
misplaced (5.392%)
debug 2023-10-24T15:11:23.862+ 7fc7ebbf6700  0 [dashboard INFO 
orchestrator] is orchestrator available: True,
debug 2023-10-24T15:11:23.962+ 7fc7ebbf6700  0 [dashboard INFO 
orchestrator] is orchestrator available: True,
debug 2023-10-24T15:11:24.058+ 7fc7ebbf6700  0 [dashboard INFO 
orchestrator] is orchestrator available: True,
debug 2023-10-24T15:11:24.158+ 7fc7ebbf6700  0 [dashboard INFO 
orchestrator] is orchestrator available: True,
debug 2023-10-24T15:11:24.270+ 7fc7ebbf6700  0 [dashboard INFO 
orchestrator] is orchestrator available: True,
debug 2023-10-24T15:11:24.546+ 7fc7ebbf6700  0 [dashboard INFO 
orchestrator] is orchestrator available: True,
debug 2023-10-24T15:11:24.654+ 7fc7ebbf6700  0 [dashboard INFO 
orchestrator] is orchestrator available: True,
debug 2023-10-24T15:11:24.654+ 7fc7ebbf6700  0 [dashboard INFO request] 
[:::10.1.0.106:13711] [GET] [200] [1.170s] [admin] [3.2K] 
/api/health/minimal
debug 2023-10-24T15:11:25.802+ 7fc86511c700  0 log_channel(cluster) log 
[DBG] : pgmap v127: 617 pgs: 53 active+remapped+backfill_wait, 2 
active+remapped+backfilling, 562 active+clean; 34 TiB data, 68 TiB used, 64 TiB 
/ 132 TiB avail; 1.1 MiB/s rd, 53 KiB/s wr, 17 op/s; 1213586/22505781 objects 
misplaced (5.392%)
debug 2023-10-24T15:11:27.802+ 7fc86511c700  0 log_channel(cluster) log 
[DBG] : pgmap v128: 617 pgs: 53 active+remapped+backfill_wait, 2 
active+remapped+backfilling, 562 active+clean; 34 TiB data, 68 TiB used, 64 TiB 
/ 132 TiB avail; 1.8 MiB/s rd, 58 KiB/s wr, 18 op/s; 1213586/22505781 objects 
misplaced (5.392%)
debug 2023-10-24T15:11:28.494+ 7fc813985700  0 [stats WARNING root] cmdtag  
not found in client metadata
debug 2023-10-24T15:11:28.498+ 7fc7eb3f5700  0 [dashboard INFO request] 
[:::10.1.0.106:20241] [GET] [200] [0.011s] [admin] [73.0B] /api/osd/settings
debug 2023-10-24T15:11:28.498+ 7fc85411a700  0 log_channel(audit) log [DBG] 
: from='mon.2 -' entity='mon.' cmd=[{"prefix": "orch prometheus access info"}]: 
dispatch
debug 2023-10-24T15:11:28.502+00

[ceph-users] Re: [EXTERNAL] [Pacific] ceph orch device ls do not returns any HDD

2023-10-25 Thread Zack Cerza
That's correct - it's the removable flag that's causing the disks to
be excluded.

I actually just merged this PR last week:
https://github.com/ceph/ceph/pull/49954

One of the changes it made was to enable removable (but not USB)
devices, as there are vendors that report hot-swappable drives as
removable. Patrick, it looks like this may resolve your issue as well.


On Tue, Oct 24, 2023 at 5:57 AM Eugen Block  wrote:
>
> Hi,
>
> > May be because they are hot-swappable hard drives.
>
> yes, that's my assumption as well.
>
>
> Zitat von Patrick Begou :
>
> > Hi Eugen,
> >
> > Yes Eugen, all the devices /dev/sd[abc] have the removable flag set
> > to 1. May be because they are hot-swappable hard drives.
> >
> > I have contacted the commit author Zack Cerza and he asked me for
> > some additional tests too this morning. I add him in copy to this
> > mail.
> >
> > Patrick
> >
> > Le 24/10/2023 à 12:57, Eugen Block a écrit :
> >> Hi,
> >>
> >> just to confirm, could you check that the disk which is *not*
> >> discovered by 16.2.11 has a "removable" flag?
> >>
> >> cat /sys/block/sdX/removable
> >>
> >> I could reproduce it as well on a test machine with a USB thumb
> >> drive (live distro) which is excluded in 16.2.11 but is shown in
> >> 16.2.10. Although I'm not a developer I tried to understand what
> >> changes were made in
> >> https://github.com/ceph/ceph/pull/46375/files#diff-330f9319b0fe352dff0486f66d3c4d6a6a3d48efd900b2ceb86551cfd88dc4c4R771
> >>  and there's this
> >> line:
> >>
> >>> if get_file_contents(os.path.join(_sys_block_path, dev,
> >>> 'removable')) == "1":
> >>>continue
> >>
> >> The thumb drive is removable, of course, apparently that is filtered here.
> >>
> >> Regards,
> >> Eugen
> >>
> >> Zitat von Patrick Begou :
> >>
> >>> Le 23/10/2023 à 03:04, 544463...@qq.com a écrit :
>  I think you can try to roll back this part of the python code and
>  wait for your good news :)
> >>>
> >>>
> >>> Not so easy 😕
> >>>
> >>>
> >>> [root@e9865d9a7f41 ceph]# git revert
> >>> 4fc6bc394dffaf3ad375ff29cbb0a3eb9e4dbefc
> >>> Auto-merging src/ceph-volume/ceph_volume/tests/util/test_device.py
> >>> CONFLICT (content): Merge conflict in
> >>> src/ceph-volume/ceph_volume/tests/util/test_device.py
> >>> Auto-merging src/ceph-volume/ceph_volume/util/device.py
> >>> CONFLICT (content): Merge conflict in
> >>> src/ceph-volume/ceph_volume/util/device.py
> >>> Auto-merging src/ceph-volume/ceph_volume/util/disk.py
> >>> CONFLICT (content): Merge conflict in
> >>> src/ceph-volume/ceph_volume/util/disk.py
> >>> error: could not revert 4fc6bc394df... ceph-volume: Optionally
> >>> consume loop devices
> >>>
> >>> Patrick
> >>> ___
> >>> ceph-users mailing list -- ceph-users@ceph.io
> >>> To unsubscribe send an email to ceph-users-le...@ceph.io
> >>
> >>
> >> ___
> >> ceph-users mailing list -- ceph-users@ceph.io
> >> To unsubscribe send an email to ceph-users-le...@ceph.io
> >
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
>
>
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] [quincy - 17.2.6] Lua scripting in the rados gateway - HTTP_REMOTE-ADDR missing

2023-10-25 Thread stephan
Hi Ceph users,

currently I'm using the lua script feature in radosgw to send "put_obj" and 
"get_obj" requests stats to a mongo db.
So far it's working quite well but I miss a field which is very important for 
us for traffic stats. 
Im looking for the HTTP_REMOTE-ADDR field which is available in the ops_log but 
couldn't find it in here 
https://docs.ceph.com/en/quincy/radosgw/lua-scripting/#request-fields

Does someone know how to get this field via lua script?

Cheers

Stephan
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: cephadm failing to add hosts despite a working SSH connection

2023-10-25 Thread Michel Jouvin
Answering to myself... I hesitated to send this email to the list as the 
problem didn't seem to be related to Ceph itself but rather a 
configuration problem that Ceph was a victim of. I managed to find the 
problem: we are using jumbo frames on all servers but the VLAN shared by 
the servers and the RGWs is going through an intermediate (campus) 
network that doesn't seem to support jumbo frames (we were not aware of 
this). The problem was not appearing when using the intranet address 
because the Ceph servers don't use jumbo frames on this 
network/interface (it is a 1 Gb management network so no point to use 
Jumbo frames). I cannot think of anything that Ceph could have mentioned 
to help diagnose this.


Best regards,

Michel

Le 25/10/2023 à 14:42, Michel Jouvin a écrit :

Hi,

I'm struggling with a problem to add cephadm some hosts in our Quincy 
cluster. "ceph orch host add host addr" fails with the famous "missing 
2 required positional arguments: 'hostname' and 'addr'" because of bug 
https://tracker.ceph.com/issues/59081 but looking at cephadm messages 
with "ceph -W cephadm", I can see:




Log: Opening SSH connection to 10.81.22.183, port 22
[conn=736] Connected to SSH server at 10.81.22.183, port 22
[conn=736]   Local address: 10.81.22.151, port 53640
[conn=736]   Peer address: 10.81.22.183, port 22
[conn=736] Login timeout expired
[conn=736] Aborting connection
Traceback (most recent call last): (removed)
cephadm.ssh.HostConnectionError: Failed to connect to jc-rgw3 
(10.81.22.183). Login timeout expired

Log: Opening SSH connection to 10.81.22.183, port 22
[conn=736] Connected to SSH server at 10.81.22.183, port 22
[conn=736]   Local address: 10.81.22.151, port 53640
[conn=736]   Peer address: 10.81.22.183, port 22
[conn=736] Login timeout expired
[conn=736] Aborting connection


It is very strange for me because " ssh -i /tmp/cephadm_identity_xxx 
10.81.22.183" is working fine |when executed in the active mgr container.

|

|The host I'm trying to add is a RGW that has 3 active network 
connections: Ceph public network, our intranet network (used for 
managing the server) and the network of the application that will use 
the RGW. It seems to be somewhat related to this network configuration 
as main cluster servers (MONs, OSDs) which have only the the 2 Ceph 
networks and the intranet one don't suffer the same problem. In 
particular, what is strange is that I can successfully add the host if 
I use its intranet adress rather than the Ceph public network one 
(|||10.81.22.183) in the cephadm command.


I have 3 hosts sharing the same network configuration and having the 
same problem.


Any hint or suggestion to troubleshoot further this problem would be 
highly appreciated!


Best regards,

Michel
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: quincy v17.2.7 QE Validation status

2023-10-25 Thread Ilya Dryomov
On Mon, Oct 23, 2023 at 5:15 PM Yuri Weinstein  wrote:
>
> If no one has anything else left, we have all issues resolved and
> ready for the 17.2.7 release

A last-minute issue with exporter daemon [1][2] necessitated a revert
[3].  17.2.7 builds would need to be respinned: since the tag created
by Jenkins hasn't been merged and packages haven't been pushed there is
no further impact.

The lack of test coverage in this area was brought up in the CLT call
earlier today.  I have bumped [4] by summarizing the history there.

[1] https://github.com/ceph/ceph/pull/54153#discussion_r1369834098
[2] https://github.com/ceph/ceph/pull/50749#pullrequestreview-1694336396
[3] https://github.com/ceph/ceph/pull/54169
[4] https://tracker.ceph.com/issues/59561

Thanks,

Ilya

>
> On Mon, Oct 23, 2023 at 8:12 AM Laura Flores  wrote:
> >
> > Regarding the crash in quincy-p2p (tracked in
> > https://tracker.ceph.com/issues/63257), @Prashant Dhange
> >  and I evaluated it, and we've concluded it isn't a
> > blocker for 17.2.7.
> >
> > So, quincy-p2p is approved.
> >
> > Thanks,
> > Laura
> >
> >
> >
> > On Sat, Oct 21, 2023 at 12:27 AM Venky Shankar  wrote:
> >
> > > Hi Yuri,
> > >
> > > On Fri, Oct 20, 2023 at 9:44 AM Venky Shankar  wrote:
> > > >
> > > > Hi Yuri,
> > > >
> > > > On Thu, Oct 19, 2023 at 10:48 PM Venky Shankar 
> > > wrote:
> > > > >
> > > > > Hi Yuri,
> > > > >
> > > > > On Thu, Oct 19, 2023 at 9:32 PM Yuri Weinstein 
> > > wrote:
> > > > > >
> > > > > > We are still finishing off:
> > > > > >
> > > > > > - revert PR https://github.com/ceph/ceph/pull/54085, needs smoke
> > > suite rerun
> > > > > > - removed s3tests https://github.com/ceph/ceph/pull/54078 merged
> > > > > >
> > > > > > Venky, Casey FYI
> > > > >
> > > > > https://github.com/ceph/ceph/pull/53139 is causing a smoke test
> > > > > failure. Details:
> > > > > https://github.com/ceph/ceph/pull/53139#issuecomment-1771388202
> > > > >
> > > > > I've sent a revert for that change -
> > > > > https://github.com/ceph/ceph/pull/54108 - will let you know when it's
> > > > > ready for testing.
> > > >
> > > > smoke passes with this revert
> > > >
> > > >
> > > https://pulpito.ceph.com/vshankar-2023-10-19_20:24:36-smoke-wip-vshankar-testing-quincy-20231019.172112-testing-default-smithi/
> > > >
> > > > fs suite running now...
> > >
> > > Test results are here -
> > > https://tracker.ceph.com/projects/cephfs/wiki/Quincy#2023-October-19
> > >
> > > Yuri, please merge change - https://github.com/ceph/ceph/pull/54108
> > >
> > > and consider this as "fs approved".
> > >
> > > >
> > > > >
> > > > > >
> > > > > > On Wed, Oct 18, 2023 at 9:07 PM Venky Shankar 
> > > wrote:
> > > > > > >
> > > > > > > On Tue, Oct 17, 2023 at 12:23 AM Yuri Weinstein <
> > > ywein...@redhat.com> wrote:
> > > > > > > >
> > > > > > > > Details of this release are summarized here:
> > > > > > > >
> > > > > > > > https://tracker.ceph.com/issues/63219#note-2
> > > > > > > > Release Notes - TBD
> > > > > > > >
> > > > > > > > Issue https://tracker.ceph.com/issues/63192 appears to be
> > > failing several runs.
> > > > > > > > Should it be fixed for this release?
> > > > > > > >
> > > > > > > > Seeking approvals/reviews for:
> > > > > > > >
> > > > > > > > smoke - Laura
> > > > > > >
> > > > > > > There's one failure in the smoke tests
> > > > > > >
> > > > > > >
> > > https://pulpito.ceph.com/yuriw-2023-10-18_14:58:31-smoke-quincy-release-distro-default-smithi/
> > > > > > >
> > > > > > > caused by
> > > > > > >
> > > > > > > https://github.com/ceph/ceph/pull/53647
> > > > > > >
> > > > > > > (which was marked DNM but got merged). However, it's a test case
> > > thing
> > > > > > > and we can live with it.
> > > > > > >
> > > > > > > Yuri mention in slack that he might do another round of
> > > build/tests,
> > > > > > > so, Yuri, here's the reverted change:
> > > > > > >
> > > > > > >https://github.com/ceph/ceph/pull/54085
> > > > > > >
> > > > > > > > rados - Laura, Radek, Travis, Ernesto, Adam King
> > > > > > > >
> > > > > > > > rgw - Casey
> > > > > > > > fs - Venky
> > > > > > > > orch - Adam King
> > > > > > > >
> > > > > > > > rbd - Ilya
> > > > > > > > krbd - Ilya
> > > > > > > >
> > > > > > > > upgrade/quincy-p2p - Known issue IIRC, Casey pls confirm/approve
> > > > > > > >
> > > > > > > > client-upgrade-quincy-reef - Laura
> > > > > > > >
> > > > > > > > powercycle - Brad pls confirm
> > > > > > > >
> > > > > > > > ceph-volume - Guillaume pls take a look
> > > > > > > >
> > > > > > > > Please reply to this email with approval and/or trackers of 
> > > > > > > > known
> > > > > > > > issues/PRs to address them.
> > > > > > > >
> > > > > > > > Josh, Neha - gibba and LRC upgrades -- N/A for quincy now after
> > > reef release.
> > > > > > > >
> > > > > > > > Thx
> > > > > > > > YuriW
> > > > > > > > ___
> > > > > > > > ceph-users mailing list -- ceph-users@ceph.io
> > > > > > > > To unsubscribe send an ema

[ceph-users] Re: quincy v17.2.7 QE Validation status

2023-10-25 Thread Laura Flores
Another outstanding issue is https://tracker.ceph.com/issues/63305, a
compile-time issue we noticed upon building Debian Bullseye. We have raised
a small PR to fix the issue, which has been merged and is now undergoing
testing.

After this, we will be ready to rebuild 17.2.7.

On Wed, Oct 25, 2023 at 11:43 AM Ilya Dryomov  wrote:

> On Mon, Oct 23, 2023 at 5:15 PM Yuri Weinstein 
> wrote:
> >
> > If no one has anything else left, we have all issues resolved and
> > ready for the 17.2.7 release
>
> A last-minute issue with exporter daemon [1][2] necessitated a revert
> [3].  17.2.7 builds would need to be respinned: since the tag created
> by Jenkins hasn't been merged and packages haven't been pushed there is
> no further impact.
>
> The lack of test coverage in this area was brought up in the CLT call
> earlier today.  I have bumped [4] by summarizing the history there.
>
> [1] https://github.com/ceph/ceph/pull/54153#discussion_r1369834098
> [2] https://github.com/ceph/ceph/pull/50749#pullrequestreview-1694336396
> [3] https://github.com/ceph/ceph/pull/54169
> [4] https://tracker.ceph.com/issues/59561
>
> Thanks,
>
> Ilya
>
> >
> > On Mon, Oct 23, 2023 at 8:12 AM Laura Flores  wrote:
> > >
> > > Regarding the crash in quincy-p2p (tracked in
> > > https://tracker.ceph.com/issues/63257), @Prashant Dhange
> > >  and I evaluated it, and we've concluded it isn't
> a
> > > blocker for 17.2.7.
> > >
> > > So, quincy-p2p is approved.
> > >
> > > Thanks,
> > > Laura
> > >
> > >
> > >
> > > On Sat, Oct 21, 2023 at 12:27 AM Venky Shankar 
> wrote:
> > >
> > > > Hi Yuri,
> > > >
> > > > On Fri, Oct 20, 2023 at 9:44 AM Venky Shankar 
> wrote:
> > > > >
> > > > > Hi Yuri,
> > > > >
> > > > > On Thu, Oct 19, 2023 at 10:48 PM Venky Shankar <
> vshan...@redhat.com>
> > > > wrote:
> > > > > >
> > > > > > Hi Yuri,
> > > > > >
> > > > > > On Thu, Oct 19, 2023 at 9:32 PM Yuri Weinstein <
> ywein...@redhat.com>
> > > > wrote:
> > > > > > >
> > > > > > > We are still finishing off:
> > > > > > >
> > > > > > > - revert PR https://github.com/ceph/ceph/pull/54085, needs
> smoke
> > > > suite rerun
> > > > > > > - removed s3tests https://github.com/ceph/ceph/pull/54078
> merged
> > > > > > >
> > > > > > > Venky, Casey FYI
> > > > > >
> > > > > > https://github.com/ceph/ceph/pull/53139 is causing a smoke test
> > > > > > failure. Details:
> > > > > > https://github.com/ceph/ceph/pull/53139#issuecomment-1771388202
> > > > > >
> > > > > > I've sent a revert for that change -
> > > > > > https://github.com/ceph/ceph/pull/54108 - will let you know
> when it's
> > > > > > ready for testing.
> > > > >
> > > > > smoke passes with this revert
> > > > >
> > > > >
> > > >
> https://pulpito.ceph.com/vshankar-2023-10-19_20:24:36-smoke-wip-vshankar-testing-quincy-20231019.172112-testing-default-smithi/
> > > > >
> > > > > fs suite running now...
> > > >
> > > > Test results are here -
> > > > https://tracker.ceph.com/projects/cephfs/wiki/Quincy#2023-October-19
> > > >
> > > > Yuri, please merge change - https://github.com/ceph/ceph/pull/54108
> > > >
> > > > and consider this as "fs approved".
> > > >
> > > > >
> > > > > >
> > > > > > >
> > > > > > > On Wed, Oct 18, 2023 at 9:07 PM Venky Shankar <
> vshan...@redhat.com>
> > > > wrote:
> > > > > > > >
> > > > > > > > On Tue, Oct 17, 2023 at 12:23 AM Yuri Weinstein <
> > > > ywein...@redhat.com> wrote:
> > > > > > > > >
> > > > > > > > > Details of this release are summarized here:
> > > > > > > > >
> > > > > > > > > https://tracker.ceph.com/issues/63219#note-2
> > > > > > > > > Release Notes - TBD
> > > > > > > > >
> > > > > > > > > Issue https://tracker.ceph.com/issues/63192 appears to be
> > > > failing several runs.
> > > > > > > > > Should it be fixed for this release?
> > > > > > > > >
> > > > > > > > > Seeking approvals/reviews for:
> > > > > > > > >
> > > > > > > > > smoke - Laura
> > > > > > > >
> > > > > > > > There's one failure in the smoke tests
> > > > > > > >
> > > > > > > >
> > > >
> https://pulpito.ceph.com/yuriw-2023-10-18_14:58:31-smoke-quincy-release-distro-default-smithi/
> > > > > > > >
> > > > > > > > caused by
> > > > > > > >
> > > > > > > > https://github.com/ceph/ceph/pull/53647
> > > > > > > >
> > > > > > > > (which was marked DNM but got merged). However, it's a test
> case
> > > > thing
> > > > > > > > and we can live with it.
> > > > > > > >
> > > > > > > > Yuri mention in slack that he might do another round of
> > > > build/tests,
> > > > > > > > so, Yuri, here's the reverted change:
> > > > > > > >
> > > > > > > >https://github.com/ceph/ceph/pull/54085
> > > > > > > >
> > > > > > > > > rados - Laura, Radek, Travis, Ernesto, Adam King
> > > > > > > > >
> > > > > > > > > rgw - Casey
> > > > > > > > > fs - Venky
> > > > > > > > > orch - Adam King
> > > > > > > > >
> > > > > > > > > rbd - Ilya
> > > > > > > > > krbd - Ilya
> > > > > > > > >
> > > > > > > > > upgrade/quincy-p2p - Known issue IIRC, Casey pls
> confirm/approve
> > > > > > >

[ceph-users] owner locked out of bucket via bucket policy

2023-10-25 Thread Wesley Dillingham
I have a bucket which got injected with bucket policy which locks the
bucket even to the bucket owner. The bucket now cannot be accessed (even
get its info or delete bucket policy does not work) I have looked in the
radosgw-admin command for a way to delete a bucket policy but do not see
anything. I presume I will need to somehow remove the bucket policy from
however it is stored in the bucket metadata / omap etc. If anyone can point
me in the right direction on that I would appreciate it. Thanks

Respectfully,

*Wes Dillingham*
w...@wesdillingham.com
LinkedIn 
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: owner locked out of bucket via bucket policy

2023-10-25 Thread Casey Bodley
if you have an administrative user (created with --admin), you should
be able to use its credentials with awscli to delete or overwrite this
bucket policy

On Wed, Oct 25, 2023 at 4:11 PM Wesley Dillingham  
wrote:
>
> I have a bucket which got injected with bucket policy which locks the
> bucket even to the bucket owner. The bucket now cannot be accessed (even
> get its info or delete bucket policy does not work) I have looked in the
> radosgw-admin command for a way to delete a bucket policy but do not see
> anything. I presume I will need to somehow remove the bucket policy from
> however it is stored in the bucket metadata / omap etc. If anyone can point
> me in the right direction on that I would appreciate it. Thanks
>
> Respectfully,
>
> *Wes Dillingham*
> w...@wesdillingham.com
> LinkedIn 
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] init unable to update_crush_location: (34) Numerical result out of range

2023-10-25 Thread Pardhiv Karri
Hi,

Getting an error while adding a new node/OSD with bluestore OSDs to the
cluster. The OSD is added without any host and is down, tried to bring it
up didn't work. The same method to add in other clusters doesn't have any
issue. Any idea what the problem is?

Ceph Version: ceph version 12.2.11
(26dc3775efc7bb286a1d6d66faee0ba30ea23eee) luminous (stable)
Ceph Health: OK

2023-10-25 20:40:40.867878 7f1f478cde40  4 rocksdb: EVENT_LOG_v1
{"time_micros": 1698266440867866, "job": 1, "event": "recovery_started",
"log_files": [270]}
2023-10-25 20:40:40.867883 7f1f478cde40  4 rocksdb:
[/build/ceph-U0cfoi/ceph-12.2.11/src/rocksdb/db/db_impl_open.cc:482]
Recovering log #270 mode 0
2023-10-25 20:40:40.867904 7f1f478cde40  4 rocksdb:
[/build/ceph-U0cfoi/ceph-12.2.11/src/rocksdb/db/version_set.cc:2395]
Creating manifest 272

2023-10-25 20:40:40.869553 7f1f478cde40  4 rocksdb: EVENT_LOG_v1
{"time_micros": 1698266440869548, "job": 1, "event": "recovery_finished"}
2023-10-25 20:40:40.870924 7f1f478cde40  4 rocksdb:
[/build/ceph-U0cfoi/ceph-12.2.11/src/rocksdb/db/db_impl_open.cc:1063] DB
pointer 0x55c9061ba000
2023-10-25 20:40:40.870964 7f1f478cde40  1
bluestore(/var/lib/ceph/osd/ceph-721) _open_db opened rocksdb path db
options
compression=kNoCompression,max_write_buffer_number=4,min_write_buffer_number_to_merge=1,recycle_log_file_num=4,write_buffer_size=268435456,writable_file_max_buffer_size=0,compaction_readahead_size=2097152
2023-10-25 20:40:40.871234 7f1f478cde40  1 freelist init
2023-10-25 20:40:40.871293 7f1f478cde40  1
bluestore(/var/lib/ceph/osd/ceph-721) _open_alloc opening allocation
metadata
2023-10-25 20:40:40.871314 7f1f478cde40  1
bluestore(/var/lib/ceph/osd/ceph-721) _open_alloc loaded 3.49TiB in 1
extents
2023-10-25 20:40:40.874700 7f1f478cde40  0 
/build/ceph-U0cfoi/ceph-12.2.11/src/cls/cephfs/cls_cephfs.cc:197: loading
cephfs
2023-10-25 20:40:40.874721 7f1f478cde40  0 _get_class not permitted to load
sdk
2023-10-25 20:40:40.874955 7f1f478cde40  0 _get_class not permitted to load
kvs
2023-10-25 20:40:40.875638 7f1f478cde40  0 _get_class not permitted to load
lua
2023-10-25 20:40:40.875724 7f1f478cde40  0 
/build/ceph-U0cfoi/ceph-12.2.11/src/cls/hello/cls_hello.cc:296: loading
cls_hello
2023-10-25 20:40:40.875776 7f1f478cde40  0 osd.721 0 crush map has features
288232575208783872, adjusting msgr requires for clients
2023-10-25 20:40:40.875780 7f1f478cde40  0 osd.721 0 crush map has features
288232575208783872 was 8705, adjusting msgr requires for mons
2023-10-25 20:40:40.875784 7f1f478cde40  0 osd.721 0 crush map has features
288232575208783872, adjusting msgr requires for osds
2023-10-25 20:40:40.875837 7f1f478cde40  0 osd.721 0 load_pgs
2023-10-25 20:40:40.875840 7f1f478cde40  0 osd.721 0 load_pgs opened 0 pgs
2023-10-25 20:40:40.875844 7f1f478cde40  0 osd.721 0 using weightedpriority
op queue with priority op cut off at 64.
2023-10-25 20:40:40.877401 7f1f478cde40 -1 osd.721 0 log_to_monitors
{default=true}
2023-10-25 20:40:40.888408 7f1f478cde40 -1 osd.721 0
mon_cmd_maybe_osd_create fail: '(34) Numerical result out of range': (34)
Numerical result out of range
2023-10-25 20:40:40.891367 7f1f478cde40 -1 osd.721 0
mon_cmd_maybe_osd_create fail: '(34) Numerical result out of range': (34)
Numerical result out of range
2023-10-25 20:40:40.891409 7f1f478cde40 -1 osd.721 0 init unable to
update_crush_location: (34) Numerical result out of range

Thanks,
Pardhiv
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: owner locked out of bucket via bucket policy

2023-10-25 Thread Wesley Dillingham
Thank you, I am not sure (inherited cluster). I presume such an admin user
created after-the-fact would work? Is there a good way to discover an admin
user other than iterate over all users and retrieve user information? (I
presume radosgw-admin user info --uid=" would illustrate such
administrative access?

Respectfully,

*Wes Dillingham*
w...@wesdillingham.com
LinkedIn 


On Wed, Oct 25, 2023 at 4:41 PM Casey Bodley  wrote:

> if you have an administrative user (created with --admin), you should
> be able to use its credentials with awscli to delete or overwrite this
> bucket policy
>
> On Wed, Oct 25, 2023 at 4:11 PM Wesley Dillingham 
> wrote:
> >
> > I have a bucket which got injected with bucket policy which locks the
> > bucket even to the bucket owner. The bucket now cannot be accessed (even
> > get its info or delete bucket policy does not work) I have looked in the
> > radosgw-admin command for a way to delete a bucket policy but do not see
> > anything. I presume I will need to somehow remove the bucket policy from
> > however it is stored in the bucket metadata / omap etc. If anyone can
> point
> > me in the right direction on that I would appreciate it. Thanks
> >
> > Respectfully,
> >
> > *Wes Dillingham*
> > w...@wesdillingham.com
> > LinkedIn 
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
> >
>
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: owner locked out of bucket via bucket policy

2023-10-25 Thread Casey Bodley
On Wed, Oct 25, 2023 at 4:59 PM Wesley Dillingham  
wrote:
>
> Thank you, I am not sure (inherited cluster). I presume such an admin user 
> created after-the-fact would work?

yes

> Is there a good way to discover an admin user other than iterate over all 
> users and retrieve user information? (I presume radosgw-admin user info 
> --uid=" would illustrate such administrative access?

not sure there's an easy way to search existing users, but you could
create a temporary admin user for this repair

>
> Respectfully,
>
> Wes Dillingham
> w...@wesdillingham.com
> LinkedIn
>
>
> On Wed, Oct 25, 2023 at 4:41 PM Casey Bodley  wrote:
>>
>> if you have an administrative user (created with --admin), you should
>> be able to use its credentials with awscli to delete or overwrite this
>> bucket policy
>>
>> On Wed, Oct 25, 2023 at 4:11 PM Wesley Dillingham  
>> wrote:
>> >
>> > I have a bucket which got injected with bucket policy which locks the
>> > bucket even to the bucket owner. The bucket now cannot be accessed (even
>> > get its info or delete bucket policy does not work) I have looked in the
>> > radosgw-admin command for a way to delete a bucket policy but do not see
>> > anything. I presume I will need to somehow remove the bucket policy from
>> > however it is stored in the bucket metadata / omap etc. If anyone can point
>> > me in the right direction on that I would appreciate it. Thanks
>> >
>> > Respectfully,
>> >
>> > *Wes Dillingham*
>> > w...@wesdillingham.com
>> > LinkedIn 
>> > ___
>> > ceph-users mailing list -- ceph-users@ceph.io
>> > To unsubscribe send an email to ceph-users-le...@ceph.io
>> >
>>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io