[ceph-users] Re: [EXTERNAL] Re: RGW Bucket Notifications and MultiPart Uploads

2022-07-19 Thread Yuval Lifshitz
yes, that would work. you would get a "404" until the object is fully
uploaded.

On Tue, Jul 19, 2022 at 6:00 PM Mark Selby  wrote:

> If you can that would be great. As it is likely to be a while before we
> make the 16 -> 17 switch.
>
>
>
> Question: If I receive the 1st put notification for the partial object
> and I check RGW to see if the object actually exists – do you know if RGW
> will tell me that it is missing until the
> ObjectCreated:CompleteMultipartUpload is sent?
>
>
>
> I was thinking as a stop gap I could add some code to just check for the
> objects existence in RGW before taking any action?
>
>
>
> Thanks very much for taking the time out to answer this.
>
>
>
> --
>
> Mark Selby
>
> Sr Linux Administrator, The Voleon Group
>
> mse...@voleon.com
>
>
>
>  This email is subject to important conditions and disclosures that are
> listed on this web page: https://voleon.com/disclaimer/.
>
>
>
>
>
> *From: *Yuval Lifshitz 
> *Date: *Monday, July 18, 2022 at 9:33 PM
> *To: *Mark Selby 
> *Cc: *"ceph-users@ceph.io" 
> *Subject: *[EXTERNAL] Re: [ceph-users] RGW Bucket Notifications and
> MultiPart Uploads
>
>
>
> *CAUTION:* This email originated from outside of the organization. Use
> caution when opening attachments or links.
>
>
>
> Hi Mark,
>
> It is in quincy but wasn't backported to pacific yet.
>
> I can do this backport, but I'm not sure when is the next pacific release.
>
>
>
> Yuval
>
>
>
> On Tue, Jul 19, 2022 at 5:04 AM Mark Selby  wrote:
>
> I am trying to use RGW Bucket Notifications to trigger events on object
> creation and have into a bit of an issue when multipart uploads come into
> play for large objects.
>
>
>
> With a small object only a single notification is generated ->
> ObjectCreated:Put
>
>
>
> When a multipart upload is performed a string of Notifications are sent:
>
> ObjectCreated:Post
>
> ObjectCreated:Put
>
> ObjectCreated:Put
>
> ObjectCreated:Put
>
> …
>
> ObjectCreated:CompleteMultipartUpload
>
>
>
> I can ignore the Post, but all of the Put notifications look the same as a
> single part upload message but the object will not actually be created
> until the CompleteMultipartUpload notification happens.
>
>
>
> There is this is https://tracker.ceph.com/issues/51520
> 
> that seems to fix this issue – I can not tell if this was actually
> backported or not. Does anyone know if this actually was backported?
>
>
>
> Thanks!
>
>
>
> --
>
> Mark Selby
>
> Sr Linux Administrator, The Voleon Group
>
> mse...@voleon.com
>
>
>
>  This email is subject to important conditions and disclosures that are
> listed on this web page: https://voleon.com/disclaimer/
> 
> .
>
>
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: CephFS standby-replay has more dns/inos/dirs than the active mds

2022-07-19 Thread Patrick Donnelly
You're probably seeing this bug: https://tracker.ceph.com/issues/48673

Sorry I've not had time to finish a fix for it yet. Hopefully soon...

On Tue, Jul 19, 2022 at 5:43 PM Bryan Stillwell  wrote:
>
> We have a cluster using multiple filesystems on Pacific (16.2.7) and even 
> though we have mds_cache_memory_limit set to 80 GiB one of the MDS daemons is 
> using 123.1 GiB.  This MDS is actually the standby-replay MDS and I'm 
> wondering if it's because it's using more dns/inos/dirs than the active MDS?:
>
> $ sudo ceph fs status cephfs19
> cephfs19 - 28 clients
> 
> RANK  STATE   MDS  ACTIVITY DNSINOS   DIRS   CAPS
>  0active  ceph006b  Reqs: 2879 /s  27.8M  27.8M  3490k  7767k
> 0-s   standby-replay  ceph008a  Evts: 1446 /s  40.1M  40.0M  6259k 0
>
> Shouldn't the standby-replay MDS daemons have similar stats to the active MDS 
> they're protecting?  What could be causing this to happen?
>
> Thanks,
> Bryan
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>


-- 
Patrick Donnelly, Ph.D.
He / Him / His
Principal Software Engineer
Red Hat, Inc.
GPG: 19F28A586F808C2402351B93C3301A3E258DD79D

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Quincy recovery load

2022-07-19 Thread Daniel Williams
Do you think maybe you should issue an immediate change/patch/update to
quincy to change the default to wpq? Given the cluster ending nature of the
problem?

On Wed, Jul 20, 2022 at 4:01 AM Sridhar Seshasayee 
wrote:

> Hi Daniel,
>
>
> And further to my theory about the spin lock or similar, increasing my
>> recovery by 4-16x using wpq sees my cpu rise to 10-15% ( from 3% )...
>> but using mclock, even at very very conservative recovery settings sees a
>> median CPU usage of some multiple of 100% (eg. a multiple of a machine
>> core/thread usage per osd).
>>
>>
> The issue has been narrowed down to waiting threads of a work queue shard
> being unblocked prematurely during the wait period prior to dequeuing a
> future
> work item from the mclock queue. This is leading to high CPU usage as you
> have observed. WPQ uses sleep to throttle various operations, but mclock
> based on the set QoS parameters could schedule operations in the future.
> The
> work queue threads should ideally relinquish CPU and block until the set
> time
> duration, but this is evidently not happening currently. While the fix is
> being
> worked upon, do continue to provide us your feedback and use wpq in the
> interim. Thanks!
>
> -Sridhar
>
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Quincy recovery load

2022-07-19 Thread Sridhar Seshasayee
Hi Daniel,


And further to my theory about the spin lock or similar, increasing my
> recovery by 4-16x using wpq sees my cpu rise to 10-15% ( from 3% )...
> but using mclock, even at very very conservative recovery settings sees a
> median CPU usage of some multiple of 100% (eg. a multiple of a machine
> core/thread usage per osd).
>
>
The issue has been narrowed down to waiting threads of a work queue shard
being unblocked prematurely during the wait period prior to dequeuing a
future
work item from the mclock queue. This is leading to high CPU usage as you
have observed. WPQ uses sleep to throttle various operations, but mclock
based on the set QoS parameters could schedule operations in the future. The
work queue threads should ideally relinquish CPU and block until the set
time
duration, but this is evidently not happening currently. While the fix is
being
worked upon, do continue to provide us your feedback and use wpq in the
interim. Thanks!

-Sridhar
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: rh8 krbd mapping causes no match of type 1 in addrvec problem decoding monmap, -2

2022-07-19 Thread Wesley Dillingham
Thanks.

Interestingly the older kernel did not have a problem with it but the newer
kernel does.


Respectfully,

*Wes Dillingham*
w...@wesdillingham.com
LinkedIn 


On Tue, Jul 19, 2022 at 3:35 PM Ilya Dryomov  wrote:

> On Tue, Jul 19, 2022 at 9:12 PM Wesley Dillingham 
> wrote:
> >
> >
> > from ceph.conf:
> >
> > mon_host = 10.26.42.172,10.26.42.173,10.26.42.174
> >
> > map command:
> > rbd --id profilerbd device map win-rbd-test/originalrbdfromsnap
> >
> > [root@a2tlomon002 ~]# ceph mon dump
> > dumped monmap epoch 44
> > epoch 44
> > fsid 227623f8-b67e-4168-8a15-2ff2a4a68567
> > last_changed 2022-05-18 15:35:39.385763
> > created 2016-08-09 10:02:28.325333
> > min_mon_release 14 (nautilus)
> > 0: [v2:10.26.42.173:3300/0,v1:10.26.42.173:6789/0] mon.a2tlomon003
> > 1: v2:10.26.42.174:3300/0 mon.a2tlomon004
> > 2: [v2:10.26.42.172:3300/0,v1:10.26.42.172:6789/0] mon.a2tlomon002
> >
> > Looks like something is up with mon:1 only listening on v2 addr not sure
> if thats the root cause but seems likely. Though would think the map should
> still be able to have success.
>
> Yes, this is the root cause.  Theoretically the kernel client could
> ignore it and attempt to proceed but it doesn't, on purpose.  This is
> a clear configuration/user error which is better fixed than worked
> around.
>
> You need to either amend mon1 addresses or tell the kernel client to
> use v2 addresses with e.g. "rbd device map -o ms_mode=prefer-crc ...".
>
> Thanks,
>
> Ilya
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: rh8 krbd mapping causes no match of type 1 in addrvec problem decoding monmap, -2

2022-07-19 Thread Ilya Dryomov
On Tue, Jul 19, 2022 at 9:12 PM Wesley Dillingham  
wrote:
>
>
> from ceph.conf:
>
> mon_host = 10.26.42.172,10.26.42.173,10.26.42.174
>
> map command:
> rbd --id profilerbd device map win-rbd-test/originalrbdfromsnap
>
> [root@a2tlomon002 ~]# ceph mon dump
> dumped monmap epoch 44
> epoch 44
> fsid 227623f8-b67e-4168-8a15-2ff2a4a68567
> last_changed 2022-05-18 15:35:39.385763
> created 2016-08-09 10:02:28.325333
> min_mon_release 14 (nautilus)
> 0: [v2:10.26.42.173:3300/0,v1:10.26.42.173:6789/0] mon.a2tlomon003
> 1: v2:10.26.42.174:3300/0 mon.a2tlomon004
> 2: [v2:10.26.42.172:3300/0,v1:10.26.42.172:6789/0] mon.a2tlomon002
>
> Looks like something is up with mon:1 only listening on v2 addr not sure if 
> thats the root cause but seems likely. Though would think the map should 
> still be able to have success.

Yes, this is the root cause.  Theoretically the kernel client could
ignore it and attempt to proceed but it doesn't, on purpose.  This is
a clear configuration/user error which is better fixed than worked
around.

You need to either amend mon1 addresses or tell the kernel client to
use v2 addresses with e.g. "rbd device map -o ms_mode=prefer-crc ...".

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: rh8 krbd mapping causes no match of type 1 in addrvec problem decoding monmap, -2

2022-07-19 Thread Wesley Dillingham
from ceph.conf:

mon_host = 10.26.42.172,10.26.42.173,10.26.42.174

map command:
rbd --id profilerbd device map win-rbd-test/originalrbdfromsnap

[root@a2tlomon002 ~]# ceph mon dump
dumped monmap epoch 44
epoch 44
fsid 227623f8-b67e-4168-8a15-2ff2a4a68567
last_changed 2022-05-18 15:35:39.385763
created 2016-08-09 10:02:28.325333
min_mon_release 14 (nautilus)
0: [v2:10.26.42.173:3300/0,v1:10.26.42.173:6789/0] mon.a2tlomon003
1: v2:10.26.42.174:3300/0 mon.a2tlomon004
2: [v2:10.26.42.172:3300/0,v1:10.26.42.172:6789/0] mon.a2tlomon002

Looks like something is up with mon:1 only listening on v2 addr not sure if
thats the root cause but seems likely. Though would think the map should
still be able to have success.

As a note i tried with 16.2.9 client as well and it also failed in the same
manner.






Respectfully,

*Wes Dillingham*
w...@wesdillingham.com
LinkedIn 


On Tue, Jul 19, 2022 at 12:51 PM Ilya Dryomov  wrote:

> On Tue, Jul 19, 2022 at 5:01 PM Wesley Dillingham 
> wrote:
> >
> > I have a strange error when trying to map via krdb on a RH (alma8)
> release
> > / kernel 4.18.0-372.13.1.el8_6.x86_64 using ceph client version 14.2.22
> > (cluster is 14.2.16)
> >
> > the rbd map causes the following error in dmesg:
> >
> > [Tue Jul 19 07:45:00 2022] libceph: no match of type 1 in addrvec
> > [Tue Jul 19 07:45:00 2022] libceph: problem decoding monmap, -2
> >
> > I am able to map this rbd to a cent7 / 3.10.0-1160.71.1.el7.x86_64
> machine
> > using the same client and commands.
> >
> > Of note, on the RH8 node I can fetch info about the rbd and list rbds in
> > the pool check ceph status etc. It seems purely limited to the mapping of
> > the RBD:
> >
> > Info about the RBD:
> >
> > [root@alma8rbdtest ~]# rbd --id profilerbd info
> > win-rbd-test/originalrbdfromsnap
> > rbd image 'originalrbdfromsnap':
> > size 5 GiB in 1280 objects
> > order 22 (4 MiB objects)
> > snapshot_count: 0
> > id: 2c5f465fa134c0
> > block_name_prefix: rbd_data.2c5f465fa134c0
> > format: 2
> > features: layering, exclusive-lock
> > op_features:
> > flags:
> > create_timestamp: Mon Jul 18 13:58:39 2022
> > access_timestamp: Mon Jul 18 13:58:39 2022
> > modify_timestamp: Mon Jul 18 13:58:39 2022
> >
> > anybody seen something like this
>
> Hi Wesley,
>
> Could you please provide:
>
> - full "rbd map" ("rbd device map") command
>
> - "mon host = XYZ" line from ceph.conf file
>
> - "ceph mon dump" output
>
> Thanks,
>
> Ilya
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Quincy recovery load

2022-07-19 Thread Daniel Williams
Just incase people don't know
osd_op_queue = "wpq"
requires an OSD restart.

And further to my theory about the spin lock or similar, increasing my
recovery by 4-16x using wpq sees my cpu rise to 10-15% ( from 3% )...
but using mclock, even at very very conservative recovery settings sees a
median CPU usage of some multiple of 100% (eg. a multiple of a machine
core/thread usage per osd).


On Tue, Jul 19, 2022 at 4:18 PM Daniel Williams  wrote:

> Also never had problems with backfill / rebalance / recovery but now seen
> runaway CPU usage even with very conservative recovery settings after
> upgrading to quincy from pacific.
>
> osd_recovery_sleep_hdd = 0.1
> osd_max_backfills = 1
> osd_recovery_max_active = 1
> osd_recovery_delay_start = 600
>
> Tried:
> osd_mclock_profile = "high_recovery_ops"
> It did not help.
>
> The CPU eventually runs away so much (regardless of config) that the OSD
> gets health check problems, and causes even more problems, so I tried
> nodown,noout,noscrub,nodeep-scrub
> But none of that helped progress the recovery forward either.
>
> The only way back to a health cluster for now seems to be
> ceph osd set norebalance
>
> When toggling off rebalance and the cluster is slowly finishing the
> rebalances in progress, I noticed that the whole cluster has almost no IO
> on the disks, except on one of the hosts 100% on a single disk is bouncing
> around from disk to disk.
>
> Example of the host with the bouncing load:
> root@ceph-server-04:~# !dstat
> dstat -cd --disk-util --disk-tps --net
> total-usage -dsk/total-
> nvme-sdb--sda--sdc--sdd--sde--sdf--sdg--sdh--sdi--sdj--sdk- -dsk/total-
> -net/total-
> usr sys idl wai stl| read
>  writ|util:util:util:util:util:util:util:util:util:util:util:util|#read
> #writ| recv  send
>  74  12   9   3   0|2542k  246M|7.49:99.3:   0:   0:   0:27.2:   0:99.3:
> 0:   0:   0:   0|   9   636 |1251k  829k
>  75  11  10   3   0|  29M  254M|7.65: 101:   0:   0:74.1:20.1:   0: 101:
> 0:   0:   0:   0| 205   686 |4246k 7841k
>  61  26   9   3   0|6340k  250M|2.81: 101:   0:   0:12.9:   0:   0:99.7:
> 0:   0:   0:   0|  45   660 |  35M   35M
>  69  20   8   2   0|   0   243M|5.20:98.5:   0:   0:   0:   0:   0:99.7:
> 0:   0:   0:   0|   0   649 | 650k  442k
>  71  20   8   0   0|   0   150M|5.13:87.9:   0:   0:   0:   0:   0:68.2:
> 0:   0:   0:   0|   0   360 | 703k  443k
>  72  16  11  57   0|8168B   51M|5.18:   0:   0:   0:   0:   0:   0:1.99:
> 0:   0:86.5:   0|   2   129 | 702k  524k
>  72  16  11   1   0|   0  5865k|7.28:   0:   0:   0:   0:   0:   0:   0:
> 0:   0:90.6:   0|   036 |1578k 1184k
>  71  16  12   0   0|   0  6519k|7.25:   0:   0:   0:   0:   0:   0:   0:
> 0:   0: 112:   0|   038 | 904k  553k
>  75  11  11   2   0| 522k   32M|1.96:   0:   0:   0:1.96:   0:   0:   0:
> 0:   0:98.5:   0|   281 |1022k  847k
>  72  14  12   1   0|   060M|5.72:   0:   0:   0:   0:   0:   0:   0:
> 0:   0: 102:   0|   0   160 | 826k  550k
>  65  19  13   2   0|   0   124M|5.57:   0:   0:99.1:   0:   0:   0:   0:
> 0:   0:   0:   0|   0   339 | 648k  340k
>  69  17  11   2   0|   0   125M|2.82:   0:   0: 101:   0:   0:   0:   0:
> 0:   0:   0:   0|   0   333 | 694k  482k
>  75  15   9   1   0|   0   123M|3.56:   0:   0:99.3:   0:   0:   0:   0:
> 0:   0:   0:   0|   0   331 |1760k 1368k
>  79  10   9   1   0|   0   114M|2.01:   0:   0: 101:   0:   0:   0:   0:
> 0:   0:   0:   0|   0   335 | 893k  636k
>  77  14   8   0   0| 685k   72M|4.41:   0:   0:82.9:   0:   0:   0:   0:
> 0:1.20:   0:   0|   1   195 |1590k 1482k
>
> You can see that the "active" io host is not doing much network traffic.
>
> The weird part is the osds on the idle machines see huge CPU load even
> during periods of no IO. There are "some" explanations for that since
> the cluster is completely jerasure code HDDs in k=6, m=3, but it seems
> weird that such a small amount of data would be so CPU intensive to
> recovery when there is no performance degradation to client operations.
>
> My best guess is some sort of weird spin lock or equivalent waiting for
> contended io on OSDs due to a changed behaviour in responses for queued
> recovery operations?
>
>
> Setting just:
> osd_op_queue = "wpq"
> fixes my cluster, now recovery going at the same speed is using on average
> 3-6% cpu per OSD down from 100-300%.
>
>
>
>
>
>
> On Tue, Jul 12, 2022 at 7:56 PM Sridhar Seshasayee 
> wrote:
>
>> Hi Chris,
>>
>> While we look into this, I have a couple of questions:
>>
>> 1. Did the recovery rate stay at 1 object/sec throughout? In our tests we
>> have seen that
>> the rate is higher during the starting phase of recovery and
>> eventually
>> tapers off due
>> to throttling by mclock.
>>
>> 2. Can you try speeding up the recovery by changing to "high_recovery_ops"
>> profile on
>> all the OSDs to see if it improves things (both CPU load and recovery
>> rate)?
>>
>> 3. On the OSDs that showed high CPU usage, can you run the following
>> command and
>> revert back? 

[ceph-users] Re: rh8 krbd mapping causes no match of type 1 in addrvec problem decoding monmap, -2

2022-07-19 Thread Ilya Dryomov
On Tue, Jul 19, 2022 at 5:01 PM Wesley Dillingham  
wrote:
>
> I have a strange error when trying to map via krdb on a RH (alma8) release
> / kernel 4.18.0-372.13.1.el8_6.x86_64 using ceph client version 14.2.22
> (cluster is 14.2.16)
>
> the rbd map causes the following error in dmesg:
>
> [Tue Jul 19 07:45:00 2022] libceph: no match of type 1 in addrvec
> [Tue Jul 19 07:45:00 2022] libceph: problem decoding monmap, -2
>
> I am able to map this rbd to a cent7 / 3.10.0-1160.71.1.el7.x86_64 machine
> using the same client and commands.
>
> Of note, on the RH8 node I can fetch info about the rbd and list rbds in
> the pool check ceph status etc. It seems purely limited to the mapping of
> the RBD:
>
> Info about the RBD:
>
> [root@alma8rbdtest ~]# rbd --id profilerbd info
> win-rbd-test/originalrbdfromsnap
> rbd image 'originalrbdfromsnap':
> size 5 GiB in 1280 objects
> order 22 (4 MiB objects)
> snapshot_count: 0
> id: 2c5f465fa134c0
> block_name_prefix: rbd_data.2c5f465fa134c0
> format: 2
> features: layering, exclusive-lock
> op_features:
> flags:
> create_timestamp: Mon Jul 18 13:58:39 2022
> access_timestamp: Mon Jul 18 13:58:39 2022
> modify_timestamp: Mon Jul 18 13:58:39 2022
>
> anybody seen something like this

Hi Wesley,

Could you please provide:

- full "rbd map" ("rbd device map") command

- "mon host = XYZ" line from ceph.conf file

- "ceph mon dump" output

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Single vs multiple cephfs file systems pros and cons

2022-07-19 Thread Patrick Donnelly
On Fri, Jul 15, 2022 at 1:46 PM Vladimir Brik
 wrote:
>
> Hello
>
> When would it be a good idea to use multiple smaller cephfs
> filesystems (in the same cluster) instead a big single one
> with active-active MDSs?
>
> I am migrating about 900M files from Lustre to Ceph and I am
> wondering if I should use a single file system or two
> filesystems. Right now the only significant benefit of using
> multiple cephfs filesystems I see is that a metadata scrub
> wouldn't take as long.
>
> Do people have other thoughts about single vs multiple
> filesystems?

Major consideration points: cost of having multiple MDS running (more
memory/cpu used), inability to move files between the two hierarchies
without full copies, and straightforward scaling w/ different file
systems.

Active-active file systems can often function in a similar way with
subtree pinning without the drawbacks.

-- 
Patrick Donnelly, Ph.D.
He / Him / His
Principal Software Engineer
Red Hat, Inc.
GPG: 19F28A586F808C2402351B93C3301A3E258DD79D

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: librbd leaks memory on crushmap updates

2022-07-19 Thread Ilya Dryomov
On Tue, Jul 19, 2022 at 5:10 PM Peter Lieven  wrote:
>
> Am 24.06.22 um 16:13 schrieb Peter Lieven:
> > Am 23.06.22 um 12:59 schrieb Ilya Dryomov:
> >> On Thu, Jun 23, 2022 at 11:32 AM Peter Lieven  wrote:
> >>> Am 22.06.22 um 15:46 schrieb Josh Baergen:
>  Hey Peter,
> 
> > I found relatively large allocations in the qemu smaps and checked the 
> > contents. It contained several hundred repetitions of osd and pool 
> > names. We use the default builds on Ubuntu 20.04. Is there a special 
> > memory allocator in place that might not clean up properly?
>  I'm sure you would have noticed this and mentioned it if it was so -
>  any chance the contents of these regions look like log messages of
>  some kind? I recently tracked down a high client memory usage that
>  looked like a leak that turned out to be a broken config option
>  resulting in higher in-memory log retention:
>  https://tracker.ceph.com/issues/56093. AFAICT it affects Nautilus+.
> >>> Hi Josh, hi Ilya,
> >>>
> >>>
> >>> it seems we were in fact facing 2 leaks with 14.x. Our long running VMs 
> >>> with librbd 14.x have several million items in the osdmap mempool.
> >>>
> >>> In our testing environment with 15.x I see no unlimited increase in the 
> >>> osdmap mempool (compared this to a second dev host with 14.x client where 
> >>> I see the increase wiht my tests),
> >>>
> >>> but I still see leaking memory when I generate a lot of osdmap changes, 
> >>> but this in fact seem to be log messages - thanks Josh.
> >>>
> >>>
> >>> So I would appreciate if #56093 would be backported to Octopus before its 
> >>> final release.
> >> I picked up Josh's PR that was sitting there unnoticed but I'm not sure
> >> it is the issue you are hitting.  I think Josh's change just resurrects
> >> the behavior where clients stored only up to 500 log entries instead of
> >> up to 1 (the default for daemons).  There is no memory leak there,
> >> just a difference in how much memory is legitimately consumed.  The
> >> usage is bounded either way.
> >>
> >> However in your case, the usage is slowly but constantly growing.
> >> In the original post you said that it was observed both on 14.2.22 and
> >> 15.2.16.  Are you saying that you are no longer seeing it in 15.x?
> >
> > After I understood whats the background of Josh issue I can confirm that I 
> > still see increasing memory which is not caused
> >
> > by osdmap items and also not by log entries. There must be something else 
> > going on.
>
>
> I still see increased memory (heap) usage. Might it be that it is just heap 
> fragmentation?

Hi Peter,

It could be but you never quantified the issue.  What is the actual
heap usage you are seeing, how fast is it growing?  Is it specific to
some particular VMs or does it affect the entire fleet?

>
> We mainly see data from inside the VM in these memory areas (this might be 
> data from buffered writes), but also librbd data.
>
> Is it possible that data from buffered writes is not always freed properly?

If it's a bug, anything is possible ;)

More seriously, I think we need to start from scratch.  Initially you
suspected the osdmap handling code based on the contents of some QEMU
process mappings dumped out with gdb, but now it's VM data?  Are osdmap
pieces no longer there?

What librbd version are you testing on?

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: rh8 krbd mapping causes no match of type 1 in addrvec problem decoding monmap, -2

2022-07-19 Thread Wesley Dillingham
Tried with rh8/14.2.16 package version and same issue.
dmesg shows the error in email subject, stdout shows: rbd: map failed:
(110) Connection timed out

Respectfully,

*Wes Dillingham*
w...@wesdillingham.com
LinkedIn 


On Tue, Jul 19, 2022 at 11:00 AM Wesley Dillingham 
wrote:

> I have a strange error when trying to map via krdb on a RH (alma8) release
> / kernel 4.18.0-372.13.1.el8_6.x86_64 using ceph client version 14.2.22
> (cluster is 14.2.16)
>
> the rbd map causes the following error in dmesg:
>
> [Tue Jul 19 07:45:00 2022] libceph: no match of type 1 in addrvec
> [Tue Jul 19 07:45:00 2022] libceph: problem decoding monmap, -2
>
> I am able to map this rbd to a cent7 / 3.10.0-1160.71.1.el7.x86_64 machine
> using the same client and commands.
>
> Of note, on the RH8 node I can fetch info about the rbd and list rbds in
> the pool check ceph status etc. It seems purely limited to the mapping of
> the RBD:
>
> Info about the RBD:
>
> [root@alma8rbdtest ~]# rbd --id profilerbd info
> win-rbd-test/originalrbdfromsnap
> rbd image 'originalrbdfromsnap':
> size 5 GiB in 1280 objects
> order 22 (4 MiB objects)
> snapshot_count: 0
> id: 2c5f465fa134c0
> block_name_prefix: rbd_data.2c5f465fa134c0
> format: 2
> features: layering, exclusive-lock
> op_features:
> flags:
> create_timestamp: Mon Jul 18 13:58:39 2022
> access_timestamp: Mon Jul 18 13:58:39 2022
> modify_timestamp: Mon Jul 18 13:58:39 2022
>
> anybody seen something like this
>
>
> Respectfully,
>
> *Wes Dillingham*
> w...@wesdillingham.com
> LinkedIn 
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: librbd leaks memory on crushmap updates

2022-07-19 Thread Peter Lieven

Am 24.06.22 um 16:13 schrieb Peter Lieven:

Am 23.06.22 um 12:59 schrieb Ilya Dryomov:

On Thu, Jun 23, 2022 at 11:32 AM Peter Lieven  wrote:

Am 22.06.22 um 15:46 schrieb Josh Baergen:

Hey Peter,


I found relatively large allocations in the qemu smaps and checked the 
contents. It contained several hundred repetitions of osd and pool names. We 
use the default builds on Ubuntu 20.04. Is there a special memory allocator in 
place that might not clean up properly?

I'm sure you would have noticed this and mentioned it if it was so -
any chance the contents of these regions look like log messages of
some kind? I recently tracked down a high client memory usage that
looked like a leak that turned out to be a broken config option
resulting in higher in-memory log retention:
https://tracker.ceph.com/issues/56093. AFAICT it affects Nautilus+.

Hi Josh, hi Ilya,


it seems we were in fact facing 2 leaks with 14.x. Our long running VMs with 
librbd 14.x have several million items in the osdmap mempool.

In our testing environment with 15.x I see no unlimited increase in the osdmap 
mempool (compared this to a second dev host with 14.x client where I see the 
increase wiht my tests),

but I still see leaking memory when I generate a lot of osdmap changes, but 
this in fact seem to be log messages - thanks Josh.


So I would appreciate if #56093 would be backported to Octopus before its final 
release.

I picked up Josh's PR that was sitting there unnoticed but I'm not sure
it is the issue you are hitting.  I think Josh's change just resurrects
the behavior where clients stored only up to 500 log entries instead of
up to 1 (the default for daemons).  There is no memory leak there,
just a difference in how much memory is legitimately consumed.  The
usage is bounded either way.

However in your case, the usage is slowly but constantly growing.
In the original post you said that it was observed both on 14.2.22 and
15.2.16.  Are you saying that you are no longer seeing it in 15.x?


After I understood whats the background of Josh issue I can confirm that I 
still see increasing memory which is not caused

by osdmap items and also not by log entries. There must be something else going 
on.



I still see increased memory (heap) usage. Might it be that it is just heap 
fragmentation?

We mainly see data from inside the VM in these memory areas (this might be data 
from buffered writes), but also librbd data.

Is it possible that data from buffered writes is not always freed properly?

The dirty areas I see are all in the area of several MB up to 64MB.


Thanks

Peter


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: [EXTERNAL] Re: RGW Bucket Notifications and MultiPart Uploads

2022-07-19 Thread Mark Selby
If you can that would be great. As it is likely to be a while before we make 
the 16 -> 17 switch. 

 

Question: If I receive the 1st put notification for the partial object and I 
check RGW to see if the object actually exists – do you know if RGW will tell 
me that it is missing until the ObjectCreated:CompleteMultipartUpload is sent?

 

I was thinking as a stop gap I could add some code to just check for the 
objects existence in RGW before taking any action?

 

Thanks very much for taking the time out to answer this.

 

-- 

Mark Selby

Sr Linux Administrator, The Voleon Group

mse...@voleon.com 

 

 This email is subject to important conditions and disclosures that are listed 
on this web page: https://voleon.com/disclaimer/.

 

 

From: Yuval Lifshitz 
Date: Monday, July 18, 2022 at 9:33 PM
To: Mark Selby 
Cc: "ceph-users@ceph.io" 
Subject: [EXTERNAL] Re: [ceph-users] RGW Bucket Notifications and MultiPart 
Uploads

 

CAUTION: This email originated from outside of the organization. Use caution 
when opening attachments or links.

 

Hi Mark,

It is in quincy but wasn't backported to pacific yet.

I can do this backport, but I'm not sure when is the next pacific release.

 

Yuval

 

On Tue, Jul 19, 2022 at 5:04 AM Mark Selby  wrote:

I am trying to use RGW Bucket Notifications to trigger events on object 
creation and have into a bit of an issue when multipart uploads come into play 
for large objects.



With a small object only a single notification is generated -> ObjectCreated:Put



When a multipart upload is performed a string of Notifications are sent:

ObjectCreated:Post

ObjectCreated:Put

ObjectCreated:Put

ObjectCreated:Put

…

ObjectCreated:CompleteMultipartUpload



I can ignore the Post, but all of the Put notifications look the same as a 
single part upload message but the object will not actually be created until 
the CompleteMultipartUpload notification happens.



There is this is https://tracker.ceph.com/issues/51520 that seems to fix this 
issue – I can not tell if this was actually backported or not. Does anyone know 
if this actually was backported?



Thanks!



-- 

Mark Selby

Sr Linux Administrator, The Voleon Group

mse...@voleon.com 



 This email is subject to important conditions and disclosures that are listed 
on this web page: https://voleon.com/disclaimer/.



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] rh8 krbd mapping causes no match of type 1 in addrvec problem decoding monmap, -2

2022-07-19 Thread Wesley Dillingham
I have a strange error when trying to map via krdb on a RH (alma8) release
/ kernel 4.18.0-372.13.1.el8_6.x86_64 using ceph client version 14.2.22
(cluster is 14.2.16)

the rbd map causes the following error in dmesg:

[Tue Jul 19 07:45:00 2022] libceph: no match of type 1 in addrvec
[Tue Jul 19 07:45:00 2022] libceph: problem decoding monmap, -2

I am able to map this rbd to a cent7 / 3.10.0-1160.71.1.el7.x86_64 machine
using the same client and commands.

Of note, on the RH8 node I can fetch info about the rbd and list rbds in
the pool check ceph status etc. It seems purely limited to the mapping of
the RBD:

Info about the RBD:

[root@alma8rbdtest ~]# rbd --id profilerbd info
win-rbd-test/originalrbdfromsnap
rbd image 'originalrbdfromsnap':
size 5 GiB in 1280 objects
order 22 (4 MiB objects)
snapshot_count: 0
id: 2c5f465fa134c0
block_name_prefix: rbd_data.2c5f465fa134c0
format: 2
features: layering, exclusive-lock
op_features:
flags:
create_timestamp: Mon Jul 18 13:58:39 2022
access_timestamp: Mon Jul 18 13:58:39 2022
modify_timestamp: Mon Jul 18 13:58:39 2022

anybody seen something like this


Respectfully,

*Wes Dillingham*
w...@wesdillingham.com
LinkedIn 
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: [cephadm] ceph config as yaml

2022-07-19 Thread Ali Akil

We would need something similar to etcd to track the state of the services.
The spec configuration files should always mirror the state of the Ceph
cluster.

On 19.07.22 14:02, Redouane Kachach Elhichou wrote:

Hi Luis,

I'm not aware of any option on the specs to remove config entries. I'm
afraid you'd need to do it yourself by using the rm command.

Redo.

On Tue, Jul 19, 2022 at 1:53 PM Luis Domingues
 wrote:

Hi,

Yes we tries that, and it's working. That's the way we do it when
I refer to removing them manually. Sorry if my first message was
not clear enough.

My question is, is there a way to remove some config key from some
declaration on the orch's spec file or not?

Luis Domingues
Proton AG


--- Original Message ---
On Tuesday, July 19th, 2022 at 13:47, Redouane Kachach Elhichou
 wrote:


> Did you try the *rm *option? both ceph config and ceph
config-key support
> removing config kyes:
>
> From:
>
https://docs.ceph.com/en/quincy/man/8/ceph/#ceph-ceph-administration-tool
>
> ceph config-key [ rm | exists | get | ls | dump | set ] …
> ceph config [ dump | ls | help | get | show | show-with-defaults
> | set | rm | log | reset | assimilate-conf |
> generate-minimal-conf ] …
>
> Best,
> Redo.
>
> On Tue, Jul 19, 2022 at 1:24 PM Luis Domingues
luis.doming...@proton.ch
>
> wrote:
>
> > Hi,
> >
> > We are interested in the exact same feature. I was able to
test it.
> >
> > I have a question regarding the removal of some configuration.
If I add
> > some config, I can see it using ceph config dump. But if I
remove it from
> > my spec file, it stays on the ceph config db.
> >
> > Is there a way to remove some configuration from the ceph db?
Or do we
> > need to do the removal manually?
> >
> > Best,
> >
> > Luis Domingues
> > Proton AG
> >
> > --- Original Message ---
> > On Friday, July 15th, 2022 at 17:06, Redouane Kachach Elhichou <
> > rkach...@redhat.com> wrote:
> >
> > > This section could be added to any service spec. cephadm
will parse it
> > > and
> > > apply all the values included in the same.
> > >
> > > There's no documentation because this wasn't documented so
far. I've just
> > > created a PR for that purpose:
> > >
> > > https://github.com/ceph/ceph/pull/46926
> > >
> > > Best,
> > > Redo.
> > >
> > > On Fri, Jul 15, 2022 at 4:47 PM Ali Akil ali-a...@gmx.de wrote:
> > >
> > > > Where to this add this section exactly. In the osd service
> > > > specification
> > > > section
https://docs.ceph.com/en/latest/cephadm/services/osd/#examples
> > > > there is not mention for config.
> > > > Also cephadm doesn't seem to apply changes added to ceph.conf.
> > > >
> > > > Best Regards,
> > > > Ali
> > > > On 15.07.22 15:21, Redouane Kachach Elhichou wrote:
> > > >
> > > > Hello Ali,
> > > >
> > > > You can set configuration by including a config section in
our yaml as
> > > > following:
> > > >
> > > > config:
> > > > param_1: val_1
> > > > ...
> > > > param_N: val_N
> > > >
> > > > this is equivalent to call the following ceph cmd:
> > > >
> > > > > ceph config set   
> > > >
> > > > Best Regards,
> > > > Redo.
> > > >
> > > > On Fri, Jul 15, 2022 at 2:45 PM Ali Akil ali-a...@gmx.de
wrote:
> > > >
> > > > > Hallo,
> > > > >
> > > > > i used to set the configuration for Ceph using the cli
aka `ceph config set global osd_deep_scrub_interval `. I
would like though to
> > > > > store these configuration in my git repository. Is there
a way to
> > > > > apply
> > > > > these configurations as yaml file?
> > > > >
> > > > > I am using Quincy ceph cluster provisioned by cephadm.
> > > > >
> > > > > Best Regards,
> > > > > Ali Akil
> > > > >
> > > > > ___
> > > > > ceph-users mailing list -- ceph-users@ceph.io
> > > > > To unsubscribe send an email to ceph-users-le...@ceph.io
> > >
> > > ___
> > > ceph-users mailing list -- ceph-users@ceph.io
> > > To unsubscribe send an email to ceph-users-le...@ceph.io
> > > ___
> > > ceph-users mailing list -- ceph-users@ceph.io
> > > To unsubscribe send an email to ceph-users-le...@ceph.io
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Can't setup Basic Ceph Client

2022-07-19 Thread Jean-Marc FONTANA

Hello Iban,

Thanks for your answering ! We finally managed to connect with the admin 
keyring
and we think that is not the best practice.  We shall try your conf and 
get you advised of the result.


Best regards

JM

Le 19/07/2022 à 11:08, Iban Cabrillo a écrit :

Hi Jean,

   If you do not want to use the admin user, which is the most logical thing to 
do, you must create a client with rbd access to the pool on which you are going 
to perform the I/O actions.
For example in our case it is the user cinder:
client.cinder
key: 

caps: [mgr] allow r
caps: [mon] profile rbd
caps: [osd] profile rbd pool=vol1, profile rbd pool=vol2 . profile 
rbd pool=volx

   And the install the client keyring on the client node:

cephclient:~ # ls -la /etc/ceph/
total 28
drwxr-xr-x 2 root root 4096 Jul 18 11:37 .
drwxr-xr-x 132 root root 12288 Jul 18 11:37 ...
-rw-r--r-- 1 root root root 64 Oct 19 2017 ceph.client.cinder.keyring
-rw-r--r-- 1 root root root 2018 Jul 18 11:37 ceph.conf

In our case we have added

cat /etc/profile.d/ceph-cinder.sh
export CEPH_ARGS="--keyring /etc/ceph/ceph.client.cinder.keyring --id cinder"

so that it picks it up automatically-

cephclient:~ # rbd ls -p volumes
image01_to_remove
volume-01bbf2ee-198c-446d-80bf-f68292130f5c
volume-036865ad-6f9b-4966-b2ea-ce10bf09b6a9
volume-04445a86-a032-4731-8bff-203dfc5d02e1
..

I hope this help you.

Cheers, I



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Can't setup Basic Ceph Client

2022-07-19 Thread Jean-Marc FONTANA

Hello,

Thanks for your answering ! We finally managed to connect with the admin 
keyring
but we think that is not the best practice. A few later after your 
message, there was another one
which indicates a way to get a real client. We shall try it and get you 
advised of the result.


Best regards

JM

Le 19/07/2022 à 10:50, Kai Stian Olstad a écrit :

On 08.07.2022 16:18, Jean-Marc FONTANA wrote:

We're planning to use rbd too and get block device for a linux server.
In order to do that, we installed ceph-common packages
and created ceph.conf and ceph.keyring as explained at Basic Ceph
Client Setup — Ceph Documentation

(https://docs.ceph.com/en/pacific/cephadm/client-setup/)

This does not work.

Ceph seems to be installed

$ dpkg -l | grep ceph-common
ii  ceph-common   16.2.9-1~bpo11+1 amd64 common
utilities to mount and interact with a ceph storage cluster
ii  python3-ceph-common   16.2.9-1~bpo11+1 all Python
3 utility libraries for Ceph

$ ceph -v
ceph version 16.2.9 (4c3647a322c0ff5a1dd2344e039859dcbd28c830) 
pacific (stable)


But, when using commands that interact with the cluster, we get this 
message


$ ceph -s
2022-07-08T15:51:24.965+0200 7f773b7fe700 -1 monclient(hunting):
handle_auth_bad_method server allowed_methods [2] but i only support 
[2,1]

[errno 13] RADOS permission denied (error connecting to the cluster)


The default user for ceph is the admin/client.admin do you have that 
key in your keyring?

And is the keyring file readable for the user running the ceph commands?


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: [cephadm] ceph config as yaml

2022-07-19 Thread Redouane Kachach Elhichou
Hi Luis,

I'm not aware of any option on the specs to remove config entries. I'm
afraid you'd need to do it yourself by using the rm command.

Redo.

On Tue, Jul 19, 2022 at 1:53 PM Luis Domingues 
wrote:

> Hi,
>
> Yes we tries that, and it's working. That's the way we do it when I refer
> to removing them manually. Sorry if my first message was not clear enough.
>
> My question is, is there a way to remove some config key from some
> declaration on the orch's spec file or not?
>
> Luis Domingues
> Proton AG
>
>
> --- Original Message ---
> On Tuesday, July 19th, 2022 at 13:47, Redouane Kachach Elhichou <
> rkach...@redhat.com> wrote:
>
>
> > Did you try the *rm *option? both ceph config and ceph config-key support
> > removing config kyes:
> >
> > From:
> >
> https://docs.ceph.com/en/quincy/man/8/ceph/#ceph-ceph-administration-tool
> >
> > ceph config-key [ rm | exists | get | ls | dump | set ] …
> > ceph config [ dump | ls | help | get | show | show-with-defaults
> > | set | rm | log | reset | assimilate-conf |
> > generate-minimal-conf ] …
> >
> > Best,
> > Redo.
> >
> > On Tue, Jul 19, 2022 at 1:24 PM Luis Domingues luis.doming...@proton.ch
> >
> > wrote:
> >
> > > Hi,
> > >
> > > We are interested in the exact same feature. I was able to test it.
> > >
> > > I have a question regarding the removal of some configuration. If I add
> > > some config, I can see it using ceph config dump. But if I remove it
> from
> > > my spec file, it stays on the ceph config db.
> > >
> > > Is there a way to remove some configuration from the ceph db? Or do we
> > > need to do the removal manually?
> > >
> > > Best,
> > >
> > > Luis Domingues
> > > Proton AG
> > >
> > > --- Original Message ---
> > > On Friday, July 15th, 2022 at 17:06, Redouane Kachach Elhichou <
> > > rkach...@redhat.com> wrote:
> > >
> > > > This section could be added to any service spec. cephadm will parse
> it
> > > > and
> > > > apply all the values included in the same.
> > > >
> > > > There's no documentation because this wasn't documented so far. I've
> just
> > > > created a PR for that purpose:
> > > >
> > > > https://github.com/ceph/ceph/pull/46926
> > > >
> > > > Best,
> > > > Redo.
> > > >
> > > > On Fri, Jul 15, 2022 at 4:47 PM Ali Akil ali-a...@gmx.de wrote:
> > > >
> > > > > Where to this add this section exactly. In the osd service
> > > > > specification
> > > > > section
> https://docs.ceph.com/en/latest/cephadm/services/osd/#examples
> > > > > there is not mention for config.
> > > > > Also cephadm doesn't seem to apply changes added to ceph.conf.
> > > > >
> > > > > Best Regards,
> > > > > Ali
> > > > > On 15.07.22 15:21, Redouane Kachach Elhichou wrote:
> > > > >
> > > > > Hello Ali,
> > > > >
> > > > > You can set configuration by including a config section in our
> yaml as
> > > > > following:
> > > > >
> > > > > config:
> > > > > param_1: val_1
> > > > > ...
> > > > > param_N: val_N
> > > > >
> > > > > this is equivalent to call the following ceph cmd:
> > > > >
> > > > > > ceph config set   
> > > > >
> > > > > Best Regards,
> > > > > Redo.
> > > > >
> > > > > On Fri, Jul 15, 2022 at 2:45 PM Ali Akil ali-a...@gmx.de wrote:
> > > > >
> > > > > > Hallo,
> > > > > >
> > > > > > i used to set the configuration for Ceph using the cli aka `ceph
> config set global osd_deep_scrub_interval `. I would like though to
> > > > > > store these configuration in my git repository. Is there a way to
> > > > > > apply
> > > > > > these configurations as yaml file?
> > > > > >
> > > > > > I am using Quincy ceph cluster provisioned by cephadm.
> > > > > >
> > > > > > Best Regards,
> > > > > > Ali Akil
> > > > > >
> > > > > > ___
> > > > > > ceph-users mailing list -- ceph-users@ceph.io
> > > > > > To unsubscribe send an email to ceph-users-le...@ceph.io
> > > >
> > > > ___
> > > > ceph-users mailing list -- ceph-users@ceph.io
> > > > To unsubscribe send an email to ceph-users-le...@ceph.io
> > > > ___
> > > > ceph-users mailing list -- ceph-users@ceph.io
> > > > To unsubscribe send an email to ceph-users-le...@ceph.io
> >
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
>
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: [cephadm] ceph config as yaml

2022-07-19 Thread Redouane Kachach Elhichou
Did you try the *rm *option? both ceph config and ceph config-key support
removing config kyes:

From:
https://docs.ceph.com/en/quincy/man/8/ceph/#ceph-ceph-administration-tool

ceph config-key [ *rm* | *exists* | *get* | *ls* | *dump* | *set* ] …
ceph config [ *dump* | *ls* | *help* | *get* | *show* | *show-with-defaults*
 | *set* | *rm* | *log* | *reset* | *assimilate-conf* |
*generate-minimal-conf* ] …

Best,
Redo.

On Tue, Jul 19, 2022 at 1:24 PM Luis Domingues 
wrote:

> Hi,
>
> We are interested in the exact same feature. I was able to test it.
>
> I have a question regarding the removal of some configuration. If I add
> some config, I can see it using ceph config dump. But if I remove it from
> my spec file, it stays on the ceph config db.
>
> Is there a way to remove some configuration from the ceph db? Or do we
> need to do the removal manually?
>
> Best,
>
> Luis Domingues
> Proton AG
>
>
> --- Original Message ---
> On Friday, July 15th, 2022 at 17:06, Redouane Kachach Elhichou <
> rkach...@redhat.com> wrote:
>
>
> > This section could be added to any service spec. cephadm will parse it
> and
> > apply all the values included in the same.
> >
> > There's no documentation because this wasn't documented so far. I've just
> > created a PR for that purpose:
> >
> > https://github.com/ceph/ceph/pull/46926
> >
> > Best,
> > Redo.
> >
> >
> >
> > On Fri, Jul 15, 2022 at 4:47 PM Ali Akil ali-a...@gmx.de wrote:
> >
> > > Where to this add this section exactly. In the osd service
> specification
> > > section https://docs.ceph.com/en/latest/cephadm/services/osd/#examples
> > > there is not mention for config.
> > > Also cephadm doesn't seem to apply changes added to ceph.conf.
> > >
> > > Best Regards,
> > > Ali
> > > On 15.07.22 15:21, Redouane Kachach Elhichou wrote:
> > >
> > > Hello Ali,
> > >
> > > You can set configuration by including a config section in our yaml as
> > > following:
> > >
> > > config:
> > > param_1: val_1
> > > ...
> > > param_N: val_N
> > >
> > > this is equivalent to call the following ceph cmd:
> > >
> > > > ceph config set   
> > >
> > > Best Regards,
> > > Redo.
> > >
> > > On Fri, Jul 15, 2022 at 2:45 PM Ali Akil ali-a...@gmx.de wrote:
> > >
> > > > Hallo,
> > > >
> > > > i used to set the configuration for Ceph using the cli aka `ceph
> config set global osd_deep_scrub_interval `. I would like though to
> > > > store these configuration in my git repository. Is there a way to
> apply
> > > > these configurations as yaml file?
> > > >
> > > > I am using Quincy ceph cluster provisioned by cephadm.
> > > >
> > > > Best Regards,
> > > > Ali Akil
> > > >
> > > > ___
> > > > ceph-users mailing list -- ceph-users@ceph.io
> > > > To unsubscribe send an email to ceph-users-le...@ceph.io
> >
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] replacing OSD nodes

2022-07-19 Thread Jesper Lykkegaard Karlsen
Hi all,

Setup: Octopus - erasure 8-3

I had gotten to the point where I had some rather old OSD nodes, that I wanted 
to replace with new ones.

The procedure was planned like this:

  *   add new replacement OSD nodes
  *   set all OSDs on the retiring nodes to out.
  *   wait for everything to rebalance
  *   remove retiring nodes

All this started out nicely, with about 62% of all objects that needed to be 
replaced. Existing OSDs was maximum 70% full and with newly added OSDs raw 
available size was 1.5 PiB (47% used).
A scenario that seemed feasible to run smoothly, at least to me.

After around 50% misplaced objects remaining, the OSDs started to complain 
about backfillfull OSDs and nearfull OSDs.
A bit of a surprise to me, as RAW size is only 47% used.

It seems that rebalancing does not happen in a prioritized manner, where planed 
backfill starts with the OSD with most space available space, but 
"alphabetically" according to pg-name.
Is this really true?

Anyway, I have tried to construct a script that make a prioritized order of 
rebalancing PGs that are in stuck in "backfill_wait" position and it starts by 
rebalancing PGs to OSD with to most availble space.
If more shards are moved in a PG, then the OSD with least available space will 
be selected for the whole PG backfill.

would this work?

#!/bin/bash
LC_NUMERIC=en_US.UTF-8
OSD_DF="$(ceph osd df | awk '{print $1,$15,$16}' | sed 's/\ TiB/T\#/g' | sed 
's/\ GiB/G\#/g'| sed 's/\ B/\#/g'| grep ^[0-9] )"
OSD_AVAIL_MAX=$(ceph osd df |awk '{print $5,$6}' | grep B | grep ^[0-9]| sed 
's/\ TiB/T/g' | sed 's/\ GiB/G/g' | sed 's/\ B//g' | numfmt --from=iec | sort 
-n | tail -n1)

for PG in $(ceph pg dump_stuck 2>/dev/null | grep wait | awk '{print $1}' ); do
  CEPH_PG_MAP=$(ceph pg map ${PG})
  PGS_NEW=$(echo ${CEPH_PG_MAP} | awk -F'[' '{print $2}' |  awk -F']' '{print 
$1}')
  PGS_OLD=$(echo ${CEPH_PG_MAP} | awk -F'[' '{print $3}' |  awk -F']' '{print 
$1}')

  NUM=1
  OSD_AVAIL=${OSD_AVAIL_MAX}
  OLD_SHARDS=$(echo ${PGS_OLD}| sed 's/\,/\ /g')
  for OLD_SHARD in $(echo ${OLD_SHARDS}); do
  NEW_SHARD=$(echo ${PGS_NEW} | awk -v a="${NUM}" -F',' '{print $a}')
  #echo "OLD_SHARD=${OLD_SHARD} NEW_SHARD=${NEW_SHARD}"
if [[ ${OLD_SHARD} != ${NEW_SHARD} ]]; then
PG_MV="${PG_MV_ALL}${OLD_SHARD} ${NEW_SHARD} "
PG_MV_ALL=${PG_MV}
OSD_AVAIL_NEW=$(echo ${OSD_DF} | sed s/\#\ /\\n/g | grep ^"${NEW_SHARD} 
" | awk '{print $2}' | numfmt --from=iec)
#echo "OSD_AVAIL_NEW=$OSD_AVAIL_NEW"
if [[ ${OSD_AVAIL_NEW} -lt ${OSD_AVAIL} ]]; then
OSD_LEAST_AVAIL=${NEW_SHARD}
OSD_AVAIL=${OSD_AVAIL_NEW}
fi

fi

NEWNUM=$(( ${NUM} + 1 ))
NUM=${NEWNUM}
  done
  echo "ceph osd pg-upmap-items ${PG} ${PG_MV_ALL}  #${OSD_AVAIL}# bytes 
available on most full OSD (${OSD_LEAST_AVAIL})"
  unset PG_MV_ALL
  unset OSD_AVAIL_NEW
done | grep ^ceph | sort -rn -t'#' -k2 | numfmt -d'#' --field 2 --to=iec  | sed 
s/\#\ /\iB\ /g


The script does not do anything at this point it only puts out "ceph osd 
pg-upmap-items" commands that then needs to be piped into bash.
They look like this:

ceph osd pg-upmap-items 20.6fa 281 364   #16TiB bytes available on most full 
OSD (364)
ceph osd pg-upmap-items 20.45 317 413 115 360 85 396 68 374 188 321   #6.2TiB 
bytes available on most full OSD (321)
ceph osd pg-upmap-items 20.6b9 334 380 110 404 84 347 161 362 6 391   #5.9TiB 
bytes available on most full OSD (347)
ceph osd pg-upmap-items 20.69e 315 388 148 366 250 404 118 319 102 354   
#5.9TiB bytes available on most full OSD (319)
ceph osd pg-upmap-items 20.56 259 368 120 319 52 384 31 349 329 414   #5.9TiB 
bytes available on most full OSD (319)
ceph osd pg-upmap-items 20.4d 338 410 329 370 93 388 29 351 290 326 64 346   
#5.9TiB bytes available on most full OSD (326)
ceph osd pg-upmap-items 20.58 152 332   #5.8TiB bytes available on most full 
OSD (332)
ceph osd pg-upmap-items 20.7bc 344 322 329 267 72 339 183 410 87 387 53 358 209 
177 98 375   #2.2TiB bytes available on most full OSD (267)
ceph osd pg-upmap-items 20.59 73 292 114 414 29 367 110 301 166 353 340 385 83 
208   #2.0TiB bytes available on most full OSD (301)
ceph osd pg-upmap-items 20.f 185 395 344 366 32 335 119 317 4 233 316 360 98 
408   #1.9TiB bytes available on most full OSD (233)
ceph osd pg-upmap-items 20.734 323 391 86 191 8 379 65 414 58 326 272 362 187 
160   #1.9TiB bytes available on most full OSD (191)
ceph osd pg-upmap-items 20.732 342 350 88 234 17 157 234 409 215 346 265 395 14 
265   #1.9TiB bytes available on most full OSD (265)
ceph osd pg-upmap-items 20.6fb 332 411 319 159 309 351 102 397 85 377 46 322 24 
306 53 200 240 338   #1.9TiB bytes available on most full OSD (306)
ceph osd pg-upmap-items 20.6c5 334 371 30 340 70 266 241 407 3 233 186 356 40 
312 294 391   #1.9TiB bytes available on most full OSD (233)
ceph osd pg-upmap-items 20.6b4 344 338 226 389 319 362 309 411 85 379 248 233 
121 318 0 254   #1.9TiB bytes 

[ceph-users] Re: crashes after upgrade from octopus to pacific

2022-07-19 Thread Ramin Najjarbashi
 ceph version 16.2.9 (4c3647a322c0ff5a1dd2344e039859dcbd28c830) pacific (stable)
 1: /lib64/libpthread.so.0(+0x12ce0) [0x7f4558e24ce0]
 2: (RGWHandler_REST_S3Website::retarget(RGWOp*, RGWOp**, 
optional_yield)+0x174) [0x7f4563e02684]
 3: (rgw_process_authenticated(RGWHandler_REST*, RGWOp*&, RGWRequest*, 
req_state*, optional_yield, bool)+0xf0a) [0x7f456399e6fa]
 4: (process_request(rgw::sal::RGWRadosStore*, RGWREST*, RGWRequest*, 
std::__cxx11::basic_string, std::allocator > 
const&, rgw::auth::StrategyRegistry const&, RGWRestfulIO*, OpsLogSink*, 
optional_yield, rgw::dmclock::Scheduler*, std::__cxx11::basic_string, std::allocator >*, std::chrono::duration >*, int*)+0x2891) [0x7f45639a21c1]
 5: /lib64/libradosgw.so.2(+0x43d640) [0x7f4563921640]
 6: /lib64/libradosgw.so.2(+0x43ef6a) [0x7f4563922f6a]
 7: make_fcontext()
 NOTE: a copy of the executable, or `objdump -rdS ` is needed to 
interpret this.


> On Jul 19, 2022, at 2:16 PM, Ramin Najjarbashi  
> wrote:
> 
> Hi
> My account in "tracker.ceph.com " was not approved 
> after 5 days but anyway I have some problems in Ceph v16.2.7.
> Accourding this issue [rgw: s3website crashes after upgrade from octopus to 
> pacific](https://tracker.ceph.com/issues/53913 
> ), RGW crashed whene s3website calls 
> without subdomain (`s->object= s->bucket=`)
> 
>  RGW LOG 
>   -638> 2022-07-16T06:05:24.159+0430 7fbd26b71700 20 
> HTTP_ACCEPT=text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8
>   -637> 2022-07-16T06:05:24.159+0430 7fbd26b71700 20 
> HTTP_ACCEPT_ENCODING=gzip, deflate
>   -636> 2022-07-16T06:05:24.159+0430 7fbd26b71700 20 
> HTTP_ACCEPT_LANGUAGE=en-US,en;q=0.9,fr;q=0.8
>   -630> 2022-07-16T06:05:24.159+0430 7fbd26b71700 20 
> HTTP_CACHE_CONTROL=max-age=0
>   -628> 2022-07-16T06:05:24.159+0430 7fbd26b71700 20 
> HTTP_HOST=s3-website.XXX.com
>   -627> 2022-07-16T06:05:24.159+0430 7fbd26b71700 20 
> HTTP_UPGRADE_INSECURE_REQUESTS=1
>   -626> 2022-07-16T06:05:24.159+0430 7fbd26b71700 20 
> HTTP_USER_AGENT=Mozlila/5.0 (Linux; Android 7.0; SM-G892A Bulid/NRD90M; wv) 
> AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/60.0.3112.107 
> Mobile Safari/537.36
>   -625> 2022-07-16T06:05:24.159+0430 7fbd26b71700 20 HTTP_VERSION=1.1
>   -621> 2022-07-16T06:05:24.159+0430 7fbd26b71700 20 HTTP_X_SCHEME=http
>   -619> 2022-07-16T06:05:24.159+0430 7fbd26b71700 20 REQUEST_METHOD=GET
>   -618> 2022-07-16T06:05:24.159+0430 7fbd26b71700 20 REQUEST_URI=/
>   -617> 2022-07-16T06:05:24.159+0430 7fbd26b71700 20 SCRIPT_URI=/
>   -616> 2022-07-16T06:05:24.159+0430 7fbd26b71700 20 SERVER_PORT=1234
>   -615> 2022-07-16T06:05:24.159+0430 7fbd26b71700  1 == starting new 
> request req=0x7fbc4a034620 =
>   -614> 2022-07-16T06:05:24.159+0430 7fbd26b71700  2 req 14431852651046008579 
> 0.0s initializing for trans_id = 
> tx0c8483d85dc607703-0062d215dc-15359a41-default
>   -613> 2022-07-16T06:05:24.159+0430 7fbd26b71700 10 req 14431852651046008579 
> 0.0s rgw api priority: s3=8 s3website=7
>   -612> 2022-07-16T06:05:24.159+0430 7fbd26b71700 10 req 14431852651046008579 
> 0.0s host=s3-website.XXX.com
>   -611> 2022-07-16T06:05:24.159+0430 7fbd26b71700 20 req 14431852651046008579 
> 0.0s subdomain= domain=s3-website.XXX.com in_hosted_domain=1 
> in_hosted_domain_s3website=1
>   -610> 2022-07-16T06:05:24.159+0430 7fbd26b71700 20 req 14431852651046008579 
> 0.0s final domain/bucket subdomain= domain=s3-website.XXX.com 
> in_hosted_domain=1 in_hosted_domain_s3website=1 
> s->info.domain=s3-website.XXX.com s->info.request_uri=/
>   -609> 2022-07-16T06:05:24.159+0430 7fbd26b71700 20 req 14431852651046008579 
> 0.0s get_handler handler=33RGWHandler_REST_Service_S3Website
>   -608> 2022-07-16T06:05:24.159+0430 7fbd26b71700 10 req 14431852651046008579 
> 0.0s handler=33RGWHandler_REST_Service_S3Website
>   -607> 2022-07-16T06:05:24.159+0430 7fbd26b71700  2 req 14431852651046008579 
> 0.0s getting op 0
>   -606> 2022-07-16T06:05:24.159+0430 7fbd26b71700 10 req 14431852651046008579 
> 0.0s s3:get_obj scheduling with throttler client=2 cost=1
>   -605> 2022-07-16T06:05:24.159+0430 7fbd26b71700 10 req 14431852651046008579 
> 0.0s s3:get_obj op=28RGWGetObj_ObjStore_S3Website
>   -604> 2022-07-16T06:05:24.159+0430 7fbd26b71700  2 req 14431852651046008579 
> 0.0s s3:get_obj verifying requester
>   -603> 2022-07-16T06:05:24.159+0430 7fbd26b71700 20 req 14431852651046008579 
> 0.0s s3:get_obj rgw::auth::StrategyRegistry::s3_main_strategy_t: 
> trying rgw::auth::s3::AWSAuthStrategy
>   -602> 2022-07-16T06:05:24.159+0430 7fbd26b71700 20 req 14431852651046008579 
> 0.0s s3:get_obj rgw::auth::s3::AWSAuthStrategy: trying 
> rgw::auth::s3::S3AnonymousEngine
>   -601> 2022-07-16T06:05:24.159+0430 7fbd26b71700 20 req 14431852651046008579 
> 0.0s s3:get_obj 

[ceph-users] crashes after upgrade from octopus to pacific

2022-07-19 Thread Ramin Najjarbashi
Hi
My account in "tracker.ceph.com" was not approved after 5 days but anyway I 
have some problems in Ceph v16.2.7.
Accourding this issue [rgw: s3website crashes after upgrade from octopus to 
pacific](https://tracker.ceph.com/issues/53913), RGW crashed whene s3website 
calls without subdomain (`s->object= s->bucket=`)

 RGW LOG 
  -638> 2022-07-16T06:05:24.159+0430 7fbd26b71700 20 
HTTP_ACCEPT=text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8
  -637> 2022-07-16T06:05:24.159+0430 7fbd26b71700 20 HTTP_ACCEPT_ENCODING=gzip, 
deflate
  -636> 2022-07-16T06:05:24.159+0430 7fbd26b71700 20 
HTTP_ACCEPT_LANGUAGE=en-US,en;q=0.9,fr;q=0.8
  -630> 2022-07-16T06:05:24.159+0430 7fbd26b71700 20 
HTTP_CACHE_CONTROL=max-age=0
  -628> 2022-07-16T06:05:24.159+0430 7fbd26b71700 20 
HTTP_HOST=s3-website.XXX.com
  -627> 2022-07-16T06:05:24.159+0430 7fbd26b71700 20 
HTTP_UPGRADE_INSECURE_REQUESTS=1
  -626> 2022-07-16T06:05:24.159+0430 7fbd26b71700 20 
HTTP_USER_AGENT=Mozlila/5.0 (Linux; Android 7.0; SM-G892A Bulid/NRD90M; wv) 
AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/60.0.3112.107 Mobile 
Safari/537.36
  -625> 2022-07-16T06:05:24.159+0430 7fbd26b71700 20 HTTP_VERSION=1.1
  -621> 2022-07-16T06:05:24.159+0430 7fbd26b71700 20 HTTP_X_SCHEME=http
  -619> 2022-07-16T06:05:24.159+0430 7fbd26b71700 20 REQUEST_METHOD=GET
  -618> 2022-07-16T06:05:24.159+0430 7fbd26b71700 20 REQUEST_URI=/
  -617> 2022-07-16T06:05:24.159+0430 7fbd26b71700 20 SCRIPT_URI=/
  -616> 2022-07-16T06:05:24.159+0430 7fbd26b71700 20 SERVER_PORT=1234
  -615> 2022-07-16T06:05:24.159+0430 7fbd26b71700  1 == starting new 
request req=0x7fbc4a034620 =
  -614> 2022-07-16T06:05:24.159+0430 7fbd26b71700  2 req 14431852651046008579 
0.0s initializing for trans_id = 
tx0c8483d85dc607703-0062d215dc-15359a41-default
  -613> 2022-07-16T06:05:24.159+0430 7fbd26b71700 10 req 14431852651046008579 
0.0s rgw api priority: s3=8 s3website=7
  -612> 2022-07-16T06:05:24.159+0430 7fbd26b71700 10 req 14431852651046008579 
0.0s host=s3-website.XXX.com
  -611> 2022-07-16T06:05:24.159+0430 7fbd26b71700 20 req 14431852651046008579 
0.0s subdomain= domain=s3-website.XXX.com in_hosted_domain=1 
in_hosted_domain_s3website=1
  -610> 2022-07-16T06:05:24.159+0430 7fbd26b71700 20 req 14431852651046008579 
0.0s final domain/bucket subdomain= domain=s3-website.XXX.com 
in_hosted_domain=1 in_hosted_domain_s3website=1 
s->info.domain=s3-website.XXX.com s->info.request_uri=/
  -609> 2022-07-16T06:05:24.159+0430 7fbd26b71700 20 req 14431852651046008579 
0.0s get_handler handler=33RGWHandler_REST_Service_S3Website
  -608> 2022-07-16T06:05:24.159+0430 7fbd26b71700 10 req 14431852651046008579 
0.0s handler=33RGWHandler_REST_Service_S3Website
  -607> 2022-07-16T06:05:24.159+0430 7fbd26b71700  2 req 14431852651046008579 
0.0s getting op 0
  -606> 2022-07-16T06:05:24.159+0430 7fbd26b71700 10 req 14431852651046008579 
0.0s s3:get_obj scheduling with throttler client=2 cost=1
  -605> 2022-07-16T06:05:24.159+0430 7fbd26b71700 10 req 14431852651046008579 
0.0s s3:get_obj op=28RGWGetObj_ObjStore_S3Website
  -604> 2022-07-16T06:05:24.159+0430 7fbd26b71700  2 req 14431852651046008579 
0.0s s3:get_obj verifying requester
  -603> 2022-07-16T06:05:24.159+0430 7fbd26b71700 20 req 14431852651046008579 
0.0s s3:get_obj rgw::auth::StrategyRegistry::s3_main_strategy_t: trying 
rgw::auth::s3::AWSAuthStrategy
  -602> 2022-07-16T06:05:24.159+0430 7fbd26b71700 20 req 14431852651046008579 
0.0s s3:get_obj rgw::auth::s3::AWSAuthStrategy: trying 
rgw::auth::s3::S3AnonymousEngine
  -601> 2022-07-16T06:05:24.159+0430 7fbd26b71700 20 req 14431852651046008579 
0.0s s3:get_obj rgw::auth::s3::S3AnonymousEngine granted access
  -600> 2022-07-16T06:05:24.159+0430 7fbd26b71700 20 req 14431852651046008579 
0.0s s3:get_obj rgw::auth::s3::AWSAuthStrategy granted access
  -599> 2022-07-16T06:05:24.159+0430 7fbd26b71700  2 req 14431852651046008579 
0.0s s3:get_obj normalizing buckets and tenants
  -598> 2022-07-16T06:05:24.159+0430 7fbd26b71700 10 req 14431852651046008579 
0.0s s->object= s->bucket=
  -597> 2022-07-16T06:05:24.159+0430 7fbd26b71700  2 req 14431852651046008579 
0.0s s3:get_obj init permissions
  -596> 2022-07-16T06:05:24.159+0430 7fbd26b71700 20 req 14431852651046008579 
0.0s s3:get_obj RGWSI_User_RADOS::read_user_info(): anonymous user
  -595> 2022-07-16T06:05:24.159+0430 7fbd26b71700  2 req 14431852651046008579 
0.0s s3:get_obj recalculating target
  -594> 2022-07-16T06:05:24.159+0430 7fbd26b71700 10 req 14431852651046008579 
0.0s retarget Starting retarget
  -291> 2022-07-16T06:05:24.191+0430 7fbd26b71700 -1 *** Caught signal 
(Segmentation fault) 
   END OF LOG 

  Crash Info:

  {
"archived": "2022-07-15 12:56:38.444301",
"backtrace": [

[ceph-users] Re: Haproxy error for rgw service

2022-07-19 Thread Redouane Kachach Elhichou
Great, thanks for sharing your solution.

It would be great if you can open a tracker describing the issue so it
could be fixed later in cephadm code.

Best,
Redo.

On Tue, Jul 19, 2022 at 9:28 AM Robert Reihs  wrote:

> Hi,
> I think I found the problem. We are using ipv6 only, and the config cephadm
> is creating only adds the ipv4 configuration.
>  /etc/sysctl.d/90-ceph-FSID-keepalived.conf
> # created by cephadm
>
> # IP forwarding and non-local bind
> net.ipv4.ip_forward = 1
> net.ipv4.ip_nonlocal_bind = 1
>
> I added:
> net.ipv6.conf.bond1.forwarding = 1
> net.ipv6.conf.bond1.accept_source_route = 1
> net.ipv6.conf.bond1.accept_redirects = 1
> net.ipv6.ip_nonlocal_bind = 1
>
> and reloading the file: sysctl -f
> /etc/sysctl.d/90-ceph-FSID-keepalived.conf
> Restarting the service, everything starts up. The file gets overwritten
> again, so the added config dose not persists.
>
> Best
> Robert Reihs
>
> On Mon, Jul 18, 2022 at 3:33 PM Robert Reihs 
> wrote:
>
> > Hi everyone,
> > I have a problem with the haproxy settings for the rgw service. I
> > specified the service in the service specification:
> > ---
> > service_type: rgw
> > service_id: rgw
> > placement:
> >   count: 3
> >   label: "rgw"
> > ---
> > service_type: ingress
> > service_id: rgw.rgw
> > placement:
> >   count: 3
> >   label: "ingress"
> > spec:
> >   backend_service: rgw.rgw
> >   virtual_ip: :::404::dd:ff:10/64
> >   virtual_interface_networks: :::404/64
> >   frontend_port: 8998
> >   monitor_port: 8999
> >
> > The keepalived services are all started, the haproxy, only one is
> started,
> > the other two are in error state:
> > systemd[1]: Starting Ceph haproxy.rgw.rgw.fsn1-ceph-01.ulnhyo for
> > 40ddf3a6-36f1-42d2-9bf7-2fd50045e5dc...
> > podman[3616202]: 2022-07-18 13:03:25.738014313 + UTC m=+0.052607969
> > container create
> > 25f90c4e26ebf6fc44efe12eae2c6b9d54811bfde744a78f756469e32c3f461f (image=
> > docker.io/library/haproxy:2.3, name=ceph-40ddf3>
> > podman[3616202]: 2022-07-18 13:03:25.787788203 + UTC m=+0.102381880
> > container init
> > 25f90c4e26ebf6fc44efe12eae2c6b9d54811bfde744a78f756469e32c3f461f (image=
> > docker.io/library/haproxy:2.3, name=ceph-40ddf3a6>
> > podman[3616202]: 2022-07-18 13:03:25.790577637 + UTC m=+0.105171323
> > container start
> > 25f90c4e26ebf6fc44efe12eae2c6b9d54811bfde744a78f756469e32c3f461f (image=
> > docker.io/library/haproxy:2.3, name=ceph-40ddf3a>
> > bash[3616202]:
> > 25f90c4e26ebf6fc44efe12eae2c6b9d54811bfde744a78f756469e32c3f461f
> > conmon[3616235]: [NOTICE] 198/130325 (2) : haproxy version is
> > 2.3.20-2c8082e
> > conmon[3616235]: [NOTICE] 198/130325 (2) : path to executable is
> > /usr/local/sbin/haproxy
> > conmon[3616235]: [ALERT] 198/130325 (2) : Starting frontend stats: cannot
> > bind socket (Cannot assign requested address)
> > [:::404::dd:ff:10:8999]
> > conmon[3616235]: [ALERT] 198/130325 (2) : Starting frontend frontend:
> > cannot bind socket (Cannot assign requested address)
> > [:::404::dd:ff:10:8998]
> > conmon[3616235]: [ALERT] 198/130325 (2) : [haproxy.main()] Some protocols
> > failed to start their listeners! Exiting.
> >
> > I can access the IP in the browser and get the XML S3 response.
> > ceph version 17.2.1 (ec95624474b1871a821a912b8c3af68f8f8e7aa1) quincy
> > (stable) installed with cephadm.
> >
> > Any idea where the problem could be?
> > Thanks
> > Robert Reihs
> >
>
>
> --
> Robert Reihs
> Jakobsweg 22
> 8046 Stattegg
> AUSTRIA
>
> mobile: +43 (664) 51 035 90
> robert.re...@gmail.com
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Can't setup Basic Ceph Client

2022-07-19 Thread Iban Cabrillo
Hi Jean,

  If you do not want to use the admin user, which is the most logical thing to 
do, you must create a client with rbd access to the pool on which you are going 
to perform the I/O actions.
For example in our case it is the user cinder:
client.cinder
key: 

caps: [mgr] allow r
caps: [mon] profile rbd
caps: [osd] profile rbd pool=vol1, profile rbd pool=vol2 . profile 
rbd pool=volx

  And the install the client keyring on the client node:

cephclient:~ # ls -la /etc/ceph/
total 28
drwxr-xr-x 2 root root 4096 Jul 18 11:37 .
drwxr-xr-x 132 root root 12288 Jul 18 11:37 ...
-rw-r--r-- 1 root root root 64 Oct 19 2017 ceph.client.cinder.keyring
-rw-r--r-- 1 root root root 2018 Jul 18 11:37 ceph.conf

In our case we have added 

cat /etc/profile.d/ceph-cinder.sh 
export CEPH_ARGS="--keyring /etc/ceph/ceph.client.cinder.keyring --id cinder"

so that it picks it up automatically-

cephclient:~ # rbd ls -p volumes
image01_to_remove
volume-01bbf2ee-198c-446d-80bf-f68292130f5c
volume-036865ad-6f9b-4966-b2ea-ce10bf09b6a9
volume-04445a86-a032-4731-8bff-203dfc5d02e1
..

I hope this help you.

Cheers, I


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: new crush map requires client version hammer

2022-07-19 Thread Iban Cabrillo
HI,
  Looking deeper at my configuration i see:

  [root@cephmon03 ~]# ceph osd dump | grep min_compat_client
  require_min_compat_client firefly  
  min_compat_client hammer


  It is safe to make:
  ceph osd set-require-min-compat-client hammer

  In order to enable straw2?

regards I,


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Can't setup Basic Ceph Client

2022-07-19 Thread Kai Stian Olstad

On 08.07.2022 16:18, Jean-Marc FONTANA wrote:

We're planning to use rbd too and get block device for a linux server.
In order to do that, we installed ceph-common packages
and created ceph.conf and ceph.keyring as explained at Basic Ceph
Client Setup — Ceph Documentation

(https://docs.ceph.com/en/pacific/cephadm/client-setup/)

This does not work.

Ceph seems to be installed

$ dpkg -l | grep ceph-common
ii  ceph-common   16.2.9-1~bpo11+1 amd64    common
utilities to mount and interact with a ceph storage cluster
ii  python3-ceph-common   16.2.9-1~bpo11+1 all  Python
3 utility libraries for Ceph

$ ceph -v
ceph version 16.2.9 (4c3647a322c0ff5a1dd2344e039859dcbd28c830) pacific 
(stable)


But, when using commands that interact with the cluster, we get this 
message


$ ceph -s
2022-07-08T15:51:24.965+0200 7f773b7fe700 -1 monclient(hunting):
handle_auth_bad_method server allowed_methods [2] but i only support 
[2,1]

[errno 13] RADOS permission denied (error connecting to the cluster)


The default user for ceph is the admin/client.admin do you have that key 
in your keyring?

And is the keyring file readable for the user running the ceph commands?

--
Kai Stian Olstad
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] new crush map requires client version hammer

2022-07-19 Thread Iban Cabrillo
Dear cephers, 
The upgrade has been successful and all cluster elements are running version 
14.2.22 (including clients), and right now the cluster is HEALTH_OK, msgr2 is 
enabled and working properly. 

Following the upgrade guide from mimic to nautilus 
https://docs.ceph.com/en/latest/releases/nautilus/#upgrading-from-mimic-or-luminous
 

Item 12: the change to straw-buckets-to-straw2. 

but it seems that my CRUSH map is older than HAMMER: 

Error EINVAL: new crush map requires client version hammer but 
require_min_compat_client is firefly 

And I can't enable this new straw2 enhancement. Is there any way to update the 
CRUSH, so Is there any way to make the change before the upgrade to Octopus? 

Best regards 

-- 
= 
Ibán Cabrillo Bartolomé 
Instituto de Fisica de Cantabria (IFCA-CSIC) 
Santander, Spain 
Tel: +34942200969/+34669930421 
Responsable del Servicio de Computación Avanzada 
== 


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Quincy recovery load

2022-07-19 Thread Daniel Williams
Also never had problems with backfill / rebalance / recovery but now seen
runaway CPU usage even with very conservative recovery settings after
upgrading to quincy from pacific.

osd_recovery_sleep_hdd = 0.1
osd_max_backfills = 1
osd_recovery_max_active = 1
osd_recovery_delay_start = 600

Tried:
osd_mclock_profile = "high_recovery_ops"
It did not help.

The CPU eventually runs away so much (regardless of config) that the OSD
gets health check problems, and causes even more problems, so I tried
nodown,noout,noscrub,nodeep-scrub
But none of that helped progress the recovery forward either.

The only way back to a health cluster for now seems to be
ceph osd set norebalance

When toggling off rebalance and the cluster is slowly finishing the
rebalances in progress, I noticed that the whole cluster has almost no IO
on the disks, except on one of the hosts 100% on a single disk is bouncing
around from disk to disk.

Example of the host with the bouncing load:
root@ceph-server-04:~# !dstat
dstat -cd --disk-util --disk-tps --net
total-usage -dsk/total-
nvme-sdb--sda--sdc--sdd--sde--sdf--sdg--sdh--sdi--sdj--sdk- -dsk/total-
-net/total-
usr sys idl wai stl| read
 writ|util:util:util:util:util:util:util:util:util:util:util:util|#read
#writ| recv  send
 74  12   9   3   0|2542k  246M|7.49:99.3:   0:   0:   0:27.2:   0:99.3:
0:   0:   0:   0|   9   636 |1251k  829k
 75  11  10   3   0|  29M  254M|7.65: 101:   0:   0:74.1:20.1:   0: 101:
0:   0:   0:   0| 205   686 |4246k 7841k
 61  26   9   3   0|6340k  250M|2.81: 101:   0:   0:12.9:   0:   0:99.7:
0:   0:   0:   0|  45   660 |  35M   35M
 69  20   8   2   0|   0   243M|5.20:98.5:   0:   0:   0:   0:   0:99.7:
0:   0:   0:   0|   0   649 | 650k  442k
 71  20   8   0   0|   0   150M|5.13:87.9:   0:   0:   0:   0:   0:68.2:
0:   0:   0:   0|   0   360 | 703k  443k
 72  16  11  57   0|8168B   51M|5.18:   0:   0:   0:   0:   0:   0:1.99:
0:   0:86.5:   0|   2   129 | 702k  524k
 72  16  11   1   0|   0  5865k|7.28:   0:   0:   0:   0:   0:   0:   0:
0:   0:90.6:   0|   036 |1578k 1184k
 71  16  12   0   0|   0  6519k|7.25:   0:   0:   0:   0:   0:   0:   0:
0:   0: 112:   0|   038 | 904k  553k
 75  11  11   2   0| 522k   32M|1.96:   0:   0:   0:1.96:   0:   0:   0:
0:   0:98.5:   0|   281 |1022k  847k
 72  14  12   1   0|   060M|5.72:   0:   0:   0:   0:   0:   0:   0:
0:   0: 102:   0|   0   160 | 826k  550k
 65  19  13   2   0|   0   124M|5.57:   0:   0:99.1:   0:   0:   0:   0:
0:   0:   0:   0|   0   339 | 648k  340k
 69  17  11   2   0|   0   125M|2.82:   0:   0: 101:   0:   0:   0:   0:
0:   0:   0:   0|   0   333 | 694k  482k
 75  15   9   1   0|   0   123M|3.56:   0:   0:99.3:   0:   0:   0:   0:
0:   0:   0:   0|   0   331 |1760k 1368k
 79  10   9   1   0|   0   114M|2.01:   0:   0: 101:   0:   0:   0:   0:
0:   0:   0:   0|   0   335 | 893k  636k
 77  14   8   0   0| 685k   72M|4.41:   0:   0:82.9:   0:   0:   0:   0:
0:1.20:   0:   0|   1   195 |1590k 1482k

You can see that the "active" io host is not doing much network traffic.

The weird part is the osds on the idle machines see huge CPU load even
during periods of no IO. There are "some" explanations for that since
the cluster is completely jerasure code HDDs in k=6, m=3, but it seems
weird that such a small amount of data would be so CPU intensive to
recovery when there is no performance degradation to client operations.

My best guess is some sort of weird spin lock or equivalent waiting for
contended io on OSDs due to a changed behaviour in responses for queued
recovery operations?


Setting just:
osd_op_queue = "wpq"
fixes my cluster, now recovery going at the same speed is using on average
3-6% cpu per OSD down from 100-300%.






On Tue, Jul 12, 2022 at 7:56 PM Sridhar Seshasayee 
wrote:

> Hi Chris,
>
> While we look into this, I have a couple of questions:
>
> 1. Did the recovery rate stay at 1 object/sec throughout? In our tests we
> have seen that
> the rate is higher during the starting phase of recovery and eventually
> tapers off due
> to throttling by mclock.
>
> 2. Can you try speeding up the recovery by changing to "high_recovery_ops"
> profile on
> all the OSDs to see if it improves things (both CPU load and recovery
> rate)?
>
> 3. On the OSDs that showed high CPU usage, can you run the following
> command and
> revert back? This just dumps the mclock settings on the OSDs.
>
> sudo ceph daemon osd.N config show | grep osd_mclock
>
> I will update the tracker with these questions as well so that the
> discussion can
> continue there.
>
> Thanks,
> -Sridhar
>
> On Tue, Jul 12, 2022 at 4:49 PM Chris Palmer 
> wrote:
>
> > I've created tracker https://tracker.ceph.com/issues/56530 for this,
> > including info on replicating it on another cluster.
> >
> >
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>

[ceph-users] Re: Haproxy error for rgw service

2022-07-19 Thread Robert Reihs
Hi,
I think I found the problem. We are using ipv6 only, and the config cephadm
is creating only adds the ipv4 configuration.
 /etc/sysctl.d/90-ceph-FSID-keepalived.conf
# created by cephadm

# IP forwarding and non-local bind
net.ipv4.ip_forward = 1
net.ipv4.ip_nonlocal_bind = 1

I added:
net.ipv6.conf.bond1.forwarding = 1
net.ipv6.conf.bond1.accept_source_route = 1
net.ipv6.conf.bond1.accept_redirects = 1
net.ipv6.ip_nonlocal_bind = 1

and reloading the file: sysctl -f
/etc/sysctl.d/90-ceph-FSID-keepalived.conf
Restarting the service, everything starts up. The file gets overwritten
again, so the added config dose not persists.

Best
Robert Reihs

On Mon, Jul 18, 2022 at 3:33 PM Robert Reihs  wrote:

> Hi everyone,
> I have a problem with the haproxy settings for the rgw service. I
> specified the service in the service specification:
> ---
> service_type: rgw
> service_id: rgw
> placement:
>   count: 3
>   label: "rgw"
> ---
> service_type: ingress
> service_id: rgw.rgw
> placement:
>   count: 3
>   label: "ingress"
> spec:
>   backend_service: rgw.rgw
>   virtual_ip: :::404::dd:ff:10/64
>   virtual_interface_networks: :::404/64
>   frontend_port: 8998
>   monitor_port: 8999
>
> The keepalived services are all started, the haproxy, only one is started,
> the other two are in error state:
> systemd[1]: Starting Ceph haproxy.rgw.rgw.fsn1-ceph-01.ulnhyo for
> 40ddf3a6-36f1-42d2-9bf7-2fd50045e5dc...
> podman[3616202]: 2022-07-18 13:03:25.738014313 + UTC m=+0.052607969
> container create
> 25f90c4e26ebf6fc44efe12eae2c6b9d54811bfde744a78f756469e32c3f461f (image=
> docker.io/library/haproxy:2.3, name=ceph-40ddf3>
> podman[3616202]: 2022-07-18 13:03:25.787788203 + UTC m=+0.102381880
> container init
> 25f90c4e26ebf6fc44efe12eae2c6b9d54811bfde744a78f756469e32c3f461f (image=
> docker.io/library/haproxy:2.3, name=ceph-40ddf3a6>
> podman[3616202]: 2022-07-18 13:03:25.790577637 + UTC m=+0.105171323
> container start
> 25f90c4e26ebf6fc44efe12eae2c6b9d54811bfde744a78f756469e32c3f461f (image=
> docker.io/library/haproxy:2.3, name=ceph-40ddf3a>
> bash[3616202]:
> 25f90c4e26ebf6fc44efe12eae2c6b9d54811bfde744a78f756469e32c3f461f
> conmon[3616235]: [NOTICE] 198/130325 (2) : haproxy version is
> 2.3.20-2c8082e
> conmon[3616235]: [NOTICE] 198/130325 (2) : path to executable is
> /usr/local/sbin/haproxy
> conmon[3616235]: [ALERT] 198/130325 (2) : Starting frontend stats: cannot
> bind socket (Cannot assign requested address)
> [:::404::dd:ff:10:8999]
> conmon[3616235]: [ALERT] 198/130325 (2) : Starting frontend frontend:
> cannot bind socket (Cannot assign requested address)
> [:::404::dd:ff:10:8998]
> conmon[3616235]: [ALERT] 198/130325 (2) : [haproxy.main()] Some protocols
> failed to start their listeners! Exiting.
>
> I can access the IP in the browser and get the XML S3 response.
> ceph version 17.2.1 (ec95624474b1871a821a912b8c3af68f8f8e7aa1) quincy
> (stable) installed with cephadm.
>
> Any idea where the problem could be?
> Thanks
> Robert Reihs
>


-- 
Robert Reihs
Jakobsweg 22
8046 Stattegg
AUSTRIA

mobile: +43 (664) 51 035 90
robert.re...@gmail.com
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: RGW error Coundn't init storage provider (RADOS)

2022-07-19 Thread Robert Reihs
Yes, I checked pg_num, pgp_num and mon_max_pg_per_osd. I also setup a
single node cluster with the same ansible script we have. Using cephadm for
setting um and managing the cluster. I had the same problem on the new
single node cluster without setup of any other services. When I created the
pools manually the service started and also the dashboard connection
directly worked.

On Mon, Jul 18, 2022 at 10:20 AM Janne Johansson 
wrote:

> No, rgw should have the ability to create its own pools. Check the caps on
> tve keys used by the rgw daemon.
>
> Den mån 18 juli 2022 09:59Robert Reihs  skrev:
>
>> Hi,
>> I had to manually create the pools, than the service automatically started
>> and is now available.
>> pools:
>> .rgw.root
>> default.rgw.log
>> default.rgw.control
>> default.rgw.meta
>> default.rgw.buckets.index
>> default.rgw.buckets.data
>> default.rgw.buckets.non-ec
>>
>> Is this normal behavior? Should then the error message be changed? Or is
>> this a bug?
>> Best
>> Robert Reihs
>>
>>
>> On Fri, Jul 15, 2022 at 3:47 PM Robert Reihs 
>> wrote:
>>
>> > Hi,
>> > When I have no luck yet solving the issue, but I can add some
>> > more information. The system pools ".rgw.root" and "default.rgw.log" are
>> > not created. I have created them manually, Now there is more log
>> activity,
>> > but still getting the same error message in the log:
>> > rgw main: rgw_init_ioctx ERROR: librados::Rados::pool_create returned
>> (34)
>> > Numerical result out of range (this can be due to a pool or placement
>> group
>> > misconfiguration, e.g. pg_num < pgp_num or mon_max_pg_per_osd exceeded)
>> > I can't find the correct pool to create manually.
>> > Thanks for any help
>> > Best
>> > Robert
>> >
>> > On Tue, Jul 12, 2022 at 5:22 PM Robert Reihs 
>> > wrote:
>> >
>> >> Hi,
>> >>
>> >> We have a problem with deloing radosgw vi cephadm. We have a Ceph
>> cluster
>> >> with 3 nodes deployed via cephadm. Pool creation, cephfs and block
>> storage
>> >> are working.
>> >>
>> >> ceph version 17.2.1 (ec95624474b1871a821a912b8c3af68f8f8e7aa1) quincy
>> >> (stable)
>> >>
>> >> The service specs is like this for the rgw:
>> >>
>> >> ---
>> >>
>> >> service_type: rgw
>> >>
>> >> service_id: rgw
>> >>
>> >> placement:
>> >>
>> >>   count: 3
>> >>
>> >>   label: "rgw"
>> >>
>> >> ---
>> >>
>> >> service_type: ingress
>> >>
>> >> service_id: rgw.rgw
>> >>
>> >> placement:
>> >>
>> >>   count: 3
>> >>
>> >>   label: "ingress"
>> >>
>> >> spec:
>> >>
>> >>   backend_service: rgw.rgw
>> >>
>> >>   virtual_ip: [IPV6]
>> >>
>> >>   virtual_interface_networks: [IPV6 CIDR]
>> >>
>> >>   frontend_port: 8080
>> >>
>> >>   monitor_port: 1967
>> >>
>> >> The error I get in the logfiles:
>> >>
>> >> 0 deferred set uid:gid to 167:167 (ceph:ceph)
>> >>
>> >> 0 ceph version 17.2.1 (ec95624474b1871a821a912b8c3af68f8f8e7aa1) quincy
>> >> (stable), process radosgw, pid 2
>> >>
>> >> 0 framework: beast
>> >>
>> >> 0 framework conf key: port, val: 80
>> >>
>> >> 1 radosgw_Main not setting numa affinity
>> >>
>> >> 1 rgw_d3n: rgw_d3n_l1_local_datacache_enabled=0
>> >>
>> >> 1 D3N datacache enabled: 0
>> >>
>> >> 0 rgw main: rgw_init_ioctx ERROR: librados::Rados::pool_create returned
>> >> (34) Numerical result out of range (this can be due to a pool or
>> placement
>> >> group misconfiguration, e.g. pg_num < pgp_num or mon_max_pg_per_osd
>> >> exceeded)
>> >>
>> >> 0 rgw main: failed reading realm info: ret -34 (34) Numerical result
>> out
>> >> of range
>> >>
>> >> 0 rgw main: ERROR: failed to start notify service ((34) Numerical
>> result
>> >> out of range
>> >>
>> >> 0 rgw main: ERROR: failed to init services (ret=(34) Numerical result
>> out
>> >> of range)
>> >>
>> >> -1 Couldn't init storage provider (RADOS)
>> >>
>> >> I have for testing set the pg_num and pgp_num to 16 and the
>> >> mon_max_pg_per_osd to 1000 and still getting the same error. I have
>> also
>> >> tried creating the rgw with ceph command, same error. Pool creation is
>> >> working, I created multiple other pools and there was no problem.
>> >>
>> >> Thanks for any help.
>> >>
>> >> Best
>> >>
>> >> Robert
>> >>
>> >> The 5 fails services are 3 from the rgw and 2 haproxy for the rgw,
>> there
>> >> is only one running:
>> >>
>> >> ceph -s
>> >>
>> >>   cluster:
>> >>
>> >> id: 40ddf
>> >>
>> >> health: HEALTH_WARN
>> >>
>> >> 5 failed cephadm daemon(s)
>> >>
>> >>
>> >>
>> >>   services:
>> >>
>> >> mon: 3 daemons, quorum ceph-01,ceph-02,ceph-03 (age 4d)
>> >>
>> >> mgr: ceph-01.hbvyqi(active, since 4d), standbys: ceph-02.pqtxbv
>> >>
>> >> mds: 1/1 daemons up, 3 standby
>> >>
>> >> osd: 6 osds: 6 up (since 4d), 6 in (since 4d)
>> >>
>> >>
>> >>
>> >>   data:
>> >>
>> >> volumes: 1/1 healthy
>> >>
>> >> pools:   5 pools, 65 pgs
>> >>
>> >> objects: 87 objects, 170 MiB
>> >>
>> >> usage:   1.4 GiB used, 19 TiB / 19 TiB avail
>> >>
>> >> pgs: 65 active+clean
>> >>
>> >>
>> >
>> > --
>> > Robert Reihs
>> >