[ceph-users] Problem: Upgrading CEPH Pacific to Quincy resulted in CEPH Storage pool to stop functioning.

2023-10-09 Thread Waywatcher
I upgraded my CEPH cluster without properly following the mon upgrade so
they were no longer on leveldb.

Proxmox and CEPH were updated to the latest for the current release.
https://pve.proxmox.com/wiki/Ceph_Pacific_to_Quincy

   1. The upgrade to Quincy states a recommendation that Mons are using
   RocksDB.
   2. Leveldb support has been removed from quincy.


The monitors were still running as leveldb.

   1. Does this mean the mons cannot work at all since they are levelDB?


I upgraded all nodes to the quincy release 17.2.6 and restarted the mons.

At this point the cluster stopped responding.
`ceph` commands do not work since the service fails to start.

Are there steps for recovery?

1) Roll back to Pacific without being able to use CEPH commands (ceph orch
upgrade start --ceph-version ).
2) Rebuild the monitors using data from the OSDs while maintaining Quincy
release.
3) Is this actually related to the bug about 17.2.6 (which is what
Proxmox/CEPH upgrades to) https://tracker.ceph.com/issues/58156 ?


I ran the upgrade on another cluster prior to this without issue. The Mons
were set with RocksDB and running on Quincy 17.2.6.

I appreciate any suggestions.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph 18: Unable to delete image after imcomplete migration "image being migrated"

2023-10-09 Thread Rhys Goodwin
Apologies, it's Ceph 17.2.6 not 18! I did try to deploy 18 on Ubuntu 22.04 but 
a bit beyond me until the packages are available.
I'll wait to hear any suggestions, otherwise I'll procced with trying to 
migrate then delete the pool and report back.
Ngā mihi
Rhys

--- Original Message ---
On Monday, October 9th, 2023 at 9:52 AM, Rhys Goodwin  
wrote:

> Hi Folks,
>
> I'm running Ceph 18 with OpenStack for my lab (and home services) in a 3 node 
> cluster on Ubuntu 22.04. I'm quite new to these platforms. Just learning. 
> This is my build, for what it's worth: 
> https://blog.rhysgoodwin.com/it/openstack-ceph-hyperconverged/
>
> I got myself into some trouble as follows. This is the sequence of events:
>
> I don't recall when but at some stage I must have tried an image migration 
> from one pool to another. The source pool/image is infra-pool/sophosbuild I 
> don't know what the target would have been. In any case on my travels, I 
> found the infra-pool/sophosbuild image in the trash:
> rhys@hcn03:/imagework# rbd trash ls --all infra-pool
> 65a87bb2472fe sophosbuild
>
> I tried to delete it but got the following:
>
> rhys@hcn03:/imagework# rbd trash rm infra-pool/65a87bb2472fe
> 2023-10-06T04:23:13.775+ 7f28bbfff640 -1 librbd::image::RefreshRequest: 
> image being migrated
> 2023-10-06T04:23:13.775+ 7f28bbfff640 -1 librbd::image::OpenRequest: 
> failed to refresh image: (30) Read-only file system
> 2023-10-06T04:23:13.775+ 7f28bbfff640 -1 librbd::ImageState: 
> 0x7f28a804b600 failed to open image: (30) Read-only file system
> 2023-10-06T04:23:13.775+ 7f28a2ffd640 -1 librbd::image::RemoveRequest: 
> 0x7f28a8000b90 handle_open_image: error opening image: (30) Read-only file 
> system
> rbd: remove error: (30) Read-only file systemRemoving image: 0% 
> complete...failed.
>
> Next, I tried to restore the image, and this also failed:
> rhys@hcn03:/imagework:# rbd trash restore infra-pool/65a87bb2472fe
> librbd::api::Trash: restore: Current trash source 'migration' does not match 
> expected: user,mirroring,unknown (4)
>
> Probably stupidly, I followed the steps in this post: 
> https://www.spinics.net/lists/ceph-users/msg72786.html to change offset 07 
> from 02 (TRASH_IMAGE_SOURCE_MIGRATION) in omap value to 
> 00(TRASH_IMAGE_SOURCE_USER)
>
> After this I was able to restore the image successfully.
> However, I still could not delete it:
> rhys@hcn03:/imagework:# rbd rm infra-pool/sophosbuild
> 2023-10-06T05:52:30.708+ 7ff5937fe640 -1 librbd::image::RefreshRequest: 
> image being migrated
> 2023-10-06T05:52:30.708+ 7ff5937fe640 -1 librbd::image::OpenRequest: 
> failed to refresh image: (30) Read-only file system
> 2023-10-06T05:52:30.708+ 7ff5937fe640 -1 librbd::ImageState: 
> 0x564d3f83d680 failed to open image: (30) Read-only file system
> Removing image: 0% complete...failed.rbd: delete error: (30) Read-only file 
> system
>
> I tried to abort the migration with: root@hcn03:/imagework# rbd migration 
> abort infra-pool/sophosbuild
> This took a few mins but failed at 99% (sorry, terminal scroll back lost)
>
> So now I'm stuck, I don't know how to get rid of this image and while 
> everything is otherwise healthy in the cluster, the dashboard is throwing 
> errors when it tries to enumerate the images in that pool.
>
> I'm considering migrating the good images off this pool and deleing the pool. 
> But I don't even know if I'll be allowed to delete the pool while this issue 
> is present.
>
> Any advice would be much appreciated.
>
> Kind regards,
> Rhys
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: [RGW] Is there a way for a user to change is secret key or create other keys ?

2023-10-09 Thread Gilles Mocellin
Le lundi 9 octobre 2023, 17:12:53 CEST Casey Bodley a écrit :
> On Mon, Oct 9, 2023 at 9:16 AM Gilles Mocellin
> 
>  wrote:
> > Hello Cephers,
> > 
> > I was using Ceph with OpenStack, and users could add, remove credentials
> > with `openstack ec2 credentials` commands.
> > But, we are moving our Object Storage service to a new cluster, and
> > didn't want to tie it with OpenStack.
> > 
> > Is there a way to have a bit of self service for Rados Gateway, at leas
> > for creating, deleting, changing S3 keys ?
> > 
> > It does not seem to be part of S3 APIs.
> 
> right, user/role/key management is part of the IAM service in AWS, not
> S3. IAM exposes APIs like
> https://docs.aws.amazon.com/IAM/latest/APIReference/API_CreateAccessKey.html
> , etc
> 
> radosgw supports some of the IAM APIs related to roles and role/user
> policy, but not the ones for self-service user/key management. i'd
> love to add those eventually once we have an s3 'account' feature to
> base them on, but development there has been slow
> (https://github.com/ceph/ceph/pull/46373 tracks the most recent
> progress)
> 
> i'd agree that the radosgw admin APIs aren't a great fit because
> they're targeted at admins, rather than delegating self-service
> features to end users

Thank you, glad to see it's something that someone already think about.

I look at other authentication mechanisms, like LDAP, Keycloak, STS...
And I don't think I understand every thing.

As a great target, I'd like to build an IAM service, based on for example 
Keycloak.
But what I understand from the documentation, everything around STS and 
Keycloak seems to be geared toward applications, which can negotiate tokens, 
and use short lived credentials ?
Impossible to use some tools like rclone, restic, s3cmd, or existing apps that 
needs just one pair of static S3 access and secret keys ?

LDAP can do that, but no way to add new keys, just modify the one we have, 
based on the login/password.
It's already one step, but with some not intuitive tricks, to encode the 
login/password into a S3 access key... with an empty S3 secret key...


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Next quincy point release 17.2.7

2023-10-09 Thread Venky Shankar
FYI - Added 5 cephfs related PRs - those are under test in Yuri's branch.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Next quincy point release 17.2.7

2023-10-09 Thread Yuri Weinstein
We've merged some PRs, and some are in testing now.

At this point, we will not add more PRs to 17.2.7.

Thank you for your help!

On Wed, Oct 4, 2023 at 1:57 PM Yuri Weinstein  wrote:
>
> Hello
>
> We are getting very close to the next Quincy point release 17.2.7
>
> Here is the list of must-have PRs https://pad.ceph.com/p/quincy_17.2.7_prs
> We will start the release testing/review/approval process as soon as
> all PRs from this list are merged.
>
> If you see something missing please speak up and the dev leads will
> make a decision on including it in this release.
>
> TIA
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: [RGW] Is there a way for a user to change is secret key or create other keys ?

2023-10-09 Thread Casey Bodley
On Mon, Oct 9, 2023 at 9:16 AM Gilles Mocellin
 wrote:
>
> Hello Cephers,
>
> I was using Ceph with OpenStack, and users could add, remove credentials
> with `openstack ec2 credentials` commands.
> But, we are moving our Object Storage service to a new cluster, and
> didn't want to tie it with OpenStack.
>
> Is there a way to have a bit of self service for Rados Gateway, at leas
> for creating, deleting, changing S3 keys ?
>
> It does not seem to be part of S3 APIs.

right, user/role/key management is part of the IAM service in AWS, not
S3. IAM exposes APIs like
https://docs.aws.amazon.com/IAM/latest/APIReference/API_CreateAccessKey.html,
etc

radosgw supports some of the IAM APIs related to roles and role/user
policy, but not the ones for self-service user/key management. i'd
love to add those eventually once we have an s3 'account' feature to
base them on, but development there has been slow
(https://github.com/ceph/ceph/pull/46373 tracks the most recent
progress)

i'd agree that the radosgw admin APIs aren't a great fit because
they're targeted at admins, rather than delegating self-service
features to end users

> It's certainly doable with Ceph RGW admin API, but with which tool that
> a standard user can use ?
>
> The Ceph Dashboard does not seem a good idea. Roles are global, nothing
> that can be scoped to a tenant.
>
> Some S3 browsers exist (https://github.com/nimbis/s3commander), but
> never with some management like changing S3 keys.
> Certainly because it's not in the "standard" S3 API.
>
> Perhaps Ceph can provide a client-side dashboard, which can be exposed
> externally, aside the actual admin dashboard, which will stay inside ?
>
> Regards,
> --
> Gilles
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Random issues with Reef

2023-10-09 Thread Eugen Block
It seems like the default value for mgr/cephadm/container_image_base  
has been changed from quay.io/ceph/ceph to
quay.io/ceph/ceph:v18 between Quincy and Reef. So a quick fix would be  
to set it to the previous default:


ceph config set mgr mgr/cephadm/container_image_base quay.io/ceph/ceph

Apparently, this is the responsible change:

# Quincy
grep "DEFAULT_IMAGE\ =" /usr/share/ceph/mgr/cephadm/module.py
DEFAULT_IMAGE = 'quay.io/ceph/ceph'

# Reef
grep "DEFAULT_IMAGE\ =" /usr/share/ceph/mgr/cephadm/module.py
DEFAULT_IMAGE = 'quay.io/ceph/ceph:v18'

Although the description clearly states:


desc='Container image name, without the tag',


I created a tracker issue for that:
https://tracker.ceph.com/issues/63150

Zitat von Eugen Block :


Hi,

either the cephadm version installed on the host should be updated  
as well so it matches the cluster version or you can also use the  
one that the orchestrator uses which stores its different versions  
in this path (@Mykola thanks again for pointing that out), the  
latest matches the current ceph version:


/var/lib/ceph/${fsid}/cephadm.*

If you set the executable bit you can use it as usual:

# pacific package version
$ rpm -qf /usr/sbin/cephadm
cephadm-16.2.11.65+g8b7e6fc0182-lp154.3872.1.noarch

$ chmod +x  
/var/lib/ceph/201a2fbc-ce7b-44a3-9ed7-39427972083b/cephadm.7dcbd4aab60af3e83970c60d4a8a2cc6ea7b997ecc2f4de0a47eeacbb88dde46


$ python3  
/var/lib/ceph/201a2fbc-ce7b-44a3-9ed7-39427972083b/cephadm.7dcbd4aab60af3e83970c60d4a8a2cc6ea7b997ecc2f4de0a47eeacbb88dde46  
ls

[
{
"style": "cephadm:v1",
...
}
]



Also the command:

ceph orch upgrade start -ceph_version v18.2.0


That looks like a bug to me, it's reproducable:

$ ceph orch upgrade check --ceph-version 18.2.0
Error EINVAL: host ceph01 `cephadm pull` failed: cephadm exited with  
an error code: 1, stderr: Pulling container image  
quay.io/ceph/ceph:v18:v18.2.0...
Non-zero exit code 125 from /usr/bin/podman pull  
quay.io/ceph/ceph:v18:v18.2.0 --authfile=/etc/ceph/podman-auth.json

/usr/bin/podman: stderr Error: invalid reference format
ERROR: Failed command: /usr/bin/podman pull  
quay.io/ceph/ceph:v18:v18.2.0 --authfile=/etc/ceph/podman-auth.json


It works correctly with 17.2.6:

# ceph orch upgrade check --ceph-version 18.2.0
{
"needs_update": {
"crash.soc9-ceph": {
"current_id":  
"2d45278716053f92517e447bc1a7b64945cc4ecbaff4fe57aa0f21632a0b9930",
"current_name":  
"quay.io/ceph/ceph@sha256:1e442b0018e6dc7445c3afa7c307bc61a06189ebd90580a1bb8b3d0866c0d8ae",

"current_version": "17.2.6"
...

I haven't checked for existing tracker issues yet. I'd recommend to  
check and create a bug report:


https://tracker.ceph.com/

Regards,
Eugen

Zitat von Martin Conway :


Hi

I have been using Ceph for many years now, and recently upgraded to Reef.

Seems I made the jump too quickly, as I have been hitting a few  
issues. I can't find any mention of them in the bug reports. I  
thought I would share them here in case it is something to do with  
my setup.


On V18.2.0

cephadm version

Fails with the following output:

Traceback (most recent call last):
 File "/usr/lib64/python3.6/runpy.py", line 193, in _run_module_as_main
   "__main__", mod_spec)
 File "/usr/lib64/python3.6/runpy.py", line 85, in _run_code
   exec(code, run_globals)
 File "/usr/sbin/cephadm/__main__.py", line 10096, in 
 File "/usr/sbin/cephadm/__main__.py", line 10084, in main
 File "/usr/sbin/cephadm/__main__.py", line 2240, in _infer_image
 File "/usr/sbin/cephadm/__main__.py", line 2338, in infer_local_ceph_image
 File "/usr/sbin/cephadm/__main__.py", line 2301, in get_container_info
 File "/usr/sbin/cephadm/__main__.py", line 2301, in 
 File "/usr/sbin/cephadm/__main__.py", line 222, in __getattr__
AttributeError: 'CephadmContext' object has no attribute 'fsid'

I don't know if it is related, but

cephadm adopt --style legacy --name osd.X

Tries to use a V15 image which then fails to start after being  
imported. The OSD in question has an SSD device from block.db if  
that is relevant.


Using the latest head version of cephadm from github let me work  
around this issue, but the adopted OSDs were running  
18.0.0-6603-g6c4ed58a and needed to be upgraded to 18.2.0.


Also the command:

ceph orch upgrade start -ceph_version v18.2.0

Does not work, it fails to find the right image. From memory I  
think it tried to pull quay.io/ceph/ceph:v18:v18.2.0


Ceph orch upgrade start quay.io/ceph/ceph:v18.2.0

Does work as expected.

Let me know if there is any other information that would be  
helpful, but I have since worked around these issues and have my  
ceph back in a happy state.


Regards,
Martin Conway
IT and Digital Media Manager
Research School of Physics
Australian National University
Canberra ACT 2601

+61 2 6125 1599
https://physics.anu.edu.au

___
ceph-users mailing list -- 

[ceph-users] [RGW] Is there a way for a user to change is secret key or create other keys ?

2023-10-09 Thread Gilles Mocellin

Hello Cephers,

I was using Ceph with OpenStack, and users could add, remove credentials 
with `openstack ec2 credentials` commands.
But, we are moving our Object Storage service to a new cluster, and 
didn't want to tie it with OpenStack.


Is there a way to have a bit of self service for Rados Gateway, at leas 
for creating, deleting, changing S3 keys ?


It does not seem to be part of S3 APIs.
It's certainly doable with Ceph RGW admin API, but with which tool that 
a standard user can use ?


The Ceph Dashboard does not seem a good idea. Roles are global, nothing 
that can be scoped to a tenant.


Some S3 browsers exist (https://github.com/nimbis/s3commander), but 
never with some management like changing S3 keys.

Certainly because it's not in the "standard" S3 API.

Perhaps Ceph can provide a client-side dashboard, which can be exposed 
externally, aside the actual admin dashboard, which will stay inside ?


Regards,
--
Gilles
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Received signal: Hangup from killall

2023-10-09 Thread Rok Jaklič
After looking through documentation soft log kills are "normal", however in
radosgw logs we found:
2023-10-06T01:31:32.920+0200 7fb6f440b700  0 INFO: RGWReshardLock::lock
found lock on reshard.02 to be held by another RGW process;
skipping for now
2023-10-06T01:31:33.371+0200 7fb6f440b700  0 INFO: RGWReshardLock::lock
found lock on reshard.04 to be held by another RGW process;
skipping for now
2023-10-06T01:31:33.521+0200 7fb6f440b700  0 INFO: RGWReshardLock::lock
found lock on reshard.06 to be held by another RGW process;
skipping for now
2023-10-06T01:31:33.853+0200 7fb6f440b700  0 INFO: RGWReshardLock::lock
found lock on reshard.08 to be held by another RGW process;
skipping for now
2023-10-06T01:31:34.598+0200 7fb6f440b700  0 INFO: RGWReshardLock::lock
found lock on reshard.12 to be held by another RGW process;
skipping for now
2023-10-06T01:31:34.740+0200 7fb6f440b700  0 INFO: RGWReshardLock::lock
found lock on reshard.14 to be held by another RGW process;
skipping for now
...
after this line ... it seems that rgw stopped responding.

And the next day it stopped again almost at the same time
2023-10-07T01:27:26.299+0200 7f6216651700  0 INFO: RGWReshardLock::lock
found lock on reshard.05 to be held by another RGW process;
skipping for now
2023-10-07T01:37:28.077+0200 7f6216651700  0 INFO: RGWReshardLock::lock
found lock on reshard.14 to be held by another RGW process;
skipping for now
2023-10-07T01:47:27.333+0200 7f6216651700  0 INFO: RGWReshardLock::lock
found lock on reshard.01 to be held by another RGW process;
skipping for now
2023-10-07T02:47:29.863+0200 7f6216651700  0 INFO: RGWReshardLock::lock
found lock on reshard.06 to be held by another RGW process;
skipping for now
...
after this line ... rgw stopped responding. We had to restart it.

We were just about to upgrade to ceph 17.x... but we had postpone it
because of this.

Rok




On Fri, Oct 6, 2023 at 9:30 AM Rok Jaklič  wrote:

> Hi,
>
> yesterday we changed RGW from civetweb to beast and at 04:02 RGW stopped
> working; we had to restart it in the morning.
>
> In one rgw log for previous day we can see:
> 2023-10-06T04:02:01.105+0200 7fb71d45d700 -1 received  signal: Hangup from
> killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw
> rbd-mirror cephfs-mirror  (PID: 3202663) UID: 0
> and in the next day log we can see:
> 2023-10-06T04:02:01.133+0200 7fb71d45d700 -1 received  signal: Hangup from
>  (PID: 3202664) UID: 0
>
> and after that no requests came. We had to restart rgw.
>
> In ceph.conf we have something like
>
> [client.radosgw.ctplmon2]
> host = ctplmon2
> log_file = /var/log/ceph/client.radosgw.ctplmon2.log
> rgw_dns_name = ctplmon2
> rgw_frontends = "beast ssl_endpoint=0.0.0.0:4443 ssl_certificate=..."
> rgw_max_put_param_size = 15728640
>
> We assume it has something to do with logrotate.
>
> /etc/logrotate.d/ceph:
> /var/log/ceph/*.log {
> rotate 90
> daily
> compress
> sharedscripts
> postrotate
> killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse
> radosgw rbd-mirror cephfs-mirror || pkill -1 -x
> "ceph-mon|ceph-mgr|ceph-mds|ceph-osd|ceph-fuse|radosgw|rbd-mirror|cephfs-mirror"
> || true
> endscript
> missingok
> notifempty
> su root ceph
> }
>
> ceph version 16.2.14 (238ba602515df21ea7ffc75c88db29f9e5ef12c9) pacific
> (stable)
>
> And ideas why this happend?
>
> Kind regards,
> Rok
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph 16.2.x excessive logging, how to reduce?

2023-10-09 Thread Zakhar Kirpichenko
Thanks for the suggestion. That pid belongs to the mon process. I.e. the
monitor is logging all client connections and commands.

/Z

On Mon, 9 Oct 2023 at 14:24, Kai Stian Olstad  wrote:

> On 09.10.2023 10:05, Zakhar Kirpichenko wrote:
> > I did try to play with various debug settings. The issue is that mons
> > produce logs of all commands issued by clients, not just mgr. For
> > example,
> > an Openstack Cinder node asking for space it can use:
> >
> > Oct  9 07:59:01 ceph03 bash[4019]: debug 2023-10-09T07:59:01.303+
>
> This log say that it's bash with PID 4019 that is creating the log
> entry.
> Maybe start there, check what what other thing you are running on the
> server that creates this messages.
>
> --
> Kai Stian Olstad
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph 16.2.x excessive logging, how to reduce?

2023-10-09 Thread Kai Stian Olstad

On 09.10.2023 10:05, Zakhar Kirpichenko wrote:

I did try to play with various debug settings. The issue is that mons
produce logs of all commands issued by clients, not just mgr. For 
example,

an Openstack Cinder node asking for space it can use:

Oct  9 07:59:01 ceph03 bash[4019]: debug 2023-10-09T07:59:01.303+


This log say that it's bash with PID 4019 that is creating the log 
entry.
Maybe start there, check what what other thing you are running on the 
server that creates this messages.


--
Kai Stian Olstad
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph 16.2.x excessive logging, how to reduce?

2023-10-09 Thread Zakhar Kirpichenko
I have `mon debug_client`, but it's already at 0/5 by default.

/Z

On Mon, 9 Oct 2023 at 11:24, Marc  wrote:

> Hi Zakhar,
>
> >
> > I did try to play with various debug settings. The issue is that mons
> > produce logs of all commands issued by clients, not just mgr. For
> example,
> > an Openstack Cinder node asking for space it can use:
> >
> > Oct  9 07:59:01 ceph03 bash[4019]: debug 2023-10-09T07:59:01.303+
> > 7f489da8f700  0 log_channel(audit) log [DBG] : from='client.?
> > 10.208.1.11:0/3286277243  '
> > entity='client.cinder' cmd=[{"prefix":"osd pool get-quota", "pool":
> > "volumes-ssd", "format":"json"}]: dispatch
>
> I am on a older version of ceph still so I am not sure if I even have
> these. There is also an option ceph.conf to do client logging
>
>   [client]
>   #debug client = 5
>
>
> >
> > It is unclear which particular mon debug option out of many controls this
> > particular type of debug. I tried searching for documentation of mon
> debug
> > options to no avail.
> >
>
> Maybe there is something equal to this for logging?
> ceph daemon mon.a perf schema|less
> ceph daemon osd.0 perf schema|less
>
>
> >
> >
> >   Did you do something like this
> >
> >   Getting keys with
> >   ceph daemon mon.a config show | grep debug_ | grep mgr
> >
> >   ceph tell mon.* injectargs --$monk=0/0
> >
> >   >
> >   > Any input from anyone, please?
> >   >
> >   > This part of Ceph is very poorly documented. Perhaps there's a
> > better place
> >   > to ask this question? Please let me know.
> >   >
> >   > /Z
> >   >
> >   > On Sat, 7 Oct 2023 at 22:00, Zakhar Kirpichenko <
> zak...@gmail.com
> >  > wrote:
> >   >
> >   > > Hi!
> >   > >
> >   > > I am still fighting excessive logging. I've reduced unnecessary
> > logging
> >   > > from most components except for mon audit:
> > https://pastebin.com/jjWvUEcQ
> >   > >
> >   > > How can I stop logging this particular type of messages?
> >   > >
> >   > > I would appreciate your help and advice.
> >   > >
> >   > > /Z
> >   > >
> >   > >
> >   > >> Thank you for your response, Igor.
> >   > >>
> >   > >> Currently debug_rocksdb is set to 4/5:
> >   > >>
> >   > >> # ceph config get osd debug_rocksdb
> >   > >> 4/5
> >   > >>
> >   > >> This setting seems to be default. Is my understanding correct
> > that
> >   > you're
> >   > >> suggesting setting it to 3/5 or even 0/5? Would setting it to
> > 0/5 have
> >   > any
> >   > >> negative effects on the cluster?
> >   > >>
> >   > >> /Z
> >   > >>
> >   > >> On Wed, 4 Oct 2023 at 21:23, Igor Fedotov <
> igor.fedo...@croit.io
> >  > wrote:
> >   > >>
> >   > >>> Hi Zakhar,
> >   > >>>
> >   > >>> do reduce rocksdb logging verbosity you might want to set
> > debug_rocksdb
> >   > >>> to 3 (or 0).
> >   > >>>
> >   > >>> I presume it produces a  significant part of the logging
> > traffic.
> >   > >>>
> >   > >>>
> >   > >>> Thanks,
> >   > >>>
> >   > >>> Igor
> >   > >>>
> >   > >>> On 04/10/2023 20:51, Zakhar Kirpichenko wrote:
> >   > >>> > Any input from anyone, please?
> >   > >>> >
> >   > >>> > On Tue, 19 Sept 2023 at 09:01, Zakhar Kirpichenko
> > mailto:zak...@gmail.com> >
> >   > >>> wrote:
> >   > >>> >
> >   > >>> >> Hi,
> >   > >>> >>
> >   > >>> >> Our Ceph 16.2.x cluster managed by cephadm is logging a
> lot
> > of very
> >   > >>> >> detailed messages, Ceph logs alone on hosts with monitors
> > and
> >   > several
> >   > >>> OSDs
> >   > >>> >> has already eaten through 50% of the endurance of the
> flash
> > system
> >   > >>> drives
> >   > >>> >> over a couple of years.
> >   > >>> >>
> >   > >>> >> Cluster logging settings are default, and it seems that
> all
> > daemons
> >   > >>> are
> >   > >>> >> writing lots and lots of debug information to the logs,
> such
> > as for
> >   > >>> >> example: https://pastebin.com/ebZq8KZk (it's just a
> snippet,
> > but
> >   > >>> there's
> >   > >>> >> lots and lots of various information).
> >   > >>> >>
> >   > >>> >> Is there a way to reduce the amount of logging and, for
> > example,
> >   > >>> limit the
> >   > >>> >> logging to warnings or important messages so that it
> doesn't
> > include
> >   > >>> every
> >   > >>> >> successful authentication attempt, compaction etc, etc,
> when
> > the
> >   > >>> cluster is
> >   > >>> >> healthy and operating normally?
> >   > >>> >>
> >   > >>> >> I would very much appreciate your advice on this.
> >   > >>> >>
> >   > >>> >> Best regards,
> >   > >>> >> Zakhar
> >   > >>> >>
> >   > >>> >>
> >   > >>> >>
> >   > >>> > 

[ceph-users] Ceph 16.2.x mon compactions, disk writes

2023-10-09 Thread Zakhar Kirpichenko
Hi,

Monitors in our 16.2.14 cluster appear to quite often run "manual
compaction" tasks:

debug 2023-10-09T09:30:53.888+ 7f48a329a700  4 rocksdb: EVENT_LOG_v1
{"time_micros": 1696843853892760, "job": 64225, "event": "flush_started",
"num_memtables": 1, "num_entries": 715, "num_deletes": 251,
"total_data_size": 3870352, "memory_usage": 3886744, "flush_reason":
"Manual Compaction"}
debug 2023-10-09T09:30:53.904+ 7f4899286700  4 rocksdb:
[db_impl/db_impl_compaction_flush.cc:1443] [default] Manual compaction
starting
debug 2023-10-09T09:30:53.908+ 7f48a3a9b700  4 rocksdb: (Original Log
Time 2023/10/09-09:30:53.910204) [db_impl/db_impl_compaction_flush.cc:2516]
[default] Manual compaction from level-0 to level-5 from 'paxos .. 'paxos;
will stop at (end)
debug 2023-10-09T09:30:53.908+ 7f4899286700  4 rocksdb:
[db_impl/db_impl_compaction_flush.cc:1443] [default] Manual compaction
starting
debug 2023-10-09T09:30:53.908+ 7f4899286700  4 rocksdb:
[db_impl/db_impl_compaction_flush.cc:1443] [default] Manual compaction
starting
debug 2023-10-09T09:30:53.908+ 7f4899286700  4 rocksdb:
[db_impl/db_impl_compaction_flush.cc:1443] [default] Manual compaction
starting
debug 2023-10-09T09:30:53.908+ 7f4899286700  4 rocksdb:
[db_impl/db_impl_compaction_flush.cc:1443] [default] Manual compaction
starting
debug 2023-10-09T09:30:53.908+ 7f4899286700  4 rocksdb:
[db_impl/db_impl_compaction_flush.cc:1443] [default] Manual compaction
starting
debug 2023-10-09T09:30:53.908+ 7f48a3a9b700  4 rocksdb: (Original Log
Time 2023/10/09-09:30:53.911004) [db_impl/db_impl_compaction_flush.cc:2516]
[default] Manual compaction from level-5 to level-6 from 'paxos .. 'paxos;
will stop at (end)
debug 2023-10-09T09:32:08.956+ 7f48a329a700  4 rocksdb: EVENT_LOG_v1
{"time_micros": 1696843928961390, "job": 64228, "event": "flush_started",
"num_memtables": 1, "num_entries": 1580, "num_deletes": 502,
"total_data_size": 8404605, "memory_usage": 8465840, "flush_reason":
"Manual Compaction"}
debug 2023-10-09T09:32:08.972+ 7f4899286700  4 rocksdb:
[db_impl/db_impl_compaction_flush.cc:1443] [default] Manual compaction
starting
debug 2023-10-09T09:32:08.976+ 7f48a3a9b700  4 rocksdb: (Original Log
Time 2023/10/09-09:32:08.977739) [db_impl/db_impl_compaction_flush.cc:2516]
[default] Manual compaction from level-0 to level-5 from 'logm .. 'logm;
will stop at (end)
debug 2023-10-09T09:32:08.976+ 7f4899286700  4 rocksdb:
[db_impl/db_impl_compaction_flush.cc:1443] [default] Manual compaction
starting
debug 2023-10-09T09:32:08.976+ 7f4899286700  4 rocksdb:
[db_impl/db_impl_compaction_flush.cc:1443] [default] Manual compaction
starting
debug 2023-10-09T09:32:08.976+ 7f4899286700  4 rocksdb:
[db_impl/db_impl_compaction_flush.cc:1443] [default] Manual compaction
starting
debug 2023-10-09T09:32:08.976+ 7f4899286700  4 rocksdb:
[db_impl/db_impl_compaction_flush.cc:1443] [default] Manual compaction
starting
debug 2023-10-09T09:32:08.976+ 7f4899286700  4 rocksdb:
[db_impl/db_impl_compaction_flush.cc:1443] [default] Manual compaction
starting
debug 2023-10-09T09:32:08.976+ 7f48a3a9b700  4 rocksdb: (Original Log
Time 2023/10/09-09:32:08.978512) [db_impl/db_impl_compaction_flush.cc:2516]
[default] Manual compaction from level-5 to level-6 from 'logm .. 'logm;
will stop at (end)
debug 2023-10-09T09:32:12.764+ 7f4899286700  4 rocksdb:
[db_impl/db_impl_compaction_flush.cc:1443] [default] Manual compaction
starting
debug 2023-10-09T09:32:12.764+ 7f4899286700  4 rocksdb:
[db_impl/db_impl_compaction_flush.cc:1443] [default] Manual compaction
starting
debug 2023-10-09T09:32:12.764+ 7f4899286700  4 rocksdb:
[db_impl/db_impl_compaction_flush.cc:1443] [default] Manual compaction
starting
debug 2023-10-09T09:32:12.764+ 7f4899286700  4 rocksdb:
[db_impl/db_impl_compaction_flush.cc:1443] [default] Manual compaction
starting
debug 2023-10-09T09:32:12.764+ 7f4899286700  4 rocksdb:
[db_impl/db_impl_compaction_flush.cc:1443] [default] Manual compaction
starting
debug 2023-10-09T09:32:12.764+ 7f4899286700  4 rocksdb:
[db_impl/db_impl_compaction_flush.cc:1443] [default] Manual compaction
starting
debug 2023-10-09T09:33:29.028+ 7f48a329a700  4 rocksdb: EVENT_LOG_v1
{"time_micros": 1696844009033151, "job": 64231, "event": "flush_started",
"num_memtables": 1, "num_entries": 1430, "num_deletes": 251,
"total_data_size": 8975535, "memory_usage": 9035920, "flush_reason":
"Manual Compaction"}
debug 2023-10-09T09:33:29.044+ 7f4899286700  4 rocksdb:
[db_impl/db_impl_compaction_flush.cc:1443] [default] Manual compaction
starting
debug 2023-10-09T09:33:29.048+ 7f48a3a9b700  4 rocksdb: (Original Log
Time 2023/10/09-09:33:29.049585) [db_impl/db_impl_compaction_flush.cc:2516]
[default] Manual compaction from level-0 to level-5 from 'paxos .. 'paxos;
will stop at (end)
debug 2023-10-09T09:33:29.048+ 7f4899286700  4 rocksdb:
[db_impl/db_impl_compaction_flush.cc:1443] [default] Manual compaction

[ceph-users] Re: Manual resharding with multisite

2023-10-09 Thread Danny Webb
This only works if you reshard on the primary zone.  Like Yixin, we've tried 
resharding on the primary zone where data is held on a secondary zone and all 
that results in is a complete loss of all index data for the reshardd bucket on 
the secondary zone.  The only way to use  multisite resharding  pre-reef that I 
know of where the data is going to sit in the secondary zone is to pre-reshard 
the bucket before any data is put in.

From: Richard Bade 
Sent: 09 October 2023 01:05
To: Yixin Jin 
Cc: Ceph-users 
Subject: [ceph-users] Re: Manual resharding with multisite

CAUTION: This email originates from outside THG

Hi Yixin,
I am interested in the answers to your questions also but I think I
can provide some useful information for you.
We have a multisite setup also where we need to reshard sometimes as
the bucket have grown. However we have bucket sync turned off for
these buckets as they only reside on one gateway and not the other.
For these buckets I have been able to manually reshard using this command:
radosgw-admin bucket reshard --rgw-zone={zone_name}
--bucket={bucket_name} --num-shards {new_shard_number}
--yes-i-really-mean-it

I have not seen any issues with this, but like I said I only have data
on that one zone and not the other. This may not be useful for your
situation but I thought I'd mention it anyway.
I would really like to know what the correct procedure is for buckets
that have more than 100k objects per shard in a multisite environment.

Regards,
Rich

On Thu, 5 Oct 2023 at 06:51, Yixin Jin  wrote:
>
> Hi folks,
>
> I am aware that dynamic resharding isn't supported before Reef with 
> multisite. However, does manual resharding work? It doesn't seem to be so, 
> either. First of all, running "bucket reshard" has to be in the master zone. 
> But if the objects of that bucket isn't in the master zone, resharding in the 
> master zone seems to render those objects inaccessible in the zone that 
> actually has them. So, what is recommended practice of resharding with 
> multiste? No resharding at all?
>
> Thanks,
> Yixin
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph 16.2.x excessive logging, how to reduce?

2023-10-09 Thread Marc
Hi Zakhar,

> 
> I did try to play with various debug settings. The issue is that mons
> produce logs of all commands issued by clients, not just mgr. For example,
> an Openstack Cinder node asking for space it can use:
> 
> Oct  9 07:59:01 ceph03 bash[4019]: debug 2023-10-09T07:59:01.303+
> 7f489da8f700  0 log_channel(audit) log [DBG] : from='client.?
> 10.208.1.11:0/3286277243  '
> entity='client.cinder' cmd=[{"prefix":"osd pool get-quota", "pool":
> "volumes-ssd", "format":"json"}]: dispatch

I am on a older version of ceph still so I am not sure if I even have these. 
There is also an option ceph.conf to do client logging

  [client]
  #debug client = 5

 
> 
> It is unclear which particular mon debug option out of many controls this
> particular type of debug. I tried searching for documentation of mon debug
> options to no avail.
> 

Maybe there is something equal to this for logging?
ceph daemon mon.a perf schema|less 
ceph daemon osd.0 perf schema|less


> 
> 
>   Did you do something like this
> 
>   Getting keys with
>   ceph daemon mon.a config show | grep debug_ | grep mgr
> 
>   ceph tell mon.* injectargs --$monk=0/0
> 
>   >
>   > Any input from anyone, please?
>   >
>   > This part of Ceph is very poorly documented. Perhaps there's a
> better place
>   > to ask this question? Please let me know.
>   >
>   > /Z
>   >
>   > On Sat, 7 Oct 2023 at 22:00, Zakhar Kirpichenko   > wrote:
>   >
>   > > Hi!
>   > >
>   > > I am still fighting excessive logging. I've reduced unnecessary
> logging
>   > > from most components except for mon audit:
> https://pastebin.com/jjWvUEcQ
>   > >
>   > > How can I stop logging this particular type of messages?
>   > >
>   > > I would appreciate your help and advice.
>   > >
>   > > /Z
>   > >
>   > >
>   > >> Thank you for your response, Igor.
>   > >>
>   > >> Currently debug_rocksdb is set to 4/5:
>   > >>
>   > >> # ceph config get osd debug_rocksdb
>   > >> 4/5
>   > >>
>   > >> This setting seems to be default. Is my understanding correct
> that
>   > you're
>   > >> suggesting setting it to 3/5 or even 0/5? Would setting it to
> 0/5 have
>   > any
>   > >> negative effects on the cluster?
>   > >>
>   > >> /Z
>   > >>
>   > >> On Wed, 4 Oct 2023 at 21:23, Igor Fedotov   > wrote:
>   > >>
>   > >>> Hi Zakhar,
>   > >>>
>   > >>> do reduce rocksdb logging verbosity you might want to set
> debug_rocksdb
>   > >>> to 3 (or 0).
>   > >>>
>   > >>> I presume it produces a  significant part of the logging
> traffic.
>   > >>>
>   > >>>
>   > >>> Thanks,
>   > >>>
>   > >>> Igor
>   > >>>
>   > >>> On 04/10/2023 20:51, Zakhar Kirpichenko wrote:
>   > >>> > Any input from anyone, please?
>   > >>> >
>   > >>> > On Tue, 19 Sept 2023 at 09:01, Zakhar Kirpichenko
> mailto:zak...@gmail.com> >
>   > >>> wrote:
>   > >>> >
>   > >>> >> Hi,
>   > >>> >>
>   > >>> >> Our Ceph 16.2.x cluster managed by cephadm is logging a lot
> of very
>   > >>> >> detailed messages, Ceph logs alone on hosts with monitors
> and
>   > several
>   > >>> OSDs
>   > >>> >> has already eaten through 50% of the endurance of the flash
> system
>   > >>> drives
>   > >>> >> over a couple of years.
>   > >>> >>
>   > >>> >> Cluster logging settings are default, and it seems that all
> daemons
>   > >>> are
>   > >>> >> writing lots and lots of debug information to the logs, such
> as for
>   > >>> >> example: https://pastebin.com/ebZq8KZk (it's just a snippet,
> but
>   > >>> there's
>   > >>> >> lots and lots of various information).
>   > >>> >>
>   > >>> >> Is there a way to reduce the amount of logging and, for
> example,
>   > >>> limit the
>   > >>> >> logging to warnings or important messages so that it doesn't
> include
>   > >>> every
>   > >>> >> successful authentication attempt, compaction etc, etc, when
> the
>   > >>> cluster is
>   > >>> >> healthy and operating normally?
>   > >>> >>
>   > >>> >> I would very much appreciate your advice on this.
>   > >>> >>
>   > >>> >> Best regards,
>   > >>> >> Zakhar
>   > >>> >>
>   > >>> >>
>   > >>> >>
>   > >>> > ___
>   > >>> > ceph-users mailing list -- ceph-users@ceph.io  us...@ceph.io>
>   > >>> > To unsubscribe send an email to ceph-users-le...@ceph.io
> 
>   > >>>
>   > >>
>   > ___
>   > ceph-users mailing list -- ceph-users@ceph.io  us...@ceph.io>
>   > To unsubscribe send an email to 

[ceph-users] Re: Ceph 16.2.x excessive logging, how to reduce?

2023-10-09 Thread Zakhar Kirpichenko
Thanks for your reply, Marc!

I did try to play with various debug settings. The issue is that mons
produce logs of all commands issued by clients, not just mgr. For example,
an Openstack Cinder node asking for space it can use:

Oct  9 07:59:01 ceph03 bash[4019]: debug 2023-10-09T07:59:01.303+
7f489da8f700  0 log_channel(audit) log [DBG] : from='client.?
10.208.1.11:0/3286277243' entity='client.cinder' cmd=[{"prefix":"osd pool
get-quota", "pool": "volumes-ssd", "format":"json"}]: dispatch

It is unclear which particular mon debug option out of many controls this
particular type of debug. I tried searching for documentation of mon debug
options to no avail.

/Z


On Mon, 9 Oct 2023 at 10:03, Marc  wrote:

>
> Did you do something like this
>
> Getting keys with
> ceph daemon mon.a config show | grep debug_ | grep mgr
>
> ceph tell mon.* injectargs --$monk=0/0
>
> >
> > Any input from anyone, please?
> >
> > This part of Ceph is very poorly documented. Perhaps there's a better
> place
> > to ask this question? Please let me know.
> >
> > /Z
> >
> > On Sat, 7 Oct 2023 at 22:00, Zakhar Kirpichenko 
> wrote:
> >
> > > Hi!
> > >
> > > I am still fighting excessive logging. I've reduced unnecessary logging
> > > from most components except for mon audit:
> https://pastebin.com/jjWvUEcQ
> > >
> > > How can I stop logging this particular type of messages?
> > >
> > > I would appreciate your help and advice.
> > >
> > > /Z
> > >
> > > On Thu, 5 Oct 2023 at 06:47, Zakhar Kirpichenko 
> wrote:
> > >
> > >> Thank you for your response, Igor.
> > >>
> > >> Currently debug_rocksdb is set to 4/5:
> > >>
> > >> # ceph config get osd debug_rocksdb
> > >> 4/5
> > >>
> > >> This setting seems to be default. Is my understanding correct that
> > you're
> > >> suggesting setting it to 3/5 or even 0/5? Would setting it to 0/5 have
> > any
> > >> negative effects on the cluster?
> > >>
> > >> /Z
> > >>
> > >> On Wed, 4 Oct 2023 at 21:23, Igor Fedotov 
> wrote:
> > >>
> > >>> Hi Zakhar,
> > >>>
> > >>> do reduce rocksdb logging verbosity you might want to set
> debug_rocksdb
> > >>> to 3 (or 0).
> > >>>
> > >>> I presume it produces a  significant part of the logging traffic.
> > >>>
> > >>>
> > >>> Thanks,
> > >>>
> > >>> Igor
> > >>>
> > >>> On 04/10/2023 20:51, Zakhar Kirpichenko wrote:
> > >>> > Any input from anyone, please?
> > >>> >
> > >>> > On Tue, 19 Sept 2023 at 09:01, Zakhar Kirpichenko <
> zak...@gmail.com>
> > >>> wrote:
> > >>> >
> > >>> >> Hi,
> > >>> >>
> > >>> >> Our Ceph 16.2.x cluster managed by cephadm is logging a lot of
> very
> > >>> >> detailed messages, Ceph logs alone on hosts with monitors and
> > several
> > >>> OSDs
> > >>> >> has already eaten through 50% of the endurance of the flash system
> > >>> drives
> > >>> >> over a couple of years.
> > >>> >>
> > >>> >> Cluster logging settings are default, and it seems that all
> daemons
> > >>> are
> > >>> >> writing lots and lots of debug information to the logs, such as
> for
> > >>> >> example: https://pastebin.com/ebZq8KZk (it's just a snippet, but
> > >>> there's
> > >>> >> lots and lots of various information).
> > >>> >>
> > >>> >> Is there a way to reduce the amount of logging and, for example,
> > >>> limit the
> > >>> >> logging to warnings or important messages so that it doesn't
> include
> > >>> every
> > >>> >> successful authentication attempt, compaction etc, etc, when the
> > >>> cluster is
> > >>> >> healthy and operating normally?
> > >>> >>
> > >>> >> I would very much appreciate your advice on this.
> > >>> >>
> > >>> >> Best regards,
> > >>> >> Zakhar
> > >>> >>
> > >>> >>
> > >>> >>
> > >>> > ___
> > >>> > ceph-users mailing list -- ceph-users@ceph.io
> > >>> > To unsubscribe send an email to ceph-users-le...@ceph.io
> > >>>
> > >>
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph 16.2.x excessive logging, how to reduce?

2023-10-09 Thread Marc


Did you do something like this

Getting keys with
ceph daemon mon.a config show | grep debug_ | grep mgr

ceph tell mon.* injectargs --$monk=0/0

> 
> Any input from anyone, please?
> 
> This part of Ceph is very poorly documented. Perhaps there's a better place
> to ask this question? Please let me know.
> 
> /Z
> 
> On Sat, 7 Oct 2023 at 22:00, Zakhar Kirpichenko  wrote:
> 
> > Hi!
> >
> > I am still fighting excessive logging. I've reduced unnecessary logging
> > from most components except for mon audit: https://pastebin.com/jjWvUEcQ
> >
> > How can I stop logging this particular type of messages?
> >
> > I would appreciate your help and advice.
> >
> > /Z
> >
> > On Thu, 5 Oct 2023 at 06:47, Zakhar Kirpichenko  wrote:
> >
> >> Thank you for your response, Igor.
> >>
> >> Currently debug_rocksdb is set to 4/5:
> >>
> >> # ceph config get osd debug_rocksdb
> >> 4/5
> >>
> >> This setting seems to be default. Is my understanding correct that
> you're
> >> suggesting setting it to 3/5 or even 0/5? Would setting it to 0/5 have
> any
> >> negative effects on the cluster?
> >>
> >> /Z
> >>
> >> On Wed, 4 Oct 2023 at 21:23, Igor Fedotov  wrote:
> >>
> >>> Hi Zakhar,
> >>>
> >>> do reduce rocksdb logging verbosity you might want to set debug_rocksdb
> >>> to 3 (or 0).
> >>>
> >>> I presume it produces a  significant part of the logging traffic.
> >>>
> >>>
> >>> Thanks,
> >>>
> >>> Igor
> >>>
> >>> On 04/10/2023 20:51, Zakhar Kirpichenko wrote:
> >>> > Any input from anyone, please?
> >>> >
> >>> > On Tue, 19 Sept 2023 at 09:01, Zakhar Kirpichenko 
> >>> wrote:
> >>> >
> >>> >> Hi,
> >>> >>
> >>> >> Our Ceph 16.2.x cluster managed by cephadm is logging a lot of very
> >>> >> detailed messages, Ceph logs alone on hosts with monitors and
> several
> >>> OSDs
> >>> >> has already eaten through 50% of the endurance of the flash system
> >>> drives
> >>> >> over a couple of years.
> >>> >>
> >>> >> Cluster logging settings are default, and it seems that all daemons
> >>> are
> >>> >> writing lots and lots of debug information to the logs, such as for
> >>> >> example: https://pastebin.com/ebZq8KZk (it's just a snippet, but
> >>> there's
> >>> >> lots and lots of various information).
> >>> >>
> >>> >> Is there a way to reduce the amount of logging and, for example,
> >>> limit the
> >>> >> logging to warnings or important messages so that it doesn't include
> >>> every
> >>> >> successful authentication attempt, compaction etc, etc, when the
> >>> cluster is
> >>> >> healthy and operating normally?
> >>> >>
> >>> >> I would very much appreciate your advice on this.
> >>> >>
> >>> >> Best regards,
> >>> >> Zakhar
> >>> >>
> >>> >>
> >>> >>
> >>> > ___
> >>> > ceph-users mailing list -- ceph-users@ceph.io
> >>> > To unsubscribe send an email to ceph-users-le...@ceph.io
> >>>
> >>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph 16.2.x excessive logging, how to reduce?

2023-10-09 Thread Zakhar Kirpichenko
Any input from anyone, please?

This part of Ceph is very poorly documented. Perhaps there's a better place
to ask this question? Please let me know.

/Z

On Sat, 7 Oct 2023 at 22:00, Zakhar Kirpichenko  wrote:

> Hi!
>
> I am still fighting excessive logging. I've reduced unnecessary logging
> from most components except for mon audit: https://pastebin.com/jjWvUEcQ
>
> How can I stop logging this particular type of messages?
>
> I would appreciate your help and advice.
>
> /Z
>
> On Thu, 5 Oct 2023 at 06:47, Zakhar Kirpichenko  wrote:
>
>> Thank you for your response, Igor.
>>
>> Currently debug_rocksdb is set to 4/5:
>>
>> # ceph config get osd debug_rocksdb
>> 4/5
>>
>> This setting seems to be default. Is my understanding correct that you're
>> suggesting setting it to 3/5 or even 0/5? Would setting it to 0/5 have any
>> negative effects on the cluster?
>>
>> /Z
>>
>> On Wed, 4 Oct 2023 at 21:23, Igor Fedotov  wrote:
>>
>>> Hi Zakhar,
>>>
>>> do reduce rocksdb logging verbosity you might want to set debug_rocksdb
>>> to 3 (or 0).
>>>
>>> I presume it produces a  significant part of the logging traffic.
>>>
>>>
>>> Thanks,
>>>
>>> Igor
>>>
>>> On 04/10/2023 20:51, Zakhar Kirpichenko wrote:
>>> > Any input from anyone, please?
>>> >
>>> > On Tue, 19 Sept 2023 at 09:01, Zakhar Kirpichenko 
>>> wrote:
>>> >
>>> >> Hi,
>>> >>
>>> >> Our Ceph 16.2.x cluster managed by cephadm is logging a lot of very
>>> >> detailed messages, Ceph logs alone on hosts with monitors and several
>>> OSDs
>>> >> has already eaten through 50% of the endurance of the flash system
>>> drives
>>> >> over a couple of years.
>>> >>
>>> >> Cluster logging settings are default, and it seems that all daemons
>>> are
>>> >> writing lots and lots of debug information to the logs, such as for
>>> >> example: https://pastebin.com/ebZq8KZk (it's just a snippet, but
>>> there's
>>> >> lots and lots of various information).
>>> >>
>>> >> Is there a way to reduce the amount of logging and, for example,
>>> limit the
>>> >> logging to warnings or important messages so that it doesn't include
>>> every
>>> >> successful authentication attempt, compaction etc, etc, when the
>>> cluster is
>>> >> healthy and operating normally?
>>> >>
>>> >> I would very much appreciate your advice on this.
>>> >>
>>> >> Best regards,
>>> >> Zakhar
>>> >>
>>> >>
>>> >>
>>> > ___
>>> > ceph-users mailing list -- ceph-users@ceph.io
>>> > To unsubscribe send an email to ceph-users-le...@ceph.io
>>>
>>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io