subject:"\[ceph\-users\] Re\: something wrong with my monitor database \?"

[ceph-users] Re: something wrong with my monitor database ?

2022-06-14 Thread Eric Le Lay


Le 13/06/2022 à 18:37, Stefan Kooman a écrit :
CAUTION: This email originated from outside the organization. Do not 
click links or open attachments unless you recognize the sender and 
know the content is safe.


On 6/13/22 18:21, Eric Le Lay wrote:



Those objects are deleted but have snapshots, even if the pool itself
doesn't have snapshots.
What could cause that?


root@hpc1a:~# rados -p storage stat
rbd_data.5b423b48a4643f.0006a4e5
  error stat-ing storage/rbd_data.5b423b48a4643f.0006a4e5: (2)
No such file or directory
root@hpc1a:~# rados -p storage lssnap
0 snaps
root@hpc1a:~# rados -p storage listsnaps
rbd_data.5b423b48a4643f.0006a4e5
rbd_data.5b423b48a4643f.0006a4e5:
cloneid    snaps    size    overlap
1160    1160    4194304
[1048576~32768,1097728~16384,1228800~16384,1409024~16384,1441792~16384,1572864~16384,1720320~16384,1900544~16384,2310144~16384] 



1364    1364    4194304    []


Do the OSDs still need to trim the snapshots? Does data usage decline
over time?

Gr. Stefan



thanks Stefan for your time!

Snaptrims were re-enabled a week ago but the OSDs only snaptrim newly 
deleted snapshots.

restarting or outing an OSD doesn't trigger them either.

Crush-reweighting to 0 an OSD indeeds results in more storage being used!

I'll drop the cluster and start again from scratch.

Best,
Eric

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: something wrong with my monitor database ?

2022-06-13 Thread Eric Le Lay

Le 13/06/2022 à 17:54, Eric Le Lay a écrit :

Le 10/06/2022 à 11:58, Stefan Kooman a écrit :
CAUTION: This email originated from outside the organization. Do not
click links or open attachments unless you recognize the sender and
know the content is safe.

On 6/10/22 11:41, Eric Le Lay wrote:

Hello list,

my ceph cluster was upgraded from nautilus to octopus last October,
causing snaptrims
to overload OSDs so I had to disable them
(bluefs_buffered_io=false|true

didn't help).

Now I've copied data elsewhere and removed all clients and try to fix
the cluster.
Scraping it and starting over is possible, but it would be wonderful if
we could
figure out what's wrong with it...

FYI: osd snap trim sleep <- adding some sleep might help alleviate the
impact on the cluster.

If HEALTH is OK I would not expect anything wrong with your cluster.

Does " ceph osd dump |grep require_osd_release" give you
require_osd_release octopus?

Gr. Stefan

Hi Stefan,

thank you for your answer.
Even osd_snap_trim_sleep=10 was not sustainable with normal cluster
load.|
Following your email I've tested bluefs_buffered_io=true again and
indeed it dramatically reduces disk load, but not cpu nor slow ceph io.

Yes, require_osd_release=octopus.

What worries me is the pool is now void of rbd images, but still has
14TiB of object data.

Here is my pool contents. rbd_directory, rbd_trash are empty.

rados -p storage ls | sed 's/\(.*\..*\)\..*/\1/'|sort|uniq -c
1 rbd_children
6 rbd_data.13fc0d1d63c52b
2634 rbd_data.15ab844f62d5
258 rbd_data.15f1f2e2398dc7
133 rbd_data.17d93e1c5a4855
258 rbd_data.1af03e352ec460
2987 rbd_data.236cfc2474b020
206872 rbd_data.31c55ee49f0abb
604593 rbd_data.5b423b48a4643f
90 rbd_data.7b06b7abcc9441
81576 rbd_data.913b398f28d1
18 rbd_data.9662ade11235a
16051 rbd_data.e01609a7a07e20
278 rbd_data.e6b6f855b5172c
90 rbd_data.e85da37e044922
1 rbd_directory
1 rbd_info
1 rbd_trash

Eric

Those objects are deleted but have snapshots, even if the pool itself
doesn't have snapshots.

What could cause that?

root@hpc1a:~# rados -p storage stat rbd_data.5b423b48a4643f.0006a4e5
error stat-ing storage/rbd_data.5b423b48a4643f.0006a4e5: (2)
No such file or directory

root@hpc1a:~# rados -p storage lssnap
0 snaps
root@hpc1a:~# rados -p storage listsnaps
rbd_data.5b423b48a4643f.0006a4e5

rbd_data.5b423b48a4643f.0006a4e5:
cloneid snaps size overlap
1160 1160 4194304
[1048576~32768,1097728~16384,1228800~16384,1409024~16384,1441792~16384,1572864~16384,1720320~16384,1900544~16384,2310144~16384]

1364 1364 4194304 []

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: something wrong with my monitor database ?

2022-06-13 Thread Eric Le Lay


Le 10/06/2022 à 11:58, Stefan Kooman a écrit :
CAUTION: This email originated from outside the organization. Do not 
click links or open attachments unless you recognize the sender and 
know the content is safe.


On 6/10/22 11:41, Eric Le Lay wrote:

Hello list,

my ceph cluster was upgraded from nautilus to octopus last October,
causing snaptrims
to overload OSDs so I had to disable them (bluefs_buffered_io=false|true
didn't help).

Now I've copied data elsewhere and removed all clients and try to fix
the cluster.
Scraping it and starting over is possible, but it would be wonderful if
we could
figure out what's wrong with it...


FYI: osd snap trim sleep <- adding some sleep might help alleviate the
impact on the cluster.

If HEALTH is OK I would not expect anything wrong with your cluster.

Does " ceph osd dump |grep require_osd_release" give you
require_osd_release octopus?

Gr. Stefan

|Hi Stefan,

thank you for your answer.

|

|Even osd_snap_trim_sleep=10 was not sustainable with normal cluster load.|

|
|

||

|Following your email I've tested bluefs_buffered_io=true again and 
indeed it dramatically reduces disk load, but not cpu nor slow ceph io.


Yes, require_osd_release=octopus.

What worries me is the pool is now void of rbd images, but still has 
14TiB of object data.

Here is my pool contents. rbd_directory, rbd_trash are empty.

   rados -p storage ls | sed 's/\(.*\..*\)\..*/\1/'|sort|uniq -c
  1 rbd_children
  6 rbd_data.13fc0d1d63c52b
   2634 rbd_data.15ab844f62d5
    258 rbd_data.15f1f2e2398dc7
    133 rbd_data.17d93e1c5a4855
    258 rbd_data.1af03e352ec460
   2987 rbd_data.236cfc2474b020
 206872 rbd_data.31c55ee49f0abb
 604593 rbd_data.5b423b48a4643f
 90 rbd_data.7b06b7abcc9441
  81576 rbd_data.913b398f28d1
 18 rbd_data.9662ade11235a
  16051 rbd_data.e01609a7a07e20
    278 rbd_data.e6b6f855b5172c
 90 rbd_data.e85da37e044922
  1 rbd_directory
  1 rbd_info
  1 rbd_trash

Eric



|
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: something wrong with my monitor database ?

[ceph-users] Re: something wrong with my monitor database ?

[ceph-users] Re: something wrong with my monitor database ?

3 matches

Site Navigation

Mail list logo

Footer information