date:20230224

[ceph-users] Re: rbd map error: couldn't connect to the cluster!

2023-02-24 Thread Thomas Schneider


Actually I didn't try other caps.

The setup of RBD images and authorizations is automised with a bash 
script that worked in the past w/o issues.
I need to understand the root cause in order to adapt the script 
accordingly.




Am 23.02.2023 um 17:55 schrieb Eugen Block:

And did you already try the other caps? Do those work?

Zitat von Thomas Schneider <74cmo...@gmail.com>:


Confirmed.

# ceph versions
{
    "mon": {
    "ceph version 14.2.22 
(877fa256043e4743620f4677e72dee5e738d1226) nautilus (stable)": 3

    },
    "mgr": {
    "ceph version 14.2.22 
(877fa256043e4743620f4677e72dee5e738d1226) nautilus (stable)": 3

    },
    "osd": {
    "ceph version 14.2.22 
(877fa256043e4743620f4677e72dee5e738d1226) nautilus (stable)": 437

    },
    "mds": {
    "ceph version 14.2.22 
(877fa256043e4743620f4677e72dee5e738d1226) nautilus (stable)": 7

    },
    "overall": {
    "ceph version 14.2.22 
(877fa256043e4743620f4677e72dee5e738d1226) nautilus (stable)": 450

    }
}



Am 23.02.2023 um 17:33 schrieb Eugen Block:
And the ceph cluster has the same version? ‚ceph versions‘ shows all 
daemons. If the cluster is also 14.2.X the caps should work with 
lower-case rbd_id. Can you confirm?



Zitat von Thomas Schneider <74cmo...@gmail.com>:


This is
# ceph --version
ceph version 14.2.22 (877fa256043e4743620f4677e72dee5e738d1226) 
nautilus (stable)




Am 23.02.2023 um 16:47 schrieb Eugen Block:
Which ceph version is this? In a Nautilus cluster it works for me 
with the lower-case rbd_id, in Pacific it doesn't. I don't have an 
Octopus cluster at hand.


Zitat von Eugen Block :

I tried to recreate this restrictive client access, one thing is 
that the rbd_id is in lower-case. I created a test client named 
"TEST":


storage01:~ # rados -p pool ls | grep -vE 
"5473cdeb5c62c|1f553ba0f6222" | grep test

rbd_id.test

But after adding all necessary caps I'm still not allowed to get 
the image info:


client:~ # rbd -p pool info test --id TEST --keyring 
/etc/ceph/ceph.client.TEST.keyring
2023-02-23T16:35:16.740+0100 7faebaffd700 -1 
librbd::mirror::GetInfoRequest: 0x556072a66560 
handle_get_mirror_image: failed to retrieve mirroring state: (1) 
Operation not permitted

rbd: info: (1) Operation not permitted

And I don't have rbd-mirror enabled in this cluster, so that's 
kind of strange... I'll try to find out which other caps it 
requires. I already disabled all image features but to no avail.


Zitat von Thomas Schneider <74cmo...@gmail.com>:

I'll delete existing authentication and its caps "VCT" and 
recreate it.


Just to be sure: there's no ingress communication to the client 
(from Ceph server)?


Am 23.02.2023 um 16:01 schrieb Eugen Block:
For rbd commands you don't specify the "client" prefix for the 
--id parameter, just the client name, in your case "VCT". Your 
second approach shows a different error message, so it can 
connect with "VCT" successfully, but the permissions seem not 
to be sufficient. Those caps look very restrictive, not sure 
which prevent the map command though.


Zitat von Thomas Schneider <74cmo...@gmail.com>:

Hm... I'm not sure about the correct rbd command syntax, but I 
thought it's correct.


Anyway, using a different ID fails, too:
# rbd map hdb_backup/VCT --id client.VCT --keyring 
/etc/ceph/ceph.client.VCT.keyring

rbd: couldn't connect to the cluster!

# rbd map hdb_backup/VCT --id VCT --keyring 
/etc/ceph/ceph.client.VCT.keyring
2023-02-23T15:46:16.848+0100 7f222d19d700 -1 
librbd::image::GetMetadataRequest: 0x7f220c001ef0 
handle_metadata_list: failed to retrieve image metadata: (1) 
Operation not permitted
2023-02-23T15:46:16.848+0100 7f222d19d700 -1 
librbd::image::RefreshRequest: failed to retrieve pool 
metadata: (1) Operation not permitted
2023-02-23T15:46:16.848+0100 7f222d19d700 -1 
librbd::image::OpenRequest: failed to refresh image: (1) 
Operation not permitted
2023-02-23T15:46:16.848+0100 7f222c99c700 -1 
librbd::ImageState: 0x5569d8a16ba0 failed to open image: (1) 
Operation not permitted

rbd: error opening image VCT: (1) Operation not permitted


Am 23.02.2023 um 15:30 schrieb Eugen Block:

You don't specify which client in your rbd command:

rbd map hdb_backup/VCT --id client --keyring 
/etc/ceph/ceph.client.VCT.keyring


Have you tried this (not sure about upper-case client names, 
haven't tried that)?


rbd map hdb_backup/VCT --id VCT --keyring 
/etc/ceph/ceph.client.VCT.keyring



Zitat von Thomas Schneider <74cmo...@gmail.com>:


Hello,

I'm trying to mount RBD using rbd map, but I get this error 
message:
# rbd map hdb_backup/VCT --id client --keyring 
/etc/ceph/ceph.client.VCT.keyring

rbd: couldn't connect to the cluster!

Checking on Ceph server the required permission for relevant 
keyring exists:

# ceph-authtool -l /etc/ceph/ceph.client.VCT.keyring
[client.VCT]
    key = AQBj3LZjNGn/BhAAG8IqMyH0WLKi4kTlbjiW7g==

# ceph auth get client.VCT
[client.VCT]
    key = AQBj3LZjNGn/BhAAG8IqMyH0WLKi4kTlbjiW7g==
    caps mon

[ceph-users] Re: rbd map error: couldn't connect to the cluster!

2023-02-24 Thread Eugen Block

Just one addition from my test, I believe I misinterpreted my results  
because my test image was named "test" and the client "TEST", so the  
rbd_id. is indeed upper case for an image that has an  
upper-case name. So forget my comment about that.
Another question though: does the image you're trying to map actually  
contain the object_prefix you have in your caps? Can you paste the  
output of 'rbd info hdb_backup/VCT'?


Zitat von Thomas Schneider <74cmo...@gmail.com>:


Actually I didn't try other caps.

The setup of RBD images and authorizations is automised with a bash  
script that worked in the past w/o issues.

I need to understand the root cause in order to adapt the script accordingly.



Am 23.02.2023 um 17:55 schrieb Eugen Block:

And did you already try the other caps? Do those work?

Zitat von Thomas Schneider <74cmo...@gmail.com>:


Confirmed.

# ceph versions
{
    "mon": {
    "ceph version 14.2.22  
(877fa256043e4743620f4677e72dee5e738d1226) nautilus (stable)": 3

    },
    "mgr": {
    "ceph version 14.2.22  
(877fa256043e4743620f4677e72dee5e738d1226) nautilus (stable)": 3

    },
    "osd": {
    "ceph version 14.2.22  
(877fa256043e4743620f4677e72dee5e738d1226) nautilus (stable)": 437

    },
    "mds": {
    "ceph version 14.2.22  
(877fa256043e4743620f4677e72dee5e738d1226) nautilus (stable)": 7

    },
    "overall": {
    "ceph version 14.2.22  
(877fa256043e4743620f4677e72dee5e738d1226) nautilus (stable)": 450

    }
}



Am 23.02.2023 um 17:33 schrieb Eugen Block:
And the ceph cluster has the same version? ‚ceph versions‘ shows  
all daemons. If the cluster is also 14.2.X the caps should work  
with lower-case rbd_id. Can you confirm?



Zitat von Thomas Schneider <74cmo...@gmail.com>:


This is
# ceph --version
ceph version 14.2.22 (877fa256043e4743620f4677e72dee5e738d1226)  
nautilus (stable)




Am 23.02.2023 um 16:47 schrieb Eugen Block:
Which ceph version is this? In a Nautilus cluster it works for  
me with the lower-case rbd_id, in Pacific it doesn't. I don't  
have an Octopus cluster at hand.


Zitat von Eugen Block :

I tried to recreate this restrictive client access, one thing  
is that the rbd_id is in lower-case. I created a test client  
named "TEST":


storage01:~ # rados -p pool ls | grep -vE  
"5473cdeb5c62c|1f553ba0f6222" | grep test

rbd_id.test

But after adding all necessary caps I'm still not allowed to  
get the image info:


client:~ # rbd -p pool info test --id TEST --keyring  
/etc/ceph/ceph.client.TEST.keyring
2023-02-23T16:35:16.740+0100 7faebaffd700 -1  
librbd::mirror::GetInfoRequest: 0x556072a66560  
handle_get_mirror_image: failed to retrieve mirroring state:  
(1) Operation not permitted

rbd: info: (1) Operation not permitted

And I don't have rbd-mirror enabled in this cluster, so that's  
kind of strange... I'll try to find out which other caps it  
requires. I already disabled all image features but to no avail.


Zitat von Thomas Schneider <74cmo...@gmail.com>:

I'll delete existing authentication and its caps "VCT" and  
recreate it.


Just to be sure: there's no ingress communication to the  
client (from Ceph server)?


Am 23.02.2023 um 16:01 schrieb Eugen Block:
For rbd commands you don't specify the "client" prefix for  
the --id parameter, just the client name, in your case  
"VCT". Your second approach shows a different error message,  
so it can connect with "VCT" successfully, but the  
permissions seem not to be sufficient. Those caps look very  
restrictive, not sure which prevent the map command though.


Zitat von Thomas Schneider <74cmo...@gmail.com>:

Hm... I'm not sure about the correct rbd command syntax,  
but I thought it's correct.


Anyway, using a different ID fails, too:
# rbd map hdb_backup/VCT --id client.VCT --keyring  
/etc/ceph/ceph.client.VCT.keyring

rbd: couldn't connect to the cluster!

# rbd map hdb_backup/VCT --id VCT --keyring  
/etc/ceph/ceph.client.VCT.keyring
2023-02-23T15:46:16.848+0100 7f222d19d700 -1  
librbd::image::GetMetadataRequest: 0x7f220c001ef0  
handle_metadata_list: failed to retrieve image metadata:  
(1) Operation not permitted
2023-02-23T15:46:16.848+0100 7f222d19d700 -1  
librbd::image::RefreshRequest: failed to retrieve pool  
metadata: (1) Operation not permitted
2023-02-23T15:46:16.848+0100 7f222d19d700 -1  
librbd::image::OpenRequest: failed to refresh image: (1)  
Operation not permitted
2023-02-23T15:46:16.848+0100 7f222c99c700 -1  
librbd::ImageState: 0x5569d8a16ba0 failed to open image:  
(1) Operation not permitted

rbd: error opening image VCT: (1) Operation not permitted


Am 23.02.2023 um 15:30 schrieb Eugen Block:

You don't specify which client in your rbd command:

rbd map hdb_backup/VCT --id client --keyring  
/etc/ceph/ceph.client.VCT.keyring


Have you tried this (not sure about upper-case client  
names, haven't tried that)?


rbd map hdb_backup/VCT --id VCT --keyring  
/etc/ceph/ceph.client.VCT.keyring



Zitat von Thomas Schneider <74cmo...@gmail.

[ceph-users] mons excessive writes to local disk and SSD wearout

2023-02-24 Thread Andrej Filipcic



Hi,

on our large ceph cluster with 60 servers, 1600 OSDs, we have observed 
that small system nvmes are wearing out rapidly. Our monitoring shows 
mon writes on average about 10MB/s to store.db. For small system nvmes 
of 250GB and DWPD of ~1, this turns out to be too much, 0.8TB/day or 
1.5PB in 5 years, too much even for 3DWPD of the same capacity.


Apart from replacing the drives with larger ones, more durable, 
preferably  both, do you have any suggestions if these writes can be 
reduced? Actually, the mon writes match 0.15Hz  rate of .sst file 
creation of 64MB


Best regards,
Andrej

--
_
   prof. dr. Andrej Filipcic,   E-mail: andrej.filip...@ijs.si
   Department of Experimental High Energy Physics - F9
   Jozef Stefan Institute, Jamova 39, P.o.Box 3000
   SI-1001 Ljubljana, Slovenia
   Tel.: +386-1-477-3674Fax: +386-1-425-7074
-
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: rbd map error: couldn't connect to the cluster!

2023-02-24 Thread Thomas Schneider


Please check the output here:
# rbd info hdb_backup/VCT
rbd image 'VCT':
    size 800 GiB in 204800 objects
    order 22 (4 MiB objects)
    snapshot_count: 0
    id: b768d4baac048b
    block_name_prefix: rbd_data.b768d4baac048b
    format: 2
    features: layering
    op_features:
    flags:
    create_timestamp: Thu Jan  5 15:19:14 2023
    access_timestamp: Thu Jan  5 15:19:14 2023
    modify_timestamp: Thu Jan  5 15:19:14 2023




Am 24.02.2023 um 11:38 schrieb Eugen Block:

rbd info hdb_backup/VCT

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Accessing OSD objects

2023-02-24 Thread Geoffrey Rhodes

Hello all,  I'd really appreciate some input from the more knowledgeable
here.

Is there a way I can access OSD objects if I have a BlueFS replay error?
This error prevents me starting the OSD and also throws an error if I try
using the bluestore or objectstore tools. - I can however run a
ceph-bluestore-tool show-label without issue.

I'm hoping there is another way or possibly a way to purge this log that I
can still access the objects on this OSD.
If deleting this reply log will help (even with some data loss) I'm happy
to try it.

This has caused a PG to go inactive and I'm considering deleting the PG and
force re-creating it. - Saw this mentioned as a last resort option.
Below is a snip of where things go wrong. - I don't know if there is even a
chance or is this an unrecoverable state?

2023-01-25T10:05:26.543+ 7fa773a14240 20 bluefs _replay 0x0:
op_dir_link db/031664.sst to 29549
2023-01-25T10:05:26.543+ 7fa773a14240 20 bluefs _replay 0x0:
op_dir_link db/031665.sst to 29550
2023-01-25T10:05:26.543+ 7fa773a14240 20 bluefs _replay 0x0:
op_dir_link db/031666.sst to 29551
2023-01-25T10:05:26.543+ 7fa773a14240 20 bluefs _replay 0x0:
op_dir_link db/CURRENT to 29543
2023-01-25T10:05:26.543+ 7fa773a14240 20 bluefs _replay 0x0:
op_dir_link db/IDENTITY to 5
2023-01-25T10:05:26.543+ 7fa773a14240 20 bluefs _replay 0x0:
op_dir_link db/LOCK to 2
2023-01-25T10:05:26.543+ 7fa773a14240 20 bluefs _replay 0x0:
op_dir_link db/MANIFEST-031657 to 29542
2023-01-25T10:05:26.543+ 7fa773a14240 20 bluefs _replay 0x0:
op_dir_link db/OPTIONS-031645 to 29529
2023-01-25T10:05:26.543+ 7fa773a14240 20 bluefs _replay 0x0:
op_dir_link db/OPTIONS-031660 to 29545
2023-01-25T10:05:26.543+ 7fa773a14240 20 bluefs _replay 0x0:
op_dir_create db.slow
2023-01-25T10:05:26.543+ 7fa773a14240 20 bluefs _replay 0x0: op_jump
seq 5204712 offset 0x2
2023-01-25T10:05:26.543+ 7fa773a14240 10 bluefs _read h 0x55d2f1cfdb80
0x1~1 from file(ino 1 size 0x0 mtime
2022-10-07T17:55:34.189440+ allocated 42 alloc_commit 42
extents [1:0x177017~2,1:0x53d1e90~40])
2023-01-25T10:05:26.543+ 7fa773a14240 20 bluefs _read left 0x1 len
0x1
2023-01-25T10:05:26.543+ 7fa773a14240 20 bluefs _read got 65536
2023-01-25T10:05:26.543+ 7fa773a14240 10 bluefs _read h 0x55d2f1cfdb80
0x2~1000 from file(ino 1 size 0x2 mtime
2022-10-07T17:55:34.189440+ allocated 42 alloc_commit 42
extents [1:0x177017~2,1:0x53d1e90~40])
2023-01-25T10:05:26.543+ 7fa773a14240 20 bluefs _read fetching
0x0~10 of 1:0x53d1e90~40
2023-01-25T10:05:26.547+ 7fa773a14240 20 bluefs _read left 0x10 len
0x1000
2023-01-25T10:05:26.547+ 7fa773a14240 20 bluefs _read got 4096
2023-01-25T10:05:26.547+ 7fa773a14240 10 bluefs _replay 0x2:
txn(seq 5204713 len 0x55 crc 0x81f48b1c)
2023-01-25T10:05:26.547+ 7fa773a14240 20 bluefs _replay 0x2:
op_file_update file(ino 29551 size 0x0 mtime
2022-10-07T17:55:34.151007+ allocated 0 alloc_commit 0 extents [])
2023-01-25T10:05:26.547+ 7fa773a14240 20 bluefs _replay 0x2:
op_dir_link db/031666.sst to 29551
2023-01-25T10:05:26.555+ 7fa773a14240 -1
/build/ceph-17.2.5/src/os/bluestore/BlueFS.cc: In function 'int
BlueFS::_replay(bool, bool)' thread 7fa773a14240 time
2023-01-25T10:05:26.551808+
/build/ceph-17.2.5/src/os/bluestore/BlueFS.cc: 1419: FAILED ceph_assert(r
== q->second->file_map.end())


Kind regards
Geoff
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: mons excessive writes to local disk and SSD wearout

2023-02-24 Thread Dan van der Ster

Hi Andrej,

That doesn't sound right -- I checked a couple of our clusters just
now and the mon filesystem is writing at just a few 100kBps.

debug_mon = 10 should clarify the root cause. Perhaps it's logm from
some persistent slow ops?

Cheers, Dan



On Fri, Feb 24, 2023 at 7:36 AM Andrej Filipcic  wrote:
>
>
> Hi,
>
> on our large ceph cluster with 60 servers, 1600 OSDs, we have observed
> that small system nvmes are wearing out rapidly. Our monitoring shows
> mon writes on average about 10MB/s to store.db. For small system nvmes
> of 250GB and DWPD of ~1, this turns out to be too much, 0.8TB/day or
> 1.5PB in 5 years, too much even for 3DWPD of the same capacity.
>
> Apart from replacing the drives with larger ones, more durable,
> preferably  both, do you have any suggestions if these writes can be
> reduced? Actually, the mon writes match 0.15Hz  rate of .sst file
> creation of 64MB
>
> Best regards,
> Andrej
>
> --
> _
> prof. dr. Andrej Filipcic,   E-mail: andrej.filip...@ijs.si
> Department of Experimental High Energy Physics - F9
> Jozef Stefan Institute, Jamova 39, P.o.Box 3000
> SI-1001 Ljubljana, Slovenia
> Tel.: +386-1-477-3674Fax: +386-1-425-7074
> -
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] OpenSSL in librados

2023-02-24 Thread Patrick Schlangen

Hi,

please forgive me if this has been asked before - I could not find any 
information on this topic.

I am using ceph with librados via the phprados extension. Since upgrading to 
the current ceph versions where OpenSSL is used in in librados, I observe that 
using PHP's libcurl integration and other features which rely on OpenSSL 
randomly fail when opening a TLS connection. I suspect that librados somehow 
initializes  or uninitializes OpenSSL in a way that interferes with the OpenSSL 
usage of libcurl / PHP's fsockopen.

Did anybody make a similar experience?

Thanks,

Patrick
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Large STDDEV in pg per osd

2023-02-24 Thread Joe Ryner

I have been digging for a while on how to minimize STDDEV of the
distribution of data on my OSDs and I can't seem to get it to get below 12.

I have other clusters that have a STDDEV of 1 which is my goal.  But this
cluster is really giving me fits.  This cluster started off on the
Emperor.  Might have even ran Dumpling originally.  I'm not sure I have
some crud laying around that is preventing better balancing.

I have tried the ceph balancer and I even tried jj balancer.

As you can see below My lowest osd percent used is 37% and highest is 64%

If anyone has any ideas on how to crack this nut I would greatly appreciate
it.

Thanks,
Joe


Here is my data:
ceph osd tree
ID   CLASS  WEIGHT TYPE NAMESTATUS  REWEIGHT  PRI-AFF
-1 167.63680  root default
-3  57.62895  rack marack
-4  10.47656  host gallo
0ssd3.49219  osd.0up   1.0  1.0
1ssd3.49219  osd.1up   1.0  1.0
8ssd3.49219  osd.8up   1.0  1.0
-8   7.85712  host jeep
18ssd0.87299  osd.18   up   1.0  1.0
19ssd0.87299  osd.19   up   1.0  1.0
20ssd0.87299  osd.20   up   1.0  1.0
21ssd0.87299  osd.21   up   1.0  1.0
22ssd3.49219  osd.22   up   1.0  1.0
45ssd0.87299  osd.45   up   1.0  1.0
-5   7.85712  host joy
5ssd0.87299  osd.5up   1.0  1.0
12ssd0.87299  osd.12   up   1.0  1.0
13ssd0.87299  osd.13   up   1.0  1.0
16ssd0.87299  osd.16   up   1.0  1.0
17ssd3.49219  osd.17   up   1.0  1.0
43ssd0.87299  osd.43   up   1.0  1.0
-37   6.98625  host kc01
82ssd0.87329  osd.82   up   1.0  1.0
83ssd0.87329  osd.83   up   1.0  1.0
84ssd0.87329  osd.84   up   1.0  1.0
85ssd3.49309  osd.85   up   1.0  1.0
86ssd0.87329  osd.86   up   1.0  1.0
-50   6.98625  host ks02
97ssd0.87329  osd.97   up   1.0  1.0
98ssd0.87329  osd.98   up   1.0  1.0
99ssd0.87329  osd.99   up   1.0  1.0
100ssd0.87329  osd.100  up   1.0  1.0
101ssd3.49309  osd.101  up   1.0  1.0
-27   3.49319  host lc02
52ssd1.74660  osd.52   up   1.0  1.0
73ssd1.74660  osd.73   up   1.0  1.0
-55  13.97246  host lx01
14ssd3.49309  osd.14   up   1.0  1.0
27ssd3.49309  osd.27   up   1.0  1.0
29ssd3.49309  osd.29   up   1.0  1.0
48ssd1.74660  osd.48   up   1.0  1.0
49ssd1.74660  osd.49   up   1.0  1.0
-2  55.00392  rack marack2
-13  13.96875  host helm
11ssd3.49219  osd.11   up   1.0  1.0
28ssd3.49219  osd.28   up   1.0  1.0
30ssd3.49219  osd.30   up   1.0  1.0
31ssd3.49219  osd.31   up   1.0  1.0
-17   7.85712  host jazz
23ssd0.87299  osd.23   up   1.0  1.0
24ssd0.87299  osd.24   up   1.0  1.0
25ssd0.87299  osd.25   up   1.0  1.0
26ssd0.87299  osd.26   up   1.0  1.0
36ssd3.49219  osd.36   up   1.0  1.0
44ssd0.87299  osd.44   up   1.0  1.0
-6   6.98438  host john
2ssd3.49219  osd.2up   1.0  1.0
3ssd3.49219  osd.3up   1.0  1.0
-49   7.85712  host jolt
34ssd3.49219  osd.34   up   1.0  1.0
63ssd0.87299  osd.63   up   1.0  1.0
64ssd0.87299  osd.64   up   1.0  1.0
65ssd0.87299  osd.65   up   1.0  1.0
66ssd0.87299  osd.66   up   1.0  1.0
67ssd0.87299  osd.67   up   1.0  1.0
-19   7.85712  host juju
37ssd0.87299  osd.37   up   1.0  1.0
38ssd0.87299  osd.38   up   1.0  1.0
39ssd0.87299  osd.39   up   1.0  1.00

[ceph-users] Accessing OSD objects

2023-02-24 Thread Geoffrey Rhodes

Hi Anthony, thanks for reaching out.

Erasure data pool (K=4, M=2)  but I had more than two disk failures around
the same time and the data had not fully replicated / restored elsewhere in
the cluster.
They are big 12TB Exos so it usually takes a few weeks to backfill /
recover plus I had snaptrimming on the go.

FYI - The journal's co-located on drive.

Kind regards
Geoff

On Fri, 24 Feb 2023 at 18:30, Anthony D'Atri  wrote:

> Are you only doing 2 replicas?
>
>
>
>
>
> On Feb 24, 2023, at 08:20, Geoffrey Rhodes  wrote:
>
> This has caused a PG to go inactive
>
>
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: rbd map error: couldn't connect to the cluster!

[ceph-users] Re: rbd map error: couldn't connect to the cluster!

[ceph-users] mons excessive writes to local disk and SSD wearout

[ceph-users] Re: rbd map error: couldn't connect to the cluster!

[ceph-users] Accessing OSD objects

[ceph-users] Re: mons excessive writes to local disk and SSD wearout

[ceph-users] OpenSSL in librados

[ceph-users] Large STDDEV in pg per osd

[ceph-users] Accessing OSD objects

9 matches

Site Navigation

Mail list logo

Footer information