Hi,

I don't have an answer why the image is in unknown state, but I'd be concerned about the pool's pg_num. You have Terabytes in a pool with a single PG? That's awful and should be increased to a more suitable value. I can't say if that would fix anything regarding the unknown issue, but that's definitely not good at all.

What is the overall Ceph status (ceph -s)?

Regards,
Eugen


Zitat von Kardos László <[email protected]>:

Hello,

We have encountered the following issue in our production environment:

A new RBD Image was created within an existing pool, and its status is
reported as "unknown" in GWCLI. Based on our tests, this does not appear
to cause operational issues, but we would like to investigate the root
cause. No relevant information regarding this issue was found in the logs.

GWCLI output:



o- /
..........................................................................
............................................... [...]

  o- cluster
..........................................................................
............................... [Clusters: 1]

  | o- ceph
..........................................................................
.................................. [HEALTH_OK]

  |   o- pools
..........................................................................
............................... [Pools: 11]

  |   | o- .mgr
................................................................ [(x3),
Commit: 0.00Y/15591725M (0%), Used: 194124K]

 |   | o- .nfs
................................................................. [(x3),
Commit: 0.00Y/15591725M (0%), Used: 16924b]

  |   | o- xxxx-test
............................................................. [(2+1),
Commit: 0.00Y/23727198M (0%), Used: 0.00Y]

  |   | o- xxxxx-erasure-0 ............................................
[(2+1), Commit: 0.00Y/23727198M (0%), Used: 61519257668K]

  |   | o- xxxxxx-repl
...................................................... [(x3), Commit:
0.00Y/15591725M (0%), Used: 130084b]

  |   | o- cephfs.cephfs-test.data
............................................ [(x3), Commit:
0.00Y/15591725M (0%), Used: 9090444K]

  |   | o- cephfs.cephfs-test.meta
.......................................... [(x3), Commit: 0.00Y/15591725M
(0%), Used: 516415713b]

  |   | o- xxxxx-data
..................................................... [(3+1), Commit:
0.00Y/9604386M (0%), Used: 7547753556K]

  |   | o- xxxxx-rpl
.......................................................... [(x3), Commit:
12.0T/4268616M (294%), Used: 85265b]

  |   | o- xxxxx-data ...................................................
[(3+1), Commit: 0.00Y/5011626M (0%), Used: 10955179612K]

  |   | o- replicated_xxxx ...............................................
[(x3), Commit: 25.0T/2280846592K (1176%), Used: 46912b]

  |   o- topology
..........................................................................
..................... [OSDs: 42,MONs: 5]

  o- disks
..........................................................................
............................. [37.0T, Disks: 3]

 | o- xxxx-rpl
..........................................................................
................... [xxxx-rpl (12.0T)]

  | | o- xxxxx_lun0
........................................................................
[xxxx-rpl/xxxxx_lun0 (Online, 12.0T)]

  | o- replicated_xxxx
..........................................................................
..... [replicated_xxxx (25.0T)]

  |   o- xxxx_lun0
...............................................................
[replicated_xxxx/xxxx_lun0 (Online, 12.0T)]

  |   o- xxxx_lun_new
........................................................
[replicated_xxxx/xxxx_lun_new (Unknown, 13.0T)]



The image (xxxx_lun_new) is provisioned to multiple ESXi hosts, mounted,
and formatted with VMFS6. The datastore is writable and readable by the
hosts.

There is a change in the block size of the RBD Image: the older RBD Images
use a 4 MiB block size, while the new RBD Image uses a 512 KiB block size.

RBD Image Parameters:

For replicated_xxxx / xxxx_lun0 (Online status in GWCLI):



rbd image 'xxxx_lun0':

        size 12 TiB in 3145728 objects

        order 22 (4 MiB objects)

        snapshot_count: 0

        id: 5c1b5ecfdfa46

        data_pool: xxxx0-data

        block_name_prefix: rbd_data.14.5c1b5ecfdfa46

        format: 2

        features: exclusive-lock, data-pool

        op_features:

        flags:

        create_timestamp: Tue Jul  8 13:02:11 2025

        access_timestamp: Thu Sep 25 13:49:47 2025

        modify_timestamp: Thu Sep 25 13:50:05 2025





For replicated_xxxx / xxxx_lun_new (Unknown status in GWCLI):

rbd image 'xxxx_lun_new':
        size 13 TiB in 27262976 objects
        order 19 (512 KiB objects)
        snapshot_count: 0
        id: 1945d9cf9f41ab
        data_pool: xxxx0-data
        block_name_prefix: rbd_data.14.1945d9cf9f41ab
        format: 2
        features: exclusive-lock, data-pool
        op_features:
        flags:
        create_timestamp: Wed Sep 24 11:21:21 2025
        access_timestamp: Thu Sep 25 13:50:42 2025
        modify_timestamp: Thu Sep 25 13:49:48 2025



Pool Parameters:

pool 14 'replicated_xxxx' replicated size 3 min_size 2 crush_rule 7
object_hash rjenkins pg_num 1 pgp_num 1 autoscale_mode on last_change
30743 flags hashpspool stripe_width 0 application rbd,rgw

Ceph version:

ceph --version

ceph version 17.2.6 (d7ff0d10654d2280e08f1ab989c7cdf3064446a5) quincy
(stable)



Question:

What could be causing the RBD Image (xxxx_lun_new) to appear in an
"unknown" state in GWCLI?


_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to