[ceph-users] Re: Ceph GWCLI issue

Kardos László Fri, 17 Oct 2025 17:20:36 -0700

Hello,
I apologize for sending the wrong pool details earlier.
We store the data in the following data pool:  xxxx0-data


pool 15 'xxxx0-data' erasure profile laurel_ec size 4 min_size 3 crush_rule 
8 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode off last_change 
30830 lfor 0/0/30825 flags hashpspool,ec_overwrites,selfmanaged_snaps 
stripe_width 12288 application rbd,rgw

cluster:
    id:     c404fafe-767c-11ee-bc37-0509d00921ba
    health: HEALTH_OK

  services:
    mon:         5 daemons, quorum 
v188-ceph-mgr0,v188-ceph-mgr1,v188-ceph-iscsigw2,v188-ceph6,v188-ceph5 (age 
5d)
    mgr:         v188-ceph-mgr0.rxcecw(active, since 11w), standbys: 
v188-ceph-mgr1.hmbuma
    mds:         1/1 daemons up, 1 standby
    osd:         42 osds: 42 up (since 2M), 42 in (since 3M)
    tcmu-runner: 10 portals active (4 hosts)

  data:
    volumes: 1/1 healthy
    pools:   11 pools, 614 pgs
    objects: 13.63M objects, 51 TiB
    usage:   75 TiB used, 71 TiB / 147 TiB avail
    pgs:     613 active+clean
             1   active+clean+scrubbing+deep

  io:
    client:   8.1 MiB/s rd, 105 MiB/s wr, 320 op/s rd, 2.31k op/s wr


Best Regards,
Laszlo Kardos

-----Original Message-----
From: Eugen Block <[email protected]>
Sent: Tuesday, September 30, 2025 9:03 AM
To: [email protected]
Subject: [ceph-users] Re: Ceph GWCLI issue


Hi,

I don't have an answer why the image is in unknown state, but I'd be 
concerned about the pool's pg_num. You have Terabytes in a pool with a 
single PG? That's awful and should be increased to a more suitable value. I 
can't say if that would fix anything regarding the unknown issue, but that's 
definitely not good at all.

What is the overall Ceph status (ceph -s)?

Regards,
Eugen


Zitat von Kardos László <[email protected]>:

> Hello,
>
> We have encountered the following issue in our production environment:
>
> A new RBD Image was created within an existing pool, and its status is
> reported as "unknown" in GWCLI. Based on our tests, this does not
> appear to cause operational issues, but we would like to investigate
> the root cause. No relevant information regarding this issue was found in 
> the logs.
>
> GWCLI output:
>
>
>
> o- /
> ..........................................................................
> ............................................... [...]
>
>   o- cluster
> ..........................................................................
> ............................... [Clusters: 1]
>
>   | o- ceph
> ..........................................................................
> .................................. [HEALTH_OK]
>
>   |   o- pools
> ..........................................................................
> ............................... [Pools: 11]
>
>   |   | o- .mgr
> ................................................................
> [(x3),
> Commit: 0.00Y/15591725M (0%), Used: 194124K]
>
>  |   | o- .nfs
> .................................................................
> [(x3),
> Commit: 0.00Y/15591725M (0%), Used: 16924b]
>
>   |   | o- xxxx-test
> ............................................................. [(2+1),
> Commit: 0.00Y/23727198M (0%), Used: 0.00Y]
>
>   |   | o- xxxxx-erasure-0 ............................................
> [(2+1), Commit: 0.00Y/23727198M (0%), Used: 61519257668K]
>
>   |   | o- xxxxxx-repl
> ...................................................... [(x3), Commit:
> 0.00Y/15591725M (0%), Used: 130084b]
>
>   |   | o- cephfs.cephfs-test.data
> ............................................ [(x3), Commit:
> 0.00Y/15591725M (0%), Used: 9090444K]
>
>   |   | o- cephfs.cephfs-test.meta
> .......................................... [(x3), Commit:
> 0.00Y/15591725M (0%), Used: 516415713b]
>
>   |   | o- xxxxx-data
> ..................................................... [(3+1), Commit:
> 0.00Y/9604386M (0%), Used: 7547753556K]
>
>   |   | o- xxxxx-rpl
> .......................................................... [(x3), Commit:
> 12.0T/4268616M (294%), Used: 85265b]
>
>   |   | o- xxxxx-data ...................................................
> [(3+1), Commit: 0.00Y/5011626M (0%), Used: 10955179612K]
>
>   |   | o- replicated_xxxx ...............................................
> [(x3), Commit: 25.0T/2280846592K (1176%), Used: 46912b]
>
>   |   o- topology
> ..........................................................................
> ..................... [OSDs: 42,MONs: 5]
>
>   o- disks
> ..........................................................................
> ............................. [37.0T, Disks: 3]
>
>  | o- xxxx-rpl
> ..........................................................................
> ................... [xxxx-rpl (12.0T)]
>
>   | | o- xxxxx_lun0
> ........................................................................
> [xxxx-rpl/xxxxx_lun0 (Online, 12.0T)]
>
>   | o- replicated_xxxx
> ..........................................................................
> ..... [replicated_xxxx (25.0T)]
>
>   |   o- xxxx_lun0
> ...............................................................
> [replicated_xxxx/xxxx_lun0 (Online, 12.0T)]
>
>   |   o- xxxx_lun_new
> ........................................................
> [replicated_xxxx/xxxx_lun_new (Unknown, 13.0T)]
>
>
>
> The image (xxxx_lun_new) is provisioned to multiple ESXi hosts,
> mounted, and formatted with VMFS6. The datastore is writable and
> readable by the hosts.
>
> There is a change in the block size of the RBD Image: the older RBD
> Images use a 4 MiB block size, while the new RBD Image uses a 512 KiB 
> block size.
>
> RBD Image Parameters:
>
> For replicated_xxxx / xxxx_lun0 (Online status in GWCLI):
>
>
>
> rbd image 'xxxx_lun0':
>
>         size 12 TiB in 3145728 objects
>
>         order 22 (4 MiB objects)
>
>         snapshot_count: 0
>
>         id: 5c1b5ecfdfa46
>
>         data_pool: xxxx0-data
>
>         block_name_prefix: rbd_data.14.5c1b5ecfdfa46
>
>         format: 2
>
>         features: exclusive-lock, data-pool
>
>         op_features:
>
>         flags:
>
>         create_timestamp: Tue Jul  8 13:02:11 2025
>
>         access_timestamp: Thu Sep 25 13:49:47 2025
>
>         modify_timestamp: Thu Sep 25 13:50:05 2025
>
>
>
>
>
> For replicated_xxxx / xxxx_lun_new (Unknown status in GWCLI):
>
> rbd image 'xxxx_lun_new':
>         size 13 TiB in 27262976 objects
>         order 19 (512 KiB objects)
>         snapshot_count: 0
>         id: 1945d9cf9f41ab
>         data_pool: xxxx0-data
>         block_name_prefix: rbd_data.14.1945d9cf9f41ab
>         format: 2
>         features: exclusive-lock, data-pool
>         op_features:
>         flags:
>         create_timestamp: Wed Sep 24 11:21:21 2025
>         access_timestamp: Thu Sep 25 13:50:42 2025
>         modify_timestamp: Thu Sep 25 13:49:48 2025
>
>
>
> Pool Parameters:
>
> pool 14 'replicated_xxxx' replicated size 3 min_size 2 crush_rule 7
> object_hash rjenkins pg_num 1 pgp_num 1 autoscale_mode on last_change
> 30743 flags hashpspool stripe_width 0 application rbd,rgw
>
> Ceph version:
>
> ceph --version
>
> ceph version 17.2.6 (d7ff0d10654d2280e08f1ab989c7cdf3064446a5) quincy
> (stable)
>
>
>
> Question:
>
> What could be causing the RBD Image (xxxx_lun_new) to appear in an
> "unknown" state in GWCLI?


_______________________________________________
ceph-users mailing list -- [email protected] To unsubscribe send an email 
to [email protected]
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]

[ceph-users] Re: Ceph GWCLI issue

Reply via email to