[ceph-users] Re: v18.2.0 Reef released

2023-08-07 Thread Konstantin Shalygin
Hi,

Thanks for the release!

Please, upgrade the OS Platform docs, currently there lack of Reef in ABC 
tests: https://tracker.ceph.com/issues/62354


Thanks,
k

> On 7 Aug 2023, at 21:37, Yuri Weinstein  wrote:
> 
> We're very happy to announce the first stable release of the Reef series.
> 
> We express our gratitude to all members of the Ceph community who
> contributed by proposing pull requests, testing this release,
> providing feedback, and offering valuable suggestions.
> 
> Major Changes from Quincy:
> - RADOS: RocksDB has been upgraded to version 7.9.2.
> - RADOS: There have been significant improvements to RocksDB iteration
> overhead and performance.
> - RADOS: The perf dump and perf schema commands have been deprecated
> in favor of the new counter dump and counter schema commands.
> - RADOS: Cache tiering is now deprecated.
> - RADOS: A new feature, the "read balancer", is now available, which
> allows users to balance primary PGs per pool on their clusters.
> - RGW: Bucket resharding is now supported for multi-site configurations.
> - RGW: There have been significant improvements to the stability and
> consistency of multi-site replication.
> - RGW: Compression is now supported for objects uploaded with
> Server-Side Encryption.
> - Dashboard: There is a new Dashboard page with improved layout.
> Active alerts and some important charts are now displayed inside
> cards.
> - RBD: Support for layered client-side encryption has been added.
> - Telemetry: Users can now opt in to participate in a leaderboard in
> the telemetry public dashboards.
> 
> We encourage you to read the full release notes at
> https://ceph.io/en/news/blog/2023/v18-2-0-reef-released/
> 
> Getting Ceph
> 
> * Git at git://github.com/ceph/ceph.git
> * Tarball at https://download.ceph.com/tarballs/ceph-18.2.0.tar.gz
> * Containers at https://quay.io/repository/ceph/ceph
> * For packages, see https://docs.ceph.com/docs/master/install/get-packages/
> * Release git sha1: 5dd24139a1eada541a3bc16b6941c5dde975e26d
> 
> Did you know? Every Ceph release is built and tested on resources
> funded directly by the non-profit Ceph Foundation.
> If you would like to support this and our other efforts, please
> consider joining now https://ceph.io/en/foundation/.
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: cephfs snapshot mirror peer_bootstrap import hung

2023-08-07 Thread Adiga, Anantha
Hi Venky,

Here, should I send the mrg log ?

root@fl31ca104ja0201:/etc/ceph# ceph -c remote_ceph.conf --id=mirror_remote  
status --verbose
parsed_args: Namespace(admin_socket=None, block=False, 
cephconf='remote_ceph.conf', client_id='mirror_remote', client_name=None, 
cluster=None, cluster_timeout=None, completion=False, help=False, 
input_file=None, output_file=None, output_format=None, period=1, setgroup=None, 
setuser=None, status=False, verbose=True, version=False, watch=False, 
watch_channel=None, watch_debug=False, watch_error=False, watch_info=False, 
watch_sec=False, watch_warn=False), childargs: ['status']
^CCluster connection aborted

root@fl31ca104ja0201:/etc/ceph#  cat remote_ceph.client.mirror_remote.keyring
[client.mirror_remote]
key = AQCfwMlkM90pLBAAwXtvpp8j04IvC8tqpAG9bA==
caps mds = "allow rwps fsname=cephfs"
caps mon = "allow r fsname=cephfs"
caps osd = "allow rw tag cephfs data=cephfs"

root@fl31ca104ja0201:/etc/ceph# cat remote_ceph.conf
[client.libvirt]
admin socket = /var/run/ceph/$cluster-$type.$id.$pid.$cctid.asok # must be 
writable by QEMU and allowed by SELinux or AppArmor
log file = /var/log/ceph/qemu-guest-$pid.log # must be writable by QEMU and 
allowed by SELinux or AppArmor

[client.rgw.cr21meg16ba0101.rgw0]
host = cr21meg16ba0101
keyring = /var/lib/ceph/radosgw/ceph-rgw.cr21meg16ba0101.rgw0/keyring
log file = /var/log/ceph/ceph-rgw-cr21meg16ba0101.rgw0.log
rgw frontends = beast endpoint=172.18.55.71:8080
rgw thread pool size = 512

# Please do not change this file directly since it is managed by Ansible and 
will be overwritten
[global]
cluster network = 172.18.55.71/24
fsid = a6f52598-e5cd-4a08-8422-7b6fdb1d5dbe
mon host = 
[v2:172.18.55.71:3300,v1:172.18.55.71:6789],[v2:172.18.55.72:3300,v1:172.18.55.72:6789],[v2:172.18.55.73:3300,v1:172.18.55.73:6789]
mon initial members = cr21meg16ba0101,cr21meg16ba0102,cr21meg16ba0103
osd pool default crush rule = -1
public network = 172.18.55.0/24

[mon]
auth_allow_insecure_global_id_reclaim = False
auth_expose_insecure_global_id_reclaim = False

[osd]
osd memory target = 23630132019

-Original Message-
From: Venky Shankar  
Sent: Monday, August 7, 2023 9:26 PM
To: Adiga, Anantha 
Cc: ceph-users@ceph.io
Subject: Re: [ceph-users] Re: cephfs snapshot mirror peer_bootstrap import hung

On Tue, Aug 8, 2023 at 9:16 AM Adiga, Anantha  wrote:
>
> Hi Venky,
>
> Is this correct?
> (copied ceph.conf from secondary cluster to /etc/ce/ph/crsite directory in 
> primary cluster, copied ceph.mon.keyring from secondary as  
> ceph.client.crsite.mon.keyringin /etc/ceph on primary)
> root@fl31ca104ja0201:/etc/ceph# ls
> ceph.client.admin.keyring  ceph.client.crsite.admin.keyring  
> ceph.client.mirror_remote.keying  crsitefio-fs.test   fs-mnt   rbdmap
> ceph.client.crash.keyring  ceph.client.crsite.mon.keyringceph.conf
>  fio-bsd.test  fio-nfs.test  nfs-mnt  remote_ceph.conf
> root@fl31ca104ja0201:/etc/ceph# ls crsite ceph.conf  ceph.mon.keyring
>
> root@fl31ca104ja0201:/etc/ceph/crsite# ceph -c ceph.conf 
> --id=crsite.mon --cluster=ceph --verbose
> parsed_args: Namespace(admin_socket=None, block=False, 
> cephconf='ceph.conf', client_id='crsite.mon', client_name=None, 
> cluster='ceph', cluster_timeout=None, completion=False, help=False, 
> input_file=None, output_file=None, output_format=None, period=1, 
> setgroup=None, setuser=None, status=False, verbose=True, 
> version=False, watch=False, watch_channel=None, watch_debug=False, 
> watch_error=False, watch_info=False, watch_sec=False, 
> watch_warn=False), childargs: [] ^CCluster connection aborted
>
> Not sure if the --id (CLIENT_ID) is correct.. not able to connect

use `remote_ceph.conf` and id as `mirror_remote` (since I guess these are the 
secondary clusters' conf given the names).

>
> Thank you,
> Anantha
>
> -Original Message-
> From: Venky Shankar 
> Sent: Monday, August 7, 2023 7:05 PM
> To: Adiga, Anantha 
> Cc: ceph-users@ceph.io
> Subject: Re: [ceph-users] Re: cephfs snapshot mirror peer_bootstrap 
> import hung
>
> Hi Anantha,
>
> On Tue, Aug 8, 2023 at 6:29 AM Adiga, Anantha  wrote:
> >
> > Hi Venky,
> >
> > The primary and secondary clusters both have the same cluster name "ceph" 
> > and both have a single filesystem by name "cephfs".
>
> That's not an issue.
>
> > How do I check the connection from primary to secondary using mon addr and 
> > key?   What is command line
>
> A quick way to check this would be to place the secondary cluster ceph 
> config file and the user key on one of the primary node (preferably, 
> the ceph-mgr host, just for tests - so purge these when done) and then 
> running
>
> ceph -c /path/to/secondary/ceph.conf --id <> status
>
> If that runs all fine, then the mirror daemon is probably hitting some bug.
>
> > These two clusters are configured for rgw multisite and is functional.
> >
> > Thank you,
> > Anantha
> >
> > -Original Message-
> > 

[ceph-users] Re: cephfs snapshot mirror peer_bootstrap import hung

2023-08-07 Thread Venky Shankar
On Tue, Aug 8, 2023 at 9:16 AM Adiga, Anantha  wrote:
>
> Hi Venky,
>
> Is this correct?
> (copied ceph.conf from secondary cluster to /etc/ce/ph/crsite directory in 
> primary cluster, copied ceph.mon.keyring from secondary as  
> ceph.client.crsite.mon.keyringin /etc/ceph on primary)
> root@fl31ca104ja0201:/etc/ceph# ls
> ceph.client.admin.keyring  ceph.client.crsite.admin.keyring  
> ceph.client.mirror_remote.keying  crsitefio-fs.test   fs-mnt   rbdmap
> ceph.client.crash.keyring  ceph.client.crsite.mon.keyringceph.conf
>  fio-bsd.test  fio-nfs.test  nfs-mnt  remote_ceph.conf
> root@fl31ca104ja0201:/etc/ceph# ls crsite
> ceph.conf  ceph.mon.keyring
>
> root@fl31ca104ja0201:/etc/ceph/crsite# ceph -c ceph.conf --id=crsite.mon 
> --cluster=ceph --verbose
> parsed_args: Namespace(admin_socket=None, block=False, cephconf='ceph.conf', 
> client_id='crsite.mon', client_name=None, cluster='ceph', 
> cluster_timeout=None, completion=False, help=False, input_file=None, 
> output_file=None, output_format=None, period=1, setgroup=None, setuser=None, 
> status=False, verbose=True, version=False, watch=False, watch_channel=None, 
> watch_debug=False, watch_error=False, watch_info=False, watch_sec=False, 
> watch_warn=False), childargs: []
> ^CCluster connection aborted
>
> Not sure if the --id (CLIENT_ID) is correct.. not able to connect

use `remote_ceph.conf` and id as `mirror_remote` (since I guess these
are the secondary clusters' conf given the names).

>
> Thank you,
> Anantha
>
> -Original Message-
> From: Venky Shankar 
> Sent: Monday, August 7, 2023 7:05 PM
> To: Adiga, Anantha 
> Cc: ceph-users@ceph.io
> Subject: Re: [ceph-users] Re: cephfs snapshot mirror peer_bootstrap import 
> hung
>
> Hi Anantha,
>
> On Tue, Aug 8, 2023 at 6:29 AM Adiga, Anantha  wrote:
> >
> > Hi Venky,
> >
> > The primary and secondary clusters both have the same cluster name "ceph" 
> > and both have a single filesystem by name "cephfs".
>
> That's not an issue.
>
> > How do I check the connection from primary to secondary using mon addr and 
> > key?   What is command line
>
> A quick way to check this would be to place the secondary cluster ceph config 
> file and the user key on one of the primary node (preferably, the ceph-mgr 
> host, just for tests - so purge these when done) and then running
>
> ceph -c /path/to/secondary/ceph.conf --id <> status
>
> If that runs all fine, then the mirror daemon is probably hitting some bug.
>
> > These two clusters are configured for rgw multisite and is functional.
> >
> > Thank you,
> > Anantha
> >
> > -Original Message-
> > From: Venky Shankar 
> > Sent: Monday, August 7, 2023 5:46 PM
> > To: Adiga, Anantha 
> > Cc: ceph-users@ceph.io
> > Subject: Re: [ceph-users] Re: cephfs snapshot mirror peer_bootstrap
> > import hung
> >
> > Hi Anantha,
> >
> > On Mon, Aug 7, 2023 at 11:52 PM Adiga, Anantha  
> > wrote:
> > >
> > > Hi Venky,
> > >
> > >
> > >
> > > I tried on another secondary Quincy cluster and it is the same problem. 
> > > The peer_bootstrap mport  command hangs.
> >
> > A pacific cluster generated peer token should be importable in a quincy 
> > source cluster. Looking at the logs, I suspect that the perceived hang is 
> > the mirroring module blocked on connecting to the secondary cluster (to set 
> > mirror info xattr). Are you able to connect to the secondary cluster from 
> > the host running ceph-mgr on the primary cluster using its monitor address 
> > (and a key)?
> >
> > The primary and secondary clusters both have the same cluster name "ceph" 
> > and both have a single filesystem by name "cephfs".  How do I check that 
> > connection from primary to secondary using mon addr and key?
> > These two clusters are configured for rgw multisite and is functional.
> >
> > >
> > >
> > >
> > >
> > >
> > > root@fl31ca104ja0201:/# ceph fs  snapshot mirror peer_bootstrap
> > > import cephfs
> > > eyJmc2lkIjogIjJlYWMwZWEwLTYwNDgtNDQ0Zi04NGIyLThjZWVmZWQyN2E1YiIsICJm
> > > aW
> > > xlc3lzdGVtIjogImNlcGhmcyIsICJ1c2VyIjogImNsaWVudC5taXJyb3JfcmVtb3RlIi
> > > wg
> > > InNpdGVfbmFtZSI6ICJzaGdSLXNpdGUiLCAia2V5IjogIkFRQ0lGdEZrSStTTE5oQUFX
> > > bW
> > > V6MkRKcEg5ZUdyYnhBOWVmZG9BPT0iLCAibW9uX2hvc3QiOiAiW3YyOjEwLjIzOS4xNT
> > > Uu
> > > MTg6MzMwMC8wLHYxOjEwLjIzOS4xNTUuMTg6Njc4OS8wXSBbdjI6MTAuMjM5LjE1NS4x
> > > OT
> > > ozMzAwLzAsdjE6MTAuMjM5LjE1NS4xOTo2Nzg5LzBdIFt2MjoxMC4yMzkuMTU1LjIwOj
> > > Mz MDAvMCx2MToxMC4yMzkuMTU1LjIwOjY3ODkvMF0ifQ==
> > >
> > > ……
> > >
> > > …….
> > >
> > > ..command does not complete..waits here
> > >
> > > ^C  to exit.
> > >
> > > Thereafter some commands do not complete…
> > >
> > > root@fl31ca104ja0201:/# ceph -s
> > >
> > >   cluster:
> > >
> > > id: d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e
> > >
> > > health: HEALTH_OK
> > >
> > >
> > >
> > >   services:
> > >
> > > mon:   3 daemons, quorum 
> > > fl31ca104ja0202,fl31ca104ja0203,fl31ca104ja0201 (age 2d)
> > >
> > > mgr:

[ceph-users] Re: cephfs snapshot mirror peer_bootstrap import hung

2023-08-07 Thread Adiga, Anantha
Hi Venky,

Is this correct? 
(copied ceph.conf from secondary cluster to /etc/ce/ph/crsite directory in 
primary cluster, copied ceph.mon.keyring from secondary as  
ceph.client.crsite.mon.keyringin /etc/ceph on primary)
root@fl31ca104ja0201:/etc/ceph# ls
ceph.client.admin.keyring  ceph.client.crsite.admin.keyring  
ceph.client.mirror_remote.keying  crsitefio-fs.test   fs-mnt   rbdmap
ceph.client.crash.keyring  ceph.client.crsite.mon.keyringceph.conf  
   fio-bsd.test  fio-nfs.test  nfs-mnt  remote_ceph.conf
root@fl31ca104ja0201:/etc/ceph# ls crsite
ceph.conf  ceph.mon.keyring

root@fl31ca104ja0201:/etc/ceph/crsite# ceph -c ceph.conf --id=crsite.mon 
--cluster=ceph --verbose
parsed_args: Namespace(admin_socket=None, block=False, cephconf='ceph.conf', 
client_id='crsite.mon', client_name=None, cluster='ceph', cluster_timeout=None, 
completion=False, help=False, input_file=None, output_file=None, 
output_format=None, period=1, setgroup=None, setuser=None, status=False, 
verbose=True, version=False, watch=False, watch_channel=None, 
watch_debug=False, watch_error=False, watch_info=False, watch_sec=False, 
watch_warn=False), childargs: []
^CCluster connection aborted

Not sure if the --id (CLIENT_ID) is correct.. not able to connect

Thank you,
Anantha

-Original Message-
From: Venky Shankar  
Sent: Monday, August 7, 2023 7:05 PM
To: Adiga, Anantha 
Cc: ceph-users@ceph.io
Subject: Re: [ceph-users] Re: cephfs snapshot mirror peer_bootstrap import hung

Hi Anantha,

On Tue, Aug 8, 2023 at 6:29 AM Adiga, Anantha  wrote:
>
> Hi Venky,
>
> The primary and secondary clusters both have the same cluster name "ceph" and 
> both have a single filesystem by name "cephfs".

That's not an issue.

> How do I check the connection from primary to secondary using mon addr and 
> key?   What is command line

A quick way to check this would be to place the secondary cluster ceph config 
file and the user key on one of the primary node (preferably, the ceph-mgr 
host, just for tests - so purge these when done) and then running

ceph -c /path/to/secondary/ceph.conf --id <> status

If that runs all fine, then the mirror daemon is probably hitting some bug.

> These two clusters are configured for rgw multisite and is functional.
>
> Thank you,
> Anantha
>
> -Original Message-
> From: Venky Shankar 
> Sent: Monday, August 7, 2023 5:46 PM
> To: Adiga, Anantha 
> Cc: ceph-users@ceph.io
> Subject: Re: [ceph-users] Re: cephfs snapshot mirror peer_bootstrap 
> import hung
>
> Hi Anantha,
>
> On Mon, Aug 7, 2023 at 11:52 PM Adiga, Anantha  
> wrote:
> >
> > Hi Venky,
> >
> >
> >
> > I tried on another secondary Quincy cluster and it is the same problem. The 
> > peer_bootstrap mport  command hangs.
>
> A pacific cluster generated peer token should be importable in a quincy 
> source cluster. Looking at the logs, I suspect that the perceived hang is the 
> mirroring module blocked on connecting to the secondary cluster (to set 
> mirror info xattr). Are you able to connect to the secondary cluster from the 
> host running ceph-mgr on the primary cluster using its monitor address (and a 
> key)?
>
> The primary and secondary clusters both have the same cluster name "ceph" and 
> both have a single filesystem by name "cephfs".  How do I check that 
> connection from primary to secondary using mon addr and key?
> These two clusters are configured for rgw multisite and is functional.
>
> >
> >
> >
> >
> >
> > root@fl31ca104ja0201:/# ceph fs  snapshot mirror peer_bootstrap 
> > import cephfs 
> > eyJmc2lkIjogIjJlYWMwZWEwLTYwNDgtNDQ0Zi04NGIyLThjZWVmZWQyN2E1YiIsICJm
> > aW 
> > xlc3lzdGVtIjogImNlcGhmcyIsICJ1c2VyIjogImNsaWVudC5taXJyb3JfcmVtb3RlIi
> > wg 
> > InNpdGVfbmFtZSI6ICJzaGdSLXNpdGUiLCAia2V5IjogIkFRQ0lGdEZrSStTTE5oQUFX
> > bW 
> > V6MkRKcEg5ZUdyYnhBOWVmZG9BPT0iLCAibW9uX2hvc3QiOiAiW3YyOjEwLjIzOS4xNT
> > Uu 
> > MTg6MzMwMC8wLHYxOjEwLjIzOS4xNTUuMTg6Njc4OS8wXSBbdjI6MTAuMjM5LjE1NS4x
> > OT 
> > ozMzAwLzAsdjE6MTAuMjM5LjE1NS4xOTo2Nzg5LzBdIFt2MjoxMC4yMzkuMTU1LjIwOj
> > Mz MDAvMCx2MToxMC4yMzkuMTU1LjIwOjY3ODkvMF0ifQ==
> >
> > ……
> >
> > …….
> >
> > ..command does not complete..waits here
> >
> > ^C  to exit.
> >
> > Thereafter some commands do not complete…
> >
> > root@fl31ca104ja0201:/# ceph -s
> >
> >   cluster:
> >
> > id: d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e
> >
> > health: HEALTH_OK
> >
> >
> >
> >   services:
> >
> > mon:   3 daemons, quorum 
> > fl31ca104ja0202,fl31ca104ja0203,fl31ca104ja0201 (age 2d)
> >
> > mgr:   fl31ca104ja0201.kkoono(active, since 3d), standbys: 
> > fl31ca104ja0202, fl31ca104ja0203
> >
> > mds:   1/1 daemons up, 2 standby
> >
> > osd:   44 osds: 44 up (since 2d), 44 in (since 5w)
> >
> > cephfs-mirror: 1 daemon active (1 hosts)
> >
> > rgw:   3 daemons active (3 hosts, 1 zones)
> >
> >
> >
> >   data:
> >
> > volumes: 1/1 healthy
> >
> > pools:   25 pools, 769 pgs
> >
> > 

[ceph-users] Re: cephfs snapshot mirror peer_bootstrap import hung

2023-08-07 Thread Venky Shankar
Hi Anantha,

On Tue, Aug 8, 2023 at 6:29 AM Adiga, Anantha  wrote:
>
> Hi Venky,
>
> The primary and secondary clusters both have the same cluster name "ceph" and 
> both have a single filesystem by name "cephfs".

That's not an issue.

> How do I check the connection from primary to secondary using mon addr and 
> key?   What is command line

A quick way to check this would be to place the secondary cluster ceph
config file and the user key on one of the primary node (preferably,
the ceph-mgr host, just for tests - so purge these when done) and then
running

ceph -c /path/to/secondary/ceph.conf --id <> status

If that runs all fine, then the mirror daemon is probably hitting some bug.

> These two clusters are configured for rgw multisite and is functional.
>
> Thank you,
> Anantha
>
> -Original Message-
> From: Venky Shankar 
> Sent: Monday, August 7, 2023 5:46 PM
> To: Adiga, Anantha 
> Cc: ceph-users@ceph.io
> Subject: Re: [ceph-users] Re: cephfs snapshot mirror peer_bootstrap import 
> hung
>
> Hi Anantha,
>
> On Mon, Aug 7, 2023 at 11:52 PM Adiga, Anantha  
> wrote:
> >
> > Hi Venky,
> >
> >
> >
> > I tried on another secondary Quincy cluster and it is the same problem. The 
> > peer_bootstrap mport  command hangs.
>
> A pacific cluster generated peer token should be importable in a quincy 
> source cluster. Looking at the logs, I suspect that the perceived hang is the 
> mirroring module blocked on connecting to the secondary cluster (to set 
> mirror info xattr). Are you able to connect to the secondary cluster from the 
> host running ceph-mgr on the primary cluster using its monitor address (and a 
> key)?
>
> The primary and secondary clusters both have the same cluster name "ceph" and 
> both have a single filesystem by name "cephfs".  How do I check that 
> connection from primary to secondary using mon addr and key?
> These two clusters are configured for rgw multisite and is functional.
>
> >
> >
> >
> >
> >
> > root@fl31ca104ja0201:/# ceph fs  snapshot mirror peer_bootstrap import
> > cephfs
> > eyJmc2lkIjogIjJlYWMwZWEwLTYwNDgtNDQ0Zi04NGIyLThjZWVmZWQyN2E1YiIsICJmaW
> > xlc3lzdGVtIjogImNlcGhmcyIsICJ1c2VyIjogImNsaWVudC5taXJyb3JfcmVtb3RlIiwg
> > InNpdGVfbmFtZSI6ICJzaGdSLXNpdGUiLCAia2V5IjogIkFRQ0lGdEZrSStTTE5oQUFXbW
> > V6MkRKcEg5ZUdyYnhBOWVmZG9BPT0iLCAibW9uX2hvc3QiOiAiW3YyOjEwLjIzOS4xNTUu
> > MTg6MzMwMC8wLHYxOjEwLjIzOS4xNTUuMTg6Njc4OS8wXSBbdjI6MTAuMjM5LjE1NS4xOT
> > ozMzAwLzAsdjE6MTAuMjM5LjE1NS4xOTo2Nzg5LzBdIFt2MjoxMC4yMzkuMTU1LjIwOjMz
> > MDAvMCx2MToxMC4yMzkuMTU1LjIwOjY3ODkvMF0ifQ==
> >
> > ……
> >
> > …….
> >
> > ..command does not complete..waits here
> >
> > ^C  to exit.
> >
> > Thereafter some commands do not complete…
> >
> > root@fl31ca104ja0201:/# ceph -s
> >
> >   cluster:
> >
> > id: d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e
> >
> > health: HEALTH_OK
> >
> >
> >
> >   services:
> >
> > mon:   3 daemons, quorum 
> > fl31ca104ja0202,fl31ca104ja0203,fl31ca104ja0201 (age 2d)
> >
> > mgr:   fl31ca104ja0201.kkoono(active, since 3d), standbys: 
> > fl31ca104ja0202, fl31ca104ja0203
> >
> > mds:   1/1 daemons up, 2 standby
> >
> > osd:   44 osds: 44 up (since 2d), 44 in (since 5w)
> >
> > cephfs-mirror: 1 daemon active (1 hosts)
> >
> > rgw:   3 daemons active (3 hosts, 1 zones)
> >
> >
> >
> >   data:
> >
> > volumes: 1/1 healthy
> >
> > pools:   25 pools, 769 pgs
> >
> > objects: 614.40k objects, 1.9 TiB
> >
> > usage:   2.9 TiB used, 292 TiB / 295 TiB avail
> >
> > pgs: 769 active+clean
> >
> >
> >
> >   io:
> >
> > client:   32 KiB/s rd, 0 B/s wr, 33 op/s rd, 1 op/s wr
> >
> >
> >
> > root@fl31ca104ja0201:/#
> >
> > root@fl31ca104ja0201:/# ceph fs status cephfs
> >
> > This command also waits. ……
> >
> >
> >
> > I have attached the mgr log
> >
> > root@fl31ca104ja0201:/# ceph service status
> >
> > {
> >
> > "cephfs-mirror": {
> >
> > "5306346": {
> >
> > "status_stamp": "2023-08-07T17:35:56.884907+",
> >
> > "last_beacon": "2023-08-07T17:45:01.903540+",
> >
> > "status": {
> >
> > "status_json": 
> > "{\"1\":{\"name\":\"cephfs\",\"directory_count\":0,\"peers\":{}}}"
> >
> > }
> >
> > }
> >
> >
> >
> > Quincy secondary cluster
> >
> >
> >
> > root@a001s008-zz14l47008:/# ceph mgr module enable mirroring
> >
> > root@a001s008-zz14l47008:/# ceph fs authorize cephfs
> > client.mirror_remote / rwps
> >
> > [client.mirror_remote]
> >
> > key = AQCIFtFkI+SLNhAAWmez2DJpH9eGrbxA9efdoA==
> >
> > root@a001s008-zz14l47008:/# ceph auth get client.mirror_remote
> >
> > [client.mirror_remote]
> >
> > key = AQCIFtFkI+SLNhAAWmez2DJpH9eGrbxA9efdoA==
> >
> > caps mds = "allow rwps fsname=cephfs"
> >
> > caps mon = "allow r fsname=cephfs"
> >
> > caps osd = "allow rw tag cephfs data=cephfs"
> >
> > root@a001s008-zz14l47008:/#
> >
> > root@a001s008-zz14l47008:/# ceph 

[ceph-users] Re: cephfs snapshot mirror peer_bootstrap import hung

2023-08-07 Thread Adiga, Anantha
Hi Venky, 

The primary and secondary clusters both have the same cluster name "ceph" and 
both have a single filesystem by name "cephfs".  How do I check the connection 
from primary to secondary using mon addr and key?   What is command line
These two clusters are configured for rgw multisite and is functional.  

Thank you,
Anantha

-Original Message-
From: Venky Shankar  
Sent: Monday, August 7, 2023 5:46 PM
To: Adiga, Anantha 
Cc: ceph-users@ceph.io
Subject: Re: [ceph-users] Re: cephfs snapshot mirror peer_bootstrap import hung

Hi Anantha,

On Mon, Aug 7, 2023 at 11:52 PM Adiga, Anantha  wrote:
>
> Hi Venky,
>
>
>
> I tried on another secondary Quincy cluster and it is the same problem. The 
> peer_bootstrap mport  command hangs.

A pacific cluster generated peer token should be importable in a quincy source 
cluster. Looking at the logs, I suspect that the perceived hang is the 
mirroring module blocked on connecting to the secondary cluster (to set mirror 
info xattr). Are you able to connect to the secondary cluster from the host 
running ceph-mgr on the primary cluster using its monitor address (and a key)?

The primary and secondary clusters both have the same cluster name "ceph" and 
both have a single filesystem by name "cephfs".  How do I check that connection 
from primary to secondary using mon addr and key?   
These two clusters are configured for rgw multisite and is functional.  

>
>
>
>
>
> root@fl31ca104ja0201:/# ceph fs  snapshot mirror peer_bootstrap import 
> cephfs 
> eyJmc2lkIjogIjJlYWMwZWEwLTYwNDgtNDQ0Zi04NGIyLThjZWVmZWQyN2E1YiIsICJmaW
> xlc3lzdGVtIjogImNlcGhmcyIsICJ1c2VyIjogImNsaWVudC5taXJyb3JfcmVtb3RlIiwg
> InNpdGVfbmFtZSI6ICJzaGdSLXNpdGUiLCAia2V5IjogIkFRQ0lGdEZrSStTTE5oQUFXbW
> V6MkRKcEg5ZUdyYnhBOWVmZG9BPT0iLCAibW9uX2hvc3QiOiAiW3YyOjEwLjIzOS4xNTUu
> MTg6MzMwMC8wLHYxOjEwLjIzOS4xNTUuMTg6Njc4OS8wXSBbdjI6MTAuMjM5LjE1NS4xOT
> ozMzAwLzAsdjE6MTAuMjM5LjE1NS4xOTo2Nzg5LzBdIFt2MjoxMC4yMzkuMTU1LjIwOjMz
> MDAvMCx2MToxMC4yMzkuMTU1LjIwOjY3ODkvMF0ifQ==
>
> ……
>
> …….
>
> ..command does not complete..waits here
>
> ^C  to exit.
>
> Thereafter some commands do not complete…
>
> root@fl31ca104ja0201:/# ceph -s
>
>   cluster:
>
> id: d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e
>
> health: HEALTH_OK
>
>
>
>   services:
>
> mon:   3 daemons, quorum 
> fl31ca104ja0202,fl31ca104ja0203,fl31ca104ja0201 (age 2d)
>
> mgr:   fl31ca104ja0201.kkoono(active, since 3d), standbys: 
> fl31ca104ja0202, fl31ca104ja0203
>
> mds:   1/1 daemons up, 2 standby
>
> osd:   44 osds: 44 up (since 2d), 44 in (since 5w)
>
> cephfs-mirror: 1 daemon active (1 hosts)
>
> rgw:   3 daemons active (3 hosts, 1 zones)
>
>
>
>   data:
>
> volumes: 1/1 healthy
>
> pools:   25 pools, 769 pgs
>
> objects: 614.40k objects, 1.9 TiB
>
> usage:   2.9 TiB used, 292 TiB / 295 TiB avail
>
> pgs: 769 active+clean
>
>
>
>   io:
>
> client:   32 KiB/s rd, 0 B/s wr, 33 op/s rd, 1 op/s wr
>
>
>
> root@fl31ca104ja0201:/#
>
> root@fl31ca104ja0201:/# ceph fs status cephfs
>
> This command also waits. ……
>
>
>
> I have attached the mgr log
>
> root@fl31ca104ja0201:/# ceph service status
>
> {
>
> "cephfs-mirror": {
>
> "5306346": {
>
> "status_stamp": "2023-08-07T17:35:56.884907+",
>
> "last_beacon": "2023-08-07T17:45:01.903540+",
>
> "status": {
>
> "status_json": 
> "{\"1\":{\"name\":\"cephfs\",\"directory_count\":0,\"peers\":{}}}"
>
> }
>
> }
>
>
>
> Quincy secondary cluster
>
>
>
> root@a001s008-zz14l47008:/# ceph mgr module enable mirroring
>
> root@a001s008-zz14l47008:/# ceph fs authorize cephfs 
> client.mirror_remote / rwps
>
> [client.mirror_remote]
>
> key = AQCIFtFkI+SLNhAAWmez2DJpH9eGrbxA9efdoA==
>
> root@a001s008-zz14l47008:/# ceph auth get client.mirror_remote
>
> [client.mirror_remote]
>
> key = AQCIFtFkI+SLNhAAWmez2DJpH9eGrbxA9efdoA==
>
> caps mds = "allow rwps fsname=cephfs"
>
> caps mon = "allow r fsname=cephfs"
>
> caps osd = "allow rw tag cephfs data=cephfs"
>
> root@a001s008-zz14l47008:/#
>
> root@a001s008-zz14l47008:/# ceph fs snapshot mirror peer_bootstrap 
> create cephfs client.mirror_remote shgR-site
>
> {"token": 
> "eyJmc2lkIjogIjJlYWMwZWEwLTYwNDgtNDQ0Zi04NGIyLThjZWVmZWQyN2E1YiIsICJma
> Wxlc3lzdGVtIjogImNlcGhmcyIsICJ1c2VyIjogImNsaWVudC5taXJyb3JfcmVtb3RlIiw
> gInNpdGVfbmFtZSI6ICJzaGdSLXNpdGUiLCAia2V5IjogIkFRQ0lGdEZrSStTTE5oQUFXb
> WV6MkRKcEg5ZUdyYnhBOWVmZG9BPT0iLCAibW9uX2hvc3QiOiAiW3YyOjEwLjIzOS4xNTU
> uMTg6MzMwMC8wLHYxOjEwLjIzOS4xNTUuMTg6Njc4OS8wXSBbdjI6MTAuMjM5LjE1NS4xO
> TozMzAwLzAsdjE6MTAuMjM5LjE1NS4xOTo2Nzg5LzBdIFt2MjoxMC4yMzkuMTU1LjIwOjM
> zMDAvMCx2MToxMC4yMzkuMTU1LjIwOjY3ODkvMF0ifQ=="}
>
> root@a001s008-zz14l47008:/#
>
>
>
> Thank you,
>
> Anantha
>
>
>
> From: Adiga, Anantha
> Sent: Friday, August 4, 2023 11:55 AM
> To: Venky Shankar ; ceph-users@ceph.io
> Subject: RE: [ceph-users] 

[ceph-users] Re: MDS stuck in rejoin

2023-08-07 Thread Xiubo Li


On 8/7/23 21:54, Frank Schilder wrote:

Dear Xiubo,

I managed to collect some information. It looks like there is nothing in the 
dmesg log around the time the client failed to advance its TID. I collected 
short snippets around the critical time below. I have full logs in case you are 
interested. Its large files, I will need to do an upload for that.

I also have a dump of "mds session ls" output for clients that showed the same 
issue later. Unfortunately, no consistent log information for a single incident.

Here the summary, please let me know if uploading the full package makes sense:

- Status:

On July 29, 2023

ceph status/df/pool stats/health detail at 01:05:14:
   cluster:
 health: HEALTH_WARN
 1 pools nearfull

ceph status/df/pool stats/health detail at 01:05:28:
   cluster:
 health: HEALTH_WARN
 1 clients failing to advance oldest client/flush tid
 1 pools nearfull


Okay, then this could be the root cause.

If the pool nearful it could block flushing the journal logs to the pool 
and then the MDS couldn't safe reply to the requests and then block them 
like this.


Could you fix the pool nearful issue first and then check could you see 
it again ?




[...]

On July 31, 2023

ceph status/df/pool stats/health detail at 10:36:16:
   cluster:
 health: HEALTH_WARN
 1 clients failing to advance oldest client/flush tid
 1 pools nearfull

   cluster:
 health: HEALTH_WARN
 1 pools nearfull

- client evict command (date, time, command):

2023-07-31 10:36  ceph tell mds.ceph-11 client evict id=145678457

We have a 1h time difference between the date stamp of the command and the 
dmesg date stamps. However, there seems to be a weird 10min delay from issuing 
the evict command until it shows up in dmesg on the client.

- dmesg:

[Fri Jul 28 12:59:14 2023] beegfs: enabling unsafe global rkey
[Fri Jul 28 12:59:14 2023] beegfs: enabling unsafe global rkey
[Fri Jul 28 12:59:14 2023] beegfs: enabling unsafe global rkey
[Fri Jul 28 12:59:14 2023] beegfs: enabling unsafe global rkey
[Fri Jul 28 12:59:14 2023] beegfs: enabling unsafe global rkey
[Fri Jul 28 12:59:14 2023] beegfs: enabling unsafe global rkey
[Fri Jul 28 16:07:47 2023] slurm.epilog.cl (24175): drop_caches: 3
[Sat Jul 29 18:21:30 2023] libceph: mds2 192.168.32.75:6801 socket closed (con 
state OPEN)
[Sat Jul 29 18:21:30 2023] libceph: mds2 192.168.32.75:6801 socket closed (con 
state OPEN)
[Sat Jul 29 18:21:30 2023] libceph: mds2 192.168.32.75:6801 socket closed (con 
state OPEN)
[Sat Jul 29 18:21:42 2023] ceph: mds2 reconnect start
[Sat Jul 29 18:21:42 2023] ceph: mds2 reconnect start
[Sat Jul 29 18:21:43 2023] ceph: mds2 reconnect start
[Sat Jul 29 18:21:43 2023] ceph: mds2 reconnect success
[Sat Jul 29 18:21:43 2023] ceph: mds2 reconnect success
[Sat Jul 29 18:21:43 2023] ceph: mds2 reconnect success
[Sat Jul 29 18:26:39 2023] ceph: mds2 reconnect start
[Sat Jul 29 18:26:39 2023] ceph: mds2 reconnect start
[Sat Jul 29 18:26:39 2023] ceph: mds2 reconnect start
[Sat Jul 29 18:26:40 2023] ceph: mds2 reconnect success
[Sat Jul 29 18:26:40 2023] ceph: mds2 reconnect success
[Sat Jul 29 18:26:40 2023] ceph: mds2 reconnect success
[Sat Jul 29 18:26:49 2023] ceph: update_snap_trace error -22


This is a known bug and we have fixed it in both kclient and ceph side:

https://tracker.ceph.com/issues/61200

https://tracker.ceph.com/issues/61217

Thanks

- Xiubo



[Sat Jul 29 18:26:49 2023] ceph: mds2 recovery completed
[Sat Jul 29 18:26:49 2023] ceph: mds2 recovery completed
[Sat Jul 29 18:26:49 2023] ceph: mds2 recovery completed
[Sun Jul 30 16:37:55 2023] slurm.epilog.cl (43668): drop_caches: 3
[Mon Jul 31 01:00:20 2023] slurm.epilog.cl (73347): drop_caches: 3
[Mon Jul 31 09:46:41 2023] libceph: mds0 192.168.32.81:6801 socket closed (con 
state OPEN)
[Mon Jul 31 09:46:41 2023] libceph: mds3 192.168.32.87:6801 socket closed (con 
state OPEN)
[Mon Jul 31 09:46:41 2023] libceph: mds7 192.168.32.88:6801 socket closed (con 
state OPEN)
[Mon Jul 31 09:46:41 2023] libceph: mds5 192.168.32.78:6801 socket closed (con 
state OPEN)
[Mon Jul 31 09:46:41 2023] libceph: mds4 192.168.32.73:6801 socket closed (con 
state OPEN)
[Mon Jul 31 09:46:41 2023] libceph: mds1 192.168.32.80:6801 socket closed (con 
state OPEN)
[Mon Jul 31 09:46:41 2023] libceph: mds2 192.168.32.75:6801 socket closed (con 
state OPEN)
[Mon Jul 31 09:46:41 2023] libceph: mds3 192.168.32.87:6801 connection reset
[Mon Jul 31 09:46:41 2023] libceph: reset on mds3
[Mon Jul 31 09:46:41 2023] ceph: mds3 closed our session
[Mon Jul 31 09:46:41 2023] ceph: mds3 reconnect start
[Mon Jul 31 09:46:41 2023] libceph: mds7 192.168.32.88:6801 connection reset
[Mon Jul 31 09:46:41 2023] libceph: reset on mds7
[Mon Jul 31 09:46:41 2023] ceph: mds7 closed our session
[Mon Jul 31 09:46:41 2023] ceph: mds7 reconnect start
[Mon Jul 31 09:46:41 2023] libceph: mds2 192.168.32.75:6801 connection reset
[Mon Jul 31 09:46:41 2023] 

[ceph-users] Re: cephfs snapshot mirror peer_bootstrap import hung

2023-08-07 Thread Venky Shankar
Hi Anantha,

On Mon, Aug 7, 2023 at 11:52 PM Adiga, Anantha  wrote:
>
> Hi Venky,
>
>
>
> I tried on another secondary Quincy cluster and it is the same problem. The 
> peer_bootstrap mport  command hangs.

A pacific cluster generated peer token should be importable in a
quincy source cluster. Looking at the logs, I suspect that the
perceived hang is the mirroring module blocked on connecting to the
secondary cluster (to set mirror info xattr). Are you able to connect
to the secondary cluster from the host running ceph-mgr on the primary
cluster using its monitor address (and a key)?

>
>
>
>
>
> root@fl31ca104ja0201:/# ceph fs  snapshot mirror peer_bootstrap import cephfs 
> eyJmc2lkIjogIjJlYWMwZWEwLTYwNDgtNDQ0Zi04NGIyLThjZWVmZWQyN2E1YiIsICJmaWxlc3lzdGVtIjogImNlcGhmcyIsICJ1c2VyIjogImNsaWVudC5taXJyb3JfcmVtb3RlIiwgInNpdGVfbmFtZSI6ICJzaGdSLXNpdGUiLCAia2V5IjogIkFRQ0lGdEZrSStTTE5oQUFXbWV6MkRKcEg5ZUdyYnhBOWVmZG9BPT0iLCAibW9uX2hvc3QiOiAiW3YyOjEwLjIzOS4xNTUuMTg6MzMwMC8wLHYxOjEwLjIzOS4xNTUuMTg6Njc4OS8wXSBbdjI6MTAuMjM5LjE1NS4xOTozMzAwLzAsdjE6MTAuMjM5LjE1NS4xOTo2Nzg5LzBdIFt2MjoxMC4yMzkuMTU1LjIwOjMzMDAvMCx2MToxMC4yMzkuMTU1LjIwOjY3ODkvMF0ifQ==
>
> ……
>
> …….
>
> ..command does not complete..waits here
>
> ^C  to exit.
>
> Thereafter some commands do not complete…
>
> root@fl31ca104ja0201:/# ceph -s
>
>   cluster:
>
> id: d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e
>
> health: HEALTH_OK
>
>
>
>   services:
>
> mon:   3 daemons, quorum 
> fl31ca104ja0202,fl31ca104ja0203,fl31ca104ja0201 (age 2d)
>
> mgr:   fl31ca104ja0201.kkoono(active, since 3d), standbys: 
> fl31ca104ja0202, fl31ca104ja0203
>
> mds:   1/1 daemons up, 2 standby
>
> osd:   44 osds: 44 up (since 2d), 44 in (since 5w)
>
> cephfs-mirror: 1 daemon active (1 hosts)
>
> rgw:   3 daemons active (3 hosts, 1 zones)
>
>
>
>   data:
>
> volumes: 1/1 healthy
>
> pools:   25 pools, 769 pgs
>
> objects: 614.40k objects, 1.9 TiB
>
> usage:   2.9 TiB used, 292 TiB / 295 TiB avail
>
> pgs: 769 active+clean
>
>
>
>   io:
>
> client:   32 KiB/s rd, 0 B/s wr, 33 op/s rd, 1 op/s wr
>
>
>
> root@fl31ca104ja0201:/#
>
> root@fl31ca104ja0201:/# ceph fs status cephfs
>
> This command also waits. ……
>
>
>
> I have attached the mgr log
>
> root@fl31ca104ja0201:/# ceph service status
>
> {
>
> "cephfs-mirror": {
>
> "5306346": {
>
> "status_stamp": "2023-08-07T17:35:56.884907+",
>
> "last_beacon": "2023-08-07T17:45:01.903540+",
>
> "status": {
>
> "status_json": 
> "{\"1\":{\"name\":\"cephfs\",\"directory_count\":0,\"peers\":{}}}"
>
> }
>
> }
>
>
>
> Quincy secondary cluster
>
>
>
> root@a001s008-zz14l47008:/# ceph mgr module enable mirroring
>
> root@a001s008-zz14l47008:/# ceph fs authorize cephfs client.mirror_remote / 
> rwps
>
> [client.mirror_remote]
>
> key = AQCIFtFkI+SLNhAAWmez2DJpH9eGrbxA9efdoA==
>
> root@a001s008-zz14l47008:/# ceph auth get client.mirror_remote
>
> [client.mirror_remote]
>
> key = AQCIFtFkI+SLNhAAWmez2DJpH9eGrbxA9efdoA==
>
> caps mds = "allow rwps fsname=cephfs"
>
> caps mon = "allow r fsname=cephfs"
>
> caps osd = "allow rw tag cephfs data=cephfs"
>
> root@a001s008-zz14l47008:/#
>
> root@a001s008-zz14l47008:/# ceph fs snapshot mirror peer_bootstrap create 
> cephfs client.mirror_remote shgR-site
>
> {"token": 
> "eyJmc2lkIjogIjJlYWMwZWEwLTYwNDgtNDQ0Zi04NGIyLThjZWVmZWQyN2E1YiIsICJmaWxlc3lzdGVtIjogImNlcGhmcyIsICJ1c2VyIjogImNsaWVudC5taXJyb3JfcmVtb3RlIiwgInNpdGVfbmFtZSI6ICJzaGdSLXNpdGUiLCAia2V5IjogIkFRQ0lGdEZrSStTTE5oQUFXbWV6MkRKcEg5ZUdyYnhBOWVmZG9BPT0iLCAibW9uX2hvc3QiOiAiW3YyOjEwLjIzOS4xNTUuMTg6MzMwMC8wLHYxOjEwLjIzOS4xNTUuMTg6Njc4OS8wXSBbdjI6MTAuMjM5LjE1NS4xOTozMzAwLzAsdjE6MTAuMjM5LjE1NS4xOTo2Nzg5LzBdIFt2MjoxMC4yMzkuMTU1LjIwOjMzMDAvMCx2MToxMC4yMzkuMTU1LjIwOjY3ODkvMF0ifQ=="}
>
> root@a001s008-zz14l47008:/#
>
>
>
> Thank you,
>
> Anantha
>
>
>
> From: Adiga, Anantha
> Sent: Friday, August 4, 2023 11:55 AM
> To: Venky Shankar ; ceph-users@ceph.io
> Subject: RE: [ceph-users] Re: cephfs snapshot mirror peer_bootstrap import 
> hung
>
>
>
> Hi Venky,
>
>
>
> Thank you so much for the guidance. Attached is the mgr log.
>
>
>
> Note: the 4th node in the primary cluster has smaller capacity  drives, the 
> other 3 nodes have the larger capacity drives.
>
> 32ssd6.98630   1.0  7.0 TiB   44 GiB   44 GiB   183 KiB  148 MiB  
> 6.9 TiB  0.62  0.64   40  up  osd.32
>
> -7  76.84927 -   77 TiB  652 GiB  648 GiB20 MiB  3.0 GiB  
>  76 TiB  0.83  0.86-  host fl31ca104ja0203
>
>   1ssd6.98630   1.0  7.0 TiB   73 GiB   73 GiB   8.0 MiB  333 MiB 
>  6.9 TiB  1.02  1.06   54  up  osd.1
>
>   4ssd6.98630   1.0  7.0 TiB   77 GiB   77 GiB   1.1 MiB  174 MiB 
>  6.9 TiB  1.07  1.11   55  up  osd.4
>
>   7ssd6.98630   1.0  7.0 TiB   47 GiB  

[ceph-users] Re: cephfs snapshot mirror peer_bootstrap import hung

2023-08-07 Thread Adiga, Anantha
Hi Venky,

Thank you very much.

Anantha

-Original Message-
From: Venky Shankar  
Sent: Monday, August 7, 2023 5:23 PM
To: Adiga, Anantha 
Cc: ceph-users@ceph.io
Subject: Re: [ceph-users] Re: cephfs snapshot mirror peer_bootstrap import hung

Hi Anantha,

On Tue, Aug 8, 2023 at 1:59 AM Adiga, Anantha  wrote:
>
> Hi Venky,
>
>
>
> Could this be the reason that the peer-bootstrap import is hanging?  how do I 
> upgrade cephfs-mirror to Quincy?

I was on leave yesterday -- will have a look at the log and update.

>
> root@fl31ca104ja0201:/# cephfs-mirror --version
>
> ceph version 16.2.13 (5378749ba6be3a0868b51803968ee9cde4833a3e) pacific 
> (stable)
>
> root@fl31ca104ja0201:/# ceph version
>
> ceph version 17.2.6 (d7ff0d10654d2280e08f1ab989c7cdf3064446a5) quincy (stable)
>
> root@fl31ca104ja0201:/#
>
>
>
>
>
> Thank you,
>
> Anantha
>
> From: Adiga, Anantha
> Sent: Monday, August 7, 2023 11:21 AM
> To: 'Venky Shankar' ; 'ceph-users@ceph.io' 
> 
> Subject: RE: [ceph-users] Re: cephfs snapshot mirror peer_bootstrap import 
> hung
>
>
>
> Hi Venky,
>
>
>
> I tried on another secondary Quincy cluster and it is the same problem. The 
> peer_bootstrap mport  command hangs.
>
>
>
>
>
> root@fl31ca104ja0201:/# ceph fs  snapshot mirror peer_bootstrap import cephfs 
> eyJmc2lkIjogIjJlYWMwZWEwLTYwNDgtNDQ0Zi04NGIyLThjZWVmZWQyN2E1YiIsICJmaWxlc3lzdGVtIjogImNlcGhmcyIsICJ1c2VyIjogImNsaWVudC5taXJyb3JfcmVtb3RlIiwgInNpdGVfbmFtZSI6ICJzaGdSLXNpdGUiLCAia2V5IjogIkFRQ0lGdEZrSStTTE5oQUFXbWV6MkRKcEg5ZUdyYnhBOWVmZG9BPT0iLCAibW9uX2hvc3QiOiAiW3YyOjEwLjIzOS4xNTUuMTg6MzMwMC8wLHYxOjEwLjIzOS4xNTUuMTg6Njc4OS8wXSBbdjI6MTAuMjM5LjE1NS4xOTozMzAwLzAsdjE6MTAuMjM5LjE1NS4xOTo2Nzg5LzBdIFt2MjoxMC4yMzkuMTU1LjIwOjMzMDAvMCx2MToxMC4yMzkuMTU1LjIwOjY3ODkvMF0ifQ==
>
> ……
>
> …….
>
> ..command does not complete..waits here
>
> ^C  to exit.
>
> Thereafter some commands do not complete…
>
> root@fl31ca104ja0201:/# ceph -s
>
>   cluster:
>
> id: d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e
>
> health: HEALTH_OK
>
>
>
>   services:
>
> mon:   3 daemons, quorum 
> fl31ca104ja0202,fl31ca104ja0203,fl31ca104ja0201 (age 2d)
>
> mgr:   fl31ca104ja0201.kkoono(active, since 3d), standbys: 
> fl31ca104ja0202, fl31ca104ja0203
>
> mds:   1/1 daemons up, 2 standby
>
> osd:   44 osds: 44 up (since 2d), 44 in (since 5w)
>
> cephfs-mirror: 1 daemon active (1 hosts)
>
> rgw:   3 daemons active (3 hosts, 1 zones)
>
>
>
>   data:
>
> volumes: 1/1 healthy
>
> pools:   25 pools, 769 pgs
>
> objects: 614.40k objects, 1.9 TiB
>
> usage:   2.9 TiB used, 292 TiB / 295 TiB avail
>
> pgs: 769 active+clean
>
>
>
>   io:
>
> client:   32 KiB/s rd, 0 B/s wr, 33 op/s rd, 1 op/s wr
>
>
>
> root@fl31ca104ja0201:/#
>
> root@fl31ca104ja0201:/# ceph fs status cephfs
>
> This command also waits. ……
>
>
>
> I have attached the mgr log
>
> root@fl31ca104ja0201:/# ceph service status
>
> {
>
> "cephfs-mirror": {
>
> "5306346": {
>
> "status_stamp": "2023-08-07T17:35:56.884907+",
>
> "last_beacon": "2023-08-07T17:45:01.903540+",
>
> "status": {
>
> "status_json": 
> "{\"1\":{\"name\":\"cephfs\",\"directory_count\":0,\"peers\":{}}}"
>
> }
>
> }
>
>
>
> Quincy secondary cluster
>
>
>
> root@a001s008-zz14l47008:/# ceph mgr module enable mirroring
>
> root@a001s008-zz14l47008:/# ceph fs authorize cephfs client.mirror_remote / 
> rwps
>
> [client.mirror_remote]
>
> key = AQCIFtFkI+SLNhAAWmez2DJpH9eGrbxA9efdoA==
>
> root@a001s008-zz14l47008:/# ceph auth get client.mirror_remote
>
> [client.mirror_remote]
>
> key = AQCIFtFkI+SLNhAAWmez2DJpH9eGrbxA9efdoA==
>
> caps mds = "allow rwps fsname=cephfs"
>
> caps mon = "allow r fsname=cephfs"
>
> caps osd = "allow rw tag cephfs data=cephfs"
>
> root@a001s008-zz14l47008:/#
>
> root@a001s008-zz14l47008:/# ceph fs snapshot mirror peer_bootstrap create 
> cephfs client.mirror_remote shgR-site
>
> {"token": 
> "eyJmc2lkIjogIjJlYWMwZWEwLTYwNDgtNDQ0Zi04NGIyLThjZWVmZWQyN2E1YiIsICJmaWxlc3lzdGVtIjogImNlcGhmcyIsICJ1c2VyIjogImNsaWVudC5taXJyb3JfcmVtb3RlIiwgInNpdGVfbmFtZSI6ICJzaGdSLXNpdGUiLCAia2V5IjogIkFRQ0lGdEZrSStTTE5oQUFXbWV6MkRKcEg5ZUdyYnhBOWVmZG9BPT0iLCAibW9uX2hvc3QiOiAiW3YyOjEwLjIzOS4xNTUuMTg6MzMwMC8wLHYxOjEwLjIzOS4xNTUuMTg6Njc4OS8wXSBbdjI6MTAuMjM5LjE1NS4xOTozMzAwLzAsdjE6MTAuMjM5LjE1NS4xOTo2Nzg5LzBdIFt2MjoxMC4yMzkuMTU1LjIwOjMzMDAvMCx2MToxMC4yMzkuMTU1LjIwOjY3ODkvMF0ifQ=="}
>
> root@a001s008-zz14l47008:/#
>
>
>
> Thank you,
>
> Anantha
>
>
>
> From: Adiga, Anantha
> Sent: Friday, August 4, 2023 11:55 AM
> To: Venky Shankar ; ceph-users@ceph.io
> Subject: RE: [ceph-users] Re: cephfs snapshot mirror peer_bootstrap import 
> hung
>
>
>
> Hi Venky,
>
>
>
> Thank you so much for the guidance. Attached is the mgr log.
>
>
>
> Note: the 4th node in the primary cluster has smaller capacity  drives, the 
> other 3 nodes have the larger capacity drives.

[ceph-users] Re: cephfs snapshot mirror peer_bootstrap import hung

2023-08-07 Thread Venky Shankar
Hi Anantha,

On Tue, Aug 8, 2023 at 1:59 AM Adiga, Anantha  wrote:
>
> Hi Venky,
>
>
>
> Could this be the reason that the peer-bootstrap import is hanging?  how do I 
> upgrade cephfs-mirror to Quincy?

I was on leave yesterday -- will have a look at the log and update.

>
> root@fl31ca104ja0201:/# cephfs-mirror --version
>
> ceph version 16.2.13 (5378749ba6be3a0868b51803968ee9cde4833a3e) pacific 
> (stable)
>
> root@fl31ca104ja0201:/# ceph version
>
> ceph version 17.2.6 (d7ff0d10654d2280e08f1ab989c7cdf3064446a5) quincy (stable)
>
> root@fl31ca104ja0201:/#
>
>
>
>
>
> Thank you,
>
> Anantha
>
> From: Adiga, Anantha
> Sent: Monday, August 7, 2023 11:21 AM
> To: 'Venky Shankar' ; 'ceph-users@ceph.io' 
> 
> Subject: RE: [ceph-users] Re: cephfs snapshot mirror peer_bootstrap import 
> hung
>
>
>
> Hi Venky,
>
>
>
> I tried on another secondary Quincy cluster and it is the same problem. The 
> peer_bootstrap mport  command hangs.
>
>
>
>
>
> root@fl31ca104ja0201:/# ceph fs  snapshot mirror peer_bootstrap import cephfs 
> eyJmc2lkIjogIjJlYWMwZWEwLTYwNDgtNDQ0Zi04NGIyLThjZWVmZWQyN2E1YiIsICJmaWxlc3lzdGVtIjogImNlcGhmcyIsICJ1c2VyIjogImNsaWVudC5taXJyb3JfcmVtb3RlIiwgInNpdGVfbmFtZSI6ICJzaGdSLXNpdGUiLCAia2V5IjogIkFRQ0lGdEZrSStTTE5oQUFXbWV6MkRKcEg5ZUdyYnhBOWVmZG9BPT0iLCAibW9uX2hvc3QiOiAiW3YyOjEwLjIzOS4xNTUuMTg6MzMwMC8wLHYxOjEwLjIzOS4xNTUuMTg6Njc4OS8wXSBbdjI6MTAuMjM5LjE1NS4xOTozMzAwLzAsdjE6MTAuMjM5LjE1NS4xOTo2Nzg5LzBdIFt2MjoxMC4yMzkuMTU1LjIwOjMzMDAvMCx2MToxMC4yMzkuMTU1LjIwOjY3ODkvMF0ifQ==
>
> ……
>
> …….
>
> ..command does not complete..waits here
>
> ^C  to exit.
>
> Thereafter some commands do not complete…
>
> root@fl31ca104ja0201:/# ceph -s
>
>   cluster:
>
> id: d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e
>
> health: HEALTH_OK
>
>
>
>   services:
>
> mon:   3 daemons, quorum 
> fl31ca104ja0202,fl31ca104ja0203,fl31ca104ja0201 (age 2d)
>
> mgr:   fl31ca104ja0201.kkoono(active, since 3d), standbys: 
> fl31ca104ja0202, fl31ca104ja0203
>
> mds:   1/1 daemons up, 2 standby
>
> osd:   44 osds: 44 up (since 2d), 44 in (since 5w)
>
> cephfs-mirror: 1 daemon active (1 hosts)
>
> rgw:   3 daemons active (3 hosts, 1 zones)
>
>
>
>   data:
>
> volumes: 1/1 healthy
>
> pools:   25 pools, 769 pgs
>
> objects: 614.40k objects, 1.9 TiB
>
> usage:   2.9 TiB used, 292 TiB / 295 TiB avail
>
> pgs: 769 active+clean
>
>
>
>   io:
>
> client:   32 KiB/s rd, 0 B/s wr, 33 op/s rd, 1 op/s wr
>
>
>
> root@fl31ca104ja0201:/#
>
> root@fl31ca104ja0201:/# ceph fs status cephfs
>
> This command also waits. ……
>
>
>
> I have attached the mgr log
>
> root@fl31ca104ja0201:/# ceph service status
>
> {
>
> "cephfs-mirror": {
>
> "5306346": {
>
> "status_stamp": "2023-08-07T17:35:56.884907+",
>
> "last_beacon": "2023-08-07T17:45:01.903540+",
>
> "status": {
>
> "status_json": 
> "{\"1\":{\"name\":\"cephfs\",\"directory_count\":0,\"peers\":{}}}"
>
> }
>
> }
>
>
>
> Quincy secondary cluster
>
>
>
> root@a001s008-zz14l47008:/# ceph mgr module enable mirroring
>
> root@a001s008-zz14l47008:/# ceph fs authorize cephfs client.mirror_remote / 
> rwps
>
> [client.mirror_remote]
>
> key = AQCIFtFkI+SLNhAAWmez2DJpH9eGrbxA9efdoA==
>
> root@a001s008-zz14l47008:/# ceph auth get client.mirror_remote
>
> [client.mirror_remote]
>
> key = AQCIFtFkI+SLNhAAWmez2DJpH9eGrbxA9efdoA==
>
> caps mds = "allow rwps fsname=cephfs"
>
> caps mon = "allow r fsname=cephfs"
>
> caps osd = "allow rw tag cephfs data=cephfs"
>
> root@a001s008-zz14l47008:/#
>
> root@a001s008-zz14l47008:/# ceph fs snapshot mirror peer_bootstrap create 
> cephfs client.mirror_remote shgR-site
>
> {"token": 
> "eyJmc2lkIjogIjJlYWMwZWEwLTYwNDgtNDQ0Zi04NGIyLThjZWVmZWQyN2E1YiIsICJmaWxlc3lzdGVtIjogImNlcGhmcyIsICJ1c2VyIjogImNsaWVudC5taXJyb3JfcmVtb3RlIiwgInNpdGVfbmFtZSI6ICJzaGdSLXNpdGUiLCAia2V5IjogIkFRQ0lGdEZrSStTTE5oQUFXbWV6MkRKcEg5ZUdyYnhBOWVmZG9BPT0iLCAibW9uX2hvc3QiOiAiW3YyOjEwLjIzOS4xNTUuMTg6MzMwMC8wLHYxOjEwLjIzOS4xNTUuMTg6Njc4OS8wXSBbdjI6MTAuMjM5LjE1NS4xOTozMzAwLzAsdjE6MTAuMjM5LjE1NS4xOTo2Nzg5LzBdIFt2MjoxMC4yMzkuMTU1LjIwOjMzMDAvMCx2MToxMC4yMzkuMTU1LjIwOjY3ODkvMF0ifQ=="}
>
> root@a001s008-zz14l47008:/#
>
>
>
> Thank you,
>
> Anantha
>
>
>
> From: Adiga, Anantha
> Sent: Friday, August 4, 2023 11:55 AM
> To: Venky Shankar ; ceph-users@ceph.io
> Subject: RE: [ceph-users] Re: cephfs snapshot mirror peer_bootstrap import 
> hung
>
>
>
> Hi Venky,
>
>
>
> Thank you so much for the guidance. Attached is the mgr log.
>
>
>
> Note: the 4th node in the primary cluster has smaller capacity  drives, the 
> other 3 nodes have the larger capacity drives.
>
> 32ssd6.98630   1.0  7.0 TiB   44 GiB   44 GiB   183 KiB  148 MiB  
> 6.9 TiB  0.62  0.64   40  up  osd.32
>
> -7  76.84927 -   77 TiB  652 GiB  648 GiB20 MiB  3.0 GiB  
>  76 TiB  0.83  0.86- 

[ceph-users] Re: RBD Disk Usage

2023-08-07 Thread Danny Webb
worth also mentioning that there are several ways to discard data 
(automatically, timed, manually) all with their own caveats.   We find it's 
easiest to simply mount with the discard option and take the penalty up front 
on deletion.  Redhat has a good explanation of all the options:  
https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/managing_file_systems/discarding-unused-blocks_managing-file-systems


From: mahnoosh shahidi 
Sent: 07 August 2023 20:21
To: Simon Ironside 
Cc: Ceph Users 
Subject: [ceph-users] Re: RBD Disk Usage

CAUTION: This email originates from outside THG

Thanks for your explanation!


On Mon, 7 Aug 2023, 18:45 Simon Ironside,  wrote:

> When you delete files they're not normally scrubbed from the disk, the
> file system just forgets the deleted files are there. To fully remove
> the data you need something like TRIM:
>
> fstrim -v /the_file_system
>
> Simon
>
> On 07/08/2023 15:15, mahnoosh shahidi wrote:
> > Hi all,
> >
> > I have an rbd image that `rbd disk-usage` shows it has 31GB usage but in
> > the filesystem `du` shows its usage is 40KB.
> >
> > Does anyone know the reason for this difference?
> >
> > Best Regards,
> > Mahnoosh
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: cephfs snapshot mirror peer_bootstrap import hung

2023-08-07 Thread Adiga, Anantha
Hi Venky,

Could this be the reason that the peer-bootstrap import is hanging?  how do I 
upgrade cephfs-mirror to Quincy?
root@fl31ca104ja0201:/# cephfs-mirror --version
ceph version 16.2.13 (5378749ba6be3a0868b51803968ee9cde4833a3e) pacific (stable)
root@fl31ca104ja0201:/# ceph version
ceph version 17.2.6 (d7ff0d10654d2280e08f1ab989c7cdf3064446a5) quincy (stable)
root@fl31ca104ja0201:/#


Thank you,
Anantha
From: Adiga, Anantha
Sent: Monday, August 7, 2023 11:21 AM
To: 'Venky Shankar' ; 'ceph-users@ceph.io' 

Subject: RE: [ceph-users] Re: cephfs snapshot mirror peer_bootstrap import hung

Hi Venky,

I tried on another secondary Quincy cluster and it is the same problem. The 
peer_bootstrap mport  command hangs.



root@fl31ca104ja0201:/# ceph fs  snapshot mirror peer_bootstrap import cephfs 
eyJmc2lkIjogIjJlYWMwZWEwLTYwNDgtNDQ0Zi04NGIyLThjZWVmZWQyN2E1YiIsICJmaWxlc3lzdGVtIjogImNlcGhmcyIsICJ1c2VyIjogImNsaWVudC5taXJyb3JfcmVtb3RlIiwgInNpdGVfbmFtZSI6ICJzaGdSLXNpdGUiLCAia2V5IjogIkFRQ0lGdEZrSStTTE5oQUFXbWV6MkRKcEg5ZUdyYnhBOWVmZG9BPT0iLCAibW9uX2hvc3QiOiAiW3YyOjEwLjIzOS4xNTUuMTg6MzMwMC8wLHYxOjEwLjIzOS4xNTUuMTg6Njc4OS8wXSBbdjI6MTAuMjM5LjE1NS4xOTozMzAwLzAsdjE6MTAuMjM5LjE1NS4xOTo2Nzg5LzBdIFt2MjoxMC4yMzkuMTU1LjIwOjMzMDAvMCx2MToxMC4yMzkuMTU1LjIwOjY3ODkvMF0ifQ==

……

…….

..command does not complete..waits here
^C  to exit.
Thereafter some commands do not complete…
root@fl31ca104ja0201:/# ceph -s
  cluster:
id: d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e
health: HEALTH_OK

  services:
mon:   3 daemons, quorum 
fl31ca104ja0202,fl31ca104ja0203,fl31ca104ja0201 (age 2d)
mgr:   fl31ca104ja0201.kkoono(active, since 3d), standbys: 
fl31ca104ja0202, fl31ca104ja0203
mds:   1/1 daemons up, 2 standby
osd:   44 osds: 44 up (since 2d), 44 in (since 5w)
cephfs-mirror: 1 daemon active (1 hosts)
rgw:   3 daemons active (3 hosts, 1 zones)

  data:
volumes: 1/1 healthy
pools:   25 pools, 769 pgs
objects: 614.40k objects, 1.9 TiB
usage:   2.9 TiB used, 292 TiB / 295 TiB avail
pgs: 769 active+clean

  io:
client:   32 KiB/s rd, 0 B/s wr, 33 op/s rd, 1 op/s wr

root@fl31ca104ja0201:/#
root@fl31ca104ja0201:/# ceph fs status cephfs
This command also waits. ……

I have attached the mgr log
root@fl31ca104ja0201:/# ceph service status
{
"cephfs-mirror": {
"5306346": {
"status_stamp": "2023-08-07T17:35:56.884907+",
"last_beacon": "2023-08-07T17:45:01.903540+",
"status": {
"status_json": 
"{\"1\":{\"name\":\"cephfs\",\"directory_count\":0,\"peers\":{}}}"
}
}

Quincy secondary cluster


root@a001s008-zz14l47008:/# ceph mgr module enable mirroring

root@a001s008-zz14l47008:/# ceph fs authorize cephfs client.mirror_remote / rwps

[client.mirror_remote]

key = AQCIFtFkI+SLNhAAWmez2DJpH9eGrbxA9efdoA==

root@a001s008-zz14l47008:/# ceph auth get client.mirror_remote

[client.mirror_remote]

key = AQCIFtFkI+SLNhAAWmez2DJpH9eGrbxA9efdoA==

caps mds = "allow rwps fsname=cephfs"

caps mon = "allow r fsname=cephfs"

caps osd = "allow rw tag cephfs data=cephfs"

root@a001s008-zz14l47008:/#

root@a001s008-zz14l47008:/# ceph fs snapshot mirror peer_bootstrap create 
cephfs client.mirror_remote shgR-site

{"token": 
"eyJmc2lkIjogIjJlYWMwZWEwLTYwNDgtNDQ0Zi04NGIyLThjZWVmZWQyN2E1YiIsICJmaWxlc3lzdGVtIjogImNlcGhmcyIsICJ1c2VyIjogImNsaWVudC5taXJyb3JfcmVtb3RlIiwgInNpdGVfbmFtZSI6ICJzaGdSLXNpdGUiLCAia2V5IjogIkFRQ0lGdEZrSStTTE5oQUFXbWV6MkRKcEg5ZUdyYnhBOWVmZG9BPT0iLCAibW9uX2hvc3QiOiAiW3YyOjEwLjIzOS4xNTUuMTg6MzMwMC8wLHYxOjEwLjIzOS4xNTUuMTg6Njc4OS8wXSBbdjI6MTAuMjM5LjE1NS4xOTozMzAwLzAsdjE6MTAuMjM5LjE1NS4xOTo2Nzg5LzBdIFt2MjoxMC4yMzkuMTU1LjIwOjMzMDAvMCx2MToxMC4yMzkuMTU1LjIwOjY3ODkvMF0ifQ=="}

root@a001s008-zz14l47008:/#

Thank you,
Anantha

From: Adiga, Anantha
Sent: Friday, August 4, 2023 11:55 AM
To: Venky Shankar mailto:vshan...@redhat.com>>; 
ceph-users@ceph.io
Subject: RE: [ceph-users] Re: cephfs snapshot mirror peer_bootstrap import hung


Hi Venky,



Thank you so much for the guidance. Attached is the mgr log.



Note: the 4th node in the primary cluster has smaller capacity  drives, the 
other 3 nodes have the larger capacity drives.

32ssd6.98630   1.0  7.0 TiB   44 GiB   44 GiB   183 KiB  148 MiB  
6.9 TiB  0.62  0.64   40  up  osd.32

-7  76.84927 -   77 TiB  652 GiB  648 GiB20 MiB  3.0 GiB   
76 TiB  0.83  0.86-  host fl31ca104ja0203

  1ssd6.98630   1.0  7.0 TiB   73 GiB   73 GiB   8.0 MiB  333 MiB  
6.9 TiB  1.02  1.06   54  up  osd.1

  4ssd6.98630   1.0  7.0 TiB   77 GiB   77 GiB   1.1 MiB  174 MiB  
6.9 TiB  1.07  1.11   55  up  osd.4

  7ssd6.98630   1.0  7.0 TiB   47 GiB   47 GiB   140 KiB  288 MiB  
6.9 TiB  0.66  0.68   51  up  osd.7

10ssd6.98630   1.0  7.0 TiB   

[ceph-users] Re: RBD Disk Usage

2023-08-07 Thread mahnoosh shahidi
Thanks for your explanation!


On Mon, 7 Aug 2023, 18:45 Simon Ironside,  wrote:

> When you delete files they're not normally scrubbed from the disk, the
> file system just forgets the deleted files are there. To fully remove
> the data you need something like TRIM:
>
> fstrim -v /the_file_system
>
> Simon
>
> On 07/08/2023 15:15, mahnoosh shahidi wrote:
> > Hi all,
> >
> > I have an rbd image that `rbd disk-usage` shows it has 31GB usage but in
> > the filesystem `du` shows its usage is 40KB.
> >
> > Does anyone know the reason for this difference?
> >
> > Best Regards,
> > Mahnoosh
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] v18.2.0 Reef released

2023-08-07 Thread Yuri Weinstein
We're very happy to announce the first stable release of the Reef series.

We express our gratitude to all members of the Ceph community who
contributed by proposing pull requests, testing this release,
providing feedback, and offering valuable suggestions.

Major Changes from Quincy:
- RADOS: RocksDB has been upgraded to version 7.9.2.
- RADOS: There have been significant improvements to RocksDB iteration
overhead and performance.
- RADOS: The perf dump and perf schema commands have been deprecated
in favor of the new counter dump and counter schema commands.
- RADOS: Cache tiering is now deprecated.
- RADOS: A new feature, the "read balancer", is now available, which
allows users to balance primary PGs per pool on their clusters.
- RGW: Bucket resharding is now supported for multi-site configurations.
- RGW: There have been significant improvements to the stability and
consistency of multi-site replication.
- RGW: Compression is now supported for objects uploaded with
Server-Side Encryption.
- Dashboard: There is a new Dashboard page with improved layout.
Active alerts and some important charts are now displayed inside
cards.
- RBD: Support for layered client-side encryption has been added.
- Telemetry: Users can now opt in to participate in a leaderboard in
the telemetry public dashboards.

We encourage you to read the full release notes at
https://ceph.io/en/news/blog/2023/v18-2-0-reef-released/

Getting Ceph

* Git at git://github.com/ceph/ceph.git
* Tarball at https://download.ceph.com/tarballs/ceph-18.2.0.tar.gz
* Containers at https://quay.io/repository/ceph/ceph
* For packages, see https://docs.ceph.com/docs/master/install/get-packages/
* Release git sha1: 5dd24139a1eada541a3bc16b6941c5dde975e26d

Did you know? Every Ceph release is built and tested on resources
funded directly by the non-profit Ceph Foundation.
If you would like to support this and our other efforts, please
consider joining now https://ceph.io/en/foundation/.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Is it safe to add different OS but same ceph version to the existing cluster?

2023-08-07 Thread Milind Changire
On Mon, Aug 7, 2023 at 8:23 AM Szabo, Istvan (Agoda)
 wrote:
>
> Hi,
>
> I have an octopus cluster on the latest octopus version with mgr/mon/rgw/osds 
> on centos 8.
> Is it safe to add an ubuntu osd host with the same octopus version?
>
> Thank you

Well, the ceph source bits surely remain the same. The binary bits
could be different due to better compiler support on the newer OS
version.
So assuming the new ceph is deployed on the same hardware platform
things should be stable.
Also, assuming that relevant OS tunables and ceph features and config
options have been configured to match the older deployment, the new
ceph deployment should just work fine and as expected.
Saying all this, I'd still recommend to test out the move one node at
a time rather than executing a bulk move.
Making a list of types of devices and checking driver support on the
new OS would also be a prudent thing to do.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: help, ceph fs status stuck with no response

2023-08-07 Thread Patrick Donnelly
On Mon, Aug 7, 2023 at 6:12 AM Zhang Bao  wrote:
>
> Hi,
>
> I have a ceph stucked at `ceph --verbose stats fs fsname`. And in  the
> monitor log, I can found something like `audit [DBG] from='client.431973 -'
> entity='client.admin' cmd=[{"prefix": "fs status", "fs": "fsname",
> "target": ["mon-mgr", ""]}]: dispatch`.

`ceph fs status` goes through the ceph-mgr. If there are slowdowns
with that daemon, the command may also be slow. You can share more
information about your cluster to help diagnose that.

You can use `ceph fs dump` to get most of the same information from
the mons directly.

-- 
Patrick Donnelly, Ph.D.
He / Him / His
Red Hat Partner Engineer
IBM, Inc.
GPG: 19F28A586F808C2402351B93C3301A3E258DD79D
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: snaptrim number of objects

2023-08-07 Thread Patrick Donnelly
On Fri, Aug 4, 2023 at 5:41 PM Angelo Höngens  wrote:
>
> Hey guys,
>
> I'm trying to figure out what's happening to my backup cluster that
> often grinds to a halt when cephfs automatically removes snapshots.

CephFS does not "automatically" remove snapshots. Do you mean the
snap_schedule mgr module?

> Almost all OSD's go to 100% CPU, ceph complains about slow ops, and
> CephFS stops doing client i/o.

What health warnings do you see? You can try configuring snap trim:

https://docs.ceph.com/en/latest/rados/configuration/osd-config-ref/#confval-osd_snap_trim_sleep

> I'm graphing the cumulative value of the snaptrimq_len value, and that
> slowly decreases over time. One night it takes an hour, but other
> days, like today, my cluster has been down for almost 20 hours, and I
> think we're half way. Funny thing is that in both cases, the
> snaptrimq_len value initially goes to the same value, around 3000, and
> then slowly decreases, but my guess is that the number of objects that
> need to be trimmed varies hugely every day.
>
> Is there a way to show the size of cephfs snapshots, or get the number
> of objects or bytes that need snaptrimming?

Unfortunately, no.

> Perhaps I can graph that
> and see where the differences are.
>
> That won't explain why my cluster bogs down, but at least it gives
> some visibility. Running 17.2.6 everywhere by the way.

Please let us know how configuring snaptrim helps or not.

-- 
Patrick Donnelly, Ph.D.
He / Him / His
Red Hat Partner Engineer
IBM, Inc.
GPG: 19F28A586F808C2402351B93C3301A3E258DD79D
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: RBD Disk Usage

2023-08-07 Thread Simon Ironside
When you delete files they're not normally scrubbed from the disk, the 
file system just forgets the deleted files are there. To fully remove 
the data you need something like TRIM:


fstrim -v /the_file_system

Simon

On 07/08/2023 15:15, mahnoosh shahidi wrote:

Hi all,

I have an rbd image that `rbd disk-usage` shows it has 31GB usage but in
the filesystem `du` shows its usage is 40KB.

Does anyone know the reason for this difference?

Best Regards,
Mahnoosh
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] RBD Disk Usage

2023-08-07 Thread mahnoosh shahidi
Hi all,

I have an rbd image that `rbd disk-usage` shows it has 31GB usage but in
the filesystem `du` shows its usage is 40KB.

Does anyone know the reason for this difference?

Best Regards,
Mahnoosh
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Problems with UFS / FreeBSD on rbd volumes?

2023-08-07 Thread Roland Giesler
We have a FreeBSD 12.3 guest machine that works well on an RBD volume 
until it is live migrated to another node (on Proxmox).  After 
migration, the processes almost all go into D state (waiting for this 
disk) and they don't exist from it (ie they don't "get" the disk the 
requested.


I'm not sure what it causing this, so I'm asking here if anyone has come 
across such a problem?


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: MDS stuck in rejoin

2023-08-07 Thread Frank Schilder
Dear Xiubo,

I managed to collect some information. It looks like there is nothing in the 
dmesg log around the time the client failed to advance its TID. I collected 
short snippets around the critical time below. I have full logs in case you are 
interested. Its large files, I will need to do an upload for that.

I also have a dump of "mds session ls" output for clients that showed the same 
issue later. Unfortunately, no consistent log information for a single incident.

Here the summary, please let me know if uploading the full package makes sense:

- Status:

On July 29, 2023

ceph status/df/pool stats/health detail at 01:05:14:
  cluster:
health: HEALTH_WARN
1 pools nearfull

ceph status/df/pool stats/health detail at 01:05:28:
  cluster:
health: HEALTH_WARN
1 clients failing to advance oldest client/flush tid
1 pools nearfull

[...]

On July 31, 2023

ceph status/df/pool stats/health detail at 10:36:16:
  cluster:
health: HEALTH_WARN
1 clients failing to advance oldest client/flush tid
1 pools nearfull

  cluster:
health: HEALTH_WARN
1 pools nearfull

- client evict command (date, time, command):

2023-07-31 10:36  ceph tell mds.ceph-11 client evict id=145678457

We have a 1h time difference between the date stamp of the command and the 
dmesg date stamps. However, there seems to be a weird 10min delay from issuing 
the evict command until it shows up in dmesg on the client.

- dmesg:

[Fri Jul 28 12:59:14 2023] beegfs: enabling unsafe global rkey
[Fri Jul 28 12:59:14 2023] beegfs: enabling unsafe global rkey
[Fri Jul 28 12:59:14 2023] beegfs: enabling unsafe global rkey
[Fri Jul 28 12:59:14 2023] beegfs: enabling unsafe global rkey
[Fri Jul 28 12:59:14 2023] beegfs: enabling unsafe global rkey
[Fri Jul 28 12:59:14 2023] beegfs: enabling unsafe global rkey
[Fri Jul 28 16:07:47 2023] slurm.epilog.cl (24175): drop_caches: 3
[Sat Jul 29 18:21:30 2023] libceph: mds2 192.168.32.75:6801 socket closed (con 
state OPEN)
[Sat Jul 29 18:21:30 2023] libceph: mds2 192.168.32.75:6801 socket closed (con 
state OPEN)
[Sat Jul 29 18:21:30 2023] libceph: mds2 192.168.32.75:6801 socket closed (con 
state OPEN)
[Sat Jul 29 18:21:42 2023] ceph: mds2 reconnect start
[Sat Jul 29 18:21:42 2023] ceph: mds2 reconnect start
[Sat Jul 29 18:21:43 2023] ceph: mds2 reconnect start
[Sat Jul 29 18:21:43 2023] ceph: mds2 reconnect success
[Sat Jul 29 18:21:43 2023] ceph: mds2 reconnect success
[Sat Jul 29 18:21:43 2023] ceph: mds2 reconnect success
[Sat Jul 29 18:26:39 2023] ceph: mds2 reconnect start
[Sat Jul 29 18:26:39 2023] ceph: mds2 reconnect start
[Sat Jul 29 18:26:39 2023] ceph: mds2 reconnect start
[Sat Jul 29 18:26:40 2023] ceph: mds2 reconnect success
[Sat Jul 29 18:26:40 2023] ceph: mds2 reconnect success
[Sat Jul 29 18:26:40 2023] ceph: mds2 reconnect success
[Sat Jul 29 18:26:49 2023] ceph: update_snap_trace error -22
[Sat Jul 29 18:26:49 2023] ceph: mds2 recovery completed
[Sat Jul 29 18:26:49 2023] ceph: mds2 recovery completed
[Sat Jul 29 18:26:49 2023] ceph: mds2 recovery completed
[Sun Jul 30 16:37:55 2023] slurm.epilog.cl (43668): drop_caches: 3
[Mon Jul 31 01:00:20 2023] slurm.epilog.cl (73347): drop_caches: 3
[Mon Jul 31 09:46:41 2023] libceph: mds0 192.168.32.81:6801 socket closed (con 
state OPEN)
[Mon Jul 31 09:46:41 2023] libceph: mds3 192.168.32.87:6801 socket closed (con 
state OPEN)
[Mon Jul 31 09:46:41 2023] libceph: mds7 192.168.32.88:6801 socket closed (con 
state OPEN)
[Mon Jul 31 09:46:41 2023] libceph: mds5 192.168.32.78:6801 socket closed (con 
state OPEN)
[Mon Jul 31 09:46:41 2023] libceph: mds4 192.168.32.73:6801 socket closed (con 
state OPEN)
[Mon Jul 31 09:46:41 2023] libceph: mds1 192.168.32.80:6801 socket closed (con 
state OPEN)
[Mon Jul 31 09:46:41 2023] libceph: mds2 192.168.32.75:6801 socket closed (con 
state OPEN)
[Mon Jul 31 09:46:41 2023] libceph: mds3 192.168.32.87:6801 connection reset
[Mon Jul 31 09:46:41 2023] libceph: reset on mds3
[Mon Jul 31 09:46:41 2023] ceph: mds3 closed our session
[Mon Jul 31 09:46:41 2023] ceph: mds3 reconnect start
[Mon Jul 31 09:46:41 2023] libceph: mds7 192.168.32.88:6801 connection reset
[Mon Jul 31 09:46:41 2023] libceph: reset on mds7
[Mon Jul 31 09:46:41 2023] ceph: mds7 closed our session
[Mon Jul 31 09:46:41 2023] ceph: mds7 reconnect start
[Mon Jul 31 09:46:41 2023] libceph: mds2 192.168.32.75:6801 connection reset
[Mon Jul 31 09:46:41 2023] libceph: reset on mds2
[Mon Jul 31 09:46:41 2023] ceph: mds2 closed our session
[Mon Jul 31 09:46:41 2023] ceph: mds2 reconnect start
[Mon Jul 31 09:46:41 2023] libceph: mds4 192.168.32.73:6801 connection reset
[Mon Jul 31 09:46:41 2023] libceph: reset on mds4
[Mon Jul 31 09:46:41 2023] ceph: mds4 closed our session
[Mon Jul 31 09:46:41 2023] ceph: mds4 reconnect start
[Mon Jul 31 09:46:41 2023] libceph: mds1 192.168.32.80:6801 connection reset
[Mon Jul 31 09:46:41 2023] libceph: reset on mds1
[Mon Jul 31 09:46:41 

[ceph-users] Re: Is it safe to add different OS but same ceph version to the existing cluster?

2023-08-07 Thread Marc
> I have an octopus cluster on the latest octopus version with
> mgr/mon/rgw/osds on centos 8.
> Is it safe to add an ubuntu osd host with the same octopus version?
> 

I am also wondering a bit about such things. For instance having el9 Nautilus 
mixed with el7 Nautilus. If I remember correctly this bug of the year issue was 
related to some external compression library. 
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] help, ceph fs status stuck with no response

2023-08-07 Thread Zhang Bao
Hi,

I have a ceph stucked at `ceph --verbose stats fs fsname`. And in  the
monitor log, I can found something like `audit [DBG] from='client.431973 -'
entity='client.admin' cmd=[{"prefix": "fs status", "fs": "fsname",
"target": ["mon-mgr", ""]}]: dispatch`.

What happened and what should I do?

-- 

ZhangBao
+6585021702
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: 64k buckets for 1 user

2023-08-07 Thread Eugen Block

Hi,

just last week there was a thread [1] about a large omap warning for a  
single user with 400k buckets. There's no resharding for that (but  
with 64k you would stay under the default 200k threshold), so that's  
the downside, I guess. I can't tell what other impacts that may have.


Regards,
Eugen

[1]  
https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/7LTCHCLH5ACTP7TYDSWOW3S3RJPGWXIY/


Zitat von "Szabo, Istvan (Agoda)" :


Hi,

We are in a transition where I'd like to ask my user who stores 2B  
objects in 1 bucket to split it some way.
Thinking for the future we identified to make it future proof and  
don't store huge amount of objects in 1 bucket, we would need to  
create 65xxx buckets.


Is there anybody aware of any issue with this amount of buckets please?
I guess better to split to multiple buckets rather than have gigantic bucket.

Thank you the advises


This message is confidential and is for the sole use of the intended  
recipient(s). It may also be privileged or otherwise protected by  
copyright or other legal rules. If you have received it by mistake  
please let us know by reply email and delete it from your system. It  
is prohibited to copy this message or disclose its content to  
anyone. Any confidentiality or privilege is not waived or lost by  
any mistaken delivery or unauthorized disclosure of the message. All  
messages sent to and from Agoda may be monitored to ensure  
compliance with company policies, to protect the company's interests  
and to remove potential malware. Electronic messages may be  
intercepted, amended, lost or deleted, or contain viruses.

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Multiple CephFS mounts and FSCache

2023-08-07 Thread caskd
It turns out this limitation is a cachefiles limitation rather than a cephfs 
limitation.

Guess it's been solved. I'm interested in any alternatives available for 
caching.

-- 
Alex D.
RedXen System & Infrastructure Administration
https://redxen.eu/


signature.asc
Description: PGP signature
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Multiple CephFS mounts and FSCache

2023-08-07 Thread caskd
Hello everyone,

I've been trying to use CephFS and mix it together with fscache, however i was 
never able to have multiple mounts with fscache enabled.
Is this a known intentional limitation or a bug?
It would be possible to work around it by mounting the root of the filesystem 
and using bind mounts but i have separate volumes that need to be separately 
mounted.

How to replicate:
mount -t ceph -o fsc admin@.filesystem1=/path1 /tmp/one # Succeeds
mount -t ceph -o fsc admin@.filesystem1=/path2 /tmp/two # Fails complaining 
about no mds being available

The alternative of using no fscache works just fine:
mount -t ceph admin@.filesystem1=/path1 /tmp/one # Succeeds
mount -t ceph admin@.filesystem1=/path2 /tmp/two # Succeeds

Versions:
- ceph quincy 17.2.6
- linux 6.4.6
- cachefilesd 0.10.10

-- 
Alex D.
RedXen System & Infrastructure Administration
https://redxen.eu/


signature.asc
Description: PGP signature
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] 64k buckets for 1 user

2023-08-07 Thread Szabo, Istvan (Agoda)
Hi,

We are in a transition where I'd like to ask my user who stores 2B objects in 1 
bucket to split it some way.
Thinking for the future we identified to make it future proof and don't store 
huge amount of objects in 1 bucket, we would need to create 65xxx buckets.

Is there anybody aware of any issue with this amount of buckets please?
I guess better to split to multiple buckets rather than have gigantic bucket.

Thank you the advises


This message is confidential and is for the sole use of the intended 
recipient(s). It may also be privileged or otherwise protected by copyright or 
other legal rules. If you have received it by mistake please let us know by 
reply email and delete it from your system. It is prohibited to copy this 
message or disclose its content to anyone. Any confidentiality or privilege is 
not waived or lost by any mistaken delivery or unauthorized disclosure of the 
message. All messages sent to and from Agoda may be monitored to ensure 
compliance with company policies, to protect the company's interests and to 
remove potential malware. Electronic messages may be intercepted, amended, lost 
or deleted, or contain viruses.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io