[ceph-users] Re: PGs and OSDs unknown

2022-04-01 Thread York Huang
Hi, How about this "osd: 7 osds: 6 up (since 3h), 6 in (since 6w)" 1 osd is missing? --Original-- From: "Konold,Martin"

[ceph-users] Re: PG down, due to 3 OSD failing

2022-04-01 Thread Dan van der Ster
We're on the right track! On Fri, Apr 1, 2022 at 6:57 PM Fulvio Galeazzi wrote: > > Ciao Dan, thanks for your messages! > > On 4/1/22 11:25, Dan van der Ster wrote: > > The PGs are stale, down, inactive *because* the OSDs don't start. > > Your main efforts should be to bring OSDs up, without

[ceph-users] Re: PG down, due to 3 OSD failing

2022-04-01 Thread Fulvio Galeazzi
Ciao Dan, thanks for your messages! On 4/1/22 11:25, Dan van der Ster wrote: The PGs are stale, down, inactive *because* the OSDs don't start. Your main efforts should be to bring OSDs up, without purging or zapping or anyting like that. (Currently your cluster is down, but there are hopes to

[ceph-users] Re: quincy v17.2.0 QE Validation status

2022-04-01 Thread Venky Shankar
45689 - I'll let you know when the > backport is available. Smoke test passes with the above PR: https://pulpito.ceph.com/vshankar-2022-04-01_12:29:01-smoke-wip-vshankar-testing1-20220401-123425-testing-default-smithi/ Requested Yuri to run FS suite w/ master (jobs were not getting scheduled in my run).

[ceph-users] Ceph rbd mirror journal pool

2022-04-01 Thread huxia...@horebdata.cn
Dear Cephers, Enabling Ceph mirroring means double writes on the same data pool, thus possibly degrading the write performance dramatically. By google searching, i found the following words (apparently appeared several year ago) "The rbd CLI allows you to use the "--journal-pool" argument

[ceph-users] Recovery or recreation of a monitor rocksdb

2022-04-01 Thread Victor Rodriguez
Hello, Have a 3 node cluster using Proxmox + ceph version 14.2.22 (nautilus). After a power failure one of the monitors does not start. The log states some kind of problem with it's rocksdb but I can't really pinpoint the issue. The log is available at https://pastebin.com/TZrFrZ1u. How can

[ceph-users] Re: Ceph remote disaster recovery at PB scale

2022-04-01 Thread huxia...@horebdata.cn
Dear Arnaud, Thanks a lot for sharing your precious experience, and this mothod for cephfs disaster recovery is really unique and intriguing! Curiously, how do you do cephfs metadata backup? Should the backup be done very frequently in order to avoid much data loss? need any special tool to

[ceph-users] Re: Ceph remote disaster recovery at PB scale

2022-04-01 Thread Arnaud M
Hello I will speak about cephfs because it what I am working on Of course you can do some kind of rsync or rclone between two cephfs clusters but at petabytes scales it will be really slow and cost a lot ! There is another approach that we tested successfully (only on test not in prod) We

[ceph-users] Re: PGs and OSDs unknown

2022-04-01 Thread Konold, Martin
Hi, restarting ceph managers did not change anything. # systemctl status ceph-mgr@hbase10.service ● ceph-mgr@hbase10.service - Ceph cluster manager daemon Loaded: loaded (/lib/systemd/system/ceph-mgr@.service; enabled; vendor preset: enabled) Drop-In:

[ceph-users] Re: PG down, due to 3 OSD failing

2022-04-01 Thread Dan van der Ster
The PGs are stale, down, inactive *because* the OSDs don't start. Your main efforts should be to bring OSDs up, without purging or zapping or anyting like that. (Currently your cluster is down, but there are hopes to recover. If you start purging things that can result in permanent data loss.).

[ceph-users] Re: PGs and OSDs unknown

2022-04-01 Thread Janne Johansson
Den fre 1 apr. 2022 kl 11:15 skrev Konold, Martin : > Hi, > running Ceph 16.2.7 on a pure NVME Cluster with 9 nodes I am > experiencing "Reduced data availability: 448 pgs inactive". > > I cannot see any statistics or pool information with "ceph -s". Since the cluster seems operational, chances

[ceph-users] PGs and OSDs unknown

2022-04-01 Thread Konold, Martin
Hi, running Ceph 16.2.7 on a pure NVME Cluster with 9 nodes I am experiencing "Reduced data availability: 448 pgs inactive". I cannot see any statistics or pool information with "ceph -s". The RBDs are still operational and "ceph report" shows the osds as expected. I am wondering how to

[ceph-users] Ceph remote disaster recovery at PB scale

2022-04-01 Thread huxia...@horebdata.cn
Dear Cepher experts, We are operating some ceph clusters (both L and N versions) at PB scale, and now planning remote distaster recovery solutions. Among these clusters, most are rbd volumes for Openstack and K8s, and a few for S3 object storage, and very few cephfs clusters. For rbd

[ceph-users] Re: PG down, due to 3 OSD failing

2022-04-01 Thread Dan van der Ster
Don't purge anything! On Fri, Apr 1, 2022 at 9:38 AM Fulvio Galeazzi wrote: > > Ciao Dan, > thanks for your time! > > So you are suggesting that my problems with PG 85.25 may somehow resolve > if I manage to bring up the three OSDs currently "down" (possibly due to > PG 85.12, and other

[ceph-users] Re: PG down, due to 3 OSD failing

2022-04-01 Thread Fulvio Galeazzi
Ciao Dan, thanks for your time! So you are suggesting that my problems with PG 85.25 may somehow resolve if I manage to bring up the three OSDs currently "down" (possibly due to PG 85.12, and other PGs)? Looking for the string 'start interval does not contain the required bound' I found