[ceph-users] Re: Cannot recreate monitor in upgrade from pacific to quincy (leveldb -> rocksdb)

2024-02-01 Thread Eugen Block
I might have a reproducer, the second rebuilt mon is not joining the cluster as well, I'll look into it and let you know if I find anything. Zitat von Eugen Block : Hi, Can anyone confirm that ancient (2017) leveldb database mons should just accept ‘mon.$hostname’ names for mons, a well

[ceph-users] RADOSGW Multi-Site Sync Metrics

2024-02-01 Thread Rhys Powell
Hi All, I am in the process of implementing multi-site RGW instance and have successfully set up a POC and confirmed the functionality. I am working on metrics and alerting for this service, and I am not seeing metrics available for the output shown by radosgw-admin sync status --rgw-realm=<>

[ceph-users] Re: Performance improvement suggestion

2024-02-01 Thread Anthony D'Atri
I'd totally defer to the RADOS folks. One issue might be adding a separate code path, which can have all sorts of problems. > On Feb 1, 2024, at 12:53, quag...@bol.com.br wrote: > > > > Ok Anthony, > > I understood what you said. I also believe in all the professional history > and

[ceph-users] Re: Performance improvement suggestion

2024-02-01 Thread quag...@bol.com.br
    Ok Anthony, I understood what you said. I also believe in all the professional history and experience you have. Anyway, could there be a configuration flag to make this happen? As well as those that already exist: "--yes-i-really-mean-it". This way, the storage pattern would remain as it

[ceph-users] Re: Performance improvement suggestion

2024-02-01 Thread quag...@bol.com.br
___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: pacific 16.2.15 QE validation status

2024-02-01 Thread Ilya Dryomov
On Thu, Feb 1, 2024 at 5:23 PM Yuri Weinstein wrote: > > Update. > Seeking approvals/reviews for: > > rados - Radek, Laura, Travis, Adam King (see Laura's comments below) > rgw - Casey approved > fs - Venky approved > rbd - Ilya No issues in RBD, formal approval is pending on [1] which also

[ceph-users] Re: pacific 16.2.15 QE validation status

2024-02-01 Thread Zakhar Kirpichenko
Hi, Please consider not leaving this behind: https://github.com/ceph/ceph/pull/55109 It's a serious bug, which potentially affects a whole node stability if the affected mgr is colocated with OSDs. The bug was known for quite a while and really shouldn't be left unfixed. /Z On Thu, 1 Feb 2024

[ceph-users] Re: pacific 16.2.15 QE validation status

2024-02-01 Thread Nizamudeen A
Thanks Laura, Raised a PR for https://tracker.ceph.com/issues/57386 https://github.com/ceph/ceph/pull/55415 On Thu, Feb 1, 2024 at 5:15 AM Laura Flores wrote: > I reviewed the rados suite. @Adam King , @Nizamudeen A > would appreciate a look from you, as there are some > orchestrator and

[ceph-users] Re: pacific 16.2.15 QE validation status

2024-02-01 Thread Yuri Weinstein
Update. Seeking approvals/reviews for: rados - Radek, Laura, Travis, Adam King (see Laura's comments below) rgw - Casey approved fs - Venky approved rbd - Ilya krbd - Ilya upgrade/nautilus-x (pacific) - fixed by Casey upgrade/octopus-x (pacific) - Adam King is looking

[ceph-users] Re: Cannot recreate monitor in upgrade from pacific to quincy (leveldb -> rocksdb)

2024-02-01 Thread Eugen Block
Hi, Can anyone confirm that ancient (2017) leveldb database mons should just accept ‘mon.$hostname’ names for mons, a well as ‘mon.$id’ ? at some point you had or have to remove one of the mons to recreate it with a rocksdb backend, so the mismatch should not be an issue here. I can

[ceph-users] Re: 6 pgs not deep-scrubbed in time

2024-02-01 Thread Wesley Dillingham
I would just set noout for the duration of the reboot no other flags really needed. There is a better option to limit that flag to just the host being rebooted. which is "set-group noout " where is the servers name in CRUSH. Just the global noout will suffice though. Anyways... your not

[ceph-users] Re: Performance improvement suggestion

2024-02-01 Thread Anthony D'Atri
> I didn't say I would accept the risk of losing data. That's implicit in what you suggest, though. > I just said that it would be interesting if the objects were first > recorded only in the primary OSD. What happens when that host / drive smokes before it can replicate? What

[ceph-users] Re: cephfs inode backtrace information

2024-02-01 Thread Loïc Tortay
On 31/01/2024 20:13, Patrick Donnelly wrote: On Tue, Jan 30, 2024 at 5:03 AM Dietmar Rieder wrote: Hello, I have a question regarding the default pool of a cephfs. According to the docs it is recommended to use a fast ssd replicated pool as default pool for cephfs. I'm asking what are the

[ceph-users] Re: Performance improvement suggestion

2024-02-01 Thread quag...@bol.com.br
  Hi Janne, thanks for your reply. I think that it would be good to maintain the number of configured replicas. I don't think it's interesting to decrease to size=1. However, I think it is not necessary to write to all disks to release the client's request. Replicas could be recorded immediately

[ceph-users] Re: Performance improvement suggestion

2024-02-01 Thread quag...@bol.com.br
    Hi Anthony, Thanks for your reply. I didn't say I would accept the risk of losing data. I just said that it would be interesting if the objects were first recorded only in the primary OSD. This way it would greatly increase performance (both for iops and throuput).

[ceph-users] Re: Performance improvement suggestion

2024-02-01 Thread quag...@bol.com.br
___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: 6 pgs not deep-scrubbed in time

2024-02-01 Thread Michel Niyoyita
And as said before still it is in warning state with pgs not deep-scrubed in time . Hope this can be ignored and set those two flags "noout and nobackfill" then reboot . Thank you again Sir On Thu, 1 Feb 2024, 16:11 Michel Niyoyita, wrote: > Thank you very much Janne. > > On Thu, 1 Feb 2024,

[ceph-users] Re: 6 pgs not deep-scrubbed in time

2024-02-01 Thread Michel Niyoyita
Thank you very much Janne. On Thu, 1 Feb 2024, 15:21 Janne Johansson, wrote: > pause and nodown is not a good option to set, that will certainly make > clients stop IO. Pause will stop it immediately, and nodown will stop > IO when the OSD processes stop running on this host. > > When we do

[ceph-users] Re: Performance improvement suggestion

2024-02-01 Thread quag...@bol.com.br
___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: 6 pgs not deep-scrubbed in time

2024-02-01 Thread Janne Johansson
pause and nodown is not a good option to set, that will certainly make clients stop IO. Pause will stop it immediately, and nodown will stop IO when the OSD processes stop running on this host. When we do service on a host, we set "noout" and "nobackfill", that is enough for reboots, OS upgrades

[ceph-users] Re: Syslog server log naming

2024-02-01 Thread Torkil Svensgaard
On 01/02/2024 12:47, Eugen Block wrote: Hi Torkil, Hi Eugen cephadm does regular checks, for example some 'ceph-volume' stuff to see if all assigned disks have actually been deployed as OSDs and so on. That's why there are "random" containers created and destroyed. I don't have a

[ceph-users] Re: Syslog server log naming

2024-02-01 Thread Eugen Block
Hi Torkil, cephadm does regular checks, for example some 'ceph-volume' stuff to see if all assigned disks have actually been deployed as OSDs and so on. That's why there are "random" containers created and destroyed. I don't have a complete list of checks, though. You should be able to

[ceph-users] Re: Syslog server log naming

2024-02-01 Thread Torkil Svensgaard
So it seems some ceph housekeeping spawn containers without giving them a name and that causes this in the journal: " Feb 01 04:10:07 dopey podman[766731]: 2024-02-01 04:10:07.967786606 +0100 CET m=+0.043987882 container create 95967a040795bd61588dcfdc6ba5daf92553cd2cb3ecd7318cd8b16c1b15782d

[ceph-users] Re: 6 pgs not deep-scrubbed in time

2024-02-01 Thread Michel Niyoyita
Thanks Very much Wesley, We have decided to restart one host among three osds hosts. before doing that I need the advices of the team . these are flags I want to set before restart. 'ceph osd set noout' 'ceph osd set nobackfill' 'ceph osd set norecover' 'ceph osd set norebalance' 'ceph osd

[ceph-users] Re: Understanding subvolumes

2024-02-01 Thread Kotresh Hiremath Ravishankar
Comments inline. On Thu, Feb 1, 2024 at 4:51 AM Matthew Melendy wrote: > In our department we're getting starting with Ceph 'reef', using Ceph FUSE > client for our Ubuntu workstations. > > So far so good, except I can't quite figure out one aspect of subvolumes. > > When I do the commands: > >

[ceph-users] Re: Understanding subvolumes

2024-02-01 Thread Neeraj Pratap Singh
Hi, In reply to your question for this UUID thing, we do need it for CLONING subvolume and yes, it is a requirement. And the subvolume mount path is the entire directory path. On Thu, Feb 1, 2024 at 4:51 AM Matthew Melendy wrote: > In our department we're getting starting with Ceph 'reef',

[ceph-users] Snapshot automation/scheduling for rbd?

2024-02-01 Thread Jeremy Hansen
Can rbd image snapshotting be scheduled like CephFS snapshots? Maybe I missed it in the documentation but it looked like scheduling snapshots wasn’t a feature for block images. I’m still running Pacific. We’re trying to devise a sufficient backup plan for Cloudstack and other things residing in

[ceph-users] Re: how can install latest dev release?

2024-02-01 Thread Christian Rohmann
On 31.01.24 11:33, garcetto wrote: thank you, but seems related to quincy, there is nothing on latest vesions in the doc...maybe the doc is not updated? I don't understand what you are missing. I just used a documentation link pointing to the Quincy version of this page, yes. The "latest"

[ceph-users] Re: Throughput metrics missing iwhen updating Ceph Quincy to Reef

2024-02-01 Thread Christian Rohmann
This change is documented at https://docs.ceph.com/en/latest/mgr/prometheus/#ceph-daemon-performance-counters-metrics, also mentioning the deployment of ceph-exporter which is now used to collect per-host metrics from the local daemons. While this deployment is done by cephadm if used, I am

[ceph-users] Merging two ceph clusters

2024-02-01 Thread Nico Schottelius
Good morning, in the spirit of the previous thread, I am wondering if anyone ever succeeded in merging two separate ceph clusters into one? Background from my side: we are running multiple ceph clusters in k8s/rook, but we still have some Nautilus/Devuan based clusters that are