[ceph-users] Re: Whether removing device_health_metrics pool is ok or not

2020-12-03 Thread Satoru Takeuchi
2020年12月4日(金) 9:53 Michael Thomas : > > On 12/3/20 6:47 PM, Satoru Takeuchi wrote: > > Hi, > > > > Could you tell me whether it's ok to remove device_health_metrics pool > > after disabling device monitoring feature? > > > > I don't use device monitoring feature because I capture hardware > >

[ceph-users] Re: Whether removing device_health_metrics pool is ok or not

2020-12-03 Thread Michael Thomas
On 12/3/20 6:47 PM, Satoru Takeuchi wrote: Hi, Could you tell me whether it's ok to remove device_health_metrics pool after disabling device monitoring feature? I don't use device monitoring feature because I capture hardware information from other way. However, after disabling this feature,

[ceph-users] Whether removing device_health_metrics pool is ok or not

2020-12-03 Thread Satoru Takeuchi
Hi, Could you tell me whether it's ok to remove device_health_metrics pool after disabling device monitoring feature? I don't use device monitoring feature because I capture hardware information from other way. However, after disabling this feature, device_health_metrics pool stll exists. I

[ceph-users] Whether read I/O is accpted when the number of replica is under pool's min_size

2020-12-03 Thread Satoru Takeuchi
Hi, Could you tell me whether read I/O is acxepted when the number of replicas is under pool's min_size? I read the official document and found that there is a difference of the effect of pool's min_size between pool's document and the pool's configuration document. Pool's document:

[ceph-users] Re: add server in crush map before osd

2020-12-03 Thread Frank Schilder
I deploy a custom crush location hook on every server. This adds a new server in the correct location without any further ado. See https://docs.ceph.com/en/latest/rados/operations/crush-map/#custom-location-hooks . Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum

[ceph-users] Re: Increase number of objects in flight during recovery

2020-12-03 Thread Joachim Kraftmayer
Hi Frank, this values we used to reduce the recovery impact before luminous. #reduce recovery impact osd max backfills osd recovery max active osd recovery max single start osd recovery op priority osd recovery threads osd backfill scan max osd backfill scan min I do not know how many osds and

[ceph-users] Re: add server in crush map before osd

2020-12-03 Thread Anthony D'Atri
This is what I do as well. > You can also just use a single command: > > ceph osd crush add-bucket host room= ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Many ceph commands hang. broken mgr?

2020-12-03 Thread Paul Mezzanini
So I fixed it but I still have no idea on root cause. Did narrow it down to any commands being redirected to the manager. We unsuccessfully tried to get the python debugging working for the status module so we could watch the code path inside to see where it was hanging. I decided to take

[ceph-users] Re: High read throughput on BlueFS

2020-12-03 Thread Seena Fallah
My first question is about this metric: ceph_bluefs_read_prefetch_bytes and I want to know what operation is related to this metric? On Thu, Dec 3, 2020 at 7:49 PM Seena Fallah wrote: > Hi all, > > When my cluster gets into a recovery state (adding new node) I see a huge > read throughput on

[ceph-users] High read throughput on BlueFS

2020-12-03 Thread Seena Fallah
Hi all, When my cluster gets into a recovery state (adding new node) I see a huge read throughput on its disks and it affects the latency! Disks are SSD and they don't have a separate WAL/DB. I'm using nautilus 14.2.14 and bluefs_buffered_io is false by default. When this throughput came on my

[ceph-users] Re: Increase number of objects in flight during recovery

2020-12-03 Thread Frank Schilder
Sorry, just tried "osd_recovery_sleep=0" (was 0.05) and the number of objects in flight did increase dramatically: recovery: 0 B/s, 8.64 kobjects/s Would be nice if there was a way to set this per pool. Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14

[ceph-users] Re: add server in crush map before osd

2020-12-03 Thread 胡 玮文
You can also just use a single command: ceph osd crush add-bucket host room= > 在 2020年12月4日,00:00,Francois Legrand 写道: > > Thank for your advices. > > it was exactly what I needed. > > Indeed, I did a : > > ceph osd crush add-bucket host > ceph osd crush move room= > > > But also set

[ceph-users] Re: add server in crush map before osd

2020-12-03 Thread Francois Legrand
Thank for your advices. it was exactly what I needed. Indeed, I did a : ceph osd crush add-bucket host ceph osd crush move room= But also set the norecover, nobackfill and norebalance flags :-) It worked perfectly as expected. F. Le 03/12/2020 à 01:50, Reed Dier a écrit : Just to

[ceph-users] Re: Increase number of objects in flight during recovery

2020-12-03 Thread Frank Schilder
Did this already. It doesn't change the number of objects in flight. Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: 胡 玮文 Sent: 03 December 2020 12:35:03 To: Frank Schilder Cc: ceph-users@ceph.io Subject: Re:

[ceph-users] Re: add OSDs to cluster

2020-12-03 Thread Jonas Jelten
Indeed, I think this is yet another incarnation of the "origin of misplaced data is no longer found"-bug. https://tracker.ceph.com/issues/37439 https://tracker.ceph.com/issues/46847 We also experience it regularly, but I haven't found the cause yet. Another bug that occurs when adding new OSDs

[ceph-users] Re: slow down keys/s in recovery

2020-12-03 Thread Seena Fallah
Thanks. It seems it is related to wpq implementation on how it is organizing priorities! I want to slow down the keys/s and I've set all the priorities to 1 for recovery but it doesn't slow down! On Thu, Dec 3, 2020 at 1:13 PM Anthony D'Atri wrote: > > >> If so why the client op priority is

[ceph-users] Re: Increase number of objects in flight during recovery

2020-12-03 Thread Frank Schilder
[root@gnosis ~]# ceph status cluster: id: health: HEALTH_WARN 8283238/3566503213 objects misplaced (0.232%) 1 pools nearfull services: mon: 3 daemons, quorum ceph-01,ceph-02,ceph-03 mgr: ceph-02(active), standbys: ceph-03, ceph-01 mds: con-fs2-1/1/1

[ceph-users] Re: Increase number of objects in flight during recovery

2020-12-03 Thread Frank Schilder
Hi Janne, looked at it already. The recovery rate is unbearably slow and I would like to increase it. The % misplaced objects is decreasing unnecessarily slow. Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From:

[ceph-users] Increase number of objects in flight during recovery

2020-12-03 Thread Frank Schilder
Hi all, I have the opposite problem as discussed in "slow down keys/s in recovery". I need to increase the number of objects in flight during rebalance. It is already all remapped PGs in state backfilling, but it looks like no more than 8 objects/sec are transferred per PG at a time. The pools

[ceph-users] Re: Increase number of objects in flight during recovery

2020-12-03 Thread 胡 玮文
Hi, There is a “OSD recovery priority” dialog box in web dashboard. Configurations it will change includes: osd_max_backfill osd_recovery_max_active osd_recovery_max_single_start osd_recovery_sleep Tune these config may helps. “High” priority corresponding to 4, 4, 4, 0, respectively. Some of

[ceph-users] Re: Increase number of objects in flight during recovery

2020-12-03 Thread David Caro
Hi Frank, out of curiosity, can you share the recovery rates you are seeing? I would appreciate it, thanks! On 12/03 09:44, Frank Schilder wrote: > Hi Janne, > > looked at it already. The recovery rate is unbearably slow and I would like > to increase it. The % misplaced objects is decreasing

[ceph-users] Re: slow down keys/s in recovery

2020-12-03 Thread Anthony D'Atri
>> If so why the client op priority is default 63 and recovery op is 3? This >> means that by default recovery op is more prioritize than client op! > > Exactly the opposite. Client ops take priority over recovery ops. And > various other ops have priorities as described in the document I

[ceph-users] Re: Increase number of objects in flight during recovery

2020-12-03 Thread Janne Johansson
Den tors 3 dec. 2020 kl 10:11 skrev Frank Schilder : > I have the opposite problem as discussed in "slow down keys/s in > recovery". I need to increase the number of objects in flight during > rebalance. It is already all remapped PGs in state backfilling, but it > looks like no more than 8

[ceph-users] Re: slow down keys/s in recovery

2020-12-03 Thread Anthony D'Atri
> > > Sorry I got confused! Do you mean that both recovery_op_priority and > recovery_priority should be 63 to have a slow recovery? recovery_priority is set per-pool to rank recovery relative to other pools. This is not related to your questions.

[ceph-users] Re: Monitors not starting, getting "e3 handle_auth_request failed to assign global_id"

2020-12-03 Thread Hoan Nguyen Van
I have same issue. My cluster version is 14.2.1, i never meet it before. I see some information from this tracker : https://tracker.ceph.com/issues/48033 ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to