Re: [ceph-users] SSD Recovery Settings

Brent Kennedy Wed, 20 Mar 2019 09:30:48 -0700

Seems both of you are spot on.  I injected the change and its now moving at
.080 instead of .002.  I did fix the label on the drives from HDD to SDD but
I didn't restart the OSDs due to the recovery process.  Seeing it fly now.
I also restarted the stuck OSDs but I know they are where the data is, so
they keep going back to slow.  I imagine this is part of the pg creation
process and my failure to adjust these when I created them.  Thanks!


 

I wasn't able to pull the config from the daemon(directory not found), but I
used the web panel to look at the setting.  I also found that
"bluestore_bdev_type" was set to "hdd", so I am going to see if there is a
way to change that because when I restarted some of the stuck OSDs, the tag
change I made doesn't seem to affect this setting.  I use ceph-deploy to do
the deployment(after ansible server setup), so it could also be a switch I
need to be using.  This is our first SSD cluster.

 

Reed:  If you don't mind me asking, what was the graphing tool you had in
the post?  I am using the ceph health web panel right now but it doesn't go
that deep.

 

Regards,

Brent

 

From: Reed Dier <reed.d...@focusvq.com> 
Sent: Wednesday, March 20, 2019 11:01 AM
To: Brent Kennedy <bkenn...@cfl.rr.com>
Cc: ceph-users <ceph-users@lists.ceph.com>
Subject: Re: [ceph-users] SSD Recovery Settings

 

Not sure what your OSD config looks like,

 

When I was moving from Filestore to Bluestore on my SSD OSD's (and NVMe FS
journal to NVMe Bluestore block.db),

I had an issue where the OSD was incorrectly being reported as rotational in
some part of the chain.

Once I overcame that, I had a huge boost in recovery performance (repaving
OSDs).

Might be something useful in there.

 

http://lists.ceph.com/pipermail/ceph-users-ceph.com/2018-February/025039.htm
l

 

Reed





On Mar 19, 2019, at 11:29 PM, Konstantin Shalygin <k0...@k0ste.ru
<mailto:k0...@k0ste.ru> > wrote:

 

 

I setup an SSD Luminous 12.2.11 cluster and realized after data had been
added that pg_num was not set properly on the default.rgw.buckets.data pool
( where all the data goes ).  I adjusted the settings up, but recovery is
going really slow ( like 56-110MiB/s ) ticking down at .002 per log
entry(ceph -w).  These are all SSDs on luminous 12.2.11 ( no journal drives
) with a set of 2 10Gb fiber twinax in a bonded LACP config.  There are six
servers, 60 OSDs, each OSD is 2TB.  There was about 4TB of data ( 3 million
objects ) added to the cluster before I noticed the red blinking lights.
 
 
 
I tried adjusting the recovery to:
 
ceph tell 'osd.*' injectargs '--osd-max-backfills 16'
 
ceph tell 'osd.*' injectargs '--osd-recovery-max-active 30'
 
 
 
Which did help a little, but didn't seem to have the impact I was looking
for.  I have used the settings on HDD clusters before to speed things up (
using 8 backfills and 4 max active though ).  Did I miss something or is
this part of the pg expansion process.  Should I be doing something else
with SSD clusters?
 
 
 
Regards,
 
-Brent
 
 
 
Existing Clusters:
 
Test: Luminous 12.2.11 with 3 osd servers, 1 mon/man, 1 gateway ( all
virtual on SSD )
 
US Production(HDD): Jewel 10.2.11 with 5 osd servers, 3 mons, 3 gateways
behind haproxy LB
 
UK Production(HDD): Luminous 12.2.11 with 15 osd servers, 3 mons/man, 3
gateways behind haproxy LB
 
US Production(SSD): Luminous 12.2.11 with 6 osd servers, 3 mons/man, 3
gateways behind haproxy LB

 

Try to lower `osd_recovery_sleep*` options.

You can get your current values from ceph admin socket like this:

```

ceph daemon osd.0 config show | jq 'to_entries[] | if
(.key|test("^(osd_recovery_sleep)(.*)")) then (.) else empty end'

```

 

k

_______________________________________________
ceph-users mailing list
 <mailto:ceph-users@lists.ceph.com> ceph-users@lists.ceph.com
 <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] SSD Recovery Settings

Reply via email to