Yeah, that hit the nail on the head. Significantly reducing/eliminating the 
recovery sleep times increases the recovery speed back up (and beyond!) the 
levels I was expecting to see - recovery is almost an order of magnitude faster 
now. Thanks for educating me about those changes!

Rich

On 14/09/17 11:16, Richard Hesketh wrote:
> Hi Mark,
> 
> No, I wasn't familiar with that work. I am in fact comparing speed of 
> recovery to maintenance work I did while the cluster was in Jewel; I haven't 
> manually done anything to sleep settings, only adjusted max backfills OSD 
> settings. New options that introduce arbitrary slowdown to recovery 
> operations to preserve client performance would explain what I'm seeing! I'll 
> have a tinker with adjusting those values (in my particular case client load 
> on the cluster is very low and I don't have to honour any guarantees about 
> client performance - getting back into HEALTH_OK asap is preferable).
> 
> Rich
> 
> On 13/09/17 21:14, Mark Nelson wrote:
>> Hi Richard,
>>
>> Regarding recovery speed, have you looked through any of Neha's results on 
>> recovery sleep testing earlier this summer?
>>
>> https://www.spinics.net/lists/ceph-devel/msg37665.html
>>
>> She tested bluestore and filestore under a couple of different scenarios.  
>> The gist of it is that time to recover changes pretty dramatically depending 
>> on the sleep setting.
>>
>> I don't recall if you said earlier, but are you comparing filestore and 
>> bluestore recovery performance on the same version of ceph with the same 
>> sleep settings?
>>
>> Mark
>>
>> On 09/12/2017 05:24 AM, Richard Hesketh wrote:
>>> Thanks for the links. That does seem to largely confirm that what I haven't 
>>> horribly misunderstood anything and I've not been doing anything obviously 
>>> wrong while converting my disks; there's no point specifying separate 
>>> WAL/DB partitions if they're going to go on the same device, throw as much 
>>> space as you have available at the DB partitions and they'll use all the 
>>> space they can, and significantly reduced I/O on the DB/WAL device compared 
>>> to Filestore is expected since bluestore's nixed the write amplification as 
>>> much as possible.
>>>
>>> I'm still seeing much reduced recovery speed on my newly Bluestored 
>>> cluster, but I guess that's a tuning issue rather than evidence of 
>>> catastrophe.
>>>
>>> Rich
> 
> 
> 
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 


-- 
Richard Hesketh
Systems Engineer, Research Platforms
BBC Research & Development

Attachment: signature.asc
Description: OpenPGP digital signature

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to