Re: [ceph-users] SOLVED: CRUSH odd bucket affinity / persistence

deeepdish Sun, 13 Sep 2015 13:59:05 -0700

Thanks Nick.   Looking at the script its something along the lines I was after.


I just realized that I could create multiple availability group “hosts” however 
your statement is valid that the failure domain is an entire host.

Thanks for all your help everyone.

> On Sep 13, 2015, at 11:47 , Nick Fisk <n...@fisk.me.uk> wrote:
> 
>> -----Original Message-----
>> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
>> deeepdish
>> Sent: 13 September 2015 02:47
>> To: Johannes Formann <mlm...@formann.de>
>> Cc: ceph-users@lists.ceph.com
>> Subject: Re: [ceph-users] CRUSH odd bucket affinity / persistence
>> 
>> Johannes,
>> 
>> Thank you — "osd crush update on start = false” did the trick.   I wasn’t 
>> aware
>> that ceph has automatic placement logic for OSDs
>> (http://permalink.gmane.org/gmane.comp.file-
>> systems.ceph.user/9035).   This brings up a best practice question..
>> 
>> How is the configuration of OSD hosts with multiple storage types (e.g.
>> spinners + flash/ssd), typically implemented in the field from a crush map /
>> device location perspective?   Preference is for a scale out design.
> 
> I use something based on this script:
> 
> https://gist.github.com/wido/5d26d88366e28e25e23d
> 
> With the crush hook location config value in ceph.conf. You can pretty much 
> place OSD's wherever you like with it.
> 
>> 
>> In addition to the SSDs which are used for a EC cache tier, I’m also 
>> planning a
>> 5:1 ratio of spinners to SSD for journals.   In this case I want to 
>> implement an
>> availability groups within the OSD host itself.
>> 
>> e.g. in a 26-drive chassis, there will be 6 SSDs + 20 spinners.   [2 SSDs for
>> replicated cache tier, 4 SSDs will create 5 availability groups of 5 spinners
>> each]   The idea is to have CRUSH take into account SSD journal failure
>> (affecting 5 spinners).
> 
> By default Ceph will make the host the smallest failure domain, so I'm not 
> sure if there is any benefit to identifying to crush that several OSD's share 
> one journal. Whether you lose 1 OSD or all OSD's from a server, there 
> shouldn't be any difference to the possibility of data loss. Or have I 
> misunderstood your question?
> 
>> 
>> Thanks.
>> 
>> 
>> 
>> On Sep 12, 2015, at 19:11 , Johannes Formann <mlm...@formann.de> wrote:
>> 
>> Hi,
>> 
>> 
>> I’m having a (strange) issue with OSD bucket persistence / affinity on my 
>> test
>> cluster..
>> 
>> The cluster is PoC / test, by no means production.   Consists of a single 
>> OSD /
>> MON host + another MON running on a KVM VM.
>> 
>> Out of 12 OSDs I’m trying to get osd.10 and osd.11 to be part of the ssd
>> bucket in my CRUSH map.   This works fine when either editing the CRUSH
>> map by hand (exporting, decompile, edit, compile, import), or via the ceph
>> osd crush set command:
>> 
>> "ceph osd crush set osd.11 0.140 root=ssd”
>> 
>> I’m able to verify that the OSD / MON host and another MON I have running
>> see the same CRUSH map.
>> 
>> After rebooting OSD / MON host, both osd.10 and osd.11 become part of the
>> default bucket.   How can I ensure that ODSs persist in their configured
>> buckets?
>> 
>> I guess you have set "osd crush update on start = true"
>> (http://ceph.com/docs/master/rados/operations/crush-map/ ) and only the
>> default „root“-entry.
>> 
>> Either fix the „root“-Entry in the ceph.conf or set osd crush update on 
>> start =
>> false.
>> 
>> greetings
>> 
>> Johannes
> 
> 
> 
> 
> 
> 

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] SOLVED: CRUSH odd bucket affinity / persistence

Reply via email to