Thanks Nick. Looking at the script its something along the lines I was after.
I just realized that I could create multiple availability group “hosts” however your statement is valid that the failure domain is an entire host. Thanks for all your help everyone. > On Sep 13, 2015, at 11:47 , Nick Fisk <n...@fisk.me.uk> wrote: > >> -----Original Message----- >> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of >> deeepdish >> Sent: 13 September 2015 02:47 >> To: Johannes Formann <mlm...@formann.de> >> Cc: ceph-users@lists.ceph.com >> Subject: Re: [ceph-users] CRUSH odd bucket affinity / persistence >> >> Johannes, >> >> Thank you — "osd crush update on start = false” did the trick. I wasn’t >> aware >> that ceph has automatic placement logic for OSDs >> (http://permalink.gmane.org/gmane.comp.file- >> systems.ceph.user/9035). This brings up a best practice question.. >> >> How is the configuration of OSD hosts with multiple storage types (e.g. >> spinners + flash/ssd), typically implemented in the field from a crush map / >> device location perspective? Preference is for a scale out design. > > I use something based on this script: > > https://gist.github.com/wido/5d26d88366e28e25e23d > > With the crush hook location config value in ceph.conf. You can pretty much > place OSD's wherever you like with it. > >> >> In addition to the SSDs which are used for a EC cache tier, I’m also >> planning a >> 5:1 ratio of spinners to SSD for journals. In this case I want to >> implement an >> availability groups within the OSD host itself. >> >> e.g. in a 26-drive chassis, there will be 6 SSDs + 20 spinners. [2 SSDs for >> replicated cache tier, 4 SSDs will create 5 availability groups of 5 spinners >> each] The idea is to have CRUSH take into account SSD journal failure >> (affecting 5 spinners). > > By default Ceph will make the host the smallest failure domain, so I'm not > sure if there is any benefit to identifying to crush that several OSD's share > one journal. Whether you lose 1 OSD or all OSD's from a server, there > shouldn't be any difference to the possibility of data loss. Or have I > misunderstood your question? > >> >> Thanks. >> >> >> >> On Sep 12, 2015, at 19:11 , Johannes Formann <mlm...@formann.de> wrote: >> >> Hi, >> >> >> I’m having a (strange) issue with OSD bucket persistence / affinity on my >> test >> cluster.. >> >> The cluster is PoC / test, by no means production. Consists of a single >> OSD / >> MON host + another MON running on a KVM VM. >> >> Out of 12 OSDs I’m trying to get osd.10 and osd.11 to be part of the ssd >> bucket in my CRUSH map. This works fine when either editing the CRUSH >> map by hand (exporting, decompile, edit, compile, import), or via the ceph >> osd crush set command: >> >> "ceph osd crush set osd.11 0.140 root=ssd” >> >> I’m able to verify that the OSD / MON host and another MON I have running >> see the same CRUSH map. >> >> After rebooting OSD / MON host, both osd.10 and osd.11 become part of the >> default bucket. How can I ensure that ODSs persist in their configured >> buckets? >> >> I guess you have set "osd crush update on start = true" >> (http://ceph.com/docs/master/rados/operations/crush-map/ ) and only the >> default „root“-entry. >> >> Either fix the „root“-Entry in the ceph.conf or set osd crush update on >> start = >> false. >> >> greetings >> >> Johannes > > > > > > _______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com