On Sat, Nov 29, 2014 at 11:25 AM, John McEleney < john.mcele...@netservers.co.uk> wrote:
> Hi all, > > I've been working on the Ceph charm with the intention of making it much > more powerful when it comes to the selection of OSD devices. I wanted to > knock a few ideas around to see what might be possible. > > The main problem I'm trying to address is that with the existing > implementation, when a new SAS controller is added, or drive caddies get > swapped around, drive letters (/dev/sd[a-z]) get swapped around. As the > current charm just asks for a list of devices, and that list of devices > is global across the entire cluster, it pretty-much requires all > machines to be identical, and unchanging. I also looked into used > /dev/disk/by-id, but found this to be too inflexible. > Below I've pasted a patch I wrote as a stop-gap for myself. This patch > allows you to list model numbers for your drives instead of /dev/XXXX > devices. It then dynamically generates the list of /dev/ devices on each > host. The patch is pretty unsophisticated, but it solves my immediate > problem. However, I think we can do better than this. > I've been thinking that xpath strings might be a better way to go. I > played around with this idea a little. This will give some idea how it > could work: > ========================================== > root@ceph-store1:~# lshw -xml -class disk > /tmp/disk.xml > root@ceph-store1:~# echo 'cat > //node[contains(product,"MG03SCA400")]/logicalname/text()'|xmllint --shell > /tmp/disk.xml|grep '^/dev/' > /dev/sdc > /dev/sdd > /dev/sde > /dev/sdf > /dev/sdg > /dev/sdh > /dev/sdi > /dev/sdj > /dev/sdk > /dev/sdl > ========================================== > > So, that takes care of selecting by model number. How about selecting > drives that are larger than 3TB? > > ========================================== > root@ceph-store1:~# echo 'cat > //node[size>3000000000000]/logicalname/text()'|xmllint --shell > /tmp/disk.xml|grep '^/dev/' > /dev/sdc > /dev/sdd > /dev/sde > /dev/sdf > /dev/sdg > /dev/sdh > /dev/sdi > /dev/sdj > /dev/sdk > /dev/sdl > ========================================== > > Just to give some idea of the power of this, take a look at the info > lshw compiles: > > <node id="disk:3" claimed="true" class="disk" > handle="GUID:aaaaaaaa-a5c7-4657-924d-8ed94e1b1aaa"> > <description>SCSI Disk</description> > <product>MG03SCA400</product> > <vendor>TOSHIBA</vendor> > <physid>0.3.0</physid> > <businfo>scsi@1:0.3.0</businfo> > <logicalname>/dev/sdf</logicalname> > <dev>8:80</dev> > <version>DG02</version> > <serial>X470A0XXXXXX</serial> > <size units="bytes">4000787030016</size> > <capacity units="bytes">5334969415680</capacity> > <configuration> > <setting id="ansiversion" value="6" /> > <setting id="guid" value="aaaaaaaa-a5c7-4657-924d-8ed94e1b1aaa" /> > <setting id="sectorsize" value="512" /> > </configuration> > <capabilities> > <capability id="7200rpm" >7200 rotations per minute</capability> > <capability id="gpt-1.00" >GUID Partition Table version > 1.00</capability> > <capability id="partitioned" >Partitioned disk</capability> > <capability id="partitioned:gpt" >GUID partition table</capability> > </capabilities> > </node> > > So, you could be selecting your drives by vendor, size, model, sector > size, or any combination of these and other attributes. > > The only reason I didn't go any further with this idea yet is that "lshw > -C disk" is incredibly slow. I tried messing around with disabling > tests, but it still crawls along. I figure that this wouldn't be that > big a deal if you could cache the resulting xml file, but that's not > fully satisfactory either. What if I want to hot-plug a new hard-drive > into the system? lshw would need to be run again. I though that maybe > udev could be used for doing this, but I certainly don't want udev > running lshw once per drive at boot time as the drives are detected. > > I'm really wondering if anyone else has any advice on either speeding up > lshw, or if there's any other simple way of pulling this kind of > functionality off. Maybe I'm worrying too much about this. As long as > the charm only fires this hook rarely, and caches the data for the > duration of the hook run, maybe I don't need to worry? > i'm wondering if instead of lshw and the time consumption there we could continue with lsblk, there's a bit more information there (size, model, rotational) etc which seems to satisfy most of the lshw examples you've given and is relatively fast in comparison. ie. https://gist.github.com/kapilt/d0485d6fac3be6caaed2 another option, here's a script around a similiar use case does a hierarchical info of drives from controller on down and supports layered block devs. http://www.spinics.net/lists/raid/msg34460.html current implementation @ https://github.com/pturmel/lsdrv/blob/master/lsdrv cheers, Kapil > John > > Patch to match against model number (NOT REGRESSION TESTED): > === modified file 'config.yaml' > --- config.yaml 2014-10-06 22:07:41 +0000 > +++ config.yaml 2014-11-29 15:42:41 +0000 > @@ -42,16 +42,35 @@ > These devices are the range of devices that will be checked for and > used across all service units. > . > + This can be a list of devices, or a list of model numbers which will > + be used to automatically compile a list of matching devices. > + . > For ceph >= 0.56.6 these can also be directories instead of devices > - the > charm assumes anything not starting with /dev is a directory > instead. > + Any device not starting with a / is assumed to be a model number > osd-journal: > type: string > default: > > === modified file 'hooks/charmhelpers/contrib/storage/linux/utils.py' > --- hooks/charmhelpers/contrib/storage/linux/utils.py 2014-09-22 > 08:51:15 +0000 > +++ hooks/charmhelpers/contrib/storage/linux/utils.py 2014-11-29 > 15:30:25 +0000 > @@ -1,5 +1,6 @@ > import os > import re > +import subprocess > from stat import S_ISBLK > > from subprocess import ( > @@ -51,3 +52,7 @@ > if is_partition: > return bool(re.search(device + r"\b", out)) > return bool(re.search(device + r"[0-9]+\b", out)) > + > +def devices_by_model(model): > + proc = subprocess.Popen(['lsblk', '-nio', > 'KNAME,MODEL'],stdout=subprocess.PIPE) > + return [ '/dev/' + dev.split()[0] for dev in [line.strip() for line > in proc.stdout] if re.search(model+'$',dev) ] > > === modified file 'hooks/hooks.py' > --- hooks/hooks.py 2014-09-30 03:06:10 +0000 > +++ hooks/hooks.py 2014-11-29 15:22:48 +0000 > @@ -44,6 +44,9 @@ > get_ipv6_addr, > format_ipv6_addr > ) > +from charmhelpers.contrib.storage.linux.utils import ( > + devices_by_model > +) > > from utils import ( > render_template, > @@ -166,14 +169,18 @@ > else: > return False > > - > def get_devices(): > if config('osd-devices'): > - return config('osd-devices').split(' ') > + results = [] > + for dev in config('osd-devices').split(' '): > + if dev.startswith('/'): > + results.append(dev) > + else: > + results += devices_by_model(dev) > + return results > else: > return [] > > - > @hooks.hook('mon-relation-joined') > def mon_relation_joined(): > for relid in relation_ids('mon'): > > > -- > ----------------------------- > John McEleney > Netservers Ltd. > 21 Signet Court > Cambridge > CB5 8LA > http://www.netservers.co.uk > ----------------------------- > Tel. 01223 446000 > Fax. 0870 4861970 > ----------------------------- > Registered in England > Number: 04028770 > ----------------------------- > > > -- > Juju mailing list > Juju@lists.ubuntu.com > Modify settings or unsubscribe at: > https://lists.ubuntu.com/mailman/listinfo/juju >
-- Juju mailing list Juju@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju