Hi all,

I've been working on the Ceph charm with the intention of making it much
more powerful when it comes to the selection of OSD devices. I wanted to
knock a few ideas around to see what might be possible.

The main problem I'm trying to address is that with the existing
implementation, when a new SAS controller is added, or drive caddies get
swapped around, drive letters (/dev/sd[a-z]) get swapped around. As the
current charm just asks for a list of devices, and that list of devices
is global across the entire cluster, it pretty-much requires all
machines to be identical, and unchanging. I also looked into used
/dev/disk/by-id, but found this to be too inflexible.

Below I've pasted a patch I wrote as a stop-gap for myself. This patch
allows you to list model numbers for your drives instead of /dev/XXXX
devices. It then dynamically generates the list of /dev/ devices on each
host. The patch is pretty unsophisticated, but it solves my immediate
problem. However, I think we can do better than this.

I've been thinking that xpath strings might be a better way to go. I
played around with this idea a little. This will give some idea how it
could work:

==========================================
root@ceph-store1:~# lshw -xml -class disk > /tmp/disk.xml
root@ceph-store1:~# echo 'cat 
//node[contains(product,"MG03SCA400")]/logicalname/text()'|xmllint --shell 
/tmp/disk.xml|grep '^/dev/'
/dev/sdc
/dev/sdd
/dev/sde
/dev/sdf
/dev/sdg
/dev/sdh
/dev/sdi
/dev/sdj
/dev/sdk
/dev/sdl
==========================================

So, that takes care of selecting by model number. How about selecting
drives that are larger than 3TB?

==========================================
root@ceph-store1:~# echo 'cat 
//node[size>3000000000000]/logicalname/text()'|xmllint --shell 
/tmp/disk.xml|grep '^/dev/'
/dev/sdc
/dev/sdd
/dev/sde
/dev/sdf
/dev/sdg
/dev/sdh
/dev/sdi
/dev/sdj
/dev/sdk
/dev/sdl
==========================================

Just to give some idea of the power of this, take a look at the info
lshw compiles:

  <node id="disk:3" claimed="true" class="disk" 
handle="GUID:aaaaaaaa-a5c7-4657-924d-8ed94e1b1aaa">
   <description>SCSI Disk</description>
   <product>MG03SCA400</product>
   <vendor>TOSHIBA</vendor>
   <physid>0.3.0</physid>
   <businfo>scsi@1:0.3.0</businfo>
   <logicalname>/dev/sdf</logicalname>
   <dev>8:80</dev>
   <version>DG02</version>
   <serial>X470A0XXXXXX</serial>
   <size units="bytes">4000787030016</size>
   <capacity units="bytes">5334969415680</capacity>
   <configuration>
    <setting id="ansiversion" value="6" />
    <setting id="guid" value="aaaaaaaa-a5c7-4657-924d-8ed94e1b1aaa" />
    <setting id="sectorsize" value="512" />
   </configuration>
   <capabilities>
    <capability id="7200rpm" >7200 rotations per minute</capability>
    <capability id="gpt-1.00" >GUID Partition Table version 1.00</capability>
    <capability id="partitioned" >Partitioned disk</capability>
    <capability id="partitioned:gpt" >GUID partition table</capability>
   </capabilities>
  </node>

So, you could be selecting your drives by vendor, size, model, sector
size, or any combination of these and other attributes.

The only reason I didn't go any further with this idea yet is that "lshw
-C disk" is incredibly slow. I tried messing around with disabling
tests, but it still crawls along. I figure that this wouldn't be that
big a deal if you could cache the resulting xml file, but that's not
fully satisfactory either. What if I want to hot-plug a new hard-drive
into the system? lshw would need to be run again. I though that maybe
udev could be used for doing this, but I certainly don't want udev
running lshw once per drive at boot time as the drives are detected.

I'm really wondering if anyone else has any advice on either speeding up
lshw, or if there's any other simple way of pulling this kind of
functionality off. Maybe I'm worrying too much about this. As long as
the charm only fires this hook rarely, and caches the data for the
duration of the hook run, maybe I don't need to worry?

John

Patch to match against model number (NOT REGRESSION TESTED):
=== modified file 'config.yaml'
--- config.yaml 2014-10-06 22:07:41 +0000
+++ config.yaml 2014-11-29 15:42:41 +0000
@@ -42,16 +42,35 @@
       These devices are the range of devices that will be checked for and
       used across all service units.
       .
+      This can be a list of devices, or a list of model numbers which will
+      be used to automatically compile a list of matching devices.
+      .
       For ceph >= 0.56.6 these can also be directories instead of devices - the
       charm assumes anything not starting with /dev is a directory instead.
+      Any device not starting with a / is assumed to be a model number
   osd-journal:
     type: string
     default:

=== modified file 'hooks/charmhelpers/contrib/storage/linux/utils.py'
--- hooks/charmhelpers/contrib/storage/linux/utils.py   2014-09-22 08:51:15 
+0000
+++ hooks/charmhelpers/contrib/storage/linux/utils.py   2014-11-29 15:30:25 
+0000
@@ -1,5 +1,6 @@
 import os
 import re
+import subprocess
 from stat import S_ISBLK
 
 from subprocess import (
@@ -51,3 +52,7 @@
     if is_partition:
         return bool(re.search(device + r"\b", out))
     return bool(re.search(device + r"[0-9]+\b", out))
+
+def devices_by_model(model):
+    proc = subprocess.Popen(['lsblk', '-nio', 
'KNAME,MODEL'],stdout=subprocess.PIPE)
+    return [ '/dev/' + dev.split()[0] for dev in [line.strip() for line in 
proc.stdout] if re.search(model+'$',dev) ]

=== modified file 'hooks/hooks.py'
--- hooks/hooks.py      2014-09-30 03:06:10 +0000
+++ hooks/hooks.py      2014-11-29 15:22:48 +0000
@@ -44,6 +44,9 @@
     get_ipv6_addr,
     format_ipv6_addr
 )
+from charmhelpers.contrib.storage.linux.utils import (
+    devices_by_model
+)
 
 from utils import (
     render_template,
@@ -166,14 +169,18 @@
     else:
         return False
 
-
 def get_devices():
     if config('osd-devices'):
-        return config('osd-devices').split(' ')
+        results = []
+        for dev in config('osd-devices').split(' '):
+            if dev.startswith('/'):
+                results.append(dev)
+            else:
+                results += devices_by_model(dev)
+        return results
     else:
         return []
 
-
 @hooks.hook('mon-relation-joined')
 def mon_relation_joined():
     for relid in relation_ids('mon'):


-- 
-----------------------------
John McEleney
Netservers Ltd.
21 Signet Court
Cambridge
CB5 8LA
http://www.netservers.co.uk
-----------------------------
Tel. 01223 446000
Fax. 0870 4861970
-----------------------------
Registered in England
Number: 04028770
-----------------------------


-- 
Juju mailing list
Juju@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju

Reply via email to