Re: Advice for implementation of LED behavior in Ceph ecosystem

John Spray Wed, 01 Apr 2015 14:18:23 -0700

On 01/04/2015 19:56, Handzik, Joe wrote:

1. Stick everything in Calamari via Salt calls similar to what Gregory is 
showing. I have concerns about this, I think I'd still need extra information 
from the OSDs themselves. I might need to implement the first half of option #2 
anyway.
2. Scatter it across the codebases (would probably require changes in Ceph, 
Calamari, and Calamari-clients). Expose the storage target data via the OSDs, 
and move that information upward via the RESTful API. Then, expose another 
RESTful API behavior that allows a user to change the LED state. Implementing 
as much as possible in the Ceph codebase itself has an added benefit (as far as 
I see it, at least) if someone ever decides that the fault LED should be 
toggled on based on the state of the OSD or backing storage device. It should 
be easier for Ceph to hook into that kind of functionality if Calamari doesn't 
need to be involved.


Dan mentioned something I thought about too...not EVERY OSD's backing storage 
is going to be able to use this (Kinetic drives, NVDIMMs, M.2, etc etc), I'd 
need to implement some way to filter devices and communicate via the Calamari 
GUI that the device doesn't have an LED to toggle or doesn't understand SCSI 
Enclosure Services (I'm targeting industry standard HBAs first, and I'll deal 
with RAID controllers like Smart Array later).

I'm trying to get this out there early so anyone with particularly strong 
implementation opinions can give feedback. Any advice would be appreciated! I'm 
still new to the Ceph source base, and probably understand Calamari and 
Calamari-clients better than Ceph proper at the moment.

Similar to Mark's comment, I would lean towards option 2 -- it would begreat to have a CLI-driven ability to flash the LEDs for an OSD, andwork on integrating that with a GUI afterwards.

Currently the OSD metadata on drives is pretty limited, it'll just tellyou the /var/lib/ceph/osd/ceph-X path for the data and journal -- thetask of resolving that to a physical device is left as an exercise tothe reader, so to speak.

I would suggest extending osd metadata to also report the block device,but only for the simple case where an OSD is a GPT partition on a raw/dev/sdX block device. Resolving block device to underlying disks inconfigurations like LVM/MDRAID/multipath is complex in the general case(I've done it, I don't recommend it), and most ceph clusters don't usethose layers. You could add a fallback ability for users to specifytheir block device in ceph.conf, in case the simple GPT-assuming OSDprobing code can't find it from the mount point.

Once you have found the block device and reported it in the OSDmetadata, you can use that information to go poke its LEDs usingenclosure services hooks as you suggest, and wrap that in an OSD 'tell'command (OSD::do_command). In a similar vein to finding the blockdevice, it would be a good thing to have a config option here so thatadmins can optionally specify a custom command for flashing a particularOSD's LED. Admins might not bother setting that, but it would mean asystem integrator could optionally configure ceph to work with whateverexotic custom stuff they have.

Hopefully that's some help, it sounds like you've already thought itthrough a fair bit anyway.


Cheers,
John
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Advice for implementation of LED behavior in Ceph ecosystem

Reply via email to