Well, keep in mind that this isn't just for identification of failed disks. I 
can conceive of use cases where a user flips on one or more drive LEDs for 
identification or debugging purposes. That would be the distinction between 
identity and fail. We can give the user the ability to distinguish between the 
two and figure out which they'd want to use at any given time (also, keep in 
mind that the failure LED is not customer controllable behind some storage 
controllers anyway...).

I was wondering if I'd need to carry along the last known disk state...guess 
I'll figure that nuance out as I go.

Joe

> On Apr 1, 2015, at 6:17 PM, Sage Weil <[email protected]> wrote:
> 
> #2 really sounds safer to me.  In particular, you need to be really 
> careful not to flash an LED until you're sure you don't need the data on 
> the disk (i.e., it's down+out and the cluster state is healthy--no heroic 
> measures needed).  I think anything that triggers flashing that doesn't 
> have a holistic view of the cluster would be dangerous.
> 
> That, combined with the complications around ceph-osd possibly not 
> running, make me thing this would be the calamari agent that does the 
> flashing.
> 
> It also may be necessary for the disk -> last known state mapping to go 
> somewhere other than in just osd metadata; if the osd is recreated or the 
> id gets reused that info go away.  (We could also be careful to avoid 
> deallocating the id until the disk is removed, I guess, but it's another 
> constraint to worry about.)
> 
> sage
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to [email protected]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to