SMART monitoring

2013-12-26 Thread James Harper
What would be the best approach to integrate SMART with ceph, for the predictive failure case? Assuming you agree with SMART diagnosis of an impending failure, would it be better to automatically start migrating data off the OSD (reduce the weight to 0?), or to just prompt the user to replace t

Re: SMART monitoring

2013-12-26 Thread Sage Weil
Hi James, On Fri, 27 Dec 2013, James Harper wrote: > What would be the best approach to integrate SMART with ceph, for the > predictive failure case? Currently (as you know) we don't do anything with SMART. It is obviously important for the entire system, but I'm unsure whether it should be s

Re: SMART monitoring

2013-12-27 Thread Justin Erenkrantz
On Thu, Dec 26, 2013 at 9:17 PM, Sage Weil wrote: > I think the question comes down to whether Ceph should take some internal > action based on the information, or whether that is better handled by some > external monitoring agent. For example, an external agent might collect > SMART info into gr

Re: SMART monitoring

2013-12-27 Thread Andrey Korolyov
T failures can be dangerous if they are not bad enough to completely tear down an OSD therefore it will not flap and will not be marked as down in time, but cluster performance is greatly affected in this case. I don`t think that the SMART monitoring task is somehow related to Ceph because seper

Re: SMART monitoring

2014-05-22 Thread Andrey Korolyov
>> More majordomo info at http://vger.kernel.org/majordomo-info.html >> > > Hi, > > Judging from my personal experience SMART failures can be dangerous if > they are not bad enough to completely tear down an OSD therefore it will > not flap and will not be mark