> -----Original Message-----
> From: ceph-devel-ow...@vger.kernel.org [mailto:ceph-devel-
> ow...@vger.kernel.org] On Behalf Of Mark Nelson
> Sent: Wednesday, December 02, 2015 11:04 AM
> To: Gregory Farnum; Vimal
> Cc: ceph-devel
> Subject: Re: Suggestions on tracker 13578
> 
> 
> On 12/02/2015 12:23 PM, Gregory Farnum wrote:
> > On Tue, Dec 1, 2015 at 5:23 AM, Vimal <viku...@redhat.com> wrote:
> >> Hello,
> >>
> >> This mail is to discuss the feature request at
> >> http://tracker.ceph.com/issues/13578.
> >>
> >> If done, such a tool should help point out several mis-configurations
> >> that may cause problems in a cluster later.
> >>
> >> Some of the suggestions are:
> >>
> >> a) A check to understand if the MONs and OSD nodes are on the same
> machines.
> >>
> >> b) If /var is a separate partition or not, to prevent the root
> >> filesystem from being filled up.
> >>
> >> c) If monitors are deployed in different failure domains or not.
> >>
> >> d) If the OSDs are deployed in different failure domains.
> >>
> >> e) If a journal disk is used for more than six OSDs. Right now, the
> >> documentation suggests upto 6 OSD journals to exist on a single
> >> journal disk.
> >>
> >> f) Failure domains depending on the power source.
> >>
> >> There can be several more checks, and it can be a useful tool to test
> >> the problems an existing cluster or a new installation.
> >>
> >> But I'd like to know how the engineering community sees this, if its
> >> seems to be worth pursuing, and what suggestions do you have for
> >> improving/adding to this.
> >
> > This is a user experience and support tool; I don't think the
> > engineering community can really judge its value. ;)
> >
> > So sure, sounds good to me. It'll need to get into the hands of users
> > before we find out if it's a good plan or not. I was at the SDI Summit
> > yesterday and was hearing about how some of our choices (like
> > HEALTH_WARN on pg counts) are *really* scary for users who think
> > they're in danger of losing data. I suspect the difficulty of a tool
> > like this will be more in the communication of issues and severity,
> > more than in what exactly we choose to check.
> 
> Frankly I've never been a big fan of how we report warnings like this through
> the health check.  It's important to let users know if they've set up things
> sub-optimally, but I don't think ceph health is the way to do it.  The
> difference between your doctor telling you you should exercise more and
> lose a few pounds vs you have Ebola and are going to suffer an incredibly
> gruesome and painful death in the next 48 hours. :)
> 

Since I was the one at the SDI Summit that took issue with some of these 
warnings, I whole-heartedly agree with Greg's and Mark's comments. A warning at 
health check should indicate to the user that some corrective action should be 
taken, besides turning the warning off :-) I do not have an issue reporting 
advisories, but they should be kept separate true warnings. If we want to 
notify the user of variances from best practices, I suggest a separate method, 
i.e. "ceph advise", rather than constantly repeating them on health checks.

> > -Greg
> > --
> > To unsubscribe from this list: send the line "unsubscribe ceph-devel"
> > in the body of a message to majord...@vger.kernel.org More
> majordomo
> > info at  http://vger.kernel.org/majordomo-info.html
> >
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the
> body of a message to majord...@vger.kernel.org More majordomo info at
> http://vger.kernel.org/majordomo-info.html
N�����r��y����b�X��ǧv�^�)޺{.n�+���z�]z���{ay�ʇڙ�,j��f���h���z��w���
���j:+v���w�j�m��������zZ+�����ݢj"��!�i

Reply via email to