Thanks for answering even before I asked the questions:) So bottom line, HEALTH_ERR state is simply part of taking a (bunch of) OSD down? Is HEALTH_ERR period of 2-4 seconds within normal bounds? For context, CPUs are 2609v3 per 4 OSDs. (I know; they're far from the fastest CPUs)
On Thu, Aug 3, 2017 at 1:55 PM, Hans van den Bogert <hansbog...@gmail.com> wrote: > What are the implications of this? Because I can see a lot of blocked > requests piling up when using 'noout' and 'nodown'. That probably makes > sense though. > Another thing, no when the OSDs come back online, I again see multiple > periods of HEALTH_ERR state. Is that to be expected? > > On Thu, Aug 3, 2017 at 1:36 PM, linghucongsong <linghucongs...@163.com> > wrote: > >> >> >> set the osd noout nodown >> >> >> >> >> At 2017-08-03 18:29:47, "Hans van den Bogert" <hansbog...@gmail.com> >> wrote: >> >> Hi all, >> >> One thing which has bothered since the beginning of using ceph is that a >> reboot of a single OSD causes a HEALTH_ERR state for the cluster for at >> least a couple of seconds. >> >> In the case of planned reboot of a OSD node, should I do some extra >> commands in order not to go to HEALTH_ERR state? >> >> Thanks, >> >> Hans >> >> >> >> >> > >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com