Hello,
On Thu, 01 Jan 2015 18:25:47 +1300 Mark Kirkwood wrote:
The number of monitors recommended and the fact that a voting quorum is
the way it works is covered here:
http://ceph.com/docs/master/rados/deployment/ceph-deploy-mon/
but I agree that you should probably not get a HEALTH OK status when you
have just setup 2 (or in fact any even number of) monitors...HEALTH WARN
would make more sense, with a wee message suggesting adding at least one
more!
I think what Jiri meant is that wen the whole cluster goes into a deadlock
due to loosing monitor quorum, ceph -s etc won't work anymore either.
And while the cluster rightfully shouldn't be doing anything in such a
state, querying the surviving/reachable monitor and being told as much
would indeed be a nice feature, as opposed to deafening silence.
As for your suggestion, while certainly helpful it is my not so humble
opinion than the the WARN state right now is totally overloaded and quite
frankly bogus.
This is particularly a problem with monitor plugins that just pick up the
WARN state without further discrimination.
And some WARN states like slow requests are pretty much an ERR state for
most people, stalled requests for more than 30 seconds (or days!) are a
sign of something massively wrong and likely to have customer/client
impact.
I think a neat solution would be the ability to assign all possible
problem states a value like ERR, WARN, NOTE.
A cluster with just 1 or 2 monitors or having scrub disabled is (for me)
worth a NOTE, but not a WARN.
Christian
Regards
Mark
On 01/01/15 18:06, Jiri Kanicky wrote:
Hi,
I think you are right. I was too focused on the following line in docs:
A cluster will run fine with a single monitor; however,*a single
monitor is a single-point-of-failure*. I will try to add another
monitor. Hopefully, this will fix my issue.
Anyway, I think that ceph status or ceph health should report at
least something in such state. Its quite weird that everything stops...
Thank you
Jiri
On 1/01/2015 15:51, Lindsay Mathieson wrote:
On Thu, 1 Jan 2015 03:46:33 PM Jiri Kanicky wrote:
Hi,
I have:
- 2 monitors, one on each node
- 4 OSDs, two on each node
- 2 MDS, one on each node
POOMA U here, but I don't think you can reach quorum with one out of
two monitors, you need a odd number:
http://ceph.com/docs/master/rados/configuration/mon-config-ref/#monitor-quorum
Perhaps try removing one monitor, so you only have one left, then
take the node without a monitor down.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
--
Christian BalzerNetwork/Systems Engineer
ch...@gol.com Global OnLine Japan/Fusion Communications
http://www.gol.com/
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com