On Wed, Aug 12, 2020 at 4:09 PM Anthony D'Atri <anthony.da...@gmail.com> wrote:
>
> My understanding is that the existing mon_clock_drift_allowed value of 50 ms 
> (default) is so that PAXOS among the mon quorum can function.  So OSDs (and 
> mgrs, and clients etc) are out of scope of that existing code.

This is correct — the monitors issue leases to each other, those
leases are in absolute clock time, and we need some level of coherency
to maintain our consistency guarantees there...

>
> Things like this are why I like to ensure that the OS does `ntpdate -b` or 
> equivalent at boot-time *before* starting ntpd / chrony - and other daemons.
>
> Now, as to why Ceph doesn’t have analogous code to a complain about other 
> daemons / clients - I’ve wonder that for some time myself.  Perhaps there’s 
> the idea that one’s monitoring infrastructure should detect that, but that’s 
> a guess.

...but none of the rest of the Ceph stack that needs clocks to be
anywhere near each other is the cephx rotating keys — and those need
to be correct on the order of "within an hour" instead of "we do
5-second leases that need to agree on absolute time".

Usually, that kind of vague agreement isn't an issue, and in that core
RADOS code we usually try not to impose requirements on the
environment, so there's not a clock sync check happening. Perhaps at
this point it would be appropriate to add one; tracker tickets and PRs
are welcome. ;)
-Greg

>
> > Yesterday, one of our OSD-only hosts came up with its clock about 8 hours 
> > wrong(!) having been out of the cluster for a week or so. Initially, ceph 
> > seemed entirely happy, and then after an hour or so it all went South (OSDs 
> > start logging about bad authenticators, I/O pauses, general sadness).
> >
> > I know clock sync is important to Ceph, so "one system is 8 hours out, Ceph 
> > becomes sad" is not a surprise. It is perhaps a surprise that the OSDs were 
> > allowed in at all...
> >
> > What _is_ a surprise, though, is that at no point in all this did Ceph 
> > raise a peep about clock skew. Normally it's pretty sensitive to this - our 
> > test cluster has had clock skew complaints when a mon is only slightly out, 
> > and here we had a node 8 hours wrong.
> >
> > Is there some oddity like Ceph not warning on clock skew for OSD-only 
> > hosts? or an upper bound on how high a discrepency it will WARN about?
> >
> > Regards,
> >
> > Matthew
> >
> > example output from mid-outage:
> >
> > root@sto-3-1:~#  ceph -s
> >  cluster:
> >    id:     049fc780-8998-45a8-be12-d3b8b6f30e69
> >    health: HEALTH_ERR
> >            40755436/2702185683 objects misplaced (1.508%)
> >            Reduced data availability: 20 pgs inactive, 20 pgs peering
> >            Degraded data redundancy: 367431/2702185683 objects degraded 
> > (0.014%), 4549 pgs degraded
> >            481 slow requests are blocked > 32 sec. Implicated osds 
> > 188,284,795,1278,1981,2061,2648,2697
> >            644 stuck requests are blocked > 4096 sec. Implicated osds 
> > 22,31,33,35,101,116,120,130,132,140,150,159,201,211,228,263,327,541,561,566,585,589,636,643,649,654,743,785,790,806,865,1037,1040,1090,1100,1104,1115,1134,1135,1166,1193,1275,1277,1292,1494,1523,1598,1638,1746,2055,2069,2191,2210,2358,2399,2486,2487,2562,2589,2613,2627,2656,2713,2720,2837,2839,2863,2888,2908,2920,2928,2929,2947,2948,2963,2969,2972
> >
> _______________________________________________
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to