[ https://issues.apache.org/jira/browse/OAK-2682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14483105#comment-14483105 ]
Robert Munteanu commented on OAK-2682: -------------------------------------- [~egli] - I've looked into this issue briefly, as I'm intereted in contributing a patch. You mention in the issue description 'all nodes of the cluster'. I assume that you mean an Oak cluster, not a MongoDB cluster. When talking about clock skew in MongoDB, we actually have two situations: - replica sets - sharded clusters For replica sets, the different MongoDB instances are actually visible to the DocumentNodeStore as cluster members. For sharded clusters, Oak would connect only to a {{mongos}} instance. We can of course find out the shards from the config database, and connect separately to those {{mongod}} instances to run the {{serverStatus}} command, but I find it unnecessarily cumbersome. Furthermore, I see that MongoDB has its own clock skew detection for both replica sets ( each replica set member does this check ) and for clustered shards ( the monogos instances perform the check ). MongoDB is also tolerant of some clock skew, but not too much ( [Mongos throwing clock skew error?|https://groups.google.com/forum/#!topic/mongodb-user/SPi4Kqox16I]) . TBH I see this more of an operations issue rather than something that can/should be done into Oak and would rather suggesting dropping this. Thoughts? /CC [~chetanm], [~mreutegg] > Introduce time difference detection for mongoMk > ----------------------------------------------- > > Key: OAK-2682 > URL: https://issues.apache.org/jira/browse/OAK-2682 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: core, mongomk > Reporter: Stefan Egli > Fix For: 1.3.0 > > > Currently the lease mechanism in DocumentNodeStore/mongoMk is based on the > assumption that the clocks are in perfect sync between all nodes of the > cluster. The lease is valid for 60sec with a timeout of 30sec. If clocks are > off by too much, and background operations happen to take couple seconds, you > run the risk of timing out a lease. So introducing a check which WARNs if the > clocks in a cluster are off by too much (1st threshold, eg 5sec?) would help > increase awareness. Further drastic measure could be to prevent a startup of > Oak at all if the difference is for example higher than a 2nd threshold > (optional I guess, but could be 20sec?). -- This message was sent by Atlassian JIRA (v6.3.4#6332)