[ https://issues.apache.org/jira/browse/OAK-3238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Stefan Egli updated OAK-3238: ----------------------------- Fix Version/s: 1.3.5 > fine tune clock-sync check vs lease-check settings > -------------------------------------------------- > > Key: OAK-3238 > URL: https://issues.apache.org/jira/browse/OAK-3238 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: core > Affects Versions: 1.3.4 > Reporter: Stefan Egli > Fix For: 1.3.5 > > > There are now two components that try to assure 'discovery-lite' (OAK-2844) > is reporting a coherent cluster view to the upper layers: > * OAK-2682 : time difference detection: by default fails if clock is off by > more than 2 seconds at startup. That results in a 4 sec max margin in a > document-cluster > * OAK-2739 : lease-checking: every instance checks if the local lease is > valid upon any document access. This check is done against the actual > 'leaseEndTime' - which is updated every (by default) 30 seconds to be valid > for (by default) another 60 seconds. > These two factors combined, in the worst case you could still end up having > that 4 second time window where the local instance fails to update the lease > (eg lease-thread dies) but it considers itself still owning a valid lease - > while a remote instance might be those 4 seconds off and considers the lease > as timed out. > So overall: the 3 factors 'lease duration', 'lease update frequency' and > 'maximum allowed clock difference' must be better tuned to end up in a stable > mechanism. > Suggestion: > * increase the 'lease duration' to be 3 x 'lease update frequency', ie 90sec > lease duration > * reduce the lease check failure limit from 'lease duration' to 2x 'lease > update frequency' - assuming that one 'lease update interval' is way larger > than the 'maximum allowed clock difference' -- This message was sent by Atlassian JIRA (v6.3.4#6332)