Nice (modulo inverting the < in your text). Option 2 seems very simple. That always attracts me.
On Thu, Aug 19, 2010 at 9:19 AM, Benjamin Reed <br...@yahoo-inc.com> wrote: > yes, you are right. we could do this. it turns out that the expiration code > is very simple: > > while (running) { > currentTime = System.currentTimeMillis(); > if (nextExpirationTime > currentTime) { > this.wait(nextExpirationTime - currentTime); > continue; > } > SessionSet set; > set = sessionSets.remove(nextExpirationTime); > if (set != null) { > for (SessionImpl s : set.sessions) { > sessionsById.remove(s.sessionId); expirer.expire(s); > } > } > nextExpirationTime += expirationInterval; > } > > so we can detect a jump very easily: if nextExpirationTime > currentTime, > we have jumped ahead in time. > > now the question is, what do we do with this information? > > option 1) we could figure out the jump (nextExpirationTime-currentTime is a > good estimate) and move all of the sessions forward by that amount. > option 2) we could converge on the time by having a policy to always wait > at least a half a tick time. > > there probably are other options as well. i kind of like option 2. worst > case is it will make the sessions expire in half the time that they should, > but this shouldn't be too much of a problem since clients send a ping if > they are idle for 1/3 of their session timeout. > > ben > > > On 08/19/2010 08:39 AM, Ted Dunning wrote: > >> True. But it knows that there has been a jump. >> >> Quiet time can be distinguished from clock shift by assuming that members >> of >> the cluster >> don't all jump at the same time. >> >> I would imagine that a "recent clock jump" estimate could be kept and >> buckets that would >> otherwise expire due to such a jump could be given a bit of a second lease >> on life, delaying >> all of their expiration. Since time-outs are relatively short, the server >> would be able to forget >> about the bump very shortly. >> >> On Thu, Aug 19, 2010 at 8:22 AM, Benjamin Reed<br...@yahoo-inc.com> >> wrote: >> >> >> >>> if we try to use network messages to detect and correct the situation, it >>> seems like we would recreate the problem we are having with ntp, since >>> that >>> is exactly what it does. >>> >>> >>> >> >