yes, you are right. we could do this. it turns out that the expiration
code is very simple:
while (running) {
currentTime = System.currentTimeMillis();
if (nextExpirationTime > currentTime) {
this.wait(nextExpirationTime - currentTime);
continue;
}
SessionSet set;
set = sessionSets.remove(nextExpirationTime);
if (set != null) {
for (SessionImpl s : set.sessions) {
sessionsById.remove(s.sessionId);
expirer.expire(s);
}
}
nextExpirationTime += expirationInterval;
}
so we can detect a jump very easily: if nextExpirationTime >
currentTime, we have jumped ahead in time.
now the question is, what do we do with this information?
option 1) we could figure out the jump (nextExpirationTime-currentTime
is a good estimate) and move all of the sessions forward by that amount.
option 2) we could converge on the time by having a policy to always
wait at least a half a tick time.
there probably are other options as well. i kind of like option 2. worst
case is it will make the sessions expire in half the time that they
should, but this shouldn't be too much of a problem since clients send a
ping if they are idle for 1/3 of their session timeout.
ben
On 08/19/2010 08:39 AM, Ted Dunning wrote:
True. But it knows that there has been a jump.
Quiet time can be distinguished from clock shift by assuming that members of
the cluster
don't all jump at the same time.
I would imagine that a "recent clock jump" estimate could be kept and
buckets that would
otherwise expire due to such a jump could be given a bit of a second lease
on life, delaying
all of their expiration. Since time-outs are relatively short, the server
would be able to forget
about the bump very shortly.
On Thu, Aug 19, 2010 at 8:22 AM, Benjamin Reed<br...@yahoo-inc.com> wrote:
if we try to use network messages to detect and correct the situation, it
seems like we would recreate the problem we are having with ntp, since that
is exactly what it does.