Hi,
In our testing of Red Hat Cluster, we could reproduce the NTP impact by
jumping the clock backwards and forwards, just using the date command in a
tight-ish loop:
use strict;
my $dir = 1;
while (1) {
jump_time( $dir );
$dir = $dir * -1;
}
sub jump_time {
my ($dir) = @_;
my
i put up a patch that should address the problem. now i need to write a
test case. the only way i can think of is to change the call to
System.currentTimeMillis to a utility class that calls
System.currentTimeMillis that i can mock for testing. any better ideas?
ben
On 08/19/2010 03:53 PM,
Mocking the time via a utility was my thought. Mocking system itself
is scary.
Sent from my iPhone
On Aug 20, 2010, at 1:18 PM, Benjamin Reed br...@yahoo-inc.com wrote:
i put up a patch that should address the problem. now i need to
write a test case. the only way i can think of is to
You can always increase your timeouts a bit.
On Thu, Aug 19, 2010 at 12:52 AM, Qing Yan qing...@gmail.com wrote:
Oh.. our servers are also running in a virtualized environment.
On Thu, Aug 19, 2010 at 2:58 PM, Martin Waite waite@gmail.com wrote:
Hi,
I have tripped over similar
Hi,
I remember Ben had opened a jira for clock jumps earlier:
https://issues.apache.org/jira/browse/ZOOKEEPER-366. It is not uncommon to
have clocks jump forward in virtualized environments.
It is desirable to modify ZooKeeper to handle this situation (as much as
possible) internally. It would
Another option would be for the cluster to compare times and note when one
member seems to be lagging. Restoration of that
lag would then be less remarkable.
I believe that the pattern of these problems is a slow slippage behind and a
sudden jump forward.
On Thu, Aug 19, 2010 at 7:51 AM, Vishal
Hi,
I'm not sure if you mean the timers I was on about earlier. If so,
http://linux.die.net/man/3/clock_gettime
Sufficiently recent versions of GNU libc and the Linux kernel support the
following clocks:
...
*CLOCK_MONOTONIC* Clock that cannot be set and represents monotonic time
since some
True. But it knows that there has been a jump.
Quiet time can be distinguished from clock shift by assuming that members of
the cluster
don't all jump at the same time.
I would imagine that a recent clock jump estimate could be kept and
buckets that would
otherwise expire due to such a jump
yes, you are right. we could do this. it turns out that the expiration
code is very simple:
while (running) {
currentTime = System.currentTimeMillis();
if (nextExpirationTime currentTime) {
this.wait(nextExpirationTime -
Nice (modulo inverting the in your text).
Option 2 seems very simple. That always attracts me.
On Thu, Aug 19, 2010 at 9:19 AM, Benjamin Reed br...@yahoo-inc.com wrote:
yes, you are right. we could do this. it turns out that the expiration code
is very simple:
while (running) {
Hi Ted,
I haven't give it a serious thought yet, but I don't think it is neccessary
for the cluster to keep track of time.
A node can make its own decision. For the sake of argument, lets say that we
have a client and a server with following policy:
1. Client is supposed to send a ping to server
if we can't rely on the clock, we cannot say things like if ... for 5
seconds.
also, clients connect to servers, not visa-versa, so we cannot say
things like server can attempt to reconnect.
ben
On 08/19/2010 10:17 AM, Vishal K wrote:
Hi Ted,
I haven't give it a serious thought yet, but I
Hi Ben,
Comments inline..
On Thu, Aug 19, 2010 at 5:33 PM, Benjamin Reed br...@yahoo-inc.com wrote:
if we can't rely on the clock, we cannot say things like if ... for 5
seconds.
if ... for 5 seconds indicates the timeout give by the socket library.
After the timeout we can verify that the
Ben's approach is really simpler. The client already sends keep-alive
messages and we know that
some have gone missing or a time shift has happened. Those two
possibilities are cleanly distinguished
by Ben's suggestion of comparing current time to the bucket expiration. If
current time is
i'm updating ZOOKEEPER-366 with this discussion and try to get a patch
out. Qing (or anyone else, can you reproduce it pretty easily?)
thanx
ben
On 08/19/2010 09:29 AM, Ted Dunning wrote:
Nice (modulo inverting the in your text).
Option 2 seems very simple. That always attracts me.
On
Put in a four letter command that will put the server to sleep for 15
seconds!
:-)
On Thu, Aug 19, 2010 at 3:51 PM, Benjamin Reed br...@yahoo-inc.com wrote:
i'm updating ZOOKEEPER-366 with this discussion and try to get a patch out.
Qing (or anyone else, can you reproduce it pretty easily?)
Hi,
The testcase is fairly simple. We have a client which connects to ZK,
registers an ephemeral node and watches on it. Now change the client
machine's time - session killed..
Here is the log:
*2010-08-18 04:24:57,782 INFO
com.taobao.timetunnel2.cluster.service.AgentService: Host name
If NTP is changing your time by more than a few milliseconds then you have
other problems (big ones).
On Wed, Aug 18, 2010 at 1:04 AM, Qing Yan qing...@gmail.com wrote:
I guess ZK might rely on timestamp to keep sessions alive, but we have
NTP daemon running so machine time can get changed
18 matches
Mail list logo