System.nanoTime() is not affected by clock changes. Really everyone - this is 
simply not an issue in ZooKeeper. 

====================
Jordan Zimmerman

> On Dec 7, 2017, at 7:43 AM, Kathryn Hogg <[email protected]> wrote:
> 
> I'm pretty new to zookeeper but have a fair amount of experience with virtual 
> synchrony going back many years.  Even though time is relative, it is 
> possible that if the clock suddenly jumps forward on the server to 
> prematurely declare timeouts as expired.  I'm not sure how Zookeeper handles 
> that but in Isis, if 2 consecutive calls to gettimeofday had too large of a 
> difference, it considered it fishy.  
> 
> Of course, this is why we use ntp with adjtime to avoid clocks going 
> backwards or making large jumps forward.
> 
> -----Original Message-----
> From: Patrick Hunt [mailto:[email protected]] 
> Sent: Wednesday, December 06, 2017 5:18 PM
> To: UserZooKeeper <[email protected]>
> Subject: Re: Zookeeper session expiration
> 
> {External email message: This email is from an external source. Please 
> exercise caution prior to opening attachments, clicking on links, or 
> providing any sensitive information.}
> 
> What Jordan said + time use is only in the relative sense, not the absolute. 
> Session tracking (expiration) is relative to the start of leadership.
> 
> Patrick
> 
>> On Mon, Dec 4, 2017 at 12:21 PM, Jordan Zimmerman < 
>> [email protected]> wrote:
>> 
>> ZooKeeper, indeed, does not use wall clock time. It uses 
>> System.nanoTime() for most operations. Further, all operations go 
>> through the Leader node so only the Leader's notion of time matters. 
>> The Leader manages the session via a "SessionTracker" instance. The code is 
>> in SessionTrackerImpl.java.
>> There is a sessionExpiryQueue which is a kind of priority queue that 
>> returns expired sessions based on System.nanoTime().
>> 
>> -JZ
>> 
>>> On Dec 4, 2017, at 12:09 PM, Abraham Fine <[email protected]> wrote:
>>> 
>>> Hello Anthony and Shawn-
>>> 
>>> To the best of my knowledge ZooKeeper does not use the "wall clock" 
>>> time anywhere. So that should not be the problem.
>>> 
>>> Please consider enabling debug logging, which should allow you to 
>>> track the "pings".
>>> 
>>> Thanks,
>>> Abe
>>> 
>>>> On Mon, Dec 4, 2017, at 11:51, Anthony Shaya wrote:
>>>> Thanks Shawn, should I message the developer mailing list for a 
>>>> more definitive answer?
>>>> 
>>>> Thanks again for the reply.
>>>> 
>>>> -----Original Message-----
>>>> From: Shawn Heisey [mailto:[email protected]]
>>>> Sent: Monday, December 4, 2017 2:49 PM
>>>> To: [email protected]
>>>> Subject: Re: Zookeeper session expiration
>>>> 
>>>>> On 12/4/2017 8:22 AM, Anthony Shaya wrote:
>>>>> My question is related to how session expiration works, I noticed 
>>>>> on
>> many of the client machines the times across these machines were all 
>> off (by anywhere from 1 minute to 20 minutes - which was resolved 
>> after discovery - haven't verified this completely yet). Can this 
>> directly affect session expiration within the zookeeper cluster?
>>>>> 
>>>>>  *   I read the following in https://na01.safelinks.
>> protection.outlook.com/?url=https%3A%2F%2Fwiki.apache.org%
>> 2Fhadoop%2FZooKeeper%2FFAQ&data=02%7C01%7C%7C6d6643860a4e4a8194c808d53
>> b50 23ec%7Cc61157e903cb47589165ee7845cb0ca3%7C0%7C0%
>> 7C636480137750841475&sdata=RwGGH19FLeYFmXMrg5GBkSLJ65ANj1
>> EXkTvwyk6OLd4%3D&reserved=0 , "Expirations happens when the cluster 
>> does not hear from the client within the specified session timeout period 
>> (i.e.
>> no heartbeat).". So in some case it seems like if the times were wrong 
>> across the machines its possible one of the clients could of 
>> effectively sent a heart beat in the past (not sure about this tbh) 
>> and then the cluster expires the session?
>>>> 
>>>> I make these comments without any knowledge of what ZK code 
>>>> actually does.  I am a member of this list because I'm a 
>>>> representative of the Apache Solr project, which uses the ZK client 
>>>> in order to maintain a cluster.
>>>> 
>>>> IMHO, any software which makes actual decisions based on the 
>>>> timestamps in messages from another system is badly designed.  I 
>>>> would hope that
>> the
>>>> ZK designers know this, and always make any decisions related to 
>>>> time using the clock in the local system only.
>>>> 
>>>> If ZK's designers did the right thing, then a session timeout would 
>>>> indicate that quite literally no heartbeats were received in X 
>>>> seconds, as measured by the local clock, and the local clock ONLY 
>>>> ... NOT from timestamp information received from another system.
>>>> 
>>>> Although such a lack of communication could be caused by any number 
>>>> of things, including network hardware failure, one of the most 
>>>> common reasons I have seen for problems like this is extreme java 
>>>> garbage collection pauses in the client software.
>>>> 
>>>> Situations where the heap is a little bit too small can cause a 
>>>> java program to basically be doing garbage collection constantly, 
>>>> so it doesn't have much time to do anything else, like send 
>>>> heartbeats to ZK servers.
>>>> 
>>>> Situations where the heap is HUGE and garbage collection is not 
>>>> well tuned can lead to pauses of a minute or longer while Java does 
>>>> a massive full GC.
>>>> 
>>>>>  *   I don't have the zookeeper node log for the above time to see
>> what was going on in zookeeper when the cluster determined the session 
>> expired.
>>>>> 
>>>>>  *   Is there any additional logging I can turn on to troubleshoot zk
>> session expiration issues?
>>>> 
>>>> Hopefully your ZK clients also have logging.  Failing that, you 
>>>> could turn on GC logging for the software with the ZK client 
>>>> (assuming it's a Java client) and find a program or website that 
>>>> can examine the log and give you statistics or a graph of GC pauses.
>>>> 
>>>> If there is a problem in software using the client and whatever 
>>>> logging is available doesn't help you figure out what's wrong, 
>>>> you're generally going to need to talk to whoever wrote that 
>>>> software for help troubleshooting it.
>>>> 
>>>> Thanks,
>>>> Shawn
>>>> 
>>>> 
>>>> 
>>>> This message is intended exclusively for the individual or entity 
>>>> to which it is addressed. This communication may contain 
>>>> information that
>> is
>>>> proprietary, privileged, confidential or otherwise legally exempt 
>>>> from disclosure. If you are not the named addressee, or have been 
>>>> inadvertently and erroneously referenced in the address line, you 
>>>> are
>> not
>>>> authorized to read, print, retain, copy or disseminate this message 
>>>> or any part of it. If you have received this message in error, 
>>>> please
>> notify
>>>> the sender immediately by e-mail and delete all copies of the message.
>>>> (ID m031214)
>> 
>> 

Reply via email to