AW: Leap sec

2015-05-18 Thread Stefan Podkowinski
This seems to be a good opportunity to dig a bit deeper into certain 
operational aspects of NTP. Some things to be aware of:

How NTP Operates [1]

It can take several minutes before ntpd will update the system time for the 
first time, while all processes are already started. This can be especially 
problematic for newly provisioned systems. You can speed up the time sync 
update by using the iburst keyword with the server configuration command [4].
It took 2-3 minutes on my testing VM without iburst to correct the time after 
startup. Definitely too long, as Cassandra would already have joined the 
cluster. With iburst it was only a few seconds.

Adjustments will be done by either by stepping or slewing the clock. This can 
happen forward and backwards(!). Stepping will set the corrected value right 
away. Slewing will make adjustments in small increments of at most 0.5ms/s by 
speeding the clock up or slowing it down. It will take at least 2000 seconds to 
adjust the clock by slewing 1 second.
* Time offsets  128ms (default) will be slewed
* Offsets  128ms will be stepped unless -x flag is used. The threshold value 
can be changed with a tinker option.
* Offsets  1000ms will cause ntpd to fail and expect the administrator to fix 
the issue (potential hardware error) unless the -g flag is used.

I think it’s fair to say that the –g options should be always set. I’m not 
fully sure about –x yet. Stepping the clock backwards is not a good idea of 
course. The best solution is probably to have –x set and create alerts on 
higher clock skews, that will prompt ops to resolve the situation manually.


Leap second awareness

Make sure your server is leap second aware in advance. You do not want to have 
the second corrected as part of a normal discrepancy detection process. Instead 
ntpd should be aware of the leap second in advance, so it can precisely 
schedule the adjustment.

There're two ways to make your ntpd instance aware of the upcoming leap second. 
This may happen though the upstream ntp server, which may propagate the leap 
second one day in advance. But this doesn't have to be the case. You need to 
find out if the server pool is configured correctly for this.
Another way to make your ntpd leap second aware is to use a custom file [2]. I 
had to modify the apparmor profile to make this work [3].

[1] http://doc.ntp.org/4.1.0/ntpd.htm
[2] http://support.ntp.org/bin/view/Support/ConfiguringNTP#Section_6.14.
[3] http://askubuntu.com/questions/571839/leapseconds-file-permission-denied
[4] http://doc.ntp.org/4.1.0/confopt.htm


Von: cass savy [mailto:casss...@gmail.com]
Gesendet: Freitag, 15. Mai 2015 19:25
An: user@cassandra.apache.org
Betreff: Leap sec

Just curious to know on how you are preparing Prod C* clusters for leap sec.

What are the workaorund other than upgrading kernel to 3.4+?
Are you upgrading clusters to Java 7 or higher on client and C* servers?



Re: Leap sec

2015-05-15 Thread Jim Witschey
 In addition, do I also have to upgrade to Java 7u60+ on C* servers as well.

Yes -- we observed C* nodes locking up when running under older
versions of the JDK.

Jim Witschey

Software Engineer in Test | jim.witsc...@datastax.com


Re: Leap sec

2015-05-15 Thread Tyler Hobbs
This post has some good advice for preparing for the leap second:
http://www.datastax.com/dev/blog/preparing-for-the-leap-second

On Fri, May 15, 2015 at 12:25 PM, cass savy casss...@gmail.com wrote:

 Just curious to know on how you are preparing Prod C* clusters for leap
 sec.

 What are the workaorund other than upgrading kernel to 3.4+?
 Are you upgrading clusters to Java 7 or higher on client and C* servers?





-- 
Tyler Hobbs
DataStax http://datastax.com/


Re: Leap sec

2015-05-15 Thread Jim Witschey
 This post has some good advice for preparing for the leap second:
 http://www.datastax.com/dev/blog/preparing-for-the-leap-second

Post author here. Let me know if you have any questions about it.

 What are the workaorund other than upgrading kernel to 3.4+?

A couple NTP-based workarounds are described here, in the Preventing
Issues and Workarounds section:

https://access.redhat.com/articles/199563

The goal of those workarounds is to prevent the leap second from being
applied all at once. I make no guarantees about them, just pointing
them out in case you wanted to investigate them yourself.

As noted in my post and the Red Hat article, it's probably best to
just upgrade your kernel/OS if you can.

 Are you upgrading clusters to Java 7 or higher on client and C* servers?

Just to be clear, JDK 7 had its own timer problems until 7u60, so you
should upgrade to or past that version.

Jim Witschey

Software Engineer in Test | jim.witsc...@datastax.com


Leap sec

2015-05-15 Thread cass savy
Just curious to know on how you are preparing Prod C* clusters for leap sec.

What are the workaorund other than upgrading kernel to 3.4+?
Are you upgrading clusters to Java 7 or higher on client and C* servers?


Re: Leap sec

2015-05-15 Thread cass savy
Are you suggesting the JDK 7 for client/driver and c* side as well. We use
Java driver 2.1.4 and plan to goto JDK7u80 on application end. In addition,
do I also have to upgrade to Java 7u60+ on C* servers as well.

On Fri, May 15, 2015 at 1:06 PM, Jim Witschey jim.witsc...@datastax.com
wrote:

  This post has some good advice for preparing for the leap second:
  http://www.datastax.com/dev/blog/preparing-for-the-leap-second

 Post author here. Let me know if you have any questions about it.

  What are the workaorund other than upgrading kernel to 3.4+?

 A couple NTP-based workarounds are described here, in the Preventing
 Issues and Workarounds section:

 https://access.redhat.com/articles/199563

 The goal of those workarounds is to prevent the leap second from being
 applied all at once. I make no guarantees about them, just pointing
 them out in case you wanted to investigate them yourself.

 As noted in my post and the Red Hat article, it's probably best to
 just upgrade your kernel/OS if you can.

  Are you upgrading clusters to Java 7 or higher on client and C* servers?

 Just to be clear, JDK 7 had its own timer problems until 7u60, so you
 should upgrade to or past that version.

 Jim Witschey

 Software Engineer in Test | jim.witsc...@datastax.com