Re: F-ckin Leap Seconds, how do they work?

2012-06-30 Thread Roy
Talk about people not testing things, leap seconds have been around since 1961. There have been nine leap seconds in the last twenty years. Any system that can't handle a leap second is seriously flawed.

Re: F-ckin Leap Seconds, how do they work?

2012-06-30 Thread Chuck Anderson
Same here with KVM guests on Scientific Linux 6 (RHEL 6 clone) hosts. No issues on SL 6 and CentOS 5 guests. We also do not run NTP on the VMs, only on the hosts. The guest VM kernels did not log any leap second clock change, but appear to have the same time as the hosts. The hosts DID have issu

Re: F-ckin Leap Seconds, how do they work?

2012-06-30 Thread Derek Ivey
We haven't had any issues with any of our VMs. We run several of our own Java/Tomcat apps, Jira, and Confluence on a mixture of Solaris and CentOS 5 and 6. We do not run NTP on our VMs though; instead, we rely on VMware Tools to sync the VMs' time with the ESXi hosts. The ESXi hosts run NTP.

RE: F-ckin Leap Seconds, how do they work?

2012-06-30 Thread George Bonser
> > > > Anything with java running seems hit. > > We just finished up a firm round of reboots... :( > > > > Recent Ubuntu boxes and RHES 6... all the same ... > > > > Bye, > > Raymond. > > > > > > Yeah, in the process of doing the same. > > http://news.ycombinator.com/item?id=4183122 > > Might t

RE: F-ckin Leap Seconds, how do they work?

2012-06-30 Thread George Bonser
> > Anything with java running seems hit. > We just finished up a firm round of reboots... :( > > Recent Ubuntu boxes and RHES 6... all the same ... > > Bye, > Raymond. > > Yeah, in the process of doing the same. http://news.ycombinator.com/item?id=4183122 Might try this for machines with

Re: F-ckin Leap Seconds, how do they work?

2012-06-30 Thread Raymond Dijkxhoorn
Hi! Drive Slow Paul Not very well if you have a modern box (RHES/CentOS 6) and Java apps running on them. RHES/CentOS 5 merrily ignored it. Worse, just bouncing the Java stack didn't fix it, it required the box to be rebooted. A sizeable number of annoyed sysadmins tweeting about it this

Re: F-ckin Leap Seconds, how do they work?

2012-06-30 Thread Paul Graydon
On 6/30/2012 3:16 PM, Paul WALL wrote: Comments? Drive Slow Paul Not very well if you have a modern box (RHES/CentOS 6) and Java apps running on them. RHES/CentOS 5 merrily ignored it. Worse, just bouncing the Java stack didn't fix it, it required the box to be rebooted. A sizeable numbe

Re: F-ckin Leap Seconds, how do they work?

2012-06-30 Thread Donald Eastlake
See International Earth Rotation Service, http://www.iers.org/, particularly http://data.iers.org/products/6/15003/orig/bulletina-xxv-026.txt Thanks, Donald =  Donald E. Eastlake 3rd   +1-508-333-2270 (cell)  155 Beaver Street, Milford, MA 01757 USA  d3e...@gmail.com O

Re: F-ckin Leap Seconds, how do they work?

2012-06-30 Thread Robert Bonomi
> From: Paul WALL > Subject: F-ckin Leap Seconds, how do they work? > > Comments? Addressing the Subject question, _as_asked_ -- "Very well". *SNORT* Mechanically, instead of rolling over from second #59 to second #0 of the next minute, it goes 59->60->0. The 'why' is to keep terrestrial c

Re: F-ckin Leap Seconds, how do they work?

2012-06-30 Thread Jimmy Hess
http://support.ntp.org/bin/view/Support/ConfiguringNTP#Section_6.14. On 6/30/12, Paul WALL wrote: > Comments? > > Drive Slow > Paul > > -- -JH

F-ckin Leap Seconds, how do they work?

2012-06-30 Thread Paul WALL
Comments? Drive Slow Paul

Re: FYI Netflix is down

2012-06-30 Thread Brett Frankenberger
On Sat, Jun 30, 2012 at 01:19:54PM -0700, Scott Howard wrote: > On Sat, Jun 30, 2012 at 12:04 PM, Todd Underwood wrote: > > > This was not a cascading failure. It was a simple power outage > > > > Cascading failures involve interdependencies among components. > > > > Not always. Cascading failu

Re: [c-nsp] NTP Servers

2012-06-30 Thread Jimmy Hess
On 6/30/12, Grant Ridder wrote: > I don't understand why anyone would use windows server for anything that > needed precision like time. Probably because they realize that in a Windows domain, their domain controllers already provide a SNTP service with the Windows NT PDC Emulator providing autho

Re: [c-nsp] NTP Servers

2012-06-30 Thread Peter Kristolaitis
You could have saved yourself a bit of typing by leaving off the last 5 words of that sentence. ;) - Pete On 6/30/2012 6:42 PM, Grant Ridder wrote: I don't understand why anyone would use windows server for anything that needed precision like time. On Sat, Jun 30, 2012 at 5:39 PM, Keith Med

Re: [c-nsp] NTP Servers

2012-06-30 Thread Grant Ridder
I don't understand why anyone would use windows server for anything that needed precision like time. On Sat, Jun 30, 2012 at 5:39 PM, Keith Medcalf wrote: > > > Or you can ask the it guys to use a windows server... Eg: > > > > http://support.microsoft.com/kb/816042 > > That is a joke Jared? You

RE: [c-nsp] NTP Servers

2012-06-30 Thread Keith Medcalf
> Or you can ask the it guys to use a windows server... Eg: > > http://support.microsoft.com/kb/816042 That is a joke Jared? You left off the smiley. Windows doesn't do NTP out-of-the-box (Microsoft assertions to the contrary notwithstanding). You can build a reasonably working standard daemo

Anyone from ATT?

2012-06-30 Thread TR Shaw
Please contact me off list. I have problems with our equipment on these two ATT netblocks communicating between one another. AT&T Services, Inc. ATT (NET-12-0-0-0-1) 12.0.0.0 - 12.255.255.255 CFWN Pool ATTCT-NMPL20 ATTW-042909163717 (NET-12-88-176-0-1) 12.88.176.0 - 12.88.191.255 and NetRang

RE: [c-nsp] NTP Servers

2012-06-30 Thread Keith Medcalf
> those. The beauty of most appliances is that they're easy to manage. If it > fails, download the latest ISO from company, burn it, boot appliance, > restore it and you're back in business in an hour or so. Keep in mind a > linux kernel running just ntpd and some management necessities like ss

Re: FYI Netflix is down

2012-06-30 Thread Mike Devlin
On Sat, Jun 30, 2012 at 5:04 PM, Bryan Horstmann-Allen < b...@mirrorshades.net> wrote: > > Have a look at Asgard, the AWS management tool they just open sourced. It > implies they rely very heavily on many AWS features, some of which are very > much region specific. > > As to their multi-region ca

Re: FYI Netflix is down

2012-06-30 Thread Bryan Horstmann-Allen
+-- | On 2012-06-30 16:55:53, Mike Devlin wrote: | | But in netflix case, if they architected their environment the way they | said they did, why wouldnt they just fail over to us-west? especially at | their scale, I would

Re: FYI Netflix is down

2012-06-30 Thread Mike Devlin
On Sat, Jun 30, 2012 at 4:45 PM, Bryan Horstmann-Allen < b...@mirrorshades.net> wrote: > Explain Netflix and Heroku last night. Both of whom architect across > multiple > AZs and have for many years. > > The API and EBS across the region were also affected. ELB was _also_ > affected > across the r

Re: FYI Netflix is down

2012-06-30 Thread Bryan Horstmann-Allen
+-- | On 2012-06-30 16:08:40, Rayson Ho wrote: | | If I recall correctly, availability zone (AZ) mappings are specific to | an AWS account, and in fact there is no way to know if you are running | in the same AZ as another

Re: FYI Netflix is down

2012-06-30 Thread Todd Underwood
scott, >> >> This was not a cascading failure.  It was a simple power outage >> >> Cascading failures involve interdependencies among components. > > > Not always.  Cascading failures can also occur when there is zero dependency > between components.  The simplest form of this is where one environ

Re: FYI Netflix is down

2012-06-30 Thread Scott Howard
On Sat, Jun 30, 2012 at 12:04 PM, Todd Underwood wrote: > This was not a cascading failure. It was a simple power outage > > Cascading failures involve interdependencies among components. > Not always. Cascading failures can also occur when there is zero dependency between components. The simp

Re: FYI Netflix is down

2012-06-30 Thread Jared Mauch
The interesting thing to me is the us population by time zone. If amazon has 70% of servers in the eastern time zone it makes some sense. Mountain + pacific is smaller than central, which is a bit more than half eastern. These stats are older but a good rough gauge: http://answers.google.com/a

Re: FYI Netflix is down

2012-06-30 Thread Randy Bush
> Sorry to be the monday morning quarterback, but the sites that went > down learned a valuable lesson in single point of failure analysis. as this has happened more than once before, i am less optimistic. or maybe they decided the spof risk was not worth the avoidance costs. randy

Re: FYI Netflix is down

2012-06-30 Thread Rayson Ho
If I recall correctly, availability zone (AZ) mappings are specific to an AWS account, and in fact there is no way to know if you are running in the same AZ as another AWS account: http://aws.amazon.com/ec2/faqs/#How_can_I_make_sure_that_I_am_in_the_same_Availability_Zone_as_another_developer Al

Re: FYI Netflix is down

2012-06-30 Thread Seth Mattinen
On 6/30/12 12:04 PM, Todd Underwood wrote: > This was not a cascading failure. It was a simple power outage > > Cascading failures involve interdependencies among components. > I guess I'm assuming there were UPS and generator systems involved (and failing) with powering the critical load, but

Re: FYI Netflix is down

2012-06-30 Thread Mike Devlin
The last 2 Amazon outages were power issues isolated to just there us-east Virginia data center. I read somewhere that Amazon has something like 70% of their ec2 resources in Virginia and its also their oldest ec2 datacenter..so I am guessing they learned a lot of lessons and are stuck with an aged

Re: FYI Netflix is down

2012-06-30 Thread Jimmy Hess
On 6/30/12, Todd Underwood wrote: > This was not a cascading failure. It was a simple power outage > Cascading failures involve interdependencies among components. Actually, you can't really say that. It's true that it was a simple power outage for Amazon. Power failed, causing the AWS service

Re: FYI Netflix is down

2012-06-30 Thread Todd Underwood
This was not a cascading failure. It was a simple power outage Cascading failures involve interdependencies among components. T On Jun 30, 2012 2:21 PM, "Seth Mattinen" wrote: > On 6/30/12 9:25 AM, Todd Underwood wrote: > > > > On Jun 30, 2012 11:23 AM, "Seth Mattinen" >

Re: FYI Netflix is down

2012-06-30 Thread Seth Mattinen
On 6/30/12 9:25 AM, Todd Underwood wrote: > > On Jun 30, 2012 11:23 AM, "Seth Mattinen" > wrote: >> >> >> But haven't they all been cascading failures? > > No. They have not. That's not what that term means. > > 'Cascading failure' has a fairly specific meaning that

Re: FYI Netflix is down

2012-06-30 Thread Jimmy Hess
On 6/30/12, Todd Underwood wrote: > On Jun 30, 2012 11:23 AM, "Seth Mattinen" wrote: >> But haven't they all been cascading failures? > No. They have not. That's not what that term means. > > 'Cascading failure' has a fairly specific meaning that doesn't imply > resilience in the face of decomp

Re: FYI Netflix is down

2012-06-30 Thread Todd Underwood
On Jun 30, 2012 11:23 AM, "Seth Mattinen" wrote: > > > But haven't they all been cascading failures? No. They have not. That's not what that term means. 'Cascading failure' has a fairly specific meaning that doesn't imply resilience in the face of decomposition into smaller parts. Cascading f

Re: FYI Netflix is down

2012-06-30 Thread Roy
On 6/30/2012 12:11 AM, Tyler Haske wrote: I am not a computer science guy but been around a long time. Data centers and clouds are like software. Once they reach a certain size, its impossible to keep the bugs out. You can test and test your heart out and something will slip by. You can say t

Re: FYI Netflix is down

2012-06-30 Thread Seth Mattinen
On 6/30/12 4:50 AM, Justin M. Streiner wrote: > On Sat, 30 Jun 2012, jamie rishaw wrote: > >> you know what's happening even more? >> >> ..Amazon not learning their lesson. > > I was not giving anyone a free pass or attempting to shrug off the > outage. I was just stating that there are many reas

Re: FYI Netflix is down

2012-06-30 Thread Jimmy Hess
On 6/30/12, Cameron Byrne wrote: > On Jun 30, 2012 12:25 AM, "joel jaeggli" wrote: >> On 6/30/12 12:11 AM, Tyler Haske wrote: > Geo-redundancy is key. In fact, i would take distributed data centers over > RAID, UPS, or any other "fancy pants" © mechanisms any day. Geo-redundancy is more expensiv

Re: FYI Netflix is down

2012-06-30 Thread Cameron Byrne
On Jun 30, 2012 12:25 AM, "joel jaeggli" wrote: > > On 6/30/12 12:11 AM, Tyler Haske wrote: >>> >>> I am not a computer science guy but been around a long time. Data centers >>> and clouds are like software. Once they reach a certain size, its >>> impossible to keep the bugs out. You can test a

Re: FYI Netflix is down

2012-06-30 Thread Jimmy Hess
On 6/30/12, Grant Ridder wrote: > well one would think that they could at least get power redundancy right... It is very similar to suggesting redundancy within a site against building collapse. Reliable power redundancy is very hard and very expensive.Much harder and much more expensive th

Re: FYI Netflix is down

2012-06-30 Thread Justin M. Streiner
On Sat, 30 Jun 2012, jamie rishaw wrote: you know what's happening even more? ..Amazon not learning their lesson. I was not giving anyone a free pass or attempting to shrug off the outage. I was just stating that there are many reasons why things break. I haven't seen anything official on

Re: FYI Netflix is down

2012-06-30 Thread Lynda
On 6/30/2012 12:11 AM, Tyler Haske wrote: > On 6/29/2012 11:07 PM, Roy wrote: I am not a computer science guy but been around a long time. Data centers and clouds are like software. Once they reach a certain size, its impossible to keep the bugs out. You can test and test your heart out and so

Re: FYI Netflix is down

2012-06-30 Thread joel jaeggli
On 6/30/12 12:11 AM, Tyler Haske wrote: I am not a computer science guy but been around a long time. Data centers and clouds are like software. Once they reach a certain size, its impossible to keep the bugs out. You can test and test your heart out and something will slip by. You can say the

Re: FYI Netflix is down

2012-06-30 Thread Andrew D Kirch
On 6/30/2012 3:11 AM, Tyler Haske wrote: How to run a datacenter 101. Have more then one location, preferably far apart. It being Amazon I would expect more. :/ Based on? Clouds are nothing more than outsourced responsibility. My business has stopped while my IT department explains to me tha

Re: FYI Netflix is down

2012-06-30 Thread Tyler Haske
> I am not a computer science guy but been around a long time.  Data centers > and clouds are like software.  Once they reach a certain size, its > impossible to keep the bugs out.  You can test and test your heart out and > something will slip by.  You can say the same thing about nuclear reactors