>From what I understand, the leap second bug could've hit anytime in the 24 hours before 23:59:59. We had it start happening early afternoon Sat on a few of our boxes.
Norbert On Mon, Jul 2, 2012 at 12:58 PM, Kevin O'dell <kevin.od...@cloudera.com>wrote: > How recently would you say this is happening? Did this start last Sat > around midnight? > > On Mon, Jul 2, 2012 at 11:50 AM, Nicolas Thiébaud > <nico...@captaindash.com> wrote: > > Hi, > > > > We have been successfully running a cdh3 HBase cluster on c1.xlarge > > instances for over a month, but we recently started hitting what looks > like > > connectivity issues in the clusters. Zookeeper sessions are expired by > the > > zk server and the region servers throw a YouAreDeadException before > > crashing. > > > > Could this be imputed to the gc ? Is there anything I can do about it ? I > > am monitoring the Ganglia metrics but am unsure of their semantics (where > > can I find it?). > > > > I know that running hbase on ec2 is advised against, but we really need > to > > get this working. > > > > Thanks, > > > > Nicolas. > > > > ZooKeeper log: http://pastebin.com/bVjrkRSL > > RegionServer log: http://pastebin.com/fU81d8hr > > > > -- > Kevin O'Dell > Customer Operations Engineer, Cloudera >