now I remember we had same kernel panic issue in the first week of D2 rolling-out. then AWS fixed it and we haven't seen any issue since. try Ubuntu 14.04 and see if it resolves your remaining kernel/instability issue.
On Tue, Jun 2, 2015 at 2:30 PM, Wes Chow <w...@chartbeat.com> wrote: > > Daniel Nelson <daniel.nel...@vungle.com> > June 2, 2015 at 4:39 PM > > On Jun 2, 2015, at 1:22 PM, Steven Wu <stevenz...@gmail.com> > <stevenz...@gmail.com> wrote: > > can you elaborate what kind of instability you have encountered? > > We have seen the nodes become completely non-responsive. Usually they get > rebooted automatically after 10-20 minutes, but occasionally they get stuck > for days in a state where they cannot be rebooted via the Amazon APIs. > > > Same here. It was worse right after d2 launch. We had 6 out of 9 servers > die within 10 hours after spinning them up. Amazon rolled out a fix, but > we're still seeing similar issues, though not nearly as bad. The first fix > was for something network related, and apparently sending lots of data > through the instances caused a kernel panic on the host. We have no > information yet about the current issue. > > Wes > > Steven Wu <stevenz...@gmail.com> > June 2, 2015 at 4:22 PM > Wes/Daniel, > > can you elaborate what kind of instability you have encountered? > > we are on Ubuntu 14.04.2 and haven't encountered any issues so far. in the > announcement, they did mention using Ubuntu 14.04 for better disk > throughput. not sure whether 14.04 also addresses any instability issue you > encountered or not. > > Thanks, > Steven > > In order to ensure the best disk throughput performance from your D2 instances > on Linux, we recommend that you use the most recent version of the Amazon > Linux AMI, or another Linux AMI with a kernel version of 3.8 or later. The > D2 instances provide the best disk performance when you use a Linux > kernel that supports Persistent Grants – an extension to the Xen block ring > protocol that significantly improves disk throughput and scalability. The > following Linux AMIs support this feature: > > - Amazon Linux AMI 2015.03 (HVM) > - Ubuntu Server 14.04 LTS (HVM) > - Red Hat Enterprise Linux 7.1 (HVM) > - SUSE Linux Enterprise Server 12 (HVM) > > > > > Daniel Nelson <daniel.nel...@vungle.com> > June 2, 2015 at 2:42 PM > > Do you have any workarounds for the d2 issues? We’ve been using them for > our Kafkas too, and ran into the instability. We’re on Ubuntu 12.04 and > plan to try on 14.04 with the latest HWE to see if that helps any. > > Thanks! > Wes Chow <w...@chartbeat.com> > June 2, 2015 at 1:39 PM > > We have run d2 instances with Kafka. They're currently unstable -- Amazon > confirmed a host issue with d2 instances that gets tickled by a Kafka > workload yesterday. Otherwise, it seems the d2 instance type is ideal as it > gets an enormous amount of disk throughput and you'll likely be network > bottlenecked. > > Wes > > > Steven Wu <stevenz...@gmail.com> > June 2, 2015 at 1:07 PM > EBS (network attached storage) has got a lot better over the last a few > years. we don't quite trust it for kafka workload. > > At Netflix, we were going with the new d2 instance type (HDD). our > perf/load testing shows it satisfy our workload. SSD is better in latency > curve but pretty comparable in terms of throughput. we can use the extra > space from HDD for longer retention period. > > On Tue, Jun 2, 2015 at 9:37 AM, Henry Cai <h...@pinterest.com.invalid> > <h...@pinterest.com.invalid> > >