Hi guys, We have developed a new staging environment after this has happened. It gives us an ability to test the new features more accurately before deploying them live.
Nick 2009/7/14 Esé <[email protected]>: > > hey folks, > > would love to get an update on this as well. it's a little terrifying > to hear about rogue dev scalr processes killing production farms. are > there safeguards in place now to prevent this kind of thing happening? > > hoping for a speedy response, thanks! > > E. > > On Jul 6, 11:39 am, rainier2 <[email protected]> wrote: >> Hey, just looking for a little closure here. >> >> Was this a newly deployed production poller, or what it the dev poller >> that broke out of the dev sandbox? >> >> Has Scalr.net taken any actions to prevent a similar problem in the >> future? >> >> Thanks! >> >> On May 7, 12:08 pm, Cole <[email protected]> wrote: >> >> > Woa, this is kind of a deal breaker here! Did this really happen? >> > Rightscale's seeming quite cost-effective now if this is the case! >> >> > On May 7, 10:30 am, Niv <[email protected]> wrote: >> >> > > and i have to add that the cause of the major data loss is your no- >> > > good way of doing the snapshots. once a snapshot creation starts, the >> > > older snapshot is immediately corrupt. >> > > your human error caused my instances to crash mid-snapshot creation >> > > and when restarted, the servers failed to download the snapshot and >> > > kept terminating. >> > > this bug was submitted more than six months ago and you've done >> > > absolutely nothing to fix it. >> >> > > On May 7, 5:22 pm, Niv <[email protected]> wrote: >> >> > > > ruined my day & upcoming weekend + major data loss + ~20 extra >> > > > instances running for several hours doing nothing. yay. >> >> > > > On May 7, 5:11 pm, Alex Kovalyov <[email protected]> wrote: >> >> > > > > Martin, it was a user error on Scalr.net side.Devversion ofpoller >> > > > > has gone nuts and selectively terminated instances on few farms >> > > > > before it was killed. >> >> > > > > On 7 май, 11:10, Martin Sweeney <[email protected]> wrote: >> >> > > > > > So my farm decided to crash this morning, all backups and database >> > > > > > bundles worked fine and another set of instances are in its place. >> > > > > > Hurrah! >> >> > > > > > What concerns me is why all four instances decided to crash within >> > > > > > 3 >> > > > > > minutes of each other. They're not connected by anything other than >> > > > > > connections to databases and memcache servers etc, but they all >> > > > > > went >> > > > > > at once. >> >> > > > > > Instance 'i-46f94bxx' found in database but not found on EC2. >> > > > > > Crashed. >> > > > > > Instance 'i-9a9014xx' found in database but not found on EC2. >> > > > > > Crashed. >> > > > > > Instance 'i-29009axx' found in database but not found on EC2. >> > > > > > Crashed. >> > > > > > Instance 'i-27a2ccxx' found in database but not found on EC2. >> > > > > > Crashed. >> >> > > > > > Is there anywhere I can find more info on this other than my Event >> > > > > > log? >> >> > > > > > M. > > > --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "scalr-discuss" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/scalr-discuss?hl=en -~----------~----~----~----~------~----~------~--~---
