So I'm not clear as to where things stand now. Are there rolling snapshots or not?
On Jul 14, 9:04 am, Nickolas Toursky <[email protected]> wrote: > Hi guys, > > We have developed a new staging environment after this has happened. > It gives us an ability to test the new features more accurately before > deploying them live. > > Nick > > 2009/7/14 Esé <[email protected]>: > > > > > > > hey folks, > > > would love to get an update on this as well. it's a little terrifying > > to hear about rogue dev scalr processes killing production farms. are > > there safeguards in place now to prevent this kind of thing happening? > > > hoping for a speedy response, thanks! > > > E. > > > On Jul 6, 11:39 am, rainier2 <[email protected]> wrote: > >> Hey, just looking for a little closure here. > > >> Was this a newly deployed production poller, or what it the dev poller > >> that broke out of the dev sandbox? > > >> Has Scalr.net taken any actions to prevent a similar problem in the > >> future? > > >> Thanks! > > >> On May 7, 12:08 pm, Cole <[email protected]> wrote: > > >> > Woa, this is kind of a deal breaker here! Did this really happen? > >> > Rightscale's seeming quite cost-effective now if this is the case! > > >> > On May 7, 10:30 am, Niv <[email protected]> wrote: > > >> > > and i have to add that the cause of the major data loss is your no- > >> > > good way of doing the snapshots. once a snapshot creation starts, the > >> > > older snapshot is immediately corrupt. > >> > > your human error caused my instances to crash mid-snapshot creation > >> > > and when restarted, the servers failed to download the snapshot and > >> > > kept terminating. > >> > > this bug was submitted more than six months ago and you've done > >> > > absolutely nothing to fix it. > > >> > > On May 7, 5:22 pm, Niv <[email protected]> wrote: > > >> > > > ruined my day & upcoming weekend + major data loss + ~20 extra > >> > > > instances running for several hours doing nothing. yay. > > >> > > > On May 7, 5:11 pm, Alex Kovalyov <[email protected]> wrote: > > >> > > > > Martin, it was a user error on Scalr.net side.Devversion ofpoller > >> > > > > has gone nuts and selectively terminated instances on few farms > >> > > > > before it was killed. > > >> > > > > On 7 май, 11:10, Martin Sweeney <[email protected]> wrote: > > >> > > > > > So my farm decided to crash this morning, all backups and > >> > > > > > database > >> > > > > > bundles worked fine and another set of instances are in its > >> > > > > > place. > >> > > > > > Hurrah! > > >> > > > > > What concerns me is why all four instances decided to crash > >> > > > > > within 3 > >> > > > > > minutes of each other. They're not connected by anything other > >> > > > > > than > >> > > > > > connections to databases and memcache servers etc, but they all > >> > > > > > went > >> > > > > > at once. > > >> > > > > > Instance 'i-46f94bxx' found in database but not found on EC2. > >> > > > > > Crashed. > >> > > > > > Instance 'i-9a9014xx' found in database but not found on EC2. > >> > > > > > Crashed. > >> > > > > > Instance 'i-29009axx' found in database but not found on EC2. > >> > > > > > Crashed. > >> > > > > > Instance 'i-27a2ccxx' found in database but not found on EC2. > >> > > > > > Crashed. > > >> > > > > > Is there anywhere I can find more info on this other than my > >> > > > > > Event > >> > > > > > log? > > >> > > > > > M. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "scalr-discuss" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/scalr-discuss?hl=en -~----------~----~----~----~------~----~------~--~---
