On Fri, Sep 6, 2013 at 6:01 AM, Ryan Lane <rl...@wikimedia.org> wrote:
> On Fri, Sep 6, 2013 at 5:46 AM, Maarten Dammers <maar...@mdammers.nl>wrote: > >> Hi Ryan, >> >> Op 4-9-2013 23:38, Ryan Lane schreef: >> >> During wikimania I was cleaning up some base images that were eating up >>> a large amount of disk space and caused an issue on virt11 that requires a >>> reboot. This will cause a reboot of about 45 instances. Here's a list of >>> the instances that will be affected: >>> >>> <https://wikitech.wikimedia.**org/w/index.php?title=Special:** >>> Ask&q=[[Resource+Type%3A%**3Ainstance]][[Instance+Host%** >>> 3A%3Avirt11]]&p=format%**3Dbroadtable%2Flink%3Dall%**2Fheaders%3Dshow%** >>> 2Fsearchlabel%3Dinstances%**2Fclass%3Dsortable-** >>> 20wikitable-20smwtable&po=%**3FInstance+Name%0A%3FInstance+** >>> Type%0A%3FProject%0A%3FImage+**Id%0A%3FFQDN%0A%3FLaunch+Time%** >>> 0A%3FPuppet+Class%0A%**3FModification+date%0A%** >>> 3FInstance+Host%0A%3FNumber+**of+CPUs%0A%3FRAM+Size%0A%** >>> 3FAmount+of+Storage%0A&limit=**100&eq=no<https://wikitech.wikimedia.org/w/index.php?title=Special:Ask&q=[[Resource+Type%3A%3Ainstance]][[Instance+Host%3A%3Avirt11]]&p=format%3Dbroadtable%2Flink%3Dall%2Fheaders%3Dshow%2Fsearchlabel%3Dinstances%2Fclass%3Dsortable-20wikitable-20smwtable&po=%3FInstance+Name%0A%3FInstance+Type%0A%3FProject%0A%3FImage+Id%0A%3FFQDN%0A%3FLaunch+Time%0A%3FPuppet+Class%0A%3FModification+date%0A%3FInstance+Host%0A%3FNumber+of+CPUs%0A%3FRAM+Size%0A%3FAmount+of+Storage%0A&limit=100&eq=no> >>> > >>> >> How long will he downtime be and can you please announce earlier? A week >> is a normal notice time. >> The Wiki Loves Monuments tools and applications (like the mobile app) >> rely on this so please keep it as short as possible. >> >> > The reboot will take about 10 minutes. > > That said, relying on labs for something like this is legitimately insane. > Have you talked with Wikimedia Foundation about getting production level > support for WLM? That's what you actually need. > > What will you do if the node hosting your instance completely dies? Is > your work puppetized? Can you just bring up a new instance to replace it? > Are you doing backups? > > Outside of tools (and deployment-prep, which is rather ephemeral) we don't > consider any project "semi-production" and the failure model is meant to be > handled at the instance level. The underlying infrastructure will just fail > and will not recover for you. You have to assume that your instances can > simply disappear at any moment (this is the traditional cloud computing > model, btw). > > This was completed a couple hours ago and all instances were rebooted. If you have any issues with your instance, please let me know. - Ryan
_______________________________________________ Wiki Loves Monuments mailing list WikiLovesMonuments@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikilovesmonuments http://www.wikilovesmonuments.org