Re: [Labs-l] [Labs-announce] IMPORTANT: Reboots (again!) on Wednesday

Andrew Bogott Sun, 03 Apr 2016 21:06:07 -0700

Security updates are not something you can opt out of. Most concretely:we have to keep our physical hardware secure and up-to-date, andrebooting a physical virtualization node has the side-effect of cyclingpower on the resident VMs.

Even if it weren't a natural consequence of our own maintenance, though,I would still be rolling out these updates. Foundation staff can'tengage in total supervision of all Labs use or activity but we are,nonetheless, effectively responsible for any malicious activities thatmight originate from within our cluster. Naturally we need to takewhatever general measures we can to prevent Labs from turning into aspam farm. When a staff of three is managing more than 700 VMs we arealways going to have to rely on general one-size-fits-all maintenancesolutions.

In almost all cases it is trivial or near-trivial to set up a service sothat it recovers gracefully from a reboot. If the few minutes ofdowntime that result are unacceptable, we can provide resources for afail-over instance and ensure that the two instances are spread out ondifferent virt hosts so that they don't both go down at once. Let meknow if you need help with either of the above!


-A


On 4/3/16 5:23 PM, Petr Bena wrote:

Is it possible to somehow opt-out of these auto updates and reboots?
Majority of kernel exploits (like 99.9999%) of them can be exploited
only if the hacker has at least some access to remote system (eg. over
ssh) so that they can execute something that eventually get them
rights of user 0.

This makes sense for tool labs, but I see no reason why for example
wm-bot's instance need to get this patch, when all people who can
access it over ssh already have root. I am absolutely sure there is no
security risk while running obsolete kernel on that one for example.
The reboots on other hand cause much more harm.

Thanks

On Wed, Mar 23, 2016 at 11:42 PM, Andrew Bogott <abog...@wikimedia.org> wrote:

This is now done; all labs services should be back to normal.  As always,
it's a good idea to poke at your tools and make sure there aren't jobs that
need restarting.

-Andrew


On 3/23/16 8:23 AM, Andrew Bogott wrote:

Reminder: This is happening today, starting in about 40 minutes.

On 3/18/16 1:58 PM, Andrew Bogott wrote:

Yet another kernel exploit turned up this week, which means another round
of kernel updates and reboots. All labs instances will be rebooted (at
various and unpredictable times) this Wednesday, 2016-03-23, beginning
around 14:00 UTC.

    We're getting pretty good at this :(  We'll pool- and de-pool exec
nodes as needed to minimize surprise endings for ToolLabs jobs, but as usual
there will be some stragglers that get cut short or don't get restarted
properly.  Keep an eye out on Wednesday, and let us know on IRC if you run
into trouble.

-Andrew


_______________________________________________
Labs-announce mailing list
labs-annou...@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/labs-announce
_______________________________________________
Labs-l mailing list
Labs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/labs-l

_______________________________________________
Labs-l mailing list
Labs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/labs-l



_______________________________________________
Labs-l mailing list
Labs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/labs-l

Re: [Labs-l] [Labs-announce] IMPORTANT: Reboots (again!) on Wednesday

Reply via email to