Thanks for your quick work on this, get some rest! We can look at supercolony next week when you're back in action. - amye
On Sat, Jan 6, 2018 at 11:48 AM, Michael Scherer <msche...@redhat.com> wrote: > Le samedi 06 janvier 2018 à 11:44 +0100, Michael Scherer a écrit : > > Le vendredi 05 janvier 2018 à 14:24 +0100, Michael Scherer a écrit : > > > Hi, > > > > > > unless you are living in a place without any internet (a igloo in > > > Antartica, the middle of the Gobi desert, a bunker in Switzerland > > > or > > > simply the Paris underground train), you may have seen the news > > > that > > > this week is again a security nightmare (also called "just a normal > > > Wednesday" among practitioners ), and that we have important kernel > > > patch to push, that do requiers a reboot. > > > > > > See https://spectreattack.com/ > > > > > > While I suspect our infra will not be targeted and there is more > > > venue > > > to attack on local computers and browsers who are the one running > > > proprietary random code in form of JS on a regular basis, we still > > > have > > > to upgrade everything to be sure. > > > > > > Therefor, I am gonna have to reboot all the infra (yes, the 83 > > > servers), minus the few servers I already did reboot (because in > > > HA, > > > or > > > not customer facing) tomorrow. > > > > > > I will block jenkins, and wait for the jobs to be finished before > > > rebooting the various servers. I will send a email tomorrow once > > > the > > > reboot start (e.g., when/if I wake up), and another one things are > > > good > > > (or if stuff broke in a horrible fashion too, as it happened > > > today). > > > > > > If there is some precaution or anything to take, people have around > > > 24h > > > to voice their concerns. > > > > Reboot is starting. I already did various backend servers, the > > document > > I used for tracking the work is on > > https://bimestriel.framapad.org/p/gluster_infra_reboot > > So almost all Linux servers got rebooted, most without issues, but > during the day, I started to have the first symptom of a cold > (headaches, shivering, etc), so I had to ping Nigel to finish the last > server (who wasn't without issue) > > > For people who do not want gruesome details on the reboots, you can > stop here. > > > We did got some trouble with: > > - a few servers on Rackspace (mostly infra) with cloud-init reseting > the configuration to dhcp, and the dhcp not working. I am finally > changing that and was in the course of fixing it for good before going > back to bed. > > - gerrit didn't start automatically at boot. I know we had a fix for > that, but not sure on why it didn't work, or if we didn't deployed yet. > > - supercolony seems to be unable to boot the latest kernel. It went so > bad that the emergency console wasn't working. A erroneous message said > "disabled for your account", so I did open a rackspace ticket and > waited. This occurred as I started to not feel well, so I didn't really > searched more, or I would have: > - seen that the console was working for others servers (thus > erroneous messages) > - would have tried harder to boot another kernel > - search a bit more on internal list that said "there is some issue > somewhere around RHEL 6". Didn't investigate more, but that's also what > happened. > > In the end, Nigel took over the problem solving and pinged harder > Rackspace, whose support suggested to boot another kernel, which he did > (but better than I did). > > And thus supercolony is back, but not upgraded. > > The last one still puzzle me, because the current configuration is: > "default=2", so that should start the 3rd kernel in the list. > > Grub doc say "The first entry (here, counting starts with number zero, > not one!) will be the default choice", it was "0" when i first tried to > boot another kernel (switched to 1). > > So since we have: > > [root@supercolony ~]# grep title /boot/grub/menu.lst > title Red Hat Enterprise Linux Server (2.6.32-696.18.7.el6.x86_64) > title Red Hat Enterprise Linux Server (2.6.32-696.16.1.el6.x86_64) > title Red Hat Enterprise Linux Server (2.6.32-642.15.1.el6.x86_64) > > default=1 should have used 2.6.32-696.16.1, but it didn't boot. > > Nigel changed it for "default=2", so that should have used 2.6.32- > 642.15.1, but plot twist... > > # uname -a > Linux supercolony.gluster.org 2.6.32-696.16.1.el6.x86_64 #1 SMP Sun Oct > 8 09:45:56 EDT 2017 x86_64 x86_64 x86_64 GNU/Linux > > So there is something fishy for grub, but as I redact that from my bed, > maybe the problem is on my side. I am sure it will be clearer once I > hit "send". > > So, to recap, we have one or two servers to upgrade (cf the pad), the > *bsd are not patched yet (I quickly checked their lists, but I do not > expect it soon), but since the more urgent issues were on the > hypervisor side, we are ok for that. > > The grub on supercolony need to be investigated, and supercolony should > be upgraded as well. > > I also need to take some rest. > > Many thanks for Nigel for taking over when my body failed me. > > > -- > Michael Scherer > Sysadmin, Community Infrastructure and Platform, OSAS > > > _______________________________________________ > Gluster-infra mailing list > Gluster-infra@gluster.org > http://lists.gluster.org/mailman/listinfo/gluster-infra > -- Amye Scavarda | a...@redhat.com | Gluster Community Lead
_______________________________________________ Gluster-infra mailing list Gluster-infra@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-infra