Re: Update

Kengo Seki Tue, 10 Nov 2020 00:31:45 -0800

I intended to keep current data in the storage and only shrink its
capacity, because I thought slaves also have some settings on the
disk.
But are following steps enough for slave setup? If so, I'll wipe and
setup them from scratch.


* launch EC2 instance
* create jenkins user
* configure its authorized_keys so that master can login with current key
* change slave's IP address on master
* (then master automatically configure this node, e.g., installing
agent, right?)

Kengo Seki <sek...@apache.org>

On Tue, Nov 10, 2020 at 4:09 PM Evans Ye <evan...@apache.org> wrote:
>
> Yes I think overall your plan is good.
> What's the purpose of leveraging EBS snapshot? Is it to backup the things
> we have before migration?
> Except for the master node(have jenkins settings stored on disk), all those
> slaves can be wiped out directly.
>
>
>
> Kengo Seki <sek...@apache.org> 於 2020年11月10日 週二 下午2:42寫道：
>
> > Thanks everyone for the information! Now I understand our circumstances.
> > So we're going to split two 1TB volumes attached to slave06 and 07
> > into four 500GB volumes (and change their type to gp2), reattach them
> > to 02, 03, 06 and 07, and remove currently unused two 1TB volumes,
> > right?
> >
> > > Kengo would you like to take this, or you need a help?
> >
> > I think I can do them somehow (maybe using EBS snapshot?), but let me
> > ask your help if I'm stuck. :)
> >
> > Kengo Seki <sek...@apache.org>
> >
> > On Tue, Nov 10, 2020 at 1:00 AM Evans Ye <evan...@apache.org> wrote:
> > >
> > > OK. I got it now.
> > > So the newly created volumes are currently attached to slave06_2 and
> > > slave07_2, respectively.
> > > However, they're standard HDD, not GP2 SSD. I think we can take this
> > chance
> > > to recreate those 2 slaves and do an overhaul of our infrastructure.
> > >
> > > Kengo would you like to take this, or you need a help?
> > >
> > > Evans
> > >
> > > Olaf Flebbe <o...@oflebbe.de> 於 2020年11月6日 週五 上午2:40寫道：
> > >
> > > > Hi,
> > > >
> > > > OMG . I think I did it.
> > > >
> > > > A few years ago two of the instance had a hardware problems and did not
> > > > reboot any more, filesystem was corrupted and so on.  That was at the
> > time
> > > > of the spectre vulnarability discovery. (2018) . At that time AWS had
> > major
> > > > instabilities since updating firmware seem to have failed for some
> > classes
> > > > of hardware.
> > > >
> > > > I tried to recreate them as close as possible but I may have left
> > > > accidentely the volumes around. Please lets delete them.
> > > >
> > > > Olaf
> > > >
> > > > > Am 05.11.2020 um 14:44 schrieb Konstantin Boudnik <c...@apache.org>:
> > > > >
> > > > > Thanks Evans!
> > > > >
> > > > > It's great you found the details: they are definitely accurate as I
> > am
> > > > > recalling now. Kengo, do you think splitting the volumes would help
> > us
> > > > for a
> > > > > while? Or perhaps we shall try to expand the resource pool (which
> > might
> > > > take a
> > > > > while)?
> > > > >
> > > > > Thanks!
> > > > >  Cos
> > > > >
> > > > > On Thu, Nov 05, 2020 at 12:32PM, Evans Ye wrote:
> > > > >> In fact, the original deal of our resource is as follows:
> > > > >>
> > > > >>> 1 m3.2xlarge for CI
> > > > >>> 4 m3.xlarge for CI and demo
> > > > >>> 3 1TB EBS volumes
> > > > >>> 5 elastic IP addresses
> > > > >>
> > > > >> So technically we should not use that 2 additional 1T volumes
> > (created
> > > > in
> > > > >> 2018).
> > > > >> Instead, I think what we can do is to split up one of the existing
> > 1TB
> > > > >> volumes(ex: attached to slave07) into smaller volumes for slave02,
> > 03.
> > > > >>
> > > > >>
> > > > >> Konstantin Boudnik <c...@apache.org> 於 2020年11月4日 週三 下午2:28寫道：
> > > > >>
> > > > >>> Kengo,
> > > > >>>
> > > > >>> We had an agreement with EMR folks that we are using the resources
> > > > >>> available
> > > > >>> to us and it is included into their budget (or something to this
> > > > extent).
> > > > >>> If
> > > > >>> you see some of the resources available under our account - I
> > don't see
> > > > >>> why we
> > > > >>> can't use them.
> > > > >>>
> > > > >>> If for whatever reason we need to expand the pool, that would
> > require a
> > > > >>> separate conversation with nice folks from that team, I imagine.
> > Please
> > > > >>> let me
> > > > >>> know if I can help with this going forward.
> > > > >>>
> > > > >>> Thanks!
> > > > >>>  Cos
> > > > >>>
> > > > >>> On Wed, Nov 04, 2020 at 11:11AM, Kengo Seki wrote:
> > > > >>>> Thanks for the comment, Cos! I was able to start docker service on
> > > > >>>> docker-slave-02 without replacing and am running some Jenkins
> > jobs on
> > > > >>>> it now, so I'll replace it in the short future.
> > > > >>>> I have a few things that I'd like to ask additionally:
> > > > >>>>
> > > > >>>> * docker-slave-02 and 03 have a gp2 storage as a root volume that
> > has
> > > > >>>> only 8GiB capacity, and they sometimes run short and stop the CI.
> > > > >>>>  May I increase them to 20 or 30 GiB when I replace those
> > instances?
> > > > >>>> (I'm not sure what is our budget)
> > > > >>>>
> > > > >>>> * They use an instance store with 30GiB to put docker images into
> > it,
> > > > >>>> and they also sometimes run short.
> > > > >>>>  It seems there are two unused volumes with 1TiB (vol-ae71114e and
> > > > >>>> vol-4efa69ae) on AWS console.
> > > > >>>>  May I attach them to 02 and 03 instead of instance stores, or are
> > > > >>>> they backups or something?
> > > > >>>>
> > > > >>>> Kengo Seki <sek...@apache.org>
> > > > >>>>
> > > > >>>> On Mon, Nov 2, 2020 at 6:41 PM Konstantin Boudnik <c...@apache.org
> > >
> > > > >>> wrote:
> > > > >>>>>
> > > > >>>>> I'd say let replace the broken one. I don't think there's a
> > > > sentimental
> > > > >>>>> value attached ;)
> > > > >>>>>
> > > > >>>>> --
> > > > >>>>> With regards,
> > > > >>>>>   Cos
> > > > >>>>>
> > > > >>>>> On 02.11.2020 08:16, Kengo Seki wrote:
> > > > >>>>>> Thanks for updating Olaf! I've just noticed the Jenkins UI
> > became
> > > > >>> cool :)
> > > > >>>>>> Regarding docker-slave-02, I'll try to replace it after waiting
> > for
> > > > a
> > > > >>>>>> while to make sure there's no objection.
> > > > >>>>>>
> > > > >>>>>> Kengo Seki <sek...@apache.org>
> > > > >>>>>>
> > > > >>>>>> On Mon, Nov 2, 2020 at 1:39 PM Jun HE <ju...@apache.org> wrote:
> > > > >>>>>>>
> > > > >>>>>>> Thanks a lot for the update, Olaf!
> > > > >>>>>>>
> > > > >>>>>>> Olaf Flebbe <o...@oflebbe.de> 于2020年10月31日周六 上午3:24写道：
> > > > >>>>>>>
> > > > >>>>>>>> Hi,
> > > > >>>>>>>>
> > > > >>>>>>>> All machines patched. Jenkins and it plugins are updated:
> > > > >>>>>>>>
> > > > >>>>>>>> Things to be noted:
> > > > >>>>>>>>
> > > > >>>>>>>> * Slave 2 seems to be in serious problems. The disk image
> > seems to
> > > > >>> be
> > > > >>>>>>>> corrupt, I would say:
> > > > >>>>>>>> One of the problems: docker does not start any more.
> > > > >>>>>>>> Is there anything important on it ? If yes please contact me.
> > I
> > > > >>> would
> > > > >>>>>>>> recommend to set up slave2 from scratch again.
> > > > >>>>>>>>
> > > > >>>>>>>> * There was a warning regarding Copy Artifacts Plugin. It now
> > > > >>> imposes
> > > > >>>>>>>> stricter rules. Not sure if there is a job depending on it.
> > > > >>>>>>>>
> > > > >>>>>>>> * I removed the CVS plugin.
> > > > >>>>>>>>
> > > > >>>>>>>> Everything else seem to working as usual.
> > > > >>>>>>>>
> > > > >>>>>>>> Best,
> > > > >>>>>>>> Olaf
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>>> Am 30.10.2020 um 19:09 schrieb Olaf Flebbe <o...@oflebbe.de>:
> > > > >>>>>>>>>
> > > > >>>>>>>>> Hi,
> > > > >>>>>>>>>
> > > > >>>>>>>>> I am doing an update of the machines in CI . Seems a couple
> > of
> > > > >>> security
> > > > >>>>>>>> fixes are to be applied.
> > > > >>>>>>>>>
> > > > >>>>>>>>> Olaf
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>
> > > >
> > > >
> >

Re: Update

Reply via email to