I intended to keep current data in the storage and only shrink its capacity, because I thought slaves also have some settings on the disk. But are following steps enough for slave setup? If so, I'll wipe and setup them from scratch.
* launch EC2 instance * create jenkins user * configure its authorized_keys so that master can login with current key * change slave's IP address on master * (then master automatically configure this node, e.g., installing agent, right?) Kengo Seki <sek...@apache.org> On Tue, Nov 10, 2020 at 4:09 PM Evans Ye <evan...@apache.org> wrote: > > Yes I think overall your plan is good. > What's the purpose of leveraging EBS snapshot? Is it to backup the things > we have before migration? > Except for the master node(have jenkins settings stored on disk), all those > slaves can be wiped out directly. > > > > Kengo Seki <sek...@apache.org> 於 2020年11月10日 週二 下午2:42寫道: > > > Thanks everyone for the information! Now I understand our circumstances. > > So we're going to split two 1TB volumes attached to slave06 and 07 > > into four 500GB volumes (and change their type to gp2), reattach them > > to 02, 03, 06 and 07, and remove currently unused two 1TB volumes, > > right? > > > > > Kengo would you like to take this, or you need a help? > > > > I think I can do them somehow (maybe using EBS snapshot?), but let me > > ask your help if I'm stuck. :) > > > > Kengo Seki <sek...@apache.org> > > > > On Tue, Nov 10, 2020 at 1:00 AM Evans Ye <evan...@apache.org> wrote: > > > > > > OK. I got it now. > > > So the newly created volumes are currently attached to slave06_2 and > > > slave07_2, respectively. > > > However, they're standard HDD, not GP2 SSD. I think we can take this > > chance > > > to recreate those 2 slaves and do an overhaul of our infrastructure. > > > > > > Kengo would you like to take this, or you need a help? > > > > > > Evans > > > > > > Olaf Flebbe <o...@oflebbe.de> 於 2020年11月6日 週五 上午2:40寫道: > > > > > > > Hi, > > > > > > > > OMG . I think I did it. > > > > > > > > A few years ago two of the instance had a hardware problems and did not > > > > reboot any more, filesystem was corrupted and so on. That was at the > > time > > > > of the spectre vulnarability discovery. (2018) . At that time AWS had > > major > > > > instabilities since updating firmware seem to have failed for some > > classes > > > > of hardware. > > > > > > > > I tried to recreate them as close as possible but I may have left > > > > accidentely the volumes around. Please lets delete them. > > > > > > > > Olaf > > > > > > > > > Am 05.11.2020 um 14:44 schrieb Konstantin Boudnik <c...@apache.org>: > > > > > > > > > > Thanks Evans! > > > > > > > > > > It's great you found the details: they are definitely accurate as I > > am > > > > > recalling now. Kengo, do you think splitting the volumes would help > > us > > > > for a > > > > > while? Or perhaps we shall try to expand the resource pool (which > > might > > > > take a > > > > > while)? > > > > > > > > > > Thanks! > > > > > Cos > > > > > > > > > > On Thu, Nov 05, 2020 at 12:32PM, Evans Ye wrote: > > > > >> In fact, the original deal of our resource is as follows: > > > > >> > > > > >>> 1 m3.2xlarge for CI > > > > >>> 4 m3.xlarge for CI and demo > > > > >>> 3 1TB EBS volumes > > > > >>> 5 elastic IP addresses > > > > >> > > > > >> So technically we should not use that 2 additional 1T volumes > > (created > > > > in > > > > >> 2018). > > > > >> Instead, I think what we can do is to split up one of the existing > > 1TB > > > > >> volumes(ex: attached to slave07) into smaller volumes for slave02, > > 03. > > > > >> > > > > >> > > > > >> Konstantin Boudnik <c...@apache.org> 於 2020年11月4日 週三 下午2:28寫道: > > > > >> > > > > >>> Kengo, > > > > >>> > > > > >>> We had an agreement with EMR folks that we are using the resources > > > > >>> available > > > > >>> to us and it is included into their budget (or something to this > > > > extent). > > > > >>> If > > > > >>> you see some of the resources available under our account - I > > don't see > > > > >>> why we > > > > >>> can't use them. > > > > >>> > > > > >>> If for whatever reason we need to expand the pool, that would > > require a > > > > >>> separate conversation with nice folks from that team, I imagine. > > Please > > > > >>> let me > > > > >>> know if I can help with this going forward. > > > > >>> > > > > >>> Thanks! > > > > >>> Cos > > > > >>> > > > > >>> On Wed, Nov 04, 2020 at 11:11AM, Kengo Seki wrote: > > > > >>>> Thanks for the comment, Cos! I was able to start docker service on > > > > >>>> docker-slave-02 without replacing and am running some Jenkins > > jobs on > > > > >>>> it now, so I'll replace it in the short future. > > > > >>>> I have a few things that I'd like to ask additionally: > > > > >>>> > > > > >>>> * docker-slave-02 and 03 have a gp2 storage as a root volume that > > has > > > > >>>> only 8GiB capacity, and they sometimes run short and stop the CI. > > > > >>>> May I increase them to 20 or 30 GiB when I replace those > > instances? > > > > >>>> (I'm not sure what is our budget) > > > > >>>> > > > > >>>> * They use an instance store with 30GiB to put docker images into > > it, > > > > >>>> and they also sometimes run short. > > > > >>>> It seems there are two unused volumes with 1TiB (vol-ae71114e and > > > > >>>> vol-4efa69ae) on AWS console. > > > > >>>> May I attach them to 02 and 03 instead of instance stores, or are > > > > >>>> they backups or something? > > > > >>>> > > > > >>>> Kengo Seki <sek...@apache.org> > > > > >>>> > > > > >>>> On Mon, Nov 2, 2020 at 6:41 PM Konstantin Boudnik <c...@apache.org > > > > > > > >>> wrote: > > > > >>>>> > > > > >>>>> I'd say let replace the broken one. I don't think there's a > > > > sentimental > > > > >>>>> value attached ;) > > > > >>>>> > > > > >>>>> -- > > > > >>>>> With regards, > > > > >>>>> Cos > > > > >>>>> > > > > >>>>> On 02.11.2020 08:16, Kengo Seki wrote: > > > > >>>>>> Thanks for updating Olaf! I've just noticed the Jenkins UI > > became > > > > >>> cool :) > > > > >>>>>> Regarding docker-slave-02, I'll try to replace it after waiting > > for > > > > a > > > > >>>>>> while to make sure there's no objection. > > > > >>>>>> > > > > >>>>>> Kengo Seki <sek...@apache.org> > > > > >>>>>> > > > > >>>>>> On Mon, Nov 2, 2020 at 1:39 PM Jun HE <ju...@apache.org> wrote: > > > > >>>>>>> > > > > >>>>>>> Thanks a lot for the update, Olaf! > > > > >>>>>>> > > > > >>>>>>> Olaf Flebbe <o...@oflebbe.de> 于2020年10月31日周六 上午3:24写道: > > > > >>>>>>> > > > > >>>>>>>> Hi, > > > > >>>>>>>> > > > > >>>>>>>> All machines patched. Jenkins and it plugins are updated: > > > > >>>>>>>> > > > > >>>>>>>> Things to be noted: > > > > >>>>>>>> > > > > >>>>>>>> * Slave 2 seems to be in serious problems. The disk image > > seems to > > > > >>> be > > > > >>>>>>>> corrupt, I would say: > > > > >>>>>>>> One of the problems: docker does not start any more. > > > > >>>>>>>> Is there anything important on it ? If yes please contact me. > > I > > > > >>> would > > > > >>>>>>>> recommend to set up slave2 from scratch again. > > > > >>>>>>>> > > > > >>>>>>>> * There was a warning regarding Copy Artifacts Plugin. It now > > > > >>> imposes > > > > >>>>>>>> stricter rules. Not sure if there is a job depending on it. > > > > >>>>>>>> > > > > >>>>>>>> * I removed the CVS plugin. > > > > >>>>>>>> > > > > >>>>>>>> Everything else seem to working as usual. > > > > >>>>>>>> > > > > >>>>>>>> Best, > > > > >>>>>>>> Olaf > > > > >>>>>>>> > > > > >>>>>>>> > > > > >>>>>>>> > > > > >>>>>>>> > > > > >>>>>>>>> Am 30.10.2020 um 19:09 schrieb Olaf Flebbe <o...@oflebbe.de>: > > > > >>>>>>>>> > > > > >>>>>>>>> Hi, > > > > >>>>>>>>> > > > > >>>>>>>>> I am doing an update of the machines in CI . Seems a couple > > of > > > > >>> security > > > > >>>>>>>> fixes are to be applied. > > > > >>>>>>>>> > > > > >>>>>>>>> Olaf > > > > >>>>>>>> > > > > >>>>>>>> > > > > >>> > > > > > > > > > >