Re: [openstack-dev] [tripleo] Pacemaker + containers CI
On 29.8.2017 17:12, Emilien Macchi wrote: On Tue, Aug 29, 2017 at 2:14 AM, Jiří Stránskýwrote: [...] the CI for containerized deployments with Pacemaker is close! In fact, it works [1][2] (but there are pending changes to merge). Really good news, thanks for the update! The way it's proposed in gerrit currently is to switch the centos-7-containers-multinode job (featureset010) to deploy with Pacemaker. What do you think about making this switch as a first step? [...] I'm ok with the idea No -1s yet, so i removed WIP status of [4]. as long as gate-tripleo-ci-centos-7-containers-multinode-upgrades-nv keep working fine. That's a different featureset so we can control it independently from the basic deployment job. It might be good to switch this one to Pacemaker too, if we can solve the current timeout issues and perhaps have some spare wall time. Non-pacemaker containers are still CI'd by OVB job, so the upgrade job (currently still non-Pacemaker) shouldn't be more vulnerable even if we switch the multinode job to Pacemaker. Deploying Pacemaker on a single node environment is not optimal but already cover a bunch of code which is good. Later it would be nice to get a proper clustering test with 3 controllers. Should we try and switch the centos-7-ovb-ha-oooq job to deploy containers on master and stable/pike? (Probably by adding a new job that only runs on master + Pike, and making the old ovb-ha-oooq only run upto Ocata, to keep the OVB capacity demands unchanged?) I'd be +1 on that since containers are the intended way of deploying Pike and beyond. WDYT? It's actually a good start to our discussion at the PTG: https://etherpad.openstack.org/p/tripleo-ptg-queens-ci-related-topics (we have a session on Wednesday morning about CI topics, please make sure you can join!) I think in Queens, we'll run container-only jobs, even for OVB. That said, I think OVB coverage in Queens will be very useful to try HA with 3 controllers (containerized) and the baremetal services coverage will only run on Pike, Ocata and Newton. That way, we would have: Queens: - multinode jobs covering basic HA scenario, single node but still useful to test a good part of the code - OVB jobs covering production environment and hopefully spot issues we wouldn't see with multinode jobs Pike, Ocata, Newton: no change on OVB job (note it's a proposal, not a statement) Yea focusing the CI changes towards containerized mainly on Queens+ could be fine too. The frequency of patches going into stable/pike will be dropping as it gains stability, so time spent on CI enhancements might indeed be better focused on Queens+. We can always adjust if that doesn't prove to be the case. [...] [3] https://review.openstack.org/498474 approved [...] Thanks, __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [tripleo] Pacemaker + containers CI
On Tue, Aug 29, 2017 at 2:14 AM, Jiří Stránskýwrote: [...] > the CI for containerized deployments with Pacemaker is close! In fact, it > works [1][2] (but there are pending changes to merge). Really good news, thanks for the update! > The way it's proposed in gerrit currently is to switch the > centos-7-containers-multinode job (featureset010) to deploy with Pacemaker. > What do you think about making this switch as a first step? [...] I'm ok with the idea as long as gate-tripleo-ci-centos-7-containers-multinode-upgrades-nv keep working fine. Deploying Pacemaker on a single node environment is not optimal but already cover a bunch of code which is good. > Later it would be nice to get a proper clustering test with 3 controllers. > Should we try and switch the centos-7-ovb-ha-oooq job to deploy containers > on master and stable/pike? (Probably by adding a new job that only runs on > master + Pike, and making the old ovb-ha-oooq only run upto Ocata, to keep > the OVB capacity demands unchanged?) I'd be +1 on that since containers are > the intended way of deploying Pike and beyond. WDYT? It's actually a good start to our discussion at the PTG: https://etherpad.openstack.org/p/tripleo-ptg-queens-ci-related-topics (we have a session on Wednesday morning about CI topics, please make sure you can join!) I think in Queens, we'll run container-only jobs, even for OVB. That said, I think OVB coverage in Queens will be very useful to try HA with 3 controllers (containerized) and the baremetal services coverage will only run on Pike, Ocata and Newton. That way, we would have: Queens: - multinode jobs covering basic HA scenario, single node but still useful to test a good part of the code - OVB jobs covering production environment and hopefully spot issues we wouldn't see with multinode jobs Pike, Ocata, Newton: no change on OVB job (note it's a proposal, not a statement) [...] > [3] https://review.openstack.org/498474 approved [...] Thanks, -- Emilien Macchi __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [tripleo] Pacemaker + containers CI
On 29.8.2017 14:42, Giulio Fidente wrote: On 08/29/2017 02:33 PM, Jiří Stránský wrote: A bit of context: Currently our only upgrade check job is non-OVB - containers-multinode-upgrades-nv. As of late we started hitting timeouts, and the job only does mixed-version deploy + 1 node AIO overcloud upgrade (just the main step). It doesn't do undercloud upgrade, nor compute upgrade, nor converge, and it still times out... It's a bit difficult to find things to cut off here. :D We could look into speeding things up (e.g. try to reintroduce selective container image upload etc.) but i think we might also be approaching the "natural" deploy+upgrade limits. We might need to bump up the timeouts if we want to test more things. Though it's not only about capacity of HW, it could also get unwieldy for devs if we keep increasing the feedback time from CI, so we're kinda in a tough spot with upgrade CI... agreed which goes back to "nobody looks at the periodic jobs" but periodic job seems the answer? Yea that might be the best solution :) J. __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [tripleo] Pacemaker + containers CI
On 08/29/2017 02:33 PM, Jiří Stránský wrote: A bit of context: Currently our only upgrade check job is non-OVB - containers-multinode-upgrades-nv. As of late we started hitting timeouts, and the job only does mixed-version deploy + 1 node AIO overcloud upgrade (just the main step). It doesn't do undercloud upgrade, nor compute upgrade, nor converge, and it still times out... It's a bit difficult to find things to cut off here. :D We could look into speeding things up (e.g. try to reintroduce selective container image upload etc.) but i think we might also be approaching the "natural" deploy+upgrade limits. We might need to bump up the timeouts if we want to test more things. Though it's not only about capacity of HW, it could also get unwieldy for devs if we keep increasing the feedback time from CI, so we're kinda in a tough spot with upgrade CI... agreed which goes back to "nobody looks at the periodic jobs" but periodic job seems the answer? -- Giulio Fidente GPG KEY: 08D733BA __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [tripleo] Pacemaker + containers CI
On 29.8.2017 13:22, Giulio Fidente wrote: On 08/29/2017 11:14 AM, Jiří Stránský wrote: Hi owls, the CI for containerized deployments with Pacemaker is close! In fact, it works [1][2] (but there are pending changes to merge). cool :D I also spotted this which we need for ceph https://review.openstack.org/#/c/498356/ but I am not sure if we want to enable ceph in this job as we have it already in a couple of scenarios, more below ... +1 on keeping it in scenarios if that covers our needs. The way it's proposed in gerrit currently is to switch the centos-7-containers-multinode job (featureset010) to deploy with Pacemaker. What do you think about making this switch as a first step? (The OVB job is an option too, but that one is considerably closer to timeouts already, so it may be better left as is.) +1 on switching the existing job Later it would be nice to get a proper clustering test with 3 controllers. Should we try and switch the centos-7-ovb-ha-oooq job to deploy containers on master and stable/pike? (Probably by adding a new job that only runs on master + Pike, and making the old ovb-ha-oooq only run upto Ocata, to keep the OVB capacity demands unchanged?) I'd be +1 on that since containers are the intended way of deploying Pike and beyond. WDYT? switching OVB to containers from pike seems fine because that's the indended way as you pointed, yet I would like to enable ceph in the upgrade job, and it requires multiple MON instances (multiple controllers) would it make any sense to deploy the pacemaker / ceph combination using multiple controllers in the upgrade job and drop the standard ovb job (which doesn't do upgrade) or use it for other purposes? It makes sense feature-wise to test upgrade with Ceph, i'd say it's a pretty common and important use case. However i'm not sure how can we achieve it time-wise in CI. Is it possible to estimate how much time might the Ceph upgrade add? A bit of context: Currently our only upgrade check job is non-OVB - containers-multinode-upgrades-nv. As of late we started hitting timeouts, and the job only does mixed-version deploy + 1 node AIO overcloud upgrade (just the main step). It doesn't do undercloud upgrade, nor compute upgrade, nor converge, and it still times out... It's a bit difficult to find things to cut off here. :D We could look into speeding things up (e.g. try to reintroduce selective container image upload etc.) but i think we might also be approaching the "natural" deploy+upgrade limits. We might need to bump up the timeouts if we want to test more things. Though it's not only about capacity of HW, it could also get unwieldy for devs if we keep increasing the feedback time from CI, so we're kinda in a tough spot with upgrade CI... Jirka __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [tripleo] Pacemaker + containers CI
On 08/29/2017 11:14 AM, Jiří Stránský wrote: Hi owls, the CI for containerized deployments with Pacemaker is close! In fact, it works [1][2] (but there are pending changes to merge). cool :D I also spotted this which we need for ceph https://review.openstack.org/#/c/498356/ but I am not sure if we want to enable ceph in this job as we have it already in a couple of scenarios, more below ... The way it's proposed in gerrit currently is to switch the centos-7-containers-multinode job (featureset010) to deploy with Pacemaker. What do you think about making this switch as a first step? (The OVB job is an option too, but that one is considerably closer to timeouts already, so it may be better left as is.) +1 on switching the existing job Later it would be nice to get a proper clustering test with 3 controllers. Should we try and switch the centos-7-ovb-ha-oooq job to deploy containers on master and stable/pike? (Probably by adding a new job that only runs on master + Pike, and making the old ovb-ha-oooq only run upto Ocata, to keep the OVB capacity demands unchanged?) I'd be +1 on that since containers are the intended way of deploying Pike and beyond. WDYT? switching OVB to containers from pike seems fine because that's the indended way as you pointed, yet I would like to enable ceph in the upgrade job, and it requires multiple MON instances (multiple controllers) would it make any sense to deploy the pacemaker / ceph combination using multiple controllers in the upgrade job and drop the standard ovb job (which doesn't do upgrade) or use it for other purposes? -- Giulio Fidente GPG KEY: 08D733BA __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [tripleo] Pacemaker + containers CI
Hi owls, the CI for containerized deployments with Pacemaker is close! In fact, it works [1][2] (but there are pending changes to merge). The way it's proposed in gerrit currently is to switch the centos-7-containers-multinode job (featureset010) to deploy with Pacemaker. What do you think about making this switch as a first step? (The OVB job is an option too, but that one is considerably closer to timeouts already, so it may be better left as is.) Later it would be nice to get a proper clustering test with 3 controllers. Should we try and switch the centos-7-ovb-ha-oooq job to deploy containers on master and stable/pike? (Probably by adding a new job that only runs on master + Pike, and making the old ovb-ha-oooq only run upto Ocata, to keep the OVB capacity demands unchanged?) I'd be +1 on that since containers are the intended way of deploying Pike and beyond. WDYT? Have a good day, Jirka P.S. You can deploy containerized with pacemaker using OOOQ by setting both `containerized_overcloud` and `enable_pacemaker` to true. Thanks to Wes for collaboration on this. P.P.S. The remaining patches are [3] and maybe [4] if we're ok with switching centos-7-containers-multinode. [1] http://logs.openstack.org/24/471724/5/check/gate-tripleo-ci-centos-7-containers-multinode/6330e5e/logs/subnode-2/var/log/pacemaker/bundles/ [2] http://logs.openstack.org/24/471724/5/check/gate-tripleo-ci-centos-7-containers-multinode/6330e5e/logs/subnode-2/var/log/extra/docker/containers/ [3] https://review.openstack.org/498474 [4] https://review.openstack.org/471724 __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev