[openstack-dev] [tripleo] Proposing Enrique Llorente Pastora as a core reviewer for TripleO
Hi, I'd like to propose Quique (@quiquell) as a core reviewer for TripleO. Quique is actively involved in improvements and development of TripleO and TripleO CI. He also helps in other projects including but not limited to Infrastructure. He shows a very good understanding how TripleO and CI works and I'd like suggest him as core reviewer of TripleO in CI related code. Please vote! My +1 is here :) Thanks -- Best regards Sagi Shnaidman __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [tripleo] shutting down 3rd party TripleO CI for measurements
We measured results and would like to shut down check jobs in RDO cloud CI today. Please let us know if you have objections. Thanks On Thu, Nov 1, 2018 at 12:14 AM Wesley Hayutin wrote: > Greetings, > > The TripleO-CI team would like to consider shutting down all the third > party check jobs running against TripleO projects in order to measure > results with and without load on the cloud for some amount of time. I > suspect we would want to shut things down for roughly 24-48 hours. > > If there are any strong objects please let us know. > Thank you > -- > > Wes Hayutin > > Associate MANAGER > > Red Hat > > <https://www.redhat.com/> > > whayu...@redhat.comT: +1919 <+19197544114>4232509 IRC: weshay > <https://red.ht/sig> > > View my calendar and check my availability for meetings HERE > <https://calendar.google.com/calendar/b/1/embed?src=whayu...@redhat.com&ctz=America/New_York> > -- Best regards Sagi Shnaidman __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [tripleo][ci][metrics] Stucked in the middle of work because of RDO CI
Hi, Martin I see master OVB jobs are passing now [1], please recheck. [1] http://cistatus.tripleo.org/ On Tue, Jul 31, 2018 at 12:24 PM, Martin Magr wrote: > Greetings guys, > > it is pretty obvious that RDO CI jobs in TripleO projects are broken > [0]. Once Zuul CI jobs will pass would it be possible to have AMQP/collectd > patches ([1],[2],[3]) merged please even though the negative result of RDO > CI jobs? Half of the patches for this feature is merged and the other half > is stucked in this situation, were nobody reviews these patches, because > there is red -1. Those patches passed Zuul jobs several times already and > were manually tested too. > > Thanks in advance for consideration of this situation, > Martin > > [0] https://trello.com/c/hkvfxAdX/667-cixtripleoci-rdo-software- > factory-3rd-party-jobs-failing-due-to-instance-nodefailure > [1] https://review.openstack.org/#/c/578749 > [2] https://review.openstack.org/#/c/576057/ > [3] https://review.openstack.org/#/c/572312/ > > -- > Martin Mágr > Senior Software Engineer > Red Hat Czech > > __ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > -- Best regards Sagi Shnaidman __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [tripleo][ci][infra] Quickstart Branching
;branching" repos but without maintenance nightmare. Thanks Thanks, > -Alex > > [0] http://git.openstack.org/cgit/openstack/ansible-role- > container-registry/ > [1] http://git.openstack.org/cgit/openstack/ansible-role-redhat- > subscription/ > [2] http://git.openstack.org/cgit/openstack/ansible-role-tripleo-keystone/ > [3] http://git.openstack.org/cgit/openstack/puppet-openstacklib/ > [4] https://review.openstack.org/#/c/565856/ > [5] https://review.openstack.org/#/c/569830 > > > Thanks > > > > > > > > On Wed, May 23, 2018 at 7:04 PM, Alex Schultz > wrote: > >> > >> On Wed, May 23, 2018 at 8:30 AM, Sagi Shnaidman > >> wrote: > >> > Hi, Sergii > >> > > >> > thanks for the question. It's not first time that this topic is raised > >> > and > >> > from first view it could seem that branching would help to that sort > of > >> > issues. > >> > > >> > Although it's not the case. Tripleo-quickstart(-extras) is part of CI > >> > code, > >> > as well as tripleo-ci repo which have never been branched. The reason > >> > for > >> > that is relative small impact on CI code from product branching. Think > >> > about > >> > backport almost *every* patch to oooq and extras to all supported > >> > branches, > >> > down to newton at least. This will be a really *huge* price and non > >> > reasonable work. Just think about active maintenance of 3-4 versions > of > >> > CI > >> > code in each of 3 repositories. It will take all time of CI team with > >> > almost > >> > zero value of this work. > >> > > >> > >> So I'm not sure I completely agree with this assessment as there is a > >> price paid for every {%if release in [...]%} that we have to carry in > >> oooq{,-extras}. These go away if we branch because we don't have to > >> worry about breaking previous releases or current release (which may > >> or may not actually have CI results). > >> > >> > What regards patch you listed, we would have backport this change to > >> > *every* > >> > branch, and it wouldn't really help to avoid the issue. The source of > >> > problem is not branchless repo here. > >> > > >> > >> No we shouldn't be backporting every change. The logic in oooq-extras > >> should be version specific and if we're changing an interface in > >> tripleo in a breaking fashion we're doing it wrong in tripleo. If > >> we're backporting things to work around tripleo issues, we're doing it > >> wrong in quickstart. > >> > >> > Regarding catching such issues and Bogdans point, that's right we > added > >> > a > >> > few jobs to catch such issues in the future and prevent breakages, > and a > >> > few > >> > running jobs is reasonable price to keep configuration working in all > >> > branches. Comparing to maintenance nightmare with branches of CI code, > >> > it's > >> > really a *zero* price. > >> > > >> > >> Nothing is free. If there's a high maintenance cost, we haven't > >> properly identified the optimal way to separate functionality between > >> tripleo/quickstart. I have repeatedly said that the provisioning > >> parts of quickstart should be separate because those aren't tied to a > >> tripleo version and this along with the scenario configs should be the > >> only unbranched repo we have. Any roles related to how to > >> configure/work with tripleo should be branched and tied to a stable > >> branch of tripleo. This would actually be beneficial for tripleo as > >> well because then we can see when we are introducing backwards > >> incompatible changes. > >> > >> Thanks, > >> -Alex > >> > >> > Thanks > >> > > >> > > >> > On Wed, May 23, 2018 at 3:43 PM, Sergii Golovatiuk < > sgolo...@redhat.com> > >> > wrote: > >> >> > >> >> Hi, > >> >> > >> >> Looking at [1], I am thinking about the price we paid for not > >> >> branching tripleo-quickstart. Can we discuss the options to prevent > >> >> the issues such as [1]? Thank you in advance. > >> >> > >> >> [1] https://review.openstack.org/#/c/569830/4 > >&
Re: [openstack-dev] [tripleo][ci][infra] Quickstart Branching
Alex, the problem is that you're working and focusing mostly on release specific code like featuresets and some scripts. But tripleo-quickstart(-extras) and tripleo-ci is much *much* more than set of featuresets. Only 10% of the code may be related to releases and branches, while other 90% is completely independent and not related to releases. So in 90% code we DO need to backport every change, take for example the latest patch to extras: https://review.openstack.org/#/c/570167/, it's fixing reproducer. If oooq-extra was branched, we would need to backport this fix to every and every branch. And the same for all other 90% of code, which is complete nonsense. Just because not using "{% if release %}" construct - to block the whole work of CI team and make the CI code is absolutely unmaintainable? Some of release related templates we moved recently from tripleo-ci to THT repo like scenarios, OC templates, etc. If we discover another things in oooq that could be moved to branched THT I'd be only happy for that. Sometimes it could be hard to maintain one file in extras templates with different logic for releases, like we have in tempest configuration for example. The solution is to create a few release-related templates and use one that match the current branch. It doesn't affect 90% of code and still "branch-like" approach. But I didn't see other scripts that are so release dependent. If we'll have ones, we could do the same. For now I see "{% if release %}" construct working very well. I didn't see still any advantage of branching CI code, except of a little bit nicer jinja templates without "{% if release ", but amount of disadvantages is so huge, that it'll literally block all current work in CI. Thanks On Wed, May 23, 2018 at 7:04 PM, Alex Schultz wrote: > On Wed, May 23, 2018 at 8:30 AM, Sagi Shnaidman > wrote: > > Hi, Sergii > > > > thanks for the question. It's not first time that this topic is raised > and > > from first view it could seem that branching would help to that sort of > > issues. > > > > Although it's not the case. Tripleo-quickstart(-extras) is part of CI > code, > > as well as tripleo-ci repo which have never been branched. The reason for > > that is relative small impact on CI code from product branching. Think > about > > backport almost *every* patch to oooq and extras to all supported > branches, > > down to newton at least. This will be a really *huge* price and non > > reasonable work. Just think about active maintenance of 3-4 versions of > CI > > code in each of 3 repositories. It will take all time of CI team with > almost > > zero value of this work. > > > > So I'm not sure I completely agree with this assessment as there is a > price paid for every {%if release in [...]%} that we have to carry in > oooq{,-extras}. These go away if we branch because we don't have to > worry about breaking previous releases or current release (which may > or may not actually have CI results). > > > What regards patch you listed, we would have backport this change to > *every* > > branch, and it wouldn't really help to avoid the issue. The source of > > problem is not branchless repo here. > > > > No we shouldn't be backporting every change. The logic in oooq-extras > should be version specific and if we're changing an interface in > tripleo in a breaking fashion we're doing it wrong in tripleo. If > we're backporting things to work around tripleo issues, we're doing it > wrong in quickstart. > > > Regarding catching such issues and Bogdans point, that's right we added a > > few jobs to catch such issues in the future and prevent breakages, and a > few > > running jobs is reasonable price to keep configuration working in all > > branches. Comparing to maintenance nightmare with branches of CI code, > it's > > really a *zero* price. > > > > Nothing is free. If there's a high maintenance cost, we haven't > properly identified the optimal way to separate functionality between > tripleo/quickstart. I have repeatedly said that the provisioning > parts of quickstart should be separate because those aren't tied to a > tripleo version and this along with the scenario configs should be the > only unbranched repo we have. Any roles related to how to > configure/work with tripleo should be branched and tied to a stable > branch of tripleo. This would actually be beneficial for tripleo as > well because then we can see when we are introducing backwards > incompatible changes. > > Thank
Re: [openstack-dev] [tripleo][ci][infra] Quickstart Branching
Hi, Sergii thanks for the question. It's not first time that this topic is raised and from first view it could seem that branching would help to that sort of issues. Although it's not the case. Tripleo-quickstart(-extras) is part of CI code, as well as tripleo-ci repo which have never been branched. The reason for that is relative small impact on CI code from product branching. Think about backport almost *every* patch to oooq and extras to all supported branches, down to newton at least. This will be a really *huge* price and non reasonable work. Just think about active maintenance of 3-4 versions of CI code in each of 3 repositories. It will take all time of CI team with almost zero value of this work. What regards patch you listed, we would have backport this change to *every* branch, and it wouldn't really help to avoid the issue. The source of problem is not branchless repo here. Regarding catching such issues and Bogdans point, that's right we added a few jobs to catch such issues in the future and prevent breakages, and a few running jobs is reasonable price to keep configuration working in all branches. Comparing to maintenance nightmare with branches of CI code, it's really a *zero* price. Thanks On Wed, May 23, 2018 at 3:43 PM, Sergii Golovatiuk wrote: > Hi, > > Looking at [1], I am thinking about the price we paid for not > branching tripleo-quickstart. Can we discuss the options to prevent > the issues such as [1]? Thank you in advance. > > [1] https://review.openstack.org/#/c/569830/4 > > -- > Best Regards, > Sergii Golovatiuk > > __ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > -- Best regards Sagi Shnaidman __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [ci][infra][tripleo] Multi-staged check pipelines for Zuul v3 proposal
an forcing an iterative workflow where they have to fix all the >>> whitespace issues before the CI system will tell them which actual tests >>> broke. >>> >>> -Jim >>> >> >> I proposed a few zuul dependencies [0], [1] to tripleo CI pipelines for >> undercloud deployments vs upgrades testing (and some more). Given that >> those undercloud jobs have not so high fail rates though, I think Emilien >> is right in his comments and those would buy us nothing. >> >> From the other side, what do you think folks of making the >> tripleo-ci-centos-7-3nodes-multinode depend on >> tripleo-ci-centos-7-containers-multinode [2]? The former seems quite >> faily and long running, and is non-voting. It deploys (see featuresets >> configs [3]*) a 3 nodes in HA fashion. And it seems almost never passing, >> when the containers-multinode fails - see the CI stats page [4]. I've found >> only a 2 cases there for the otherwise situation, when containers-multinode >> fails, but 3nodes-multinode passes. So cutting off those future failures >> via the dependency added, *would* buy us something and allow other jobs to >> wait less to commence, by a reasonable price of somewhat extended time of >> the main zuul pipeline. I think it makes sense and that extended CI time >> will not overhead the RDO CI execution times so much to become a problem. >> WDYT? >> >> [0] https://review.openstack.org/#/c/568275/ >> [1] https://review.openstack.org/#/c/568278/ >> [2] https://review.openstack.org/#/c/568326/ >> [3] https://docs.openstack.org/tripleo-quickstart/latest/feature >> -configuration.html >> [4] http://tripleo.org/cistatus.html >> >> * ignore the column 1, it's obsolete, all CI jobs now using configs >> download AFAICT... >> >> > > -- > Best regards, > Bogdan Dobrelya, > Irc #bogdando > > __ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > -- Best regards Sagi Shnaidman __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [ci][infra][tripleo] Multi-staged check pipelines for Zuul v3 proposal
Hi, Bogdan I like the idea with undercloud job. Actually if undercloud fails, I'd stop all other jobs, because it doens't make sense to run them. Seeing the same failure in 10 jobs doesn't add too much. So maybe adding undercloud job as dependency for all multinode jobs would be great idea. I think it's worth to check also how long it will delay jobs. Will all jobs wait until undercloud job is running? Or they will be aborted when undercloud job is failing? However I'm very sceptical about multinode containers and scenarios jobs, they could fail because of very different reasons, like race conditions in product or infra issues. Having skipping some of them will lead to more rechecks from devs trying to discover all problems in a row, which will delay the development process significantly. Thanks On Mon, May 14, 2018 at 7:15 PM, Bogdan Dobrelya wrote: > An update for your review please folks > > Bogdan Dobrelya writes: >> >> Hello. >>> As Zuul documentation [0] explains, the names "check", "gate", and >>> "post" may be altered for more advanced pipelines. Is it doable to >>> introduce, for particular openstack projects, multiple check >>> stages/steps as check-1, check-2 and so on? And is it possible to make >>> the consequent steps reusing environments from the previous steps >>> finished with? >>> >>> Narrowing down to tripleo CI scope, the problem I'd want we to solve >>> with this "virtual RFE", and using such multi-staged check pipelines, >>> is reducing (ideally, de-duplicating) some of the common steps for >>> existing CI jobs. >>> >> >> What you're describing sounds more like a job graph within a pipeline. >> See: https://docs.openstack.org/infra/zuul/user/config.html#attr- >> job.dependencies >> for how to configure a job to run only after another job has completed. >> There is also a facility to pass data between such jobs. >> >> ... (skipped) ... >> >> Creating a job graph to have one job use the results of the previous job >> can make sense in a lot of cases. It doesn't always save *time* >> however. >> >> It's worth noting that in OpenStack's Zuul, we have made an explicit >> choice not to have long-running integration jobs depend on shorter pep8 >> or tox jobs, and that's because we value developer time more than CPU >> time. We would rather run all of the tests and return all of the >> results so a developer can fix all of the errors as quickly as possible, >> rather than forcing an iterative workflow where they have to fix all the >> whitespace issues before the CI system will tell them which actual tests >> broke. >> >> -Jim >> > > I proposed a few zuul dependencies [0], [1] to tripleo CI pipelines for > undercloud deployments vs upgrades testing (and some more). Given that > those undercloud jobs have not so high fail rates though, I think Emilien > is right in his comments and those would buy us nothing. > > From the other side, what do you think folks of making the > tripleo-ci-centos-7-3nodes-multinode depend on > tripleo-ci-centos-7-containers-multinode [2]? The former seems quite > faily and long running, and is non-voting. It deploys (see featuresets > configs [3]*) a 3 nodes in HA fashion. And it seems almost never passing, > when the containers-multinode fails - see the CI stats page [4]. I've found > only a 2 cases there for the otherwise situation, when containers-multinode > fails, but 3nodes-multinode passes. So cutting off those future failures > via the dependency added, *would* buy us something and allow other jobs to > wait less to commence, by a reasonable price of somewhat extended time of > the main zuul pipeline. I think it makes sense and that extended CI time > will not overhead the RDO CI execution times so much to become a problem. > WDYT? > > [0] https://review.openstack.org/#/c/568275/ > [1] https://review.openstack.org/#/c/568278/ > [2] https://review.openstack.org/#/c/568326/ > [3] https://docs.openstack.org/tripleo-quickstart/latest/feature > -configuration.html > [4] http://tripleo.org/cistatus.html > > * ignore the column 1, it's obsolete, all CI jobs now using configs > download AFAICT... > > -- > Best regards, > Bogdan Dobrelya, > Irc #bogdando > > __ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > -- Best regards Sagi Shnaidman __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [TripleO][CI] Which network templates to use in CI (with and without net isolation)?
Hi, all we have now network templates in tripleo-ci repo[1] and we'd like to move them to tht repo[2] and to use them from there. We have also default templates defined in overcloud-deploy role[3]. So the question is - which templates should we use and how to configure them? One option for configuration is set network args (incl. isolation) in overcloud-deploy role[3] depending on other features (like docker, ipv6, etc). The other is to set them in featureset[4] files for each job. The question is also which network templates we want to gate in CI and should it be the same we have by default in tripleo-quickstart-extras? We have a few patches from James (@slagle) to address this topic[5] and from Arx for this issue[6]. Please feel free to share your thoughts what and where should be tested in CI from network templates. Thanks [1] https://github.com/openstack-infra/tripleo-ci/tree/821d84f34c851a79495f0205ad3c8dac928c286f/test-environments [2] https://github.com/openstack/tripleo-heat-templates/tree/master/ci/environments/network [3] https://github.com/openstack/tripleo-quickstart-extras/blob/master/roles/overcloud-deploy/tasks/pre-deploy.yml#L21-L51 [4] https://github.com/openstack/tripleo-quickstart/blob/cf793bbb8368f89cd28214fe21adca2df48ef7f3/config/general_config/featureset001.yml#L26-L28 [5] https://review.openstack.org/#/c/531224/ https://review.openstack.org/#/c/525331 https://review.openstack.org/#/c/531221 [6] https://review.openstack.org/#/c/512225/ -- Best regards Sagi Shnaidman __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [tripleo] Reminder about bug priority
On Mon, Dec 11, 2017 at 5:42 PM, Emilien Macchi wrote: > A lot of bugs are set Critical or High. > Just a few reminder on how we like to use the criteria in TripleO: > > - "Critical" should be used when a bug makes a basic deployment > impossible. For example, a bug that affects all CI gate jobs is > critical. Something that any deployment can hit is critical. A bug in > the master promotion pipeline can be set as Critical. > I think any bug that completely or pretty often fails current jobs should be critical: 1) all voting jobs on TripleO CI 2) OVB jobs 3) promotion blocker for any of releases, both stable and master What regards to releases, there are some stages in release workflow when master-1 release is more important than any other. I think we should be flexible there and allow to set critical also for bug in master-1 release, at least when it has priority. Another point for critical bugs might be bug that blocks developers work (for example using tripleo tools), even if it passes in CI. > - "High", "Medium" and "Low" should be used for other bug reports, > where High is an important bug that you can hit but it won't block a > deployment. "High" can also be used for stable branch promotion > pipelines (pike, ocata, newton). > > Please don't use Critical for all the bugs, otherwise we end up with > half of our bugs set at Critical, while they're rather High or Medium. > > If any doubt, please ask on #tripleo. > > Thanks, > -- > Emilien Macchi > > __ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > -- Best regards Sagi Shnaidman __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [tripleo] Proposing Wesley Hayutin core on TripleO CI
+1 On Wed, Dec 6, 2017 at 5:45 PM, Emilien Macchi wrote: > Team, > > Wes has been consistently and heavily involved in TripleO CI work. > He has a very well understanding on how tripleo-quickstart and > tripleo-quickstart-extras work, his number and quality of reviews are > excellent so far. His experience with testing TripleO is more than > valuable. > Also, he's always here to help on TripleO CI issues or just > improvements (he's the guy filling bugs on a Saturday evening). > I think he would be a good addition to the TripleO CI core team > (tripleo-ci, t-q and t-q-e repos for now). > Anyway, thanks a lot Wes for your hard work on CI, I think it's time > to move on and get you +2 ;-) > > As usual, it's open for voting, feel free to bring any feedback. > Thanks everyone, > -- > Emilien Macchi > > __ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > -- Best regards Sagi Shnaidman __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [TripleO] Proposing Ronelle Landy for Tripleo-Quickstart/Extras/CI core
+1 On Wed, Nov 29, 2017 at 9:34 PM, John Trowbridge wrote: > I would like to propose Ronelle be given +2 for the above repos. She has > been a solid contributor to tripleo-quickstart and extras almost since the > beginning. She has solid review numbers, but more importantly has always > done quality reviews. She also has been working in the very intense rover > role on the CI squad in the past CI sprint, and has done very well in that > role. > > __ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > -- Best regards Sagi Shnaidman __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [TripleO][infra][CI] Moving OVB jobs from RH1 cloud to RDO cloud, plan
Hi, as you know we prepare transition of all OVB jobs from RH1 cloud to RDO cloud, also a few long multinode upgrades jobs as well. We prepared a workflow of transition below, please feel free to comment. 1) We run one job (ovb-ha-oooq) on every patch in following repos: oooq, oooq-extras, tripleo-ci. We run rest of ovb jobs (containers and fs024) as experimental in rdo cloud for following repos: oooq, oooq-extras, tripleo-ci, tht, tripleo-common. It should cover most of our testing. This step is completed. Currently it's blocked by newton bug in RDO cloud: https://bugs.launchpad.net/heat/+bug/1626256 , where cloud release doesn't contain its fix: https://review.openstack.org/#/c/501592/ . From other side, the upgrade to Ocata release (which would solve this issue too) is blocked by bug: https://bugs.launchpad.net/tripleo/+bug/1724328 So we are in blocked state right now with moving. Next steps: 2) We solve all issues with running on every patch job (ovb-ha-oooq) so that it's passing (or failing exactly for same results as on rh1) for a 2 regular working days. (not weekend). 3) We should trigger experimental jobs in this time on various patches in tht and tripleo-common and solve all issues for experimental jobs so all ovb jobs pass. 4) We need to monitor all this time resources in openstack-nodepool tenant (with help of rhops maybe) and be sure that it has the capacity to run configured jobs. 5) We set ovb-ha-oooq job as running for every patch in all places where it's running in rh1 (in parallel with existing rh1 job). We monitor RDO cloud that it doesn't fail and still have resources - 1.5 working days 6) We add featureset024 ovb job to run in every patch where it runs in rh1. We continue to monitor RDO cloud - 1.5 working days 7) We add last containers ovb job to all patches where it runs on rh1. We continue monitor RDO cloud - 2 days. 8) In case if everything is OK in all previous points and RDO cloud still performs well, we remove ovb jobs from rh1 configuration and make them as experimental. 9) During next few days we monitor ovb jobs and run rh1 ovb jobs as experimental to check if we have the same results (or better :) ) 10) OVB jobs on rh1 cloud stay in experimental pipeline in tripleo for a next month or two. -- Best regards Sagi Shnaidman __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [TripleO][Heat] using convergence_engine to deploy overcloud stack
Hi there On Wed, Aug 9, 2017 at 1:49 PM, Rabi Mishra wrote: > On Wed, Aug 9, 2017 at 1:41 PM, Smigielski, Radoslaw (Nokia - IE) < > radoslaw.smigiel...@nokia.com> wrote: > >> Hi there! >> >>I have a question about heat "convergence_engine" option, it's >> present in heat config since quite a long time but still not enabled. >> > Well, convergence is enabled by default in heat since newton. However, > Tripleo does not use it yet, as convergence engine memory usage is higher > than that of legacy engine. > > TripleO CI has heat-convergence job running on heat patches in experimental pipeline [1] It runs there during last year at least. No any high memory usage was detected in last few months when I watched it. [1] https://github.com/openstack-infra/project-config/blob/master/zuul/layout.yaml#L10407 Thanks -- Best regards Sagi Shnaidman __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [tripleo] CI Squad Meeting Summary (week 26) - job renaming discussion
Every job contains topology file too, like "1cont_1comp" for example. And generally could be different jobs that run the same featureset024 but with different topologies. So I think the topology part is necessary too. On Tue, Jul 4, 2017 at 8:45 PM, Emilien Macchi wrote: > On Fri, Jun 30, 2017 at 11:06 AM, Jiří Stránský wrote: > > On 30.6.2017 15:04, Attila Darazs wrote: > >> > >> = Renaming the CI jobs = > >> > >> When we started the job transition to Quickstart, we introduced the > >> concept of featuresets[1] that define a certain combination of features > >> for each job. > >> > >> This seemed to be a sensible solution, as it's not practical to mention > >> all the individual features in the job name, and short names can be > >> misleading (for example ovb-ha job does so much more than tests HA). > >> > >> We decided to keep the original names for these jobs to simplify the > >> transition, but the plan is to rename them to something that will help > >> to reproduce the jobs locally with Quickstart. > >> > >> The proposed naming scheme will be the same as the one we're now using > >> for job type in project-config: > >> > >> gate-tripleo-ci-centos-7-{node-config}-{featureset-config} > >> > >> So for example the current "gate-tripleo-ci-centos-7-ovb-ha-oooq" job > >> would look like "gate-tripleo-ci-centos-7-ovb- > 3ctlr_1comp-featureset001" > > > > > > I'd prefer to keep the job names somewhat descriptive... If i had to pick > > one or the other, i'd rather stick with the current way, as at least for > me > > it's higher priority to see descriptive names in CI results than saving > time > > on finding featureset file mapping when needing to reproduce a job > result. > > My eyes scan probably more than a hundred of individual CI job results > > daily, but i only need to reproduce 0 or 1 job failures locally usually. > > > > Alternatively, could we rename "featureset001.yaml" into > > "featureset-ovb-ha.yaml" and then have i guess something like > > "gate-tripleo-ci-centos-7-ovb-3ctlr_1comp-ovb-ha" for the job name? > Maybe > > "ovb" would be there twice, in case it's needed both in node config and > > featureset parts of the job name... > > I'm in favor of keeping jobnames as simple as possible. > To me, we should use something like gate-tripleo-ci-centos-7-ovb- > featureset001 > > So we know: > > - it's a tripleo gate job running on centos7 > - it's OVB and not multinode > - it's deploying featureset001 > > Please don't mention HA or ceph or other features in the name because > it would be too rigid in case of featureset would change the coverage. > > Note: if we go that way, we also might want to rename scenario jobs > and use featureset in the job name. > Note2: if we rename jobs, we need to keep doing good work on > documenting what featureset deploy and make > https://github.com/openstack/tripleo-quickstart/blob/ > master/doc/source/feature-configuration.rst > more visible probably. > > My 2 cents. > > > Or we could pull the mapping between job name and job type in an > automated > > way from project-config. > > > > (Will be on PTO for a week from now, apologies if i don't respond timely > > here.) > > > > > > Have a good day, > > > > Jirka > > > >> > >> The advantage of this will be that it will be easy to reproduce a gate > >> job on a local virthost by typing something like: > >> > >> ./quickstart.sh --release tripleo-ci/master \ > >> --nodes config/nodes/3ctlr_1comp.yml \ > >> --config config/general_config/featureset001.yml \ > >> > >> > >> Please let us know if this method sounds like a step forward. > > > > > > > __ > > OpenStack Development Mailing List (not for usage questions) > > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject: > unsubscribe > > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > > > -- > Emilien Macchi > > __ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > -- Best regards Sagi Shnaidman __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [tripleo] CI Squad Meeting Summary (week 26) - job renaming discussion
HI I think job names should be meaningful too. We can include like "featureset024" or even "-f024" in job name to make reproducing easily, or just to make another table of featuresets and job names, like we have for file names and features. gate-tripleo-ci-centos-7-ovb-f024-ha-cont-iso-bonds-ipv6-1ctrl_1comp_1ceph seems not too long and gives a clue what runs in this job without looking for job configuration also for people outside tripleo. Our jobs run not only on TripleO CI, but on neutron, nova, etc Thanks On Fri, Jun 30, 2017 at 6:06 PM, Jiří Stránský wrote: > On 30.6.2017 15:04, Attila Darazs wrote: > >> = Renaming the CI jobs = >> >> When we started the job transition to Quickstart, we introduced the >> concept of featuresets[1] that define a certain combination of features >> for each job. >> >> This seemed to be a sensible solution, as it's not practical to mention >> all the individual features in the job name, and short names can be >> misleading (for example ovb-ha job does so much more than tests HA). >> >> We decided to keep the original names for these jobs to simplify the >> transition, but the plan is to rename them to something that will help >> to reproduce the jobs locally with Quickstart. >> >> The proposed naming scheme will be the same as the one we're now using >> for job type in project-config: >> >> gate-tripleo-ci-centos-7-{node-config}-{featureset-config} >> >> So for example the current "gate-tripleo-ci-centos-7-ovb-ha-oooq" job >> would look like "gate-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001" >> > > I'd prefer to keep the job names somewhat descriptive... If i had to pick > one or the other, i'd rather stick with the current way, as at least for me > it's higher priority to see descriptive names in CI results than saving > time on finding featureset file mapping when needing to reproduce a job > result. My eyes scan probably more than a hundred of individual CI job > results daily, but i only need to reproduce 0 or 1 job failures locally > usually. > > Alternatively, could we rename "featureset001.yaml" into > "featureset-ovb-ha.yaml" and then have i guess something like > "gate-tripleo-ci-centos-7-ovb-3ctlr_1comp-ovb-ha" for the job name? Maybe > "ovb" would be there twice, in case it's needed both in node config and > featureset parts of the job name... > > Or we could pull the mapping between job name and job type in an automated > way from project-config. > > (Will be on PTO for a week from now, apologies if i don't respond timely > here.) > > > Have a good day, > > Jirka > > >> The advantage of this will be that it will be easy to reproduce a gate >> job on a local virthost by typing something like: >> >> ./quickstart.sh --release tripleo-ci/master \ >> --nodes config/nodes/3ctlr_1comp.yml \ >> --config config/general_config/featureset001.yml \ >> >> >> Please let us know if this method sounds like a step forward. >> > > __ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > -- Best regards Sagi Shnaidman __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [TripleO][CI][containers] Broken container gate jobs block patches.
Hi, FYI that gates now are blocked because of [1] and containers jobs now are part of gate jobs. Please try to resolve it asap. Thanks [1] CI: containers jobs fail in pingtest because volume error: https://bugs.launchpad.net/tripleo/+bug/1700333 -- Best regards Sagi Shnaidman __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [TripleO] A proposal for hackathon to reduce deploy time of TripleO
Hi, all Thanks for your attention and proposals for this hackathon. With full understanding that optimization of deployment is on-going effort and should not be started and finished in these 2 days only, we still want to get focus on these issues in the sprint. Even if we don't solve immediately all problems, more people will be exposed to this field, additional tasks/bugs could be opened and scheduled, and maybe additional tests, process improvements and other insights will be introduced. If we don't reduce ci job time to 1 hour in Thursday it doesn't mean we failed the mission, please remember. The main goal of this sprint is to find problems and their work scope, and to find as many as possible solutions for them, using inter-team and team members collaboration and sharing knowledge. Ideally this collaboration and on-going effort will go further with such momentum. :) I suggest to do it in 21 - 22 Jun 2017 (Wednesday - Thursday). All other details are provided in etherpad: https://etherpad.openstack.org/p/tripleo-deploy-time-hack and in wiki as well: https://wiki.openstack.org/wiki/VirtualSprints We have a "deployment-time" tag for bugs: https://bugs.launchpad.net/tripleo/+bugs?field.tag=deployment-time Please use it for bugs that affect deployment time or CI job run time. It will be easier to handle them in the sprint. Please provide your comments and suggestions. Thanks On Tue, May 23, 2017 at 1:47 PM, Sagi Shnaidman wrote: > Hi, all > > I'd like to propose an idea to make one or two days hackathon in TripleO > project with main goal - to reduce deployment time of TripleO. > > - How could it be arranged? > > We can arrange a separate IRC channel and Bluejeans video conference > session for hackathon in these days to create a "presence" feeling. > > - How to participate and contribute? > > We'll have a few responsibility fields like tripleo-quickstart, > containers, storage, HA, baremetal, etc - the exact list should be ready > before the hackathon so that everybody could assign to one of these > "teams". It's good to have somebody in team to be stakeholder and > responsible for organization and tasks. > > - What is the goal? > > The goal of this hackathon to reduce deployment time of TripleO as much as > possible. > > For example part of CI team takes a task to reduce quickstart tasks time. > It includes statistics collection, profiling and detection of places to > optimize. After this tasks are created, patches are tested and submitted. > > The prizes will be presented to teams which saved most of time :) > > What do you think? > > Thanks > -- > Best regards > Sagi Shnaidman > -- Best regards Sagi Shnaidman __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [TripleO] overcloud containers patches todo
Hi I think a "deep dive" about containers in TripleO and some helpful documentation would help a lot for valuable reviews of these container patches. The knowledge gap that's accumulated here is pretty big. Thanks On Jun 5, 2017 03:39, "Dan Prince" wrote: > Hi, > > Any help reviewing the following patches for the overcloud > containerization effort in TripleO would be appreciated: > > https://etherpad.openstack.org/p/tripleo-containers-todo > > If you've got new services related to the containerization efforts feel > free to add them here too. > > Thanks, > > Dan > > __ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [TripleO] A proposal for hackathon to reduce deploy time of TripleO
Hi, all I'd like to propose an idea to make one or two days hackathon in TripleO project with main goal - to reduce deployment time of TripleO. - How could it be arranged? We can arrange a separate IRC channel and Bluejeans video conference session for hackathon in these days to create a "presence" feeling. - How to participate and contribute? We'll have a few responsibility fields like tripleo-quickstart, containers, storage, HA, baremetal, etc - the exact list should be ready before the hackathon so that everybody could assign to one of these "teams". It's good to have somebody in team to be stakeholder and responsible for organization and tasks. - What is the goal? The goal of this hackathon to reduce deployment time of TripleO as much as possible. For example part of CI team takes a task to reduce quickstart tasks time. It includes statistics collection, profiling and detection of places to optimize. After this tasks are created, patches are tested and submitted. The prizes will be presented to teams which saved most of time :) What do you think? Thanks -- Best regards Sagi Shnaidman __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [tripleo] [CI] HA and non-HA OVB jobs are now running with Quickstart
Hi, all In addition to multinode jobs, we migrated today part of OVB jobs to use quickstart. We had before OVB ha and OVB nonha jobs and together with migrating them to use quickstart we merged them into one job. It's called now: - gate-tripleo-ci-centos-7-ovb-ha-oooq and will be voting job instead of - gate-tripleo-ci-centos-7-ovb-ha - gate-tripleo-ci-centos-7-ovb-nonha The updates job "gate-tripleo-ci-centos-7-ovb-updates" stays the same and nothing was changed about it. The same is about periodic jobs, they stay the same and additional update will be sent when we migrate them too. In addition for tripleo-ci repository there are two branch jobs: - gate-tripleo-ci-centos-7-ovb-ha-oooq-newton - gate-tripleo-ci-centos-7-ovb-ha-oooq-ocata which replaces accordingly: - gate-tripleo-ci-centos-7-ovb-ha-ocata - gate-tripleo-ci-centos-7-ovb-nonha-ocata - gate-tripleo-ci-centos-7-ovb-ha-newton - gate-tripleo-ci-centos-7-ovb-nonha-newton A little about "gate-tripleo-ci-centos-7-ovb-ha-oooq" job: its features file is located in: https://github.com/openstack/t ripleo-quickstart/blob/master/config/general_config/featureset001.yml and it's pretty similar to previous HA job, but in addition it has overcloud SSL and nodes introspection enabled (which were tested in previous non-HA job). The old HA and non-HA jobs are moved into experimental queue and could run on the patch with "check experimental". It's done for regression check, please use it if you suspect there is a problem with migration. As usual you are welcome to ask any questions about new jobs and features in #tripleo . Tripleo-CI squad folks will be happy to answer you. Thanks -- Forwarded message -- From: Attila Darazs Date: Wed, Mar 15, 2017 at 12:04 PM Subject: [openstack-dev] [tripleo] Gating jobs are now running with Quickstart To: "OpenStack Development Mailing List (not for usage questions)" < openstack-dev@lists.openstack.org> As discussed previously in the CI Squad meeting summaries[1] and on the TripleO weekly meeting, the multinode gate jobs are now running with tripleo-quickstart. To signify the change, we added the -oooq suffix to them. The following jobs migrated yesterday evening, with more to come: - gate-tripleo-ci-centos-7-undercloud-oooq - gate-tripleo-ci-centos-7-nonha-multinode-oooq - gate-tripleo-ci-centos-7-scenario001-multinode-oooq - gate-tripleo-ci-centos-7-scenario002-multinode-oooq - gate-tripleo-ci-centos-7-scenario003-multinode-oooq - gate-tripleo-ci-centos-7-scenario004-multinode-oooq For those who are already familiar with Quickstart, we introduced two new concepts: - featureset config files that are numbered collection of settings, without node configuration[2] - the '--nodes' option for quickstart.sh and the config/nodes files that deal with only the number and type of nodes the deployment will have[3] If you would like to debug these jobs, it might be useful to read Quickstart's documentation[4]. We hope the transition will be smooth, but if you have problems ping members of the TripleO CI Squad on #tripleo. Best regards, [1] http://lists.openstack.org/pipermail/openstack-dev/2017-Marc h/113724.html [2] https://docs.openstack.org/developer/tripleo-quickstart/feat ure-configuration.html [3] https://docs.openstack.org/developer/tripleo-quickstart/node -configuration.html [4] https://docs.openstack.org/developer/tripleo-quickstart/ __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Best regards Sagi Shnaidman __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [TripleO][CI] OVB combined job and periodic
Hi, all just another point to think about transition of periodic jobs: firstly, we need featureset files for them, secondly, when we combined ha and nonha job, it should be also one job in periodic jobs, which contains now ha, nonha, updates, because of all above, I think we will still need 2 jobs, because we check overcloud deletion in one of them, undercloud idempotency in second, and it's impossible to test all in one job because of time restrictions. So it seems to be like: 1) combined ovb job + overcloud deletion (+another specific features?) 2) combined ovb job + undercloud idempotent install (+another specific features?) 3) ovb updates job thoughts? -- Best regards Sagi Shnaidman __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [tripleo] pingtest vs tempest
HI, I think Rally or Browbeat and other performance oriented solutions won't serve our needs, because we run TripleO CI on virtualized environment with very limited resources. Actually we are pretty close to full utilizing these resources when deploying openstack, so very little is available for test. It's not a problem to run tempest API tests because they are cheap - take little time, little resources, but also gives little coverage. Scenario test are more interesting and gives us more coverage, but also takes a lot of resources (which we don't have sometimes). It may be useful to run a "limited edition" of API tests that maximize coverage and don't duplicate, for example just to check service working basically, without covering all its functionality. It will take very little time (i.e. 5 tests for each service) and will give a general picture of deployment success. It will cover fields that are not covered by pingtest as well. I think could be an option to develop a special scenario tempest tests for TripleO which would fit our needs. Thanks On Wed, Apr 5, 2017 at 11:49 PM, Emilien Macchi wrote: > Greetings dear owls, > > I would like to bring back an old topic: running tempest in the gate. > > == Context > > Right now, TripleO gate is running something called pingtest to > validate that the OpenStack cloud is working. It's an Heat stack, that > deploys a Nova server, some volumes, a glance image, a neutron network > and sometimes a little bit more. > To deploy the pingtest, you obviously need Heat deployed in your overcloud. > > == Problems: > > Although pingtest has been very helpful over the last years: > - easy to understand, it's an Heat template, like an OpenStack user > would do to deploy their apps. > - fast: the stack takes a few minutes to be created and validated > > It has some limitations: > - Limitation to what Heat resources support (example: some OpenStack > resources can't be managed from Heat) > - Impossible to run a dynamic workflow (test a live migration for example) > > == Solutions > > 1) Switch pingtest to Tempest run on some specific tests, with feature > parity of what we had with pingtest. > For example, we could imagine to run the scenarios that deploys VM and > boot from volume. It would test the same thing as pingtest (details > can be discussed here). > Each scenario would run more tests depending on the service that they > run (scenario001 is telemetry, so it would run some tempest tests for > Ceilometer, Aodh, Gnocchi, etc). > We should work at making the tempest run as short as possible, and the > close as possible from what we have with a pingtest. > > 2) Run custom scripts in TripleO CI tooling, called from the pingtest > (heat template), that would run some validations commands (API calls, > etc). > It has been investigated in the past but never implemented AFIK. > > 3) ? > > I tried to make this text short and go straight to the point, please > bring feedback now. I hope we can make progress on $topic during Pike, > so we can increase our testing coverage and detect deployment issues > sooner. > > Thanks, > -- > Emilien Macchi > > __ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > -- Best regards Sagi Shnaidman __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [TripleO][tripleo-quickstart][tripleo-ci] Review of critical bugfix
Hi, all we have a pretty critical bug in quickstart jobs[1] that ignores status code of commands, please review its fix[2]. If you have more elegant solution than setting pipefail or exiting with PIPESTATUS, please suggest in comments. Thanks [1] https://bugs.launchpad.net/tripleo/+bug/1676156 [2] https://review.openstack.org/450023 -- Best regards Sagi Shnaidman __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [infra][tripleo] initial discussion for a new periodic pipeline
Paul, if we run 750 ovb jobs per day, than adding 12 more will be less than 2% increase. I don't believe it will be a serious issue. Thanks On Tue, Mar 21, 2017 at 7:34 PM, Paul Belanger wrote: > On Tue, Mar 21, 2017 at 12:40:39PM -0400, Wesley Hayutin wrote: > > On Tue, Mar 21, 2017 at 12:03 PM, Emilien Macchi > wrote: > > > > > On Mon, Mar 20, 2017 at 3:29 PM, Paul Belanger > > > wrote: > > > > On Sun, Mar 19, 2017 at 06:54:27PM +0200, Sagi Shnaidman wrote: > > > >> Hi, Paul > > > >> I would say that real worthwhile try starts from "normal" priority, > > > because > > > >> we want to run promotion jobs more *often*, not more *rarely* which > > > happens > > > >> with low priority. > > > >> In addition the initial idea in the first mail was running them each > > > after > > > >> other almost, not once a day like it happens now or with "low" > priority. > > > >> > > > > As I've said, my main reluctance is is how the gate will react if we > > > create a > > > > new pipeline with the same priority as our check pipeline. I would > much > > > rather > > > > since on caution, default to 'low', see how things react for a day / > > > week / > > > > month, then see what it would like like a normal. I want us to be > > > caution about > > > > adding a new pipeline, as it dynamically changes how our existing > > > pipelines > > > > function. > > > > > > > > Further more, this is actually a capacity issue for > > > tripleo-test-cloud-rh1, > > > > there currently too many jobs running for the amount of hardware. If > > > these jobs > > > > were running on our donated clouds, we could get away with a low > priority > > > > periodic pipeline. > > > > > > multinode jobs are running under donated clouds but as you know ovb > not. > > > We want to keep ovb jobs in our promotion pipeline because they bring > > > high value to the tests (ironic, ipv6, ssl, probably more). > > > > > > Another alternative would be to reduce it to one ovb job (ironic with > > > introspection + ipv6 + ssl at minimum) and use the 4 multinode jobs > > > into the promotion pipeline -instead of the 3 ovb. > > > > > > > I'm +1 on using one ovb jobs + 4 multinode jobs. > > > > > > > > > > current: 3 ovb jobs running every night > > > proposal: 18 ovb jobs per day > > > > > > The addition will cost us 15 jobs into rh1 load. Would it be > acceptable? > > > > > > > Now, allow me to propose another solution. > > > > > > > > RDO project has their own version of zuul, which has the ability to > do > > > periodic > > > > pipelines. Since tripleo-test-cloud-rh2 is still around, and has OVB > > > ability, I > > > > would suggest configuring this promoting pipeline within RDO, as to > not > > > affect > > > > the capacity of tripleo-test-cloud-rh1. This now means, you can > > > continuously > > > > enqueue jobs at a rate of 4 hours, priority shouldn't matter as you > are > > > the only > > > > jobs running on tripleo-test-cloud-rh2, resulting in faster > promotions. > > > > > > Using RDO would also be an option. I'm just not sure about our > > > available resources, maybe other can reply on this one. > > > > > > > The purpose of the periodic jobs are two fold. > > 1. ensure the latest built packages work > > 2. ensure the tripleo check gates continue to work with out error > > > > Running the promotion in review.rdoproject would not cover #2. The > > rdoproject jobs > > would be configured in slightly different ways from upstream tripleo. > > Running the promotion > > in ci.centos has the same issue. > > > Right, there is some leg work to use the images produced by opentack-infra > in > RDO, but that is straightforward. It would be the same build process that > a 3rd > party CI system does. It would be a matter of copying nodepool.yaml from > openstack-infra/project-config, and (this is harder) using > nodepool-builder to > build the images. Today RDO does snapshot images. > > > Using tripleo-testcloud-rh2 I think is fine. > > > > > > > > > > > This also make sense, as packaging is done in RDO, and you are &
[openstack-dev] [TripleO] A lot of instack-haproxy zombie processes
Hi, all while investigating periodic jobs failure, I mentioned a lot of Z processes on undercloud: bash-4.2# ps aux | grep " Z " | less root 28481 0.0 0.0 0 0 ?Z17:41 0:00 [instack-haproxy] root 28494 0.0 0.0 0 0 ?Z17:41 0:00 [instack-haproxy] root 28509 0.0 0.0 0 0 ?Z17:41 0:00 [instack-haproxy] root 28522 0.0 0.0 0 0 ?Z17:41 0:00 [instack-haproxy] ... bash-4.2# ps aux | grep " Z " | wc -l 979 About a thousand same zombie processes. I don't think it's appropriate behavior, although not sure it's the reason for failing jobs. Any ideas why it could happen? Thanks -- Best regards Sagi Shnaidman __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [infra][tripleo] initial discussion for a new periodic pipeline
Hi, Paul I would say that real worthwhile try starts from "normal" priority, because we want to run promotion jobs more *often*, not more *rarely* which happens with low priority. In addition the initial idea in the first mail was running them each after other almost, not once a day like it happens now or with "low" priority. Thanks On Wed, Mar 15, 2017 at 11:16 PM, Paul Belanger wrote: > On Wed, Mar 15, 2017 at 03:42:32PM -0500, Ben Nemec wrote: > > > > > > On 03/13/2017 02:29 PM, Sagi Shnaidman wrote: > > > Hi, all > > > > > > I submitted a change: https://review.openstack.org/#/c/443964/ > > > but seems like it reached a point which requires an additional > discussion. > > > > > > I had a few proposals, it's increasing period to 12 hours instead of 4 > > > for start, and to leave it in regular periodic *low* precedence. > > > I think we can start from 12 hours period to see how it goes, although > I > > > don't think that 4 only jobs will increase load on OVB cloud, it's > > > completely negligible comparing to current OVB capacity and load. > > > But making its precedence as "low" IMHO completely removes any sense > > > from this pipeline to be, because we already run experimental-tripleo > > > pipeline which this priority and it could reach timeouts like 7-14 > > > hours. So let's assume we ran periodic job, it's queued to run now 12 + > > > "low queue length" - about 20 and more hours. It's even worse than > usual > > > periodic job and definitely makes this change useless. > > > I'd like to notice as well that those periodic jobs unlike "usual" > > > periodic are used for repository promotion and their value are equal or > > > higher than check jobs, so it needs to run with "normal" or even "high" > > > precedence. > > > > Yeah, it makes no sense from an OVB perspective to add these as low > priority > > jobs. Once in a while we've managed to chew through the entire > experimental > > queue during the day, but with the containers job added it's very > unlikely > > that's going to happen anymore. Right now we have a 4.5 hour wait time > just > > for the check queue, then there's two hours of experimental jobs queued > up > > behind that. All of which means if we started a low priority periodic > job > > right now it probably wouldn't run until about midnight my time, which I > > think is when the regular periodic jobs run now. > > > Lets just give it a try? A 12 hour periodic job with low priority. There is > nothing saying we cannot iterate on this after a few days / weeks / months. > > > > > > > Thanks > > > > > > > > > On Thu, Mar 9, 2017 at 10:06 PM, Wesley Hayutin > > <mailto:whayu...@redhat.com>> wrote: > > > > > > > > > > > > On Wed, Mar 8, 2017 at 1:29 PM, Jeremy Stanley > > <mailto:fu...@yuggoth.org>> wrote: > > > > > > On 2017-03-07 10:12:58 -0500 (-0500), Wesley Hayutin wrote: > > > > The TripleO team would like to initiate a conversation about > the > > > > possibility of creating a new pipeline in Openstack Infra to > allow > > > > a set of jobs to run periodically every four hours > > > [...] > > > > > > The request doesn't strike me as contentious/controversial. > Why not > > > just propose your addition to the zuul/layout.yaml file in the > > > openstack-infra/project-config repo and hash out any resulting > > > concerns via code review? > > > -- > > > Jeremy Stanley > > > > > > > > > Sounds good to me. > > > We thought it would be nice to walk through it in an email first :) > > > > > > Thanks > > > > > > > > > > __ > > > OpenStack Development Mailing List (not for usage questions) > > > Unsubscribe: > > > openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > > > <http://openstack-dev-requ...@lists.openstack.org?subject: > unsubscribe> > > > http://lists.openstack.org/cgi-bin/mailman/listinfo/ > openstack-dev <http://lists.openstack.org/cgi-bin/mailman/listinfo/ > openstack-dev> > > > > > > > > >
Re: [openstack-dev] [TripleO] Propose Attila Darazs and Gabriele Cerami for tripleo-ci core
+1 +1 ! On Wed, Mar 15, 2017 at 5:44 PM, John Trowbridge wrote: > Both Attila and Gabriele have been rockstars with the work to transition > tripleo-ci to run via quickstart, and both have become extremely > knowledgeable about how tripleo-ci works during that process. They are > both very capable of providing thorough and thoughtful reviews of > tripleo-ci patches. > > On top of this Attila has greatly increased the communication from the > tripleo-ci squad as the liason, with weekly summary emails of our > meetings to this list. > > - trown > > __ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > -- Best regards Sagi Shnaidman __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [infra][tripleo] initial discussion for a new periodic pipeline
Hi, all I submitted a change: https://review.openstack.org/#/c/443964/ but seems like it reached a point which requires an additional discussion. I had a few proposals, it's increasing period to 12 hours instead of 4 for start, and to leave it in regular periodic *low* precedence. I think we can start from 12 hours period to see how it goes, although I don't think that 4 only jobs will increase load on OVB cloud, it's completely negligible comparing to current OVB capacity and load. But making its precedence as "low" IMHO completely removes any sense from this pipeline to be, because we already run experimental-tripleo pipeline which this priority and it could reach timeouts like 7-14 hours. So let's assume we ran periodic job, it's queued to run now 12 + "low queue length" - about 20 and more hours. It's even worse than usual periodic job and definitely makes this change useless. I'd like to notice as well that those periodic jobs unlike "usual" periodic are used for repository promotion and their value are equal or higher than check jobs, so it needs to run with "normal" or even "high" precedence. Thanks On Thu, Mar 9, 2017 at 10:06 PM, Wesley Hayutin wrote: > > > On Wed, Mar 8, 2017 at 1:29 PM, Jeremy Stanley wrote: > >> On 2017-03-07 10:12:58 -0500 (-0500), Wesley Hayutin wrote: >> > The TripleO team would like to initiate a conversation about the >> > possibility of creating a new pipeline in Openstack Infra to allow >> > a set of jobs to run periodically every four hours >> [...] >> >> The request doesn't strike me as contentious/controversial. Why not >> just propose your addition to the zuul/layout.yaml file in the >> openstack-infra/project-config repo and hash out any resulting >> concerns via code review? >> -- >> Jeremy Stanley >> >> > Sounds good to me. > We thought it would be nice to walk through it in an email first :) > > Thanks > > >> >> __ >> OpenStack Development Mailing List (not for usage questions) >> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscrib >> e >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >> > > > ______ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > -- Best regards Sagi Shnaidman __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [TripleO][CI] Running experimental OVB and not OVB jobs separately.
HI, all I'd like to propose a bit different approach to run experimental jobs in TripleO CI. As you know we have OVB jobs and not-OVB jobs, and different pipelines for running these two types of them. What is current flow: if you need to run experimental jobs, you write comment with "check experimental" and all types of jobs will run - both OVB and not-OVB. What is proposal: for running OVB jobs only, you'll need to leave comment "check experimental-tripleo", for running non-OVB jobs only you'll still write "check experimental". For running all experimental jobs OVB and not-OVB just leave two comments: check experimental-tripleo check experimental >From what I observed people usually want to run one-two of experimental jobs and usually one type of them. So this more explicit run can save us expensive OVB resources. If this not a case and you prefer to run all experimental jobs we have at once, please provide a feedback and I'll take it back. Patch about the topic: https://review.openstack.org/#/c/425184/ Thanks -- Best regards Sagi Shnaidman __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [TripleO][CI]
Hi, all FYI, the periodic TripleO nonha jobs fail because of introspection failure, there is opened bug in mistral: Ironic introspection fails because unexpected keyword "insecure" https://bugs.launchpad.net/tripleo/+bug/1656692 and marked as promotion blocker. Thanks -- Best regards Sagi Shnaidman __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [TripleO][CI] Dev CI environment for OVB
Hi, all with Dereks help we set up OVB dev environment on rh1/rh2 clouds, which allow developers to run their patches in real CI environment and debug their issues there. In case you have a problem with your patch on CI and locally it works - you can reproduce and debug it on this environment. Please note, this is OVB environment only. Please use for regular patch tests the tripleo-quickstart project[1], which is more fits this purpose, this dev env is for CI issues only. If shortly, we have a special tenants on rh1/rh2, on which you can create your undercloud vm from infra image, then create your OVB environment there. Finally you're ready to run your patch there - clone your repo, inject the changes and run main CI script toci_gate_test.sh. The whole process is described in etherpad https://etherpad.openstack.org /p/tripleo-ci-devenvs where below you'll find a script that does everything for you (as all scripts should do usually). In case you need to test your patch, just send me your *public* keys *offline*, I'll add them to tenant defaults and you'll be able to run it. Thanks [1] https://github.com/openstack/tripleo-quickstart -- Best regards Sagi Shnaidman __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [TripleO][CI] Failed jobs because bad image on mirror server
Hi, all FYI, jobs failed after last images promotion because of corrupted image, seems like last promotion job failed to upload it correctly, it didn't match md5. I've replaced it on mirror server with image from previous delorean hash run, it should be OK because we anyway update them and it should be updated on next promotion job run. Thanks -- Best regards Sagi Shnaidman __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [TripleO][CI] Memory shortage in HA jobs, please increase it
Hi, Derek I suspect Sahara can cause it, it started to run on overcloud since my patch was merged: https://review.openstack.org/#/c/352598/ I don't think it ever ran on jobs, because was either improperly configured or disabled. And according to reports it's most memory consuming service on overcloud controllers. On Fri, Aug 19, 2016 at 12:41 PM, Derek Higgins wrote: > On 19 August 2016 at 00:07, Sagi Shnaidman wrote: > > Hi, > > > > we have a problem again with not enough memory in HA jobs, all of them > > constantly fails in CI: http://status-tripleoci.rhcloud.com/ > > Have we any idea why we need more memory all of a sudden? For months > the overcloud nodes have had 5G of RAM, then last week[1] we bumped it > too 5.5G now we need it bumped too 6G. > > If a new service has been added that is needed on the overcloud then > bumping to 6G is expected and probably the correct answer but I'd like > to see us avoiding blindly increasing the resources each time we see > out of memory errors without investigating if there was a regression > causing something to start hogging memory. > > Sorry if it seems like I'm being picky about this (I seem to resist > these bumps every time they come up) but there are two good reasons to > avoid this if possible > o at peak we are currently configured to run 75 simultaneous jobs > (although we probably don't reach that at the moment), and each HA job > has 5 baremetal nodes so bumping from 5G too 6G increases the amount > of RAM ci can use at peak by 375G > o When we bump the RAM usage of baremetal nodes from 5G too 6G what > we're actually doing is increasing the minimum requirements for > developers from 28G(or whatever the number is now) too 32G > > So before we bump the number can we just check first if its justified, > as I've watched this number increase from 2G since we started running > tripleo-ci > > thanks, > Derek. > > [1] - https://review.openstack.org/#/c/353655/ > > > I've created a patch that will increase it[1], but we need to increase it > > right now on rh1. > > I can't do it now, because unfortunately I'll not be able to watch this > if > > it works and no problems appear. > > TripleO CI cloud admins, please increase the memory for baremetal flavor > on > > rh1 tomorrow (to 6144?). > > > > Thanks > > > > [1] https://review.openstack.org/#/c/357532/ > > -- > > Best regards > > Sagi Shnaidman > -- Best regards Sagi Shnaidman __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [TripleO][CI] Memory shortage in HA jobs, please increase it
Hi, we have a problem again with not enough memory in HA jobs, all of them constantly fails in CI: http://status-tripleoci.rhcloud.com/ I've created a patch that will increase it[1], but we need to increase it right now on rh1. I can't do it now, because unfortunately I'll not be able to watch this if it works and no problems appear. TripleO CI cloud admins, please increase the memory for baremetal flavor on rh1 tomorrow (to 6144?). Thanks [1] https://review.openstack.org/#/c/357532/ -- Best regards Sagi Shnaidman __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [TripleO][CI][Delorean]
Hi, all we have current tripleo repo[1] pointing to old repository[2] which contains broken cinder[3] with volumes type bug[4] That breaks all our CI jobs which can not create pingtest stack, because current-tripleo is the main repo that is used in jobs[5] Could we move the link please to any newer repo that contains cinder fix? Thanks [1] http://buildlogs.centos.org/centos/7/cloud/x86_64/rdo-trunk-master-tripleo/ [2] http://trunk.rdoproject.org/centos7/c6/bd/c6bd3cb95b9819c03345f50bf2812227e81314ab_4e6dfa3c [3] openstack-cinder-9.0.0-0.20160810043123.b53621a.el7.centos.noarch [4] https://bugs.launchpad.net/cinder/+bug/1610073 [5] https://github.com/openstack-infra/tripleo-ci/blob/master/scripts/tripleo.sh#L100 -- Best regards Sagi Shnaidman __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [infra] [TripleO] ntp-wait issue breaks TripleO jobs
Hi infra and TripleO cores, I'd like to ask you for review+merge of bugfix which limits ntp-wait tries to 100 instead of current 1000. These long tries cause timeout of 100 minutes and breaks TripleO jobs. More details are in the bug: https://bugs.launchpad.net/tripleo/+bug/1608226 The patch: https://review.openstack.org/#/c/349261 Thanks -- Best regards Sagi Shnaidman __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [TripleO] Delorean fail blocks CI for stable branches
On Thu, Jul 21, 2016 at 3:11 AM, Alan Pevec wrote: > On Wed, Jul 20, 2016 at 7:49 PM, Sagi Shnaidman > wrote: > > How then it worked before? Can you show me the patch that broke this > > functionality in delorean? It should be about 15 Jul when jobs started to > > fail. > > commented in lp > > > How then master branch works? It also runs on patched repo and succeeds. > > I explained that but looks like we're talking past each other. > > > I don't think we can use this workaround, each time this source file will > > change - all our jobs will fail again? It's not even a workaround. > > Please let's stop discussing and let's solve it finally, it blocks our CI > > for stable patches. > > Sure, I've assigned https://bugs.launchpad.net/tripleo/+bug/1604039 to > myself and proposed a patch. > > It's a workaround for short time range, but NOT a solution, if you change something in this one file, it'll be broken again. But it does NOT solve the main issue - after recent changes in dlrn and specs we can't build repo with delorean on stable branches. I think it should be solved on DLRN side and should be provided a appropriate interface to use it for CI purposes. I opened an issue there: https://github.com/openstack-packages/DLRN/issues/22 But you closed it, so I suppose we will not get any solution and help for it from your side? Should we move to other packaging tool? > Alan > -- Best regards Sagi Shnaidman __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [TripleO] Delorean fail blocks CI for stable branches
How then it worked before? Can you show me the patch that broke this functionality in delorean? It should be about 15 Jul when jobs started to fail. How then master branch works? It also runs on patched repo and succeeds. I don't think we can use this workaround, each time this source file will change - all our jobs will fail again? It's not even a workaround. Please let's stop discussing and let's solve it finally, it blocks our CI for stable patches. I'd expect for a little bit more involvement in this issue and suggest you or anybody who understand well delorean code and specs will try to solve it when I provide him a whole TripleO CI dev environment with walking through every CI step and any other info I can provide. Let's just sit and solve it, otherwise we'll never get it working. Thanks On Wed, Jul 20, 2016 at 7:50 PM, Alan Pevec wrote: > > as a quickfix in tripleo.sh you could patch dlrn and set local=True in > > correction, patch local=False there while running dlrn command with > --local to keep source checkout as-is > -- Best regards Sagi Shnaidman __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [TripleO] Delorean fail blocks CI for stable branches
On Wed, Jul 20, 2016 at 2:29 PM, Alan Pevec wrote: > > git clone https://git.openstack.org/openstack/tripleo-heat-templates > > cd tripleo-heat-templates/ > > git checkout -b stable/mitaka origin/stable/mitaka > > ^ this is manually switching to the stable source branch > > > sed -i -e "s%distro=.*%distro=rpm-mitaka%" projects.ini > > sed -i -e "s%source=.*%source=stable/mitaka%" projects.ini > > ^ this configures dlrn to the correct combination of distro and source > branches, but ... > > > ./venv/bin/dlrn --config-file projects.ini --head-only --package-name > > openstack-tripleo-heat-templates --local > > ^ ... --local here keeps local checkout untouched, so you end up with > default rpm-master in distro git checkout. > If you remove --local it will reset local checkouts to the branches > specified in projects.ini > > Alan, I don't want to reset local checkouts and reset branches - I need to build with these checkout, it's all CI is for. > Alan > -- Best regards Sagi Shnaidman __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [TripleO] Delorean fail blocks CI for stable branches
HI, we have a problem with delorean build of stable branches in TripleO CI[1], and it seems like a rpm specs problem. It can be reproducible easily [2] Please, help with solution to this problem, all info is in the bug[1] Alan, if you think we use dlrn wrong, please point me out which line is incorrect in reproducing. [1] https://bugs.launchpad.net/tripleo/+bug/1604039 [2] Reproducing: sudo yum install -y createrepo git mock rpm-build yum-plugin-priorities yum-utils gcc python-virtualenv libffi-devel openssl-devel sudo usermod -G mock -a $(id -nu) cd /tmp/ sudo rm -rf /tmp/test mkdir /tmp/test && cd /tmp/test git clone https://git.openstack.org/openstack/tripleo-heat-templates cd tripleo-heat-templates/ git checkout -b stable/mitaka origin/stable/mitaka cd .. git clone https://github.com/openstack-packages/delorean.git cd delorean mkdir -p data sed -i -e 's%--postinstall%%' scripts/build_rpm.sh virtualenv venv ./venv/bin/pip install -U setuptools ./venv/bin/pip install pytz ./venv/bin/pip install . sed -i -e "s%baseurl=.*%baseurl=https://trunk.rdoproject.org/centos7-mitaka%"; projects.ini sed -i -e "s%distro=.*%distro=rpm-mitaka%" projects.ini sed -i -e "s%source=.*%source=stable/mitaka%" projects.ini cp -r ../tripleo-heat-templates data/openstack-tripleo-heat-templates cd data/openstack-tripleo-heat-templates/ GITHASH=$(git rev-parse HEAD) for BRANCH in master origin/master stable/liberty origin/stable/liberty stable/mitaka origin/stable/mitaka; do git checkout -b $BRANCH || git checkout $BRANCH git reset --hard $GITHASH done cd /tmp/test/delorean ./venv/bin/dlrn --config-file projects.ini --head-only --package-name openstack-tripleo-heat-templates --local The projects.ini: [DEFAULT] datadir=./data scriptsdir=./scripts baseurl=https://trunk.rdoproject.org/centos7-mitaka distro=rpm-mitaka source=stable/mitaka target=centos smtpserver= reponame=delorean templatedir=./dlrn/templates maxretries=3 pkginfo_driver=dlrn.drivers.rdoinfo.RdoInfoDriver tags= #tags=mitaka rsyncdest= rsyncport=22 [gitrepo_driver] -- Best regards Sagi Shnaidman __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [TripleO][CI] Tempest on periodic jobs
Hi, raising again the question about tempest running on TripleO CI as it was discussed in the last TripleO meeting. I'd like to get your attention that in these tests, which I ran just for ensure it works, there were bugs discovered, and these weren't corner cases but real failures of TripleO installation. Like this one for Sahara: https://review.openstack.org/#/c/309042/ I'm sorry, I should have prepared these bugs for the meeting as proofs for testing value. The second issue that was blocker before is a wall time and now, as we can see from jobs length, after HW upgrade of CI - is not an issue anymore. We can run tempest without any fear to get into timeout problem, "nonha" job for sure, as most short from all. So I'd insist on running tempest exactly on promoting job in order not to promote images with bugs, especially the critical like the whole service not available at all. The pingtest is not enough for this purpose as we can see from the bugs above, it checks very basic things and not all services are covered. I think we aren't interested just to see the jobs green, but sticking for the basic working functionality and quality of promoting. Maybe it's influence of my previous QA roles, but I don't see any value to promote something with bugs. The point about CI stability - the last issues that CI faces now are not so connected to tempest tests or CI code at all, it's bugs of underlying projects and whether tempest will run or not doesn't really matters in this case. These issues fail everything yet before any testing starts. Indication of such issues before they leak into TripleO is different topic and approach. So my main point for running tempest tests on "nonha" periodic jobs is: Quality and guaranteed basic functionality of installed overcloud services. At least all of them are up and can accept connections. Avoid and early discover critical bugs that are not seen in pingtest. I remind that we going to run the only smoke tests, which takes not much time and check the basic functionality only. P.S. If there is interest, we can run the whole tempest set or specific sets in experimental or third-party jobs just for indication. And I mean not only tempest tests, but project scenario tests as well, for example Heat integration tests. Both for undercloud and overcloud. P.P.S Just ping me if you have any unclear points or would like to discuss it in separate meeting, I'll give the all required info. Thanks -- Best regards Sagi Shnaidman __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [TripleO][CI] Elastic recheck bugs
Sorry, missed the mentioned patches: [1] Refresh log files for tripleo project: https://review.openstack.org/#/c/312985/ [2] Add bug for TripleO timeouts: https://review.openstack.org/#/c/313038/ On Mon, May 9, 2016 at 8:22 PM, Sagi Shnaidman wrote: > Hi, all > > I'd like to enable elastic recheck on TripleO CI and have submitted > patches for refreshing the tracked logs [1] (please review) and for timeout > case [2]. > But according to Derek's comment behind the timeout issue could be > multiple issues and bugs, so I'd like to clarify - what are criteria for > elastic recheck bugs? > > I thought about those markers: > > Nova: > 1) "No valid host was found. There are not enough hosts" > Network issues: > 2) "Failed to connect to trunk.rdoproject.org" OR "fatal: The remote end > hung up unexpectedly" OR "Could not resolve host:" > Ironic: > 3) "Error contacting Ironic server:" > 4) "Introspection completed with errors:" > 5) ": Introspection timeout" > 6) "Timed out waiting for node " > Glance: > 7) "500 Internal Server Error: Failed to upload image" > crm_resource: > 8) "crm_resource for openstack " > > and various puppet errors. > > However almost all of these messages could have different root causes, > except of network failures. Easy to fix bug doesn't make to submit there, > because they will be fixed yet before recheck patch will be merged. > So, could you please think about right criteria of bugs for elastic > recheck? > > Thanks > > -- > Best regards > Sagi Shnaidman > -- Best regards Sagi Shnaidman __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [TripleO][CI] Elastic recheck bugs
Hi, all I'd like to enable elastic recheck on TripleO CI and have submitted patches for refreshing the tracked logs [1] (please review) and for timeout case [2]. But according to Derek's comment behind the timeout issue could be multiple issues and bugs, so I'd like to clarify - what are criteria for elastic recheck bugs? I thought about those markers: Nova: 1) "No valid host was found. There are not enough hosts" Network issues: 2) "Failed to connect to trunk.rdoproject.org" OR "fatal: The remote end hung up unexpectedly" OR "Could not resolve host:" Ironic: 3) "Error contacting Ironic server:" 4) "Introspection completed with errors:" 5) ": Introspection timeout" 6) "Timed out waiting for node " Glance: 7) "500 Internal Server Error: Failed to upload image" crm_resource: 8) "crm_resource for openstack " and various puppet errors. However almost all of these messages could have different root causes, except of network failures. Easy to fix bug doesn't make to submit there, because they will be fixed yet before recheck patch will be merged. So, could you please think about right criteria of bugs for elastic recheck? Thanks -- Best regards Sagi Shnaidman __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [TripleO][CI] Tempest sources for testing tripleo in CI environment
For making clear all advantages and disadvantages, I've created a doc: https://docs.google.com/document/d/1HmY-I8OzoJt0SzLzs79hCa1smKGltb-byrJOkKKGXII/edit?usp=sharing Please comment. On Sun, Apr 17, 2016 at 12:14 PM, Sagi Shnaidman wrote: > > Hi, > > John raised up the issue - where should we take tempest sources from. > I'm not sure where to take them from, so I bring it to wider discussion. > > Right now I use tempest from delorean packages. In comparison with > original tempest I don't see any difference in tests, only additional > configuration scripts: > https://github.com/openstack/tempest/compare/master...redhat-openstack:master > It's worth to mention that in case of delorean tempest the configuration > scripts fit tempest tests configuration, however in case of original > tempest repo it's required to change them and maintain according to very > dynamical configuration. > > So, do we need to use pure upstream tempest from current source and to > maintain configuration scripts or we can use packaged from delorean and not > duplicate effort of test teams? > > Thanks > -- > Best regards > Sagi Shnaidman > -- Best regards Sagi Shnaidman __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [TripleO][CI] Tempest sources for testing tripleo in CI environment
Hi, John raised up the issue - where should we take tempest sources from. I'm not sure where to take them from, so I bring it to wider discussion. Right now I use tempest from delorean packages. In comparison with original tempest I don't see any difference in tests, only additional configuration scripts: https://github.com/openstack/tempest/compare/master...redhat-openstack:master It's worth to mention that in case of delorean tempest the configuration scripts fit tempest tests configuration, however in case of original tempest repo it's required to change them and maintain according to very dynamical configuration. So, do we need to use pure upstream tempest from current source and to maintain configuration scripts or we can use packaged from delorean and not duplicate effort of test teams? Thanks -- Best regards Sagi Shnaidman __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [TripleO] [CI] Tempest configuration in Tripleo CI jobs
Hi, Andrey I've checked this option - to use rally for configuring and running tempest test. Although it looks like great choice, unfortunately a few issues and bugs makes it not useful right now. For example it can not work with current public networks and can not create new ones, so that everything that is related to networking will fail. As I understand this bug remains already a long time unsolved: https://bugs.launchpad.net/rally/+bug/1550848 Also it doesn't have possibility to customize configuration options when running tempest configuration, like configure_tempest.py has - just count them in the command line. In rally you will need to generate tempest file and then manually to edit it for customize (for example tempest log path in DEFAULT section). Adding "interface" for tempest configuration will be great feature for rally IMHO. I think it's cool approach and we definitely should take it into account, but now it looks pretty raw and not stable enough to use it in gate jobs. Anyway, thank you for your pointing out to this great tool. Thanks On Fri, Apr 8, 2016 at 2:33 PM, Andrey Kurilin wrote: > Hi Sagi, > > > On Thu, Apr 7, 2016 at 5:56 PM, Sagi Shnaidman > wrote: > >> Hi, all >> >> I'd like to discuss the topic about how do we configure tempest in CI >> jobs for TripleO. >> I have currently two patches: >> support for tempest: https://review.openstack.org/#/c/295844/ >> actually run of tests: https://review.openstack.org/#/c/297038/ >> >> Right now there is no upstream tool to configure tempest, so everybody >> use their own tools. >> > > You are wrong. There is Rally in upstream:) > Basic and the most widely used Rally component is Task, which provides > benchmarking and testing tool. > But, also, Rally has Verification component(here > <https://www.mirantis.com/blog/rally-openstack-tempest-testing-made-simpler/> > you can find is a bit outdated blog-post, but it can introduce Verification > component for you). > It can: > > 1. Configure Tempest based on public OpenStack API. > An example of config from our gates: > http://logs.openstack.org/58/285758/5/check/gate-rally-dsvm-verify-full/eabe2ff/rally-verify/5_verify_showconfig.txt.gz > . Empty options mean that rally will check these resources while running > tempest and create it if necessary) > > 2. Launch set of tests, tests which match regexp, list of tests. Also, it > supports x-fail mechanism from out of box. > An example of full run based on config file posted above - > http://logs.openstack.org/58/285758/5/check/gate-rally-dsvm-verify-full/eabe2ff/rally-verify/7_verify_results.html.gz > > 3. Compare results. > > http://logs.openstack.org/58/285758/5/check/gate-rally-dsvm-verify-light/d806b91/rally-verify/17_verify_compare_--uuid-1_9fe72ea8-bd5c-45eb-9a37-5e674ea5e5d4_--uuid-2_315843d4-40b8-46f2-aa69-fb3d5d463379.html.gz > It is not so good-looking as other rally reports, but we will fix it > someday:) > > Summarize: > - Rally is an upstream tool, which was accepted to BigTent. > - One instance of Rally can manage and run tempest for different number of > clouds > - Rally Verification component is tested in gates for every new patch. > Also it supports different APIs of services. > - You can install, configure, launch, store results, display results in > different formats. > > Btw, we are planning to refactor verification component(there is an spec > on review with several +2), so you will be able to launch whatever you want > subunit-based tools via Rally and simplify usage of it. > > However it's planned and David Mellado is working on it AFAIK. >> > Till then everybody use their own tools for tempest configuration. >> I'd review two of them: >> 1) Puppet configurations that is used in puppet modules CI >> 2) Using configure_tempest.py script from >> https://github.com/redhat-openstack/tempest/blob/master/tools/config_tempest.py >> >> Unfortunately there is no ready puppet module or script, that configures >> tempest, you need to create your own. >> >> On other hand the config_tempest.py script provides full configuration, >> support for tempest-deployer-input.conf and possibility to add any config >> options in the command line when running it: >> >> python config_tempest.py \ >> --out etc/tempest.conf \ >> --debug \ >> --create \ >> --deployer-input ~/tempest-deployer-input.conf \ >> identity.uri $OS_AUTH_URL \ >> compute.allow_tenant_isolation true \ >> identity.admin_password $OS_PASSWORD \ >> compute.build_timeout 500 \ >> compute.image_ssh_user cirros >> >&
[openstack-dev] [TripleO] [CI] Tempest configuration in Tripleo CI jobs
=> '/tmp/openstack/tempest', } But it's not enough, you need also to make some workarounds and additional configurations, for example: tempest_config { 'object-storage/operator_role': value => 'SwiftOperator', path => "${tempest_clone_path}/etc/tempest.conf", } } After this run puppet on controller node: sudo puppet apply --verbose --debug --detailed-exitcodes -e "include ::testt" | tee ~/puppet_run.log After everything is finished, you need to copy the folder with tempest to your node: scp -r -heat-admin@${CONTROLLER}:/tmp/openstack /tmp/ After this run within this directory testr init and run tests: /tmp/tempest/tools/with_venv.sh testr init /tmp/tempest/tools/with_venv.sh testr run There are still holes in this configuration and most likely you'd fix it by another workarounds and tempest_config runs, because it's still a few of skipped tests, so configuration is not full as it would be done with config_tempest.py. You don't have also any possibility to add custom configuration in running the manifest, for each config change you need to change the manifest itself which makes it maintenance harder and more complex. I would say that conclusion is quite obvious for me and it's much easier even to write tempest.conf manually from scratch or simple template and use 5 bash lines, then use puppet for things it's completely not fit to. P.S. In this script I used ideas from puppet-openstack-integration and packstack projects. [1] https://review.openstack.org/#/c/295844/ [2] https://git.openstack.org/openstack-infra/tripleo-ci -- Best regards Sagi Shnaidman __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev