[openstack-dev] [tripleo] Proposing Enrique Llorente Pastora as a core reviewer for TripleO

2018-11-15 Thread Sagi Shnaidman
Hi,
I'd like to propose Quique (@quiquell) as a core reviewer for TripleO.
Quique is actively involved in improvements and development of TripleO and
TripleO CI. He also helps in other projects including but not limited to
Infrastructure.
He shows a very good understanding how TripleO and CI works and I'd like
suggest him as core reviewer of TripleO in CI related code.

Please vote!
My +1 is here :)

Thanks
-- 
Best regards
Sagi Shnaidman
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tripleo] shutting down 3rd party TripleO CI for measurements

2018-11-06 Thread Sagi Shnaidman
We measured results and would like to shut down check jobs in RDO cloud CI
today. Please let us know if you have objections.

Thanks

On Thu, Nov 1, 2018 at 12:14 AM Wesley Hayutin  wrote:

> Greetings,
>
> The TripleO-CI team would like to consider shutting down all the third
> party check jobs running against TripleO projects in order to measure
> results with and without load on the cloud for some amount of time.  I
> suspect we would want to shut things down for roughly 24-48 hours.
>
> If there are any strong objects please let us know.
> Thank you
> --
>
> Wes Hayutin
>
> Associate MANAGER
>
> Red Hat
>
> <https://www.redhat.com/>
>
> whayu...@redhat.comT: +1919 <+19197544114>4232509 IRC:  weshay
> <https://red.ht/sig>
>
> View my calendar and check my availability for meetings HERE
> <https://calendar.google.com/calendar/b/1/embed?src=whayu...@redhat.com&ctz=America/New_York>
>


-- 
Best regards
Sagi Shnaidman
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tripleo][ci][metrics] Stucked in the middle of work because of RDO CI

2018-07-31 Thread Sagi Shnaidman
Hi, Martin

I see master OVB jobs are passing now [1], please recheck.

[1] http://cistatus.tripleo.org/

On Tue, Jul 31, 2018 at 12:24 PM, Martin Magr  wrote:

> Greetings guys,
>
>   it is pretty obvious that RDO CI jobs in TripleO projects are broken
> [0]. Once Zuul CI jobs will pass would it be possible to have AMQP/collectd
> patches ([1],[2],[3]) merged please even though the negative result of RDO
> CI jobs? Half of the patches for this feature is merged and the other half
> is stucked in this situation, were nobody reviews these patches, because
> there is red -1. Those patches passed Zuul jobs several times already and
> were manually tested too.
>
> Thanks in advance for consideration of this situation,
> Martin
>
> [0] https://trello.com/c/hkvfxAdX/667-cixtripleoci-rdo-software-
> factory-3rd-party-jobs-failing-due-to-instance-nodefailure
> [1] https://review.openstack.org/#/c/578749
> [2] https://review.openstack.org/#/c/576057/
> [3] https://review.openstack.org/#/c/572312/
>
> --
> Martin Mágr
> Senior Software Engineer
> Red Hat Czech
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>


-- 
Best regards
Sagi Shnaidman
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tripleo][ci][infra] Quickstart Branching

2018-05-23 Thread Sagi Shnaidman
;branching" repos but without
maintenance nightmare.

Thanks

Thanks,
> -Alex
>
> [0] http://git.openstack.org/cgit/openstack/ansible-role-
> container-registry/
> [1] http://git.openstack.org/cgit/openstack/ansible-role-redhat-
> subscription/
> [2] http://git.openstack.org/cgit/openstack/ansible-role-tripleo-keystone/
> [3] http://git.openstack.org/cgit/openstack/puppet-openstacklib/
> [4] https://review.openstack.org/#/c/565856/
> [5] https://review.openstack.org/#/c/569830
>
> > Thanks
> >
> >
> >
> > On Wed, May 23, 2018 at 7:04 PM, Alex Schultz 
> wrote:
> >>
> >> On Wed, May 23, 2018 at 8:30 AM, Sagi Shnaidman 
> >> wrote:
> >> > Hi, Sergii
> >> >
> >> > thanks for the question. It's not first time that this topic is raised
> >> > and
> >> > from first view it could seem that branching would help to that sort
> of
> >> > issues.
> >> >
> >> > Although it's not the case. Tripleo-quickstart(-extras) is part of CI
> >> > code,
> >> > as well as tripleo-ci repo which have never been branched. The reason
> >> > for
> >> > that is relative small impact on CI code from product branching. Think
> >> > about
> >> > backport almost *every* patch to oooq and extras to all supported
> >> > branches,
> >> > down to newton at least. This will be a really *huge* price and non
> >> > reasonable work. Just think about active maintenance of 3-4 versions
> of
> >> > CI
> >> > code in each of 3 repositories. It will take all time of CI team with
> >> > almost
> >> > zero value of this work.
> >> >
> >>
> >> So I'm not sure I completely agree with this assessment as there is a
> >> price paid for every {%if release in [...]%} that we have to carry in
> >> oooq{,-extras}.  These go away if we branch because we don't have to
> >> worry about breaking previous releases or current release (which may
> >> or may not actually have CI results).
> >>
> >> > What regards patch you listed, we would have backport this change to
> >> > *every*
> >> > branch, and it wouldn't really help to avoid the issue. The source of
> >> > problem is not branchless repo here.
> >> >
> >>
> >> No we shouldn't be backporting every change.  The logic in oooq-extras
> >> should be version specific and if we're changing an interface in
> >> tripleo in a breaking fashion we're doing it wrong in tripleo. If
> >> we're backporting things to work around tripleo issues, we're doing it
> >> wrong in quickstart.
> >>
> >> > Regarding catching such issues and Bogdans point, that's right we
> added
> >> > a
> >> > few jobs to catch such issues in the future and prevent breakages,
> and a
> >> > few
> >> > running jobs is reasonable price to keep configuration working in all
> >> > branches. Comparing to maintenance nightmare with branches of CI code,
> >> > it's
> >> > really a *zero* price.
> >> >
> >>
> >> Nothing is free. If there's a high maintenance cost, we haven't
> >> properly identified the optimal way to separate functionality between
> >> tripleo/quickstart.  I have repeatedly said that the provisioning
> >> parts of quickstart should be separate because those aren't tied to a
> >> tripleo version and this along with the scenario configs should be the
> >> only unbranched repo we have. Any roles related to how to
> >> configure/work with tripleo should be branched and tied to a stable
> >> branch of tripleo. This would actually be beneficial for tripleo as
> >> well because then we can see when we are introducing backwards
> >> incompatible changes.
> >>
> >> Thanks,
> >> -Alex
> >>
> >> > Thanks
> >> >
> >> >
> >> > On Wed, May 23, 2018 at 3:43 PM, Sergii Golovatiuk <
> sgolo...@redhat.com>
> >> > wrote:
> >> >>
> >> >> Hi,
> >> >>
> >> >> Looking at [1], I am thinking about the price we paid for not
> >> >> branching tripleo-quickstart. Can we discuss the options to prevent
> >> >> the issues such as [1]? Thank you in advance.
> >> >>
> >> >> [1] https://review.openstack.org/#/c/569830/4
> >&

Re: [openstack-dev] [tripleo][ci][infra] Quickstart Branching

2018-05-23 Thread Sagi Shnaidman
Alex,

the problem is that you're working and focusing mostly on release specific
code like featuresets and some scripts. But tripleo-quickstart(-extras) and
tripleo-ci is much *much* more than set of featuresets. Only 10% of the
code may be related to releases and branches, while other 90% is completely
independent and not related to releases.

So in 90% code we DO need to backport every change, take for example the
latest patch to extras: https://review.openstack.org/#/c/570167/, it's
fixing reproducer. If oooq-extra was branched, we would need to backport
this fix to every and every branch. And the same for all other 90% of code,
which is complete nonsense.
Just because not using "{% if release %}" construct - to block the whole
work of CI team and make the CI code is absolutely unmaintainable?

Some of release related templates we moved recently from tripleo-ci to THT
repo like scenarios, OC templates, etc. If we discover another things in
oooq that could be moved to branched THT I'd be only happy for that.

Sometimes it could be hard to maintain one file in extras templates with
different logic for releases, like we have in tempest configuration for
example. The solution is to create a few release-related templates and use
one that match the current branch. It doesn't affect 90% of code and still
"branch-like" approach. But I didn't see other scripts that are so release
dependent. If we'll have ones, we could do the same. For now I see "{% if
release %}" construct working very well.

I didn't see still any advantage of branching CI code, except of a little
bit nicer jinja templates without "{% if release ", but amount of
disadvantages is so huge, that it'll literally block all current work in CI.

Thanks



On Wed, May 23, 2018 at 7:04 PM, Alex Schultz  wrote:

> On Wed, May 23, 2018 at 8:30 AM, Sagi Shnaidman 
> wrote:
> > Hi, Sergii
> >
> > thanks for the question. It's not first time that this topic is raised
> and
> > from first view it could seem that branching would help to that sort of
> > issues.
> >
> > Although it's not the case. Tripleo-quickstart(-extras) is part of CI
> code,
> > as well as tripleo-ci repo which have never been branched. The reason for
> > that is relative small impact on CI code from product branching. Think
> about
> > backport almost *every* patch to oooq and extras to all supported
> branches,
> > down to newton at least. This will be a really *huge* price and non
> > reasonable work. Just think about active maintenance of 3-4 versions of
> CI
> > code in each of 3 repositories. It will take all time of CI team with
> almost
> > zero value of this work.
> >
>
> So I'm not sure I completely agree with this assessment as there is a
> price paid for every {%if release in [...]%} that we have to carry in
> oooq{,-extras}.  These go away if we branch because we don't have to
> worry about breaking previous releases or current release (which may
> or may not actually have CI results).
>
> > What regards patch you listed, we would have backport this change to
> *every*
> > branch, and it wouldn't really help to avoid the issue. The source of
> > problem is not branchless repo here.
> >
>
> No we shouldn't be backporting every change.  The logic in oooq-extras
> should be version specific and if we're changing an interface in
> tripleo in a breaking fashion we're doing it wrong in tripleo. If
> we're backporting things to work around tripleo issues, we're doing it
> wrong in quickstart.
>
> > Regarding catching such issues and Bogdans point, that's right we added a
> > few jobs to catch such issues in the future and prevent breakages, and a
> few
> > running jobs is reasonable price to keep configuration working in all
> > branches. Comparing to maintenance nightmare with branches of CI code,
> it's
> > really a *zero* price.
> >
>
> Nothing is free. If there's a high maintenance cost, we haven't
> properly identified the optimal way to separate functionality between
> tripleo/quickstart.  I have repeatedly said that the provisioning
> parts of quickstart should be separate because those aren't tied to a
> tripleo version and this along with the scenario configs should be the
> only unbranched repo we have. Any roles related to how to
> configure/work with tripleo should be branched and tied to a stable
> branch of tripleo. This would actually be beneficial for tripleo as
> well because then we can see when we are introducing backwards
> incompatible changes.
>
> Thank

Re: [openstack-dev] [tripleo][ci][infra] Quickstart Branching

2018-05-23 Thread Sagi Shnaidman
Hi, Sergii

thanks for the question. It's not first time that this topic is raised and
from first view it could seem that branching would help to that sort of
issues.

Although it's not the case. Tripleo-quickstart(-extras) is part of CI code,
as well as tripleo-ci repo which have never been branched. The reason for
that is relative small impact on CI code from product branching. Think
about backport almost *every* patch to oooq and extras to all supported
branches, down to newton at least. This will be a really *huge* price and
non reasonable work. Just think about active maintenance of 3-4 versions of
CI code in each of 3 repositories. It will take all time of CI team with
almost zero value of this work.

What regards patch you listed, we would have backport this change to
*every* branch, and it wouldn't really help to avoid the issue. The source
of problem is not branchless repo here.

Regarding catching such issues and Bogdans point, that's right we added a
few jobs to catch such issues in the future and prevent breakages, and a
few running jobs is reasonable price to keep configuration working in all
branches. Comparing to maintenance nightmare with branches of CI code, it's
really a *zero* price.

Thanks


On Wed, May 23, 2018 at 3:43 PM, Sergii Golovatiuk 
wrote:

> Hi,
>
> Looking at [1], I am thinking about the price we paid for not
> branching tripleo-quickstart. Can we discuss the options to prevent
> the issues such as [1]? Thank you in advance.
>
> [1] https://review.openstack.org/#/c/569830/4
>
> --
> Best Regards,
> Sergii Golovatiuk
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>



-- 
Best regards
Sagi Shnaidman
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [ci][infra][tripleo] Multi-staged check pipelines for Zuul v3 proposal

2018-05-15 Thread Sagi Shnaidman
an forcing an iterative workflow where they have to fix all the
>>> whitespace issues before the CI system will tell them which actual tests
>>> broke.
>>>
>>> -Jim
>>>
>>
>> I proposed a few zuul dependencies [0], [1] to tripleo CI pipelines for
>> undercloud deployments vs upgrades testing (and some more). Given that
>> those undercloud jobs have not so high fail rates though, I think Emilien
>> is right in his comments and those would buy us nothing.
>>
>>  From the other side, what do you think folks of making the
>> tripleo-ci-centos-7-3nodes-multinode depend on
>> tripleo-ci-centos-7-containers-multinode [2]? The former seems quite
>> faily and long running, and is non-voting. It deploys (see featuresets
>> configs [3]*) a 3 nodes in HA fashion. And it seems almost never passing,
>> when the containers-multinode fails - see the CI stats page [4]. I've found
>> only a 2 cases there for the otherwise situation, when containers-multinode
>> fails, but 3nodes-multinode passes. So cutting off those future failures
>> via the dependency added, *would* buy us something and allow other jobs to
>> wait less to commence, by a reasonable price of somewhat extended time of
>> the main zuul pipeline. I think it makes sense and that extended CI time
>> will not overhead the RDO CI execution times so much to become a problem.
>> WDYT?
>>
>> [0] https://review.openstack.org/#/c/568275/
>> [1] https://review.openstack.org/#/c/568278/
>> [2] https://review.openstack.org/#/c/568326/
>> [3] https://docs.openstack.org/tripleo-quickstart/latest/feature
>> -configuration.html
>> [4] http://tripleo.org/cistatus.html
>>
>> * ignore the column 1, it's obsolete, all CI jobs now using configs
>> download AFAICT...
>>
>>
>
> --
> Best regards,
> Bogdan Dobrelya,
> Irc #bogdando
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>



-- 
Best regards
Sagi Shnaidman
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [ci][infra][tripleo] Multi-staged check pipelines for Zuul v3 proposal

2018-05-14 Thread Sagi Shnaidman
Hi, Bogdan

I like the idea with undercloud job. Actually if undercloud fails, I'd stop
all other jobs, because it doens't make sense to run them. Seeing the same
failure in 10 jobs doesn't add too much. So maybe adding undercloud job as
dependency for all multinode jobs would be great idea. I think it's worth
to check also how long it will delay jobs. Will all jobs wait until
undercloud job is running? Or they will be aborted when undercloud job is
failing?

However I'm very sceptical about multinode containers and scenarios jobs,
they could fail because of very different reasons, like race conditions in
product or infra issues. Having skipping some of them will lead to more
rechecks from devs trying to discover all problems in a row, which will
delay the development process significantly.

Thanks


On Mon, May 14, 2018 at 7:15 PM, Bogdan Dobrelya 
wrote:

> An update for your review please folks
>
> Bogdan Dobrelya  writes:
>>
>> Hello.
>>> As Zuul documentation [0] explains, the names "check", "gate", and
>>> "post"  may be altered for more advanced pipelines. Is it doable to
>>> introduce, for particular openstack projects, multiple check
>>> stages/steps as check-1, check-2 and so on? And is it possible to make
>>> the consequent steps reusing environments from the previous steps
>>> finished with?
>>>
>>> Narrowing down to tripleo CI scope, the problem I'd want we to solve
>>> with this "virtual RFE", and using such multi-staged check pipelines,
>>> is reducing (ideally, de-duplicating) some of the common steps for
>>> existing CI jobs.
>>>
>>
>> What you're describing sounds more like a job graph within a pipeline.
>> See: https://docs.openstack.org/infra/zuul/user/config.html#attr-
>> job.dependencies
>> for how to configure a job to run only after another job has completed.
>> There is also a facility to pass data between such jobs.
>>
>> ... (skipped) ...
>>
>> Creating a job graph to have one job use the results of the previous job
>> can make sense in a lot of cases.  It doesn't always save *time*
>> however.
>>
>> It's worth noting that in OpenStack's Zuul, we have made an explicit
>> choice not to have long-running integration jobs depend on shorter pep8
>> or tox jobs, and that's because we value developer time more than CPU
>> time.  We would rather run all of the tests and return all of the
>> results so a developer can fix all of the errors as quickly as possible,
>> rather than forcing an iterative workflow where they have to fix all the
>> whitespace issues before the CI system will tell them which actual tests
>> broke.
>>
>> -Jim
>>
>
> I proposed a few zuul dependencies [0], [1] to tripleo CI pipelines for
> undercloud deployments vs upgrades testing (and some more). Given that
> those undercloud jobs have not so high fail rates though, I think Emilien
> is right in his comments and those would buy us nothing.
>
> From the other side, what do you think folks of making the
> tripleo-ci-centos-7-3nodes-multinode depend on
> tripleo-ci-centos-7-containers-multinode [2]? The former seems quite
> faily and long running, and is non-voting. It deploys (see featuresets
> configs [3]*) a 3 nodes in HA fashion. And it seems almost never passing,
> when the containers-multinode fails - see the CI stats page [4]. I've found
> only a 2 cases there for the otherwise situation, when containers-multinode
> fails, but 3nodes-multinode passes. So cutting off those future failures
> via the dependency added, *would* buy us something and allow other jobs to
> wait less to commence, by a reasonable price of somewhat extended time of
> the main zuul pipeline. I think it makes sense and that extended CI time
> will not overhead the RDO CI execution times so much to become a problem.
> WDYT?
>
> [0] https://review.openstack.org/#/c/568275/
> [1] https://review.openstack.org/#/c/568278/
> [2] https://review.openstack.org/#/c/568326/
> [3] https://docs.openstack.org/tripleo-quickstart/latest/feature
> -configuration.html
> [4] http://tripleo.org/cistatus.html
>
> * ignore the column 1, it's obsolete, all CI jobs now using configs
> download AFAICT...
>
> --
> Best regards,
> Bogdan Dobrelya,
> Irc #bogdando
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>



-- 
Best regards
Sagi Shnaidman
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [TripleO][CI] Which network templates to use in CI (with and without net isolation)?

2018-01-04 Thread Sagi Shnaidman
Hi, all

we have now network templates in tripleo-ci repo[1] and we'd like to move
them to tht repo[2] and to use them from there. We have also default
templates defined in overcloud-deploy role[3].
So the question is - which templates should we use and how to configure
them?
One option for configuration is set network args (incl. isolation) in
overcloud-deploy role[3] depending on other features (like docker, ipv6,
etc).
The other is to set them in featureset[4] files for each job.
The question is also which network templates we want to gate in CI and
should it be the same we have by default in tripleo-quickstart-extras?

We have a few patches from James (@slagle) to address this topic[5] and
from Arx for this issue[6].

Please feel free to share your thoughts what and where should be tested in
CI from network templates.

Thanks

[1]
https://github.com/openstack-infra/tripleo-ci/tree/821d84f34c851a79495f0205ad3c8dac928c286f/test-environments

[2]
https://github.com/openstack/tripleo-heat-templates/tree/master/ci/environments/network

[3]
https://github.com/openstack/tripleo-quickstart-extras/blob/master/roles/overcloud-deploy/tasks/pre-deploy.yml#L21-L51

[4]
https://github.com/openstack/tripleo-quickstart/blob/cf793bbb8368f89cd28214fe21adca2df48ef7f3/config/general_config/featureset001.yml#L26-L28

[5] https://review.openstack.org/#/c/531224/
https://review.openstack.org/#/c/525331
https://review.openstack.org/#/c/531221

[6] https://review.openstack.org/#/c/512225/

-- 
Best regards
Sagi Shnaidman
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tripleo] Reminder about bug priority

2017-12-11 Thread Sagi Shnaidman
On Mon, Dec 11, 2017 at 5:42 PM, Emilien Macchi  wrote:

> A lot of bugs are set Critical or High.
> Just a few reminder on how we like to use the criteria in TripleO:
>
> - "Critical" should be used when a bug makes a basic deployment
> impossible. For example, a bug that affects all CI gate jobs is
> critical. Something that any deployment can hit is critical. A bug in
> the master promotion pipeline can be set as Critical.
>

I think any bug that completely or pretty often fails current jobs should
be critical:
1) all voting jobs on TripleO CI
2) OVB jobs
3) promotion blocker for any of releases, both stable and master

What regards to releases, there are some stages in release workflow when
master-1 release is more important than any other.
I think we should be flexible there and allow to set critical also for bug
in master-1 release, at least when it has priority.

Another point for critical bugs might be bug that blocks developers work
(for example using tripleo tools), even if it passes in CI.


> - "High", "Medium" and "Low" should be used for other bug reports,
> where High is an important bug that you can hit but it won't block a
> deployment. "High" can also be used for stable branch promotion
> pipelines (pike, ocata, newton).
>
> Please don't use Critical for all the bugs, otherwise we end up with
> half of our bugs set at Critical, while they're rather High or Medium.
>
> If any doubt, please ask on #tripleo.
>
> Thanks,
> --
> Emilien Macchi
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>



-- 
Best regards
Sagi Shnaidman
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tripleo] Proposing Wesley Hayutin core on TripleO CI

2017-12-11 Thread Sagi Shnaidman
+1

On Wed, Dec 6, 2017 at 5:45 PM, Emilien Macchi  wrote:

> Team,
>
> Wes has been consistently and heavily involved in TripleO CI work.
> He has a very well understanding on how tripleo-quickstart and
> tripleo-quickstart-extras work, his number and quality of reviews are
> excellent so far. His experience with testing TripleO is more than
> valuable.
> Also, he's always here to help on TripleO CI issues or just
> improvements (he's the guy filling bugs on a Saturday evening).
> I think he would be a good addition to the TripleO CI core team
> (tripleo-ci, t-q and t-q-e repos for now).
> Anyway, thanks a lot Wes for your hard work on CI, I think it's time
> to move on and get you +2 ;-)
>
> As usual, it's open for voting, feel free to bring any feedback.
> Thanks everyone,
> --
> Emilien Macchi
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>



-- 
Best regards
Sagi Shnaidman
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] Proposing Ronelle Landy for Tripleo-Quickstart/Extras/CI core

2017-12-02 Thread Sagi Shnaidman
+1

On Wed, Nov 29, 2017 at 9:34 PM, John Trowbridge  wrote:

> I would like to propose Ronelle be given +2 for the above repos. She has
> been a solid contributor to tripleo-quickstart and extras almost since the
> beginning. She has solid review numbers, but more importantly has always
> done quality reviews. She also has been working in the very intense rover
> role on the CI squad in the past CI sprint, and has done very well in that
> role.
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>


-- 
Best regards
Sagi Shnaidman
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [TripleO][infra][CI] Moving OVB jobs from RH1 cloud to RDO cloud, plan

2017-10-23 Thread Sagi Shnaidman
Hi,

as you know we prepare transition of all OVB jobs from RH1 cloud to RDO
cloud, also a few long multinode upgrades jobs as well. We prepared a
workflow of transition below, please feel free to comment.


1) We run one job (ovb-ha-oooq) on every patch in following repos: oooq,
oooq-extras, tripleo-ci. We run rest of ovb jobs (containers and fs024) as
experimental in rdo cloud for following repos: oooq, oooq-extras,
tripleo-ci, tht, tripleo-common. It should cover most of our testing. This
step is completed.

Currently it's blocked by newton bug in RDO cloud:
https://bugs.launchpad.net/heat/+bug/1626256 , where cloud release doesn't
contain its fix: https://review.openstack.org/#/c/501592/ . From other
side, the upgrade to Ocata release (which would solve this issue too) is
blocked by bug: https://bugs.launchpad.net/tripleo/+bug/1724328
So we are in blocked state right now with moving.

Next steps:

2) We solve all issues with running on every patch job (ovb-ha-oooq) so
that it's passing (or failing exactly for same results as on rh1) for a 2
regular working days. (not weekend).
3) We should trigger experimental jobs in this time on various patches in
tht and tripleo-common and solve all issues for experimental jobs so all
ovb jobs pass.
4) We need to monitor all this time resources in openstack-nodepool tenant
(with help of rhops maybe) and be sure that it has the capacity to run
configured jobs.
5) We set ovb-ha-oooq job as running for every patch in all places where
it's running in rh1 (in parallel with existing rh1 job). We monitor RDO
cloud that it doesn't fail and still have resources - 1.5 working days
6) We add featureset024 ovb job to run in every patch where it runs in rh1.
We continue to monitor RDO cloud - 1.5 working days
7) We add last containers ovb job to all patches where it runs on rh1. We
continue monitor RDO cloud - 2 days.
8) In case if everything is OK in all previous points and RDO cloud still
performs well, we remove ovb jobs from rh1 configuration and make them as
experimental.
9) During next few days we monitor ovb jobs and run rh1 ovb jobs as
experimental to check if we have the same results (or better :) )
10) OVB jobs on rh1 cloud stay in experimental pipeline in tripleo for a
next month or two.

-- 
Best regards
Sagi Shnaidman
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO][Heat] using convergence_engine to deploy overcloud stack

2017-08-09 Thread Sagi Shnaidman
Hi there

On Wed, Aug 9, 2017 at 1:49 PM, Rabi Mishra  wrote:

> On Wed, Aug 9, 2017 at 1:41 PM, Smigielski, Radoslaw (Nokia - IE) <
> radoslaw.smigiel...@nokia.com> wrote:
>
>> Hi there!
>>
>>I have a question about heat "convergence_engine" option, it's
>> present in heat config since quite a long time but still not enabled.
>>
> Well, convergence is enabled by default in heat since newton. However,
> Tripleo does not use it yet, as convergence engine memory usage is higher
> than that of legacy engine.
>
>
TripleO CI has heat-convergence job running on heat patches in experimental
pipeline [1] It runs there during last year at least.
No any high memory usage was detected in last few months when I watched it.

[1]
https://github.com/openstack-infra/project-config/blob/master/zuul/layout.yaml#L10407

Thanks

-- 
Best regards
Sagi Shnaidman
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tripleo] CI Squad Meeting Summary (week 26) - job renaming discussion

2017-07-04 Thread Sagi Shnaidman
Every job contains topology file too, like "1cont_1comp" for example. And
generally could be different jobs that run the same featureset024 but with
different topologies. So I think the topology part is necessary too.



On Tue, Jul 4, 2017 at 8:45 PM, Emilien Macchi  wrote:

> On Fri, Jun 30, 2017 at 11:06 AM, Jiří Stránský  wrote:
> > On 30.6.2017 15:04, Attila Darazs wrote:
> >>
> >> = Renaming the CI jobs =
> >>
> >> When we started the job transition to Quickstart, we introduced the
> >> concept of featuresets[1] that define a certain combination of features
> >> for each job.
> >>
> >> This seemed to be a sensible solution, as it's not practical to mention
> >> all the individual features in the job name, and short names can be
> >> misleading (for example ovb-ha job does so much more than tests HA).
> >>
> >> We decided to keep the original names for these jobs to simplify the
> >> transition, but the plan is to rename them to something that will help
> >> to reproduce the jobs locally with Quickstart.
> >>
> >> The proposed naming scheme will be the same as the one we're now using
> >> for job type in project-config:
> >>
> >> gate-tripleo-ci-centos-7-{node-config}-{featureset-config}
> >>
> >> So for example the current "gate-tripleo-ci-centos-7-ovb-ha-oooq" job
> >> would look like "gate-tripleo-ci-centos-7-ovb-
> 3ctlr_1comp-featureset001"
> >
> >
> > I'd prefer to keep the job names somewhat descriptive... If i had to pick
> > one or the other, i'd rather stick with the current way, as at least for
> me
> > it's higher priority to see descriptive names in CI results than saving
> time
> > on finding featureset file mapping when needing to reproduce a job
> result.
> > My eyes scan probably more than a hundred of individual CI job results
> > daily, but i only need to reproduce 0 or 1 job failures locally usually.
> >
> > Alternatively, could we rename "featureset001.yaml" into
> > "featureset-ovb-ha.yaml" and then have i guess something like
> > "gate-tripleo-ci-centos-7-ovb-3ctlr_1comp-ovb-ha" for the job name?
> Maybe
> > "ovb" would be there twice, in case it's needed both in node config and
> > featureset parts of the job name...
>
> I'm in favor of keeping jobnames as simple as possible.
> To me, we should use something like gate-tripleo-ci-centos-7-ovb-
> featureset001
>
> So we know:
>
> - it's a tripleo gate job running on centos7
> - it's OVB and not multinode
> - it's deploying featureset001
>
> Please don't mention HA or ceph or other features in the name because
> it would be too rigid in case of featureset would change the coverage.
>
> Note: if we go that way, we also might want to rename scenario jobs
> and use featureset in the job name.
> Note2: if we rename jobs, we need to keep doing good work on
> documenting what featureset deploy and make
> https://github.com/openstack/tripleo-quickstart/blob/
> master/doc/source/feature-configuration.rst
> more visible probably.
>
> My 2 cents.
>
> > Or we could pull the mapping between job name and job type in an
> automated
> > way from project-config.
> >
> > (Will be on PTO for a week from now, apologies if i don't respond timely
> > here.)
> >
> >
> > Have a good day,
> >
> > Jirka
> >
> >>
> >> The advantage of this will be that it will be easy to reproduce a gate
> >> job on a local virthost by typing something like:
> >>
> >> ./quickstart.sh --release tripleo-ci/master \
> >>   --nodes config/nodes/3ctlr_1comp.yml \
> >>   --config config/general_config/featureset001.yml \
> >>   
> >>
> >> Please let us know if this method sounds like a step forward.
> >
> >
> > 
> __
> > OpenStack Development Mailing List (not for usage questions)
> > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:
> unsubscribe
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
>
> --
> Emilien Macchi
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>



-- 
Best regards
Sagi Shnaidman
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tripleo] CI Squad Meeting Summary (week 26) - job renaming discussion

2017-07-03 Thread Sagi Shnaidman
HI
I think job names should be meaningful too. We can include like
"featureset024" or even "-f024"
 in job name to make reproducing easily, or just to make another table of
featuresets and job names, like we have for file names and features.
gate-tripleo-ci-centos-7-ovb-f024-ha-cont-iso-bonds-ipv6-1ctrl_1comp_1ceph
seems not too long and gives a clue what runs in this job without looking
for job configuration also for people outside tripleo. Our jobs run not
only on TripleO CI, but on neutron, nova, etc

Thanks



On Fri, Jun 30, 2017 at 6:06 PM, Jiří Stránský  wrote:

> On 30.6.2017 15:04, Attila Darazs wrote:
>
>> = Renaming the CI jobs =
>>
>> When we started the job transition to Quickstart, we introduced the
>> concept of featuresets[1] that define a certain combination of features
>> for each job.
>>
>> This seemed to be a sensible solution, as it's not practical to mention
>> all the individual features in the job name, and short names can be
>> misleading (for example ovb-ha job does so much more than tests HA).
>>
>> We decided to keep the original names for these jobs to simplify the
>> transition, but the plan is to rename them to something that will help
>> to reproduce the jobs locally with Quickstart.
>>
>> The proposed naming scheme will be the same as the one we're now using
>> for job type in project-config:
>>
>> gate-tripleo-ci-centos-7-{node-config}-{featureset-config}
>>
>> So for example the current "gate-tripleo-ci-centos-7-ovb-ha-oooq" job
>> would look like "gate-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001"
>>
>
> I'd prefer to keep the job names somewhat descriptive... If i had to pick
> one or the other, i'd rather stick with the current way, as at least for me
> it's higher priority to see descriptive names in CI results than saving
> time on finding featureset file mapping when needing to reproduce a job
> result. My eyes scan probably more than a hundred of individual CI job
> results daily, but i only need to reproduce 0 or 1 job failures locally
> usually.
>
> Alternatively, could we rename "featureset001.yaml" into
> "featureset-ovb-ha.yaml" and then have i guess something like
> "gate-tripleo-ci-centos-7-ovb-3ctlr_1comp-ovb-ha" for the job name? Maybe
> "ovb" would be there twice, in case it's needed both in node config and
> featureset parts of the job name...
>
> Or we could pull the mapping between job name and job type in an automated
> way from project-config.
>
> (Will be on PTO for a week from now, apologies if i don't respond timely
> here.)
>
>
> Have a good day,
>
> Jirka
>
>
>> The advantage of this will be that it will be easy to reproduce a gate
>> job on a local virthost by typing something like:
>>
>> ./quickstart.sh --release tripleo-ci/master \
>>   --nodes config/nodes/3ctlr_1comp.yml \
>>   --config config/general_config/featureset001.yml \
>>   
>>
>> Please let us know if this method sounds like a step forward.
>>
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>



-- 
Best regards
Sagi Shnaidman
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [TripleO][CI][containers] Broken container gate jobs block patches.

2017-06-25 Thread Sagi Shnaidman
Hi,
FYI that gates now are blocked because of [1] and containers jobs now are
part of gate jobs. Please try to resolve it asap.

Thanks

[1] CI: containers jobs fail in pingtest because volume error:
https://bugs.launchpad.net/tripleo/+bug/1700333

-- 
Best regards
Sagi Shnaidman
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] A proposal for hackathon to reduce deploy time of TripleO

2017-06-08 Thread Sagi Shnaidman
Hi, all

Thanks for your attention and proposals for this hackathon. With full
understanding that optimization of deployment is on-going effort and should
not be started and finished in these 2 days only, we still want to get
focus on these issues in the sprint. Even if we don't solve immediately all
problems, more people will be exposed to this field, additional tasks/bugs
could be opened and scheduled, and maybe additional tests, process
improvements and other insights will be introduced.
If we don't reduce ci job time to 1 hour in Thursday it doesn't mean we
failed the mission, please remember.
The main goal of this sprint is to find problems and their work scope, and
to find as many as possible solutions for them, using inter-team and team
members collaboration and sharing knowledge. Ideally this collaboration and
on-going effort will go further with such momentum. :)

I suggest to do it in 21 - 22 Jun 2017 (Wednesday - Thursday). All other
details are provided in etherpad:
https://etherpad.openstack.org/p/tripleo-deploy-time-hack and in wiki as
well: https://wiki.openstack.org/wiki/VirtualSprints
We have a "deployment-time" tag for bugs:
https://bugs.launchpad.net/tripleo/+bugs?field.tag=deployment-time Please
use it for bugs that affect deployment time or CI job run time. It will be
easier to handle them in the sprint.

Please provide your comments and suggestions.

Thanks



On Tue, May 23, 2017 at 1:47 PM, Sagi Shnaidman  wrote:

> Hi, all
>
> I'd like to propose an idea to make one or two days hackathon in TripleO
> project with main goal - to reduce deployment time of TripleO.
>
> - How could it be arranged?
>
> We can arrange a separate IRC channel and Bluejeans video conference
> session for hackathon in these days to create a "presence" feeling.
>
> - How to participate and contribute?
>
> We'll have a few responsibility fields like tripleo-quickstart,
> containers, storage, HA, baremetal, etc - the exact list should be ready
> before the hackathon so that everybody could assign to one of these
> "teams". It's good to have somebody in team to be stakeholder and
> responsible for organization and tasks.
>
> - What is the goal?
>
> The goal of this hackathon to reduce deployment time of TripleO as much as
> possible.
>
> For example part of CI team takes a task to reduce quickstart tasks time.
> It includes statistics collection, profiling and detection of places to
> optimize. After this tasks are created, patches are tested and submitted.
>
> The prizes will be presented to teams which saved most of time :)
>
> What do you think?
>
> Thanks
> --
> Best regards
> Sagi Shnaidman
>



-- 
Best regards
Sagi Shnaidman
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] overcloud containers patches todo

2017-06-05 Thread Sagi Shnaidman
Hi
I think a "deep dive" about containers in TripleO and some helpful
documentation would help a lot for valuable reviews of these container
patches. The knowledge gap that's accumulated here is pretty big.

Thanks

On Jun 5, 2017 03:39, "Dan Prince"  wrote:

> Hi,
>
> Any help reviewing the following patches for the overcloud
> containerization effort in TripleO would be appreciated:
>
> https://etherpad.openstack.org/p/tripleo-containers-todo
>
> If you've got new services related to the containerization efforts feel
> free to add them here too.
>
> Thanks,
>
> Dan
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [TripleO] A proposal for hackathon to reduce deploy time of TripleO

2017-05-23 Thread Sagi Shnaidman
Hi, all

I'd like to propose an idea to make one or two days hackathon in TripleO
project with main goal - to reduce deployment time of TripleO.

- How could it be arranged?

We can arrange a separate IRC channel and Bluejeans video conference
session for hackathon in these days to create a "presence" feeling.

- How to participate and contribute?

We'll have a few responsibility fields like tripleo-quickstart, containers,
storage, HA, baremetal, etc - the exact list should be ready before the
hackathon so that everybody could assign to one of these "teams". It's good
to have somebody in team to be stakeholder and responsible for organization
and tasks.

- What is the goal?

The goal of this hackathon to reduce deployment time of TripleO as much as
possible.

For example part of CI team takes a task to reduce quickstart tasks time.
It includes statistics collection, profiling and detection of places to
optimize. After this tasks are created, patches are tested and submitted.

The prizes will be presented to teams which saved most of time :)

What do you think?

Thanks
-- 
Best regards
Sagi Shnaidman
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [tripleo] [CI] HA and non-HA OVB jobs are now running with Quickstart

2017-05-10 Thread Sagi Shnaidman
Hi, all
In addition to multinode jobs, we migrated today part of OVB jobs to use
quickstart.

We had before OVB ha and OVB nonha jobs and together with migrating them to
use quickstart we merged them into one job. It's called now:
 - gate-tripleo-ci-centos-7-ovb-ha-oooq

and will be voting job instead of
 - gate-tripleo-ci-centos-7-ovb-ha
 - gate-tripleo-ci-centos-7-ovb-nonha

The updates job "gate-tripleo-ci-centos-7-ovb-updates" stays the same and
nothing was changed about it. The same is about periodic jobs, they stay
the same and additional update will be sent when we migrate them too.

In addition for tripleo-ci repository there are two branch jobs:
- gate-tripleo-ci-centos-7-ovb-ha-oooq-newton
- gate-tripleo-ci-centos-7-ovb-ha-oooq-ocata
which replaces accordingly:
 - gate-tripleo-ci-centos-7-ovb-ha-ocata
 - gate-tripleo-ci-centos-7-ovb-nonha-ocata
 - gate-tripleo-ci-centos-7-ovb-ha-newton
 - gate-tripleo-ci-centos-7-ovb-nonha-newton

A little about "gate-tripleo-ci-centos-7-ovb-ha-oooq" job:
its features file is located in: https://github.com/openstack/t
ripleo-quickstart/blob/master/config/general_config/featureset001.yml
and it's pretty similar to previous HA job, but in addition it has
overcloud SSL and nodes introspection enabled (which were tested  in
previous non-HA job).

The old HA and non-HA jobs are moved into experimental queue and could run
on the patch with "check experimental". It's done for regression check,
please use it if you suspect there is a problem with migration.

As usual you are welcome to ask any questions about new jobs and features
in #tripleo . Tripleo-CI squad folks will be happy to answer you.

Thanks

-- Forwarded message --
From: Attila Darazs 
Date: Wed, Mar 15, 2017 at 12:04 PM
Subject: [openstack-dev] [tripleo] Gating jobs are now running with
Quickstart
To: "OpenStack Development Mailing List (not for usage questions)" <
openstack-dev@lists.openstack.org>


As discussed previously in the CI Squad meeting summaries[1] and on the
TripleO weekly meeting, the multinode gate jobs are now running with
tripleo-quickstart. To signify the change, we added the -oooq suffix to
them.

The following jobs migrated yesterday evening, with more to come:

- gate-tripleo-ci-centos-7-undercloud-oooq
- gate-tripleo-ci-centos-7-nonha-multinode-oooq
- gate-tripleo-ci-centos-7-scenario001-multinode-oooq
- gate-tripleo-ci-centos-7-scenario002-multinode-oooq
- gate-tripleo-ci-centos-7-scenario003-multinode-oooq
- gate-tripleo-ci-centos-7-scenario004-multinode-oooq

For those who are already familiar with Quickstart, we introduced two new
concepts:

- featureset config files that are numbered collection of settings, without
node configuration[2]
- the '--nodes' option for quickstart.sh and the config/nodes files that
deal with only the number and type of nodes the deployment will have[3]

If you would like to debug these jobs, it might be useful to read
Quickstart's documentation[4]. We hope the transition will be smooth, but
if you have problems ping members of the TripleO CI Squad on #tripleo.

Best regards,

[1] http://lists.openstack.org/pipermail/openstack-dev/2017-Marc
h/113724.html
[2] https://docs.openstack.org/developer/tripleo-quickstart/feat
ure-configuration.html
[3] https://docs.openstack.org/developer/tripleo-quickstart/node
-configuration.html
[4] https://docs.openstack.org/developer/tripleo-quickstart/

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



-- 
Best regards
Sagi Shnaidman
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [TripleO][CI] OVB combined job and periodic

2017-04-11 Thread Sagi Shnaidman
Hi, all

just another point to think about transition of periodic jobs:
firstly, we need featureset files for them,
secondly, when we combined ha and nonha job, it should be also one job in
periodic jobs, which contains now ha, nonha, updates,
because of all above, I think we will still need 2 jobs, because we check
overcloud deletion in one of them, undercloud idempotency in second, and
it's impossible to test all in one job because of time restrictions.
So it seems to be like:
1) combined ovb job + overcloud deletion (+another specific features?)
2) combined ovb job + undercloud idempotent install (+another specific
features?)
3) ovb updates job

thoughts?

-- 
Best regards
Sagi Shnaidman
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tripleo] pingtest vs tempest

2017-04-06 Thread Sagi Shnaidman
HI,

I think Rally or Browbeat and other performance oriented solutions won't
serve our needs, because we run TripleO CI on virtualized environment with
very limited resources. Actually we are pretty close to full utilizing
these resources when deploying openstack, so very little is available for
test.
It's not a problem to run tempest API tests because they are cheap - take
little time, little resources, but also gives little coverage. Scenario
test are more interesting and gives us more coverage, but also takes a lot
of resources (which we don't have sometimes).

It may be useful to run a "limited edition" of API tests that maximize
coverage and don't duplicate, for example just to check service working
basically, without covering all its functionality. It will take very little
time (i.e. 5 tests for each service) and will give a general picture of
deployment success. It will cover fields that are not covered by pingtest
as well.

I think could be an option to develop a special scenario tempest tests for
TripleO which would fit our needs.

Thanks


On Wed, Apr 5, 2017 at 11:49 PM, Emilien Macchi  wrote:

> Greetings dear owls,
>
> I would like to bring back an old topic: running tempest in the gate.
>
> == Context
>
> Right now, TripleO gate is running something called pingtest to
> validate that the OpenStack cloud is working. It's an Heat stack, that
> deploys a Nova server, some volumes, a glance image, a neutron network
> and sometimes a little bit more.
> To deploy the pingtest, you obviously need Heat deployed in your overcloud.
>
> == Problems:
>
> Although pingtest has been very helpful over the last years:
> - easy to understand, it's an Heat template, like an OpenStack user
> would do to deploy their apps.
> - fast: the stack takes a few minutes to be created and validated
>
> It has some limitations:
> - Limitation to what Heat resources support (example: some OpenStack
> resources can't be managed from Heat)
> - Impossible to run a dynamic workflow (test a live migration for example)
>
> == Solutions
>
> 1) Switch pingtest to Tempest run on some specific tests, with feature
> parity of what we had with pingtest.
> For example, we could imagine to run the scenarios that deploys VM and
> boot from volume. It would test the same thing as pingtest (details
> can be discussed here).
> Each scenario would run more tests depending on the service that they
> run (scenario001 is telemetry, so it would run some tempest tests for
> Ceilometer, Aodh, Gnocchi, etc).
> We should work at making the tempest run as short as possible, and the
> close as possible from what we have with a pingtest.
>
> 2) Run custom scripts in TripleO CI tooling, called from the pingtest
> (heat template), that would run some validations commands (API calls,
> etc).
> It has been investigated in the past but never implemented AFIK.
>
> 3) ?
>
> I tried to make this text short and go straight to the point, please
> bring feedback now. I hope we can make progress on $topic during Pike,
> so we can increase our testing coverage and detect deployment issues
> sooner.
>
> Thanks,
> --
> Emilien Macchi
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>



-- 
Best regards
Sagi Shnaidman
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [TripleO][tripleo-quickstart][tripleo-ci] Review of critical bugfix

2017-03-26 Thread Sagi Shnaidman
Hi, all

we have a pretty critical bug in quickstart jobs[1] that ignores status
code of commands, please review its fix[2]. If you have more elegant
solution than setting pipefail or exiting with PIPESTATUS, please suggest
in comments.

Thanks

[1] https://bugs.launchpad.net/tripleo/+bug/1676156
[2] https://review.openstack.org/450023

-- 
Best regards
Sagi Shnaidman
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [infra][tripleo] initial discussion for a new periodic pipeline

2017-03-21 Thread Sagi Shnaidman
Paul,
if we run 750 ovb jobs per day, than adding 12 more will be less than 2%
increase. I don't believe it will be a serious issue.

Thanks

On Tue, Mar 21, 2017 at 7:34 PM, Paul Belanger 
wrote:

> On Tue, Mar 21, 2017 at 12:40:39PM -0400, Wesley Hayutin wrote:
> > On Tue, Mar 21, 2017 at 12:03 PM, Emilien Macchi 
> wrote:
> >
> > > On Mon, Mar 20, 2017 at 3:29 PM, Paul Belanger 
> > > wrote:
> > > > On Sun, Mar 19, 2017 at 06:54:27PM +0200, Sagi Shnaidman wrote:
> > > >> Hi, Paul
> > > >> I would say that real worthwhile try starts from "normal" priority,
> > > because
> > > >> we want to run promotion jobs more *often*, not more *rarely* which
> > > happens
> > > >> with low priority.
> > > >> In addition the initial idea in the first mail was running them each
> > > after
> > > >> other almost, not once a day like it happens now or with "low"
> priority.
> > > >>
> > > > As I've said, my main reluctance is is how the gate will react if we
> > > create a
> > > > new pipeline with the same priority as our check pipeline.  I would
> much
> > > rather
> > > > since on caution, default to 'low', see how things react for a day /
> > > week /
> > > > month, then see what it would like like a normal.  I want us to be
> > > caution about
> > > > adding a new pipeline, as it dynamically changes how our existing
> > > pipelines
> > > > function.
> > > >
> > > > Further more, this is actually a capacity issue for
> > > tripleo-test-cloud-rh1,
> > > > there currently too many jobs running for the amount of hardware. If
> > > these jobs
> > > > were running on our donated clouds, we could get away with a low
> priority
> > > > periodic pipeline.
> > >
> > > multinode jobs are running under donated clouds but as you know ovb
> not.
> > > We want to keep ovb jobs in our promotion pipeline because they bring
> > > high value to the tests (ironic, ipv6, ssl, probably more).
> > >
> > > Another alternative would be to reduce it to one ovb job (ironic with
> > > introspection + ipv6 + ssl at minimum) and use the 4 multinode jobs
> > > into the promotion pipeline -instead of the 3 ovb.
> > >
> >
> > I'm +1 on using one ovb jobs + 4 multinode jobs.
> >
> >
> > >
> > > current: 3 ovb jobs running every night
> > > proposal: 18 ovb jobs per day
> > >
> > > The addition will cost us 15 jobs into rh1 load. Would it be
> acceptable?
> > >
> > > > Now, allow me to propose another solution.
> > > >
> > > > RDO project has their own version of zuul, which has the ability to
> do
> > > periodic
> > > > pipelines.  Since tripleo-test-cloud-rh2 is still around, and has OVB
> > > ability, I
> > > > would suggest configuring this promoting pipeline within RDO, as to
> not
> > > affect
> > > > the capacity of tripleo-test-cloud-rh1.  This now means, you can
> > > continuously
> > > > enqueue jobs at a rate of 4 hours, priority shouldn't matter as you
> are
> > > the only
> > > > jobs running on tripleo-test-cloud-rh2, resulting in faster
> promotions.
> > >
> > > Using RDO would also be an option. I'm just not sure about our
> > > available resources, maybe other can reply on this one.
> > >
> >
> > The purpose of the periodic jobs are two fold.
> > 1. ensure the latest built packages work
> > 2. ensure the tripleo check gates continue to work with out error
> >
> > Running the promotion in review.rdoproject would not cover #2.  The
> > rdoproject jobs
> > would be configured in slightly different ways from upstream tripleo.
> > Running the promotion
> > in ci.centos has the same issue.
> >
> Right, there is some leg work to use the images produced by opentack-infra
> in
> RDO, but that is straightforward. It would be the same build process that
> a 3rd
> party CI system does.  It would be a matter of copying nodepool.yaml from
> openstack-infra/project-config, and (this is harder) using
> nodepool-builder to
> build the images.  Today RDO does snapshot images.
>
> > Using tripleo-testcloud-rh2 I think is fine.
> >
> >
> > >
> > > > This also make sense, as packaging is done in RDO, and you are
&

[openstack-dev] [TripleO] A lot of instack-haproxy zombie processes

2017-03-19 Thread Sagi Shnaidman
Hi, all

while investigating periodic jobs failure, I mentioned a lot of Z processes
on undercloud:

bash-4.2# ps aux | grep " Z " | less
root 28481  0.0  0.0  0 0 ?Z17:41   0:00
[instack-haproxy] 
root 28494  0.0  0.0  0 0 ?Z17:41   0:00
[instack-haproxy] 
root 28509  0.0  0.0  0 0 ?Z17:41   0:00
[instack-haproxy] 
root 28522  0.0  0.0  0 0 ?Z17:41   0:00
[instack-haproxy] 
...
bash-4.2# ps aux | grep " Z " | wc -l
979

About a thousand same zombie processes. I don't think it's appropriate
behavior, although not sure it's the reason for failing jobs. Any ideas why
it could happen?

Thanks

-- 
Best regards
Sagi Shnaidman
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [infra][tripleo] initial discussion for a new periodic pipeline

2017-03-19 Thread Sagi Shnaidman
Hi, Paul
I would say that real worthwhile try starts from "normal" priority, because
we want to run promotion jobs more *often*, not more *rarely* which happens
with low priority.
In addition the initial idea in the first mail was running them each after
other almost, not once a day like it happens now or with "low" priority.

Thanks

On Wed, Mar 15, 2017 at 11:16 PM, Paul Belanger 
wrote:

> On Wed, Mar 15, 2017 at 03:42:32PM -0500, Ben Nemec wrote:
> >
> >
> > On 03/13/2017 02:29 PM, Sagi Shnaidman wrote:
> > > Hi, all
> > >
> > > I submitted a change: https://review.openstack.org/#/c/443964/
> > > but seems like it reached a point which requires an additional
> discussion.
> > >
> > > I had a few proposals, it's increasing period to 12 hours instead of 4
> > > for start, and to leave it in regular periodic *low* precedence.
> > > I think we can start from 12 hours period to see how it goes, although
> I
> > > don't think that 4 only jobs will increase load on OVB cloud, it's
> > > completely negligible comparing to current OVB capacity and load.
> > > But making its precedence as "low" IMHO completely removes any sense
> > > from this pipeline to be, because we already run experimental-tripleo
> > > pipeline which this priority and it could reach timeouts like 7-14
> > > hours. So let's assume we ran periodic job, it's queued to run now 12 +
> > > "low queue length" - about 20 and more hours. It's even worse than
> usual
> > > periodic job and definitely makes this change useless.
> > > I'd like to notice as well that those periodic jobs unlike "usual"
> > > periodic are used for repository promotion and their value are equal or
> > > higher than check jobs, so it needs to run with "normal" or even "high"
> > > precedence.
> >
> > Yeah, it makes no sense from an OVB perspective to add these as low
> priority
> > jobs.  Once in a while we've managed to chew through the entire
> experimental
> > queue during the day, but with the containers job added it's very
> unlikely
> > that's going to happen anymore.  Right now we have a 4.5 hour wait time
> just
> > for the check queue, then there's two hours of experimental jobs queued
> up
> > behind that.  All of which means if we started a low priority periodic
> job
> > right now it probably wouldn't run until about midnight my time, which I
> > think is when the regular periodic jobs run now.
> >
> Lets just give it a try? A 12 hour periodic job with low priority. There is
> nothing saying we cannot iterate on this after a few days / weeks / months.
>
> > >
> > > Thanks
> > >
> > >
> > > On Thu, Mar 9, 2017 at 10:06 PM, Wesley Hayutin  > > <mailto:whayu...@redhat.com>> wrote:
> > >
> > >
> > >
> > > On Wed, Mar 8, 2017 at 1:29 PM, Jeremy Stanley  > > <mailto:fu...@yuggoth.org>> wrote:
> > >
> > > On 2017-03-07 10:12:58 -0500 (-0500), Wesley Hayutin wrote:
> > > > The TripleO team would like to initiate a conversation about
> the
> > > > possibility of creating a new pipeline in Openstack Infra to
> allow
> > > > a set of jobs to run periodically every four hours
> > > [...]
> > >
> > > The request doesn't strike me as contentious/controversial.
> Why not
> > > just propose your addition to the zuul/layout.yaml file in the
> > > openstack-infra/project-config repo and hash out any resulting
> > > concerns via code review?
> > > --
> > > Jeremy Stanley
> > >
> > >
> > > Sounds good to me.
> > > We thought it would be nice to walk through it in an email first :)
> > >
> > > Thanks
> > >
> > >
> > > 
> __
> > > OpenStack Development Mailing List (not for usage questions)
> > > Unsubscribe:
> > > openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> > > <http://openstack-dev-requ...@lists.openstack.org?subject:
> unsubscribe>
> > > http://lists.openstack.org/cgi-bin/mailman/listinfo/
> openstack-dev <http://lists.openstack.org/cgi-bin/mailman/listinfo/
> openstack-dev>
> > >
> > >
> > >

Re: [openstack-dev] [TripleO] Propose Attila Darazs and Gabriele Cerami for tripleo-ci core

2017-03-15 Thread Sagi Shnaidman
+1 +1 !

On Wed, Mar 15, 2017 at 5:44 PM, John Trowbridge  wrote:

> Both Attila and Gabriele have been rockstars with the work to transition
> tripleo-ci to run via quickstart, and both have become extremely
> knowledgeable about how tripleo-ci works during that process. They are
> both very capable of providing thorough and thoughtful reviews of
> tripleo-ci patches.
>
> On top of this Attila has greatly increased the communication from the
> tripleo-ci squad as the liason, with weekly summary emails of our
> meetings to this list.
>
> - trown
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>



-- 
Best regards
Sagi Shnaidman
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [infra][tripleo] initial discussion for a new periodic pipeline

2017-03-13 Thread Sagi Shnaidman
Hi, all

I submitted a change: https://review.openstack.org/#/c/443964/
but seems like it reached a point which requires an additional discussion.

I had a few proposals, it's increasing period to 12 hours instead of 4 for
start, and to leave it in regular periodic *low* precedence.
I think we can start from 12 hours period to see how it goes, although I
don't think that 4 only jobs will increase load on OVB cloud, it's
completely negligible comparing to current OVB capacity and load.
But making its precedence as "low" IMHO completely removes any sense from
this pipeline to be, because we already run experimental-tripleo pipeline
which this priority and it could reach timeouts like 7-14 hours. So let's
assume we ran periodic job, it's queued to run now 12 + "low queue length"
- about 20 and more hours. It's even worse than usual periodic job and
definitely makes this change useless.
I'd like to notice as well that those periodic jobs unlike "usual" periodic
are used for repository promotion and their value are equal or higher than
check jobs, so it needs to run with "normal" or even "high" precedence.

Thanks


On Thu, Mar 9, 2017 at 10:06 PM, Wesley Hayutin  wrote:

>
>
> On Wed, Mar 8, 2017 at 1:29 PM, Jeremy Stanley  wrote:
>
>> On 2017-03-07 10:12:58 -0500 (-0500), Wesley Hayutin wrote:
>> > The TripleO team would like to initiate a conversation about the
>> > possibility of creating a new pipeline in Openstack Infra to allow
>> > a set of jobs to run periodically every four hours
>> [...]
>>
>> The request doesn't strike me as contentious/controversial. Why not
>> just propose your addition to the zuul/layout.yaml file in the
>> openstack-infra/project-config repo and hash out any resulting
>> concerns via code review?
>> --
>> Jeremy Stanley
>>
>>
> Sounds good to me.
> We thought it would be nice to walk through it in an email first :)
>
> Thanks
>
>
>> 
>> __
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscrib
>> e
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>
>
> ______
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>


-- 
Best regards
Sagi Shnaidman
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [TripleO][CI] Running experimental OVB and not OVB jobs separately.

2017-01-25 Thread Sagi Shnaidman
HI, all

I'd like to propose a bit different approach to run experimental jobs in
TripleO CI.
As you know we have OVB jobs and not-OVB jobs, and different pipelines for
running these two types of them.

What is current flow:
if you need to run experimental jobs, you write comment with "check
experimental" and all types of jobs will run - both OVB and not-OVB.

What is proposal:
for running OVB jobs only, you'll need to leave comment "check
experimental-tripleo", for running non-OVB jobs only you'll still write
"check experimental".
For running all experimental jobs OVB and not-OVB just leave two comments:
check experimental-tripleo
check experimental

>From what I observed people usually want to run one-two of experimental
jobs and usually one type of them. So this more explicit run can save us
expensive OVB resources.
If this not a case and you prefer to run all experimental jobs we have at
once, please provide a feedback and I'll take it back.

Patch about the topic: https://review.openstack.org/#/c/425184/

Thanks
-- 
Best regards
Sagi Shnaidman
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [TripleO][CI]

2017-01-15 Thread Sagi Shnaidman
Hi, all

FYI, the periodic TripleO nonha jobs fail because of introspection failure,
there is opened bug in mistral:

Ironic introspection fails because unexpected keyword "insecure"
https://bugs.launchpad.net/tripleo/+bug/1656692

and marked as promotion blocker.

Thanks
-- 
Best regards
Sagi Shnaidman
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [TripleO][CI] Dev CI environment for OVB

2016-09-27 Thread Sagi Shnaidman
Hi, all

with Dereks help we set up OVB dev environment on rh1/rh2 clouds, which
allow developers to run their patches in real CI environment and debug
their issues there. In case you have a problem with your patch on CI and
locally it works - you can reproduce and debug it on this environment.
Please note, this is OVB environment only.
Please use for regular patch tests the tripleo-quickstart project[1], which
is more fits this purpose, this dev env is for CI issues only.

If shortly, we have a special tenants on rh1/rh2, on which you can create
your undercloud vm from infra image, then create your OVB environment
there. Finally you're ready to run your patch there - clone your repo,
inject the changes and run main CI script toci_gate_test.sh.

The whole process is described in etherpad https://etherpad.openstack.org
/p/tripleo-ci-devenvs where below you'll find a script that does everything
for you (as all scripts should do usually).

In case you need to test your patch, just send me your *public* keys
*offline*, I'll add them to tenant defaults and you'll be able to run it.

Thanks

[1] https://github.com/openstack/tripleo-quickstart
-- 
Best regards
Sagi Shnaidman
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [TripleO][CI] Failed jobs because bad image on mirror server

2016-09-25 Thread Sagi Shnaidman
Hi, all

FYI, jobs failed after last images promotion because of corrupted image,
seems like last promotion job failed to upload it correctly, it didn't
match md5. I've replaced it on mirror server with image from previous
delorean hash run, it should be OK because we anyway update them and it
should be updated on next promotion job run.

Thanks
-- 
Best regards
Sagi Shnaidman
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO][CI] Memory shortage in HA jobs, please increase it

2016-08-19 Thread Sagi Shnaidman
Hi, Derek

I suspect Sahara can cause it, it started to run on overcloud since my
patch was merged: https://review.openstack.org/#/c/352598/
I don't think it ever ran on jobs, because was either improperly configured
or disabled. And according to reports it's most memory consuming service on
overcloud controllers.


On Fri, Aug 19, 2016 at 12:41 PM, Derek Higgins  wrote:

> On 19 August 2016 at 00:07, Sagi Shnaidman  wrote:
> > Hi,
> >
> > we have a problem again with not enough memory in HA jobs, all of them
> > constantly fails in CI: http://status-tripleoci.rhcloud.com/
>
> Have we any idea why we need more memory all of a sudden? For months
> the overcloud nodes have had 5G of RAM, then last week[1] we bumped it
> too 5.5G now we need it bumped too 6G.
>
> If a new service has been added that is needed on the overcloud then
> bumping to 6G is expected and probably the correct answer but I'd like
> to see us avoiding blindly increasing the resources each time we see
> out of memory errors without investigating if there was a regression
> causing something to start hogging memory.
>
> Sorry if it seems like I'm being picky about this (I seem to resist
> these bumps every time they come up) but there are two good reasons to
> avoid this if possible
> o at peak we are currently configured to run 75 simultaneous jobs
> (although we probably don't reach that at the moment), and each HA job
> has 5 baremetal nodes so bumping from 5G too 6G increases the amount
> of RAM ci can use at peak by 375G
> o When we bump the RAM usage of baremetal nodes from 5G too 6G what
> we're actually doing is increasing the minimum requirements for
> developers from 28G(or whatever the number is now) too 32G
>
> So before we bump the number can we just check first if its justified,
> as I've watched this number increase from 2G since we started running
> tripleo-ci
>
> thanks,
> Derek.
>
> [1] - https://review.openstack.org/#/c/353655/
>
> > I've created a patch that will increase it[1], but we need to increase it
> > right now on rh1.
> > I can't do it now, because unfortunately I'll not be able to watch this
> if
> > it works and no problems appear.
> > TripleO CI cloud admins, please increase the memory for baremetal flavor
> on
> > rh1 tomorrow (to 6144?).
> >
> > Thanks
> >
> > [1] https://review.openstack.org/#/c/357532/
> > --
> > Best regards
> > Sagi Shnaidman
>



-- 
Best regards
Sagi Shnaidman
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [TripleO][CI] Memory shortage in HA jobs, please increase it

2016-08-18 Thread Sagi Shnaidman
Hi,

we have a problem again with not enough memory in HA jobs, all of them
constantly fails in CI: http://status-tripleoci.rhcloud.com/
I've created a patch that will increase it[1], but we need to increase it
right now on rh1.
I can't do it now, because unfortunately I'll not be able to watch this if
it works and no problems appear.
TripleO CI cloud admins, please increase the memory for baremetal flavor on
rh1 tomorrow (to 6144?).

Thanks

[1] https://review.openstack.org/#/c/357532/
-- 
Best regards
Sagi Shnaidman
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [TripleO][CI][Delorean]

2016-08-16 Thread Sagi Shnaidman
Hi, all

we have current tripleo repo[1] pointing to old repository[2] which
contains broken cinder[3] with volumes type bug[4] That breaks all our CI
jobs which can not create pingtest stack, because current-tripleo is the
main repo that is used in jobs[5]

Could we move the link please to any newer repo that contains cinder fix?

Thanks


[1]
http://buildlogs.centos.org/centos/7/cloud/x86_64/rdo-trunk-master-tripleo/
[2]
http://trunk.rdoproject.org/centos7/c6/bd/c6bd3cb95b9819c03345f50bf2812227e81314ab_4e6dfa3c
[3] openstack-cinder-9.0.0-0.20160810043123.b53621a.el7.centos.noarch
[4] https://bugs.launchpad.net/cinder/+bug/1610073
[5]
https://github.com/openstack-infra/tripleo-ci/blob/master/scripts/tripleo.sh#L100

-- 
Best regards
Sagi Shnaidman
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [infra] [TripleO] ntp-wait issue breaks TripleO jobs

2016-07-31 Thread Sagi Shnaidman
Hi infra and TripleO cores,

I'd like to ask you for review+merge of bugfix which limits ntp-wait tries
to 100 instead of current 1000. These long tries cause timeout of 100
minutes and breaks TripleO jobs.

More details are in the bug: https://bugs.launchpad.net/tripleo/+bug/1608226
The patch: https://review.openstack.org/#/c/349261

Thanks

-- 
Best regards
Sagi Shnaidman
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] Delorean fail blocks CI for stable branches

2016-07-20 Thread Sagi Shnaidman
On Thu, Jul 21, 2016 at 3:11 AM, Alan Pevec  wrote:

> On Wed, Jul 20, 2016 at 7:49 PM, Sagi Shnaidman 
> wrote:
> > How then it worked before? Can you show me the patch that broke this
> > functionality in delorean? It should be about 15 Jul when jobs started to
> > fail.
>
> commented in lp
>
> > How then master branch works? It also runs on patched repo and succeeds.
>
> I explained that but looks like we're talking past each other.
>
> > I don't think we can use this workaround, each time this source file will
> > change - all our jobs will fail again? It's not even a workaround.
> > Please let's stop discussing and let's solve it finally, it blocks our CI
> > for stable patches.
>
> Sure, I've assigned https://bugs.launchpad.net/tripleo/+bug/1604039 to
> myself and proposed a patch.
>
>
It's a workaround for short time range, but NOT a solution, if you change
something in this one file, it'll be broken again. But it does NOT solve
the main issue - after recent changes in dlrn and specs we can't build repo
with delorean on stable branches.
I think it should be solved on DLRN side and should be provided a
appropriate interface to use it for CI purposes.
I opened an issue there:
https://github.com/openstack-packages/DLRN/issues/22
But you closed it, so I suppose we will not get any solution and help for
it from your side?

Should we move to other packaging tool?


> Alan
>



-- 
Best regards
Sagi Shnaidman
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] Delorean fail blocks CI for stable branches

2016-07-20 Thread Sagi Shnaidman
How then it worked before? Can you show me the patch that broke this
functionality in delorean? It should be about 15 Jul when jobs started to
fail.
How then master branch works? It also runs on patched repo and succeeds.

I don't think we can use this workaround, each time this source file will
change - all our jobs will fail again? It's not even a workaround.
Please let's stop discussing and let's solve it finally, it blocks our CI
for stable patches.
I'd expect for a little bit more involvement in this issue and suggest you
or anybody who understand well delorean code and specs will try to solve it
when I provide him a whole TripleO CI dev environment with walking through
every CI step and any other info I can provide. Let's just sit and solve
it, otherwise we'll never get it working.

Thanks


On Wed, Jul 20, 2016 at 7:50 PM, Alan Pevec  wrote:

> > as a quickfix in tripleo.sh you could patch dlrn and set local=True in
>
> correction, patch local=False there while running dlrn command with
> --local to keep source checkout as-is
>



-- 
Best regards
Sagi Shnaidman
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] Delorean fail blocks CI for stable branches

2016-07-20 Thread Sagi Shnaidman
On Wed, Jul 20, 2016 at 2:29 PM, Alan Pevec  wrote:

> > git clone https://git.openstack.org/openstack/tripleo-heat-templates
> > cd tripleo-heat-templates/
> > git checkout -b stable/mitaka origin/stable/mitaka
>
> ^ this is manually switching to the stable source branch
>
> > sed -i -e "s%distro=.*%distro=rpm-mitaka%" projects.ini
> > sed -i -e "s%source=.*%source=stable/mitaka%" projects.ini
>
> ^ this configures dlrn to the correct combination of distro and source
> branches, but ...
>
> > ./venv/bin/dlrn --config-file projects.ini --head-only --package-name
> > openstack-tripleo-heat-templates --local
>
> ^ ... --local here keeps local checkout untouched, so you end up with
> default rpm-master in distro git checkout.
> If you remove --local it will reset local checkouts to the branches
> specified in projects.ini
>
> Alan,
I don't want to reset local checkouts and reset branches - I need to build
with these checkout, it's all CI is for.


> Alan
>



-- 
Best regards
Sagi Shnaidman
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [TripleO] Delorean fail blocks CI for stable branches

2016-07-20 Thread Sagi Shnaidman
HI,

we have a problem with delorean build of stable branches in TripleO CI[1],
and it seems like a rpm specs problem. It can be reproducible easily [2]
Please, help with solution to this problem, all info is in the bug[1]

Alan,
if you think we use dlrn wrong, please point me out which line is incorrect
in reproducing.

[1] https://bugs.launchpad.net/tripleo/+bug/1604039
[2] Reproducing:

sudo yum install -y createrepo git mock rpm-build yum-plugin-priorities
yum-utils gcc python-virtualenv libffi-devel openssl-devel
sudo usermod -G mock -a $(id -nu)

cd /tmp/
sudo rm -rf /tmp/test
mkdir /tmp/test && cd /tmp/test

git clone https://git.openstack.org/openstack/tripleo-heat-templates
cd tripleo-heat-templates/
git checkout -b stable/mitaka origin/stable/mitaka

cd ..
git clone https://github.com/openstack-packages/delorean.git
cd delorean
mkdir -p data
sed -i -e 's%--postinstall%%' scripts/build_rpm.sh

virtualenv venv
./venv/bin/pip install -U setuptools
./venv/bin/pip install pytz
./venv/bin/pip install .

sed -i -e "s%baseurl=.*%baseurl=https://trunk.rdoproject.org/centos7-mitaka%";
projects.ini
sed -i -e "s%distro=.*%distro=rpm-mitaka%" projects.ini
sed -i -e "s%source=.*%source=stable/mitaka%" projects.ini

cp -r ../tripleo-heat-templates data/openstack-tripleo-heat-templates

cd data/openstack-tripleo-heat-templates/
GITHASH=$(git rev-parse HEAD)
for BRANCH in master origin/master stable/liberty origin/stable/liberty
stable/mitaka origin/stable/mitaka; do
git checkout -b $BRANCH || git checkout $BRANCH
git reset --hard $GITHASH
done
cd /tmp/test/delorean

./venv/bin/dlrn --config-file projects.ini --head-only --package-name
openstack-tripleo-heat-templates --local


The projects.ini:

[DEFAULT]
datadir=./data
scriptsdir=./scripts
baseurl=https://trunk.rdoproject.org/centos7-mitaka
distro=rpm-mitaka
source=stable/mitaka
target=centos
smtpserver=
reponame=delorean
templatedir=./dlrn/templates
maxretries=3
pkginfo_driver=dlrn.drivers.rdoinfo.RdoInfoDriver
tags=
#tags=mitaka
rsyncdest=
rsyncport=22

[gitrepo_driver]


-- 
Best regards
Sagi Shnaidman
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [TripleO][CI] Tempest on periodic jobs

2016-05-17 Thread Sagi Shnaidman
Hi,
raising again the question about tempest running on TripleO CI as it was
discussed in the last TripleO meeting.

I'd like to get your attention that in these tests, which I ran just for
ensure it works, there were bugs discovered, and these weren't corner cases
but real failures of TripleO installation. Like this one for Sahara:
https://review.openstack.org/#/c/309042/
I'm sorry, I should have prepared these bugs for the meeting as proofs for
testing value.

The second issue that was blocker before is a wall time and now, as we can
see from jobs length, after HW upgrade of CI - is not an issue anymore. We
can run tempest without any fear to get into timeout problem, "nonha" job
for sure, as most short from all.

So I'd insist on running tempest exactly on promoting job in order not to
promote images with bugs, especially the critical like the whole service
not available at all. The pingtest is not enough for this purpose as we can
see from the bugs above, it checks very basic things and not all services
are covered. I think we aren't interested just to see the jobs green, but
sticking for the basic working functionality and quality of promoting.
Maybe it's influence of my previous QA roles, but I don't see any value to
promote something with bugs.

The point about CI stability - the last issues that CI faces now are not so
connected to tempest tests or CI code at all, it's bugs of underlying
projects and whether tempest will run or not doesn't really matters in this
case. These issues fail everything yet before any testing starts.
Indication of such issues before they leak into TripleO is different topic
and approach.

So my main point for running tempest tests on "nonha" periodic jobs is:
Quality and guaranteed basic functionality of installed overcloud services.
At least all of them are up and can accept connections. Avoid and early
discover critical bugs that are not seen in pingtest. I remind that we
going to run the only smoke tests, which takes not much time and check the
basic functionality only.

P.S. If there is interest, we can run the whole tempest set or specific
sets in experimental or third-party jobs just for indication. And I mean
not only tempest tests, but project scenario tests as well, for example
Heat integration tests. Both for undercloud and overcloud.

P.P.S Just ping me if you have any unclear points or would like to discuss
it in separate meeting, I'll give the all required info.

Thanks
-- 
Best regards
Sagi Shnaidman
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO][CI] Elastic recheck bugs

2016-05-09 Thread Sagi Shnaidman
Sorry, missed the mentioned patches:

[1] Refresh log files for tripleo project:
https://review.openstack.org/#/c/312985/
[2] Add bug for TripleO timeouts: https://review.openstack.org/#/c/313038/

On Mon, May 9, 2016 at 8:22 PM, Sagi Shnaidman  wrote:

> Hi, all
>
> I'd like to enable elastic recheck on TripleO CI and have submitted
> patches for refreshing the tracked logs [1] (please review) and for timeout
> case [2].
> But according to Derek's comment behind the timeout issue could be
> multiple issues and bugs, so I'd like to clarify - what are criteria for
> elastic recheck bugs?
>
> I thought about those markers:
>
> Nova:
> 1) "No valid host was found. There are not enough hosts"
> Network issues:
> 2) "Failed to connect to trunk.rdoproject.org" OR "fatal: The remote end
> hung up unexpectedly"  OR "Could not resolve host:"
> Ironic:
> 3) "Error contacting Ironic server:"
> 4) "Introspection completed with errors:"
> 5) ": Introspection timeout"
> 6) "Timed out waiting for node "
> Glance:
> 7) "500 Internal Server Error: Failed to upload image"
> crm_resource:
> 8) "crm_resource for openstack "
>
> and various puppet errors.
>
> However almost all of these messages could have different root causes,
> except of network failures. Easy to fix bug doesn't make to submit there,
> because they will be fixed yet before recheck patch will be merged.
> So, could you please think about right criteria of bugs for elastic
> recheck?
>
> Thanks
>
> --
> Best regards
> Sagi Shnaidman
>



-- 
Best regards
Sagi Shnaidman
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [TripleO][CI] Elastic recheck bugs

2016-05-09 Thread Sagi Shnaidman
Hi, all

I'd like to enable elastic recheck on TripleO CI and have submitted patches
for refreshing the tracked logs [1] (please review) and for timeout case
[2].
But according to Derek's comment behind the timeout issue could be multiple
issues and bugs, so I'd like to clarify - what are criteria for elastic
recheck bugs?

I thought about those markers:

Nova:
1) "No valid host was found. There are not enough hosts"
Network issues:
2) "Failed to connect to trunk.rdoproject.org" OR "fatal: The remote end
hung up unexpectedly"  OR "Could not resolve host:"
Ironic:
3) "Error contacting Ironic server:"
4) "Introspection completed with errors:"
5) ": Introspection timeout"
6) "Timed out waiting for node "
Glance:
7) "500 Internal Server Error: Failed to upload image"
crm_resource:
8) "crm_resource for openstack "

and various puppet errors.

However almost all of these messages could have different root causes,
except of network failures. Easy to fix bug doesn't make to submit there,
because they will be fixed yet before recheck patch will be merged.
So, could you please think about right criteria of bugs for elastic recheck?

Thanks

-- 
Best regards
Sagi Shnaidman
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO][CI] Tempest sources for testing tripleo in CI environment

2016-04-18 Thread Sagi Shnaidman
For making clear all advantages and disadvantages, I've created a doc:
https://docs.google.com/document/d/1HmY-I8OzoJt0SzLzs79hCa1smKGltb-byrJOkKKGXII/edit?usp=sharing

Please comment.

On Sun, Apr 17, 2016 at 12:14 PM, Sagi Shnaidman 
wrote:

>
> Hi,
>
> John raised up the issue - where should we take tempest sources from.
> I'm not sure where to take them from, so I bring it to wider discussion.
>
> Right now I use tempest from delorean packages. In comparison with
> original tempest I don't see any difference in tests, only additional
> configuration scripts:
> https://github.com/openstack/tempest/compare/master...redhat-openstack:master
> It's worth to mention that in case of delorean tempest the configuration
> scripts fit tempest tests configuration, however in case of original
> tempest repo it's required to change them and maintain according to very
> dynamical configuration.
>
> So, do we need to use pure upstream tempest from current source and to
> maintain configuration scripts or we can use packaged from delorean and not
> duplicate effort of test teams?
>
> Thanks
> --
> Best regards
> Sagi Shnaidman
>



-- 
Best regards
Sagi Shnaidman
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [TripleO][CI] Tempest sources for testing tripleo in CI environment

2016-04-17 Thread Sagi Shnaidman
Hi,

John raised up the issue - where should we take tempest sources from.
I'm not sure where to take them from, so I bring it to wider discussion.

Right now I use tempest from delorean packages. In comparison with original
tempest I don't see any difference in tests, only additional configuration
scripts:
https://github.com/openstack/tempest/compare/master...redhat-openstack:master
It's worth to mention that in case of delorean tempest the configuration
scripts fit tempest tests configuration, however in case of original
tempest repo it's required to change them and maintain according to very
dynamical configuration.

So, do we need to use pure upstream tempest from current source and to
maintain configuration scripts or we can use packaged from delorean and not
duplicate effort of test teams?

Thanks
-- 
Best regards
Sagi Shnaidman
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] [CI] Tempest configuration in Tripleo CI jobs

2016-04-11 Thread Sagi Shnaidman
Hi, Andrey

I've checked this option - to use rally for configuring and running tempest
test.
Although it looks like great choice, unfortunately a few issues and bugs
makes it not useful right now. For example it can not work with current
public networks and can not create new ones, so that everything that is
related to networking will fail. As I understand this bug remains already a
long time unsolved: https://bugs.launchpad.net/rally/+bug/1550848
Also it doesn't have possibility to customize configuration options when
running tempest configuration, like configure_tempest.py has - just count
them in the command line. In rally you will need to generate tempest file
and then manually to edit it for customize (for example tempest log path in
DEFAULT section). Adding "interface" for tempest configuration will be
great feature for rally IMHO.
I think it's cool approach and we definitely should take it into account,
but now it looks pretty raw and not stable enough to use it in gate jobs.
Anyway, thank you for your pointing out to this great tool.

Thanks

On Fri, Apr 8, 2016 at 2:33 PM, Andrey Kurilin 
wrote:

> Hi Sagi,
>
>
> On Thu, Apr 7, 2016 at 5:56 PM, Sagi Shnaidman 
> wrote:
>
>> Hi, all
>>
>> I'd like to discuss the topic about how do we configure tempest in CI
>> jobs for TripleO.
>> I have currently two patches:
>> support for tempest: https://review.openstack.org/#/c/295844/
>> actually run of tests: https://review.openstack.org/#/c/297038/
>>
>> Right now there is no upstream tool to configure tempest, so everybody
>> use their own tools.
>>
>
> You are wrong. There is Rally in upstream:)
> Basic and the most widely used Rally component is Task, which provides
> benchmarking and testing tool.
> But, also, Rally has Verification component(here
> <https://www.mirantis.com/blog/rally-openstack-tempest-testing-made-simpler/>
> you can find is a bit outdated blog-post, but it can introduce Verification
> component for you).
> It can:
>
> 1. Configure Tempest based on public OpenStack API.
> An example of config from our gates:
> http://logs.openstack.org/58/285758/5/check/gate-rally-dsvm-verify-full/eabe2ff/rally-verify/5_verify_showconfig.txt.gz
> . Empty options mean that rally will check these resources while running
> tempest and create it if necessary)
>
> 2. Launch set of tests, tests which match regexp, list of tests. Also, it
> supports x-fail mechanism from out of box.
> An example of full run based on config file posted above -
> http://logs.openstack.org/58/285758/5/check/gate-rally-dsvm-verify-full/eabe2ff/rally-verify/7_verify_results.html.gz
>
> 3. Compare results.
>
> http://logs.openstack.org/58/285758/5/check/gate-rally-dsvm-verify-light/d806b91/rally-verify/17_verify_compare_--uuid-1_9fe72ea8-bd5c-45eb-9a37-5e674ea5e5d4_--uuid-2_315843d4-40b8-46f2-aa69-fb3d5d463379.html.gz
> It is not so good-looking as other rally reports, but we will fix it
> someday:)
>
> Summarize:
> - Rally is an upstream tool, which was accepted to BigTent.
> - One instance of Rally can manage and run tempest for different number of
> clouds
> - Rally Verification component is tested in gates for every new patch.
> Also it supports different APIs of services.
> - You can install, configure, launch, store results, display results in
> different formats.
>
> Btw, we are planning to refactor verification component(there is an spec
> on review with several +2), so you will be able to launch whatever you want
> subunit-based tools via Rally and simplify usage of it.
>
> However it's planned and David Mellado is working on it AFAIK.
>>
> Till then everybody use their own tools for tempest configuration.
>> I'd review two of them:
>> 1) Puppet configurations that is used in puppet modules CI
>> 2) Using configure_tempest.py script from
>> https://github.com/redhat-openstack/tempest/blob/master/tools/config_tempest.py
>>
>> Unfortunately there is no ready puppet module or script, that configures
>> tempest, you need to create your own.
>>
>> On other hand the config_tempest.py script provides full configuration,
>> support for tempest-deployer-input.conf and possibility to add any config
>> options in the command line when running it:
>>
>> python config_tempest.py \
>> --out etc/tempest.conf \
>> --debug \
>> --create \
>> --deployer-input ~/tempest-deployer-input.conf \
>> identity.uri $OS_AUTH_URL \
>> compute.allow_tenant_isolation true \
>> identity.admin_password $OS_PASSWORD \
>> compute.build_timeout 500 \
>> compute.image_ssh_user cirros
>>
>&

[openstack-dev] [TripleO] [CI] Tempest configuration in Tripleo CI jobs

2016-04-07 Thread Sagi Shnaidman
 => '/tmp/openstack/tempest',
}

But it's not enough, you need also to make some workarounds and additional
configurations, for example:

tempest_config { 'object-storage/operator_role':
  value => 'SwiftOperator',
  path  => "${tempest_clone_path}/etc/tempest.conf",
}
}

After this run puppet on controller node:

sudo puppet apply --verbose --debug --detailed-exitcodes -e "include
::testt" | tee ~/puppet_run.log

After everything is finished, you need to copy the folder with tempest to
your node:
scp -r -heat-admin@${CONTROLLER}:/tmp/openstack /tmp/

After this run within this directory testr init and run tests:
/tmp/tempest/tools/with_venv.sh testr init
/tmp/tempest/tools/with_venv.sh testr run

There are still holes in this configuration and most likely you'd fix it by
another workarounds and tempest_config runs, because it's still a few of
skipped tests, so configuration is not full as it would be done with
config_tempest.py.
You don't have also any possibility to add custom configuration in running
the manifest, for each config change you need to change the manifest itself
which makes it maintenance harder and more complex.

I would say that conclusion is quite obvious for me and it's much easier
even to write tempest.conf manually from scratch or simple template and use
5 bash lines, then use puppet for things it's completely not fit to.

P.S. In this script I used ideas from puppet-openstack-integration and
packstack projects.

[1] https://review.openstack.org/#/c/295844/
[2] https://git.openstack.org/openstack-infra/tripleo-ci

-- 
Best regards
Sagi Shnaidman
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev