from:"Clark Boylan"

[openstack-dev] OpenDev, the future of OpenStack Infra

2018-11-08 Thread Clark Boylan

Hello everyone,

Sorry for another cross post so soon.

In the land before time we had Stackforge. 

Stackforge gave non-OpenStack projects a place to live with their own clearly 
defined "not OpenStack" namespacing. As the wheel of time spun we realized that 
many Stackforge projects were becoming OpenStack projects and we would have to 
migrate them. This involved Gerrit downtimes to rename things safely. To ease 
the pain of this, the TC decided that all projects developed in the OpenStack 
Infrastructure could live under the OpenStack git namespace to simplify 
migrations.

Unfortunately this had the effect of creating confusion over which projects 
were officially a part of OpenStack, and whether or not projects that were not 
OpenStack could use our project hosting. Stackforge lived on under a different 
name, "unofficial project hosting", but many potential infrastructure users 
either didn't understand this or didn't want that strong association to 
OpenStack for their project hosting [0].

Turns out that we want to be able to host OpenStack and non-OpenStack projects 
together without confusion in a way that makes all of the projects involved 
happy. In an effort to make this a reality the OpenStack Infra team has been 
working through a process to rename itself to make it clear that our awesome 
project infrastructure and open collaboration tooling is community run, not 
just for OpenStack, but for others that want to be involved. To this end we've 
acquired the opendev.org domain which will allow us to host services under a 
neutral name as the OpenDev Infrastructure team.

The OpenStack community will continue to be the largest and a primary user for 
the OpenDev Infrastructure team, but our hope in making our infrastructure 
services more inclusive is that we'll also attract new contributors, which will 
ultimately benefit OpenStack and other open infrastructure projects.

Our goals for OpenDev are to:
  * Encourage additional active infrastructure contributors to help us scale. 
Make it clear that this is community-run tooling & infrastructure and everyone 
can get involved.
  * Make open source collaboration tools and project infrastructure more 
accessible to those that want it.
  * Have exposure to and dogfooding of OpenStack clouds as viable open source 
cloud providers.
  * Enable more projects to take advantage of the OpenStack-pioneered model of 
development and collaboration, including recommended practices like code review 
and gating.
  * Help build relationships with new and adjacent open source projects and 
create an inclusive space for collaboration and open source development.

Much of this is still in the early planning stages. This is the infrastructure 
team's current thinking on the subject, but understand we have an existing 
community from which we'd like to see buy-in and involvement. To that end we 
have tried to compile a list of expected FAQ/Q information below, but feel 
free to followup either on this thread or with myself for anything we haven't 
considered already.

Any transition will be slow and considered so don't expect everything to change 
overnight. But don't be surprised if you run into some new HTTP redirects as we 
reorganize the names under which services run. We'll also be sure to keep you 
informed on any major (and probably minor) transition steps so that they won't 
surprise you.

Thank you,
Clark

[0] It should be noted that some projects did not mind this and hosted with 
OpenStack Infra anyway. ARA is an excellent example of this.

FAQ

* What is OpenDev?
OpenDev is community-run tools and infrastructure services for collaboratively 
developing open source software. 

The OpenDev infrastruture team is the community of people who operate the 
infrastructure under the umbrella of the OpenStack Foundation.

* What services are you offering? What is the expected timeline?
In the near-term we expect to transition simple services like etherpad hosting 
to the OpenDev domain. It wil take us months and potentially up to a year to 
transition key infrastructure pieces like Git and Gerrit.

Example services managed by the team today include etherpad, wiki, the zuul and 
nodepool CI system, git and gerrit, and other minor systems like pbx 
conferencing and survey tools.

* Where will these services live?
We've acquired opendev.org and are planning to set up DNS hosting very soon. We 
will post a simple information page and FAQ on the website and build it out as 
necessary over time. 

* Why are you changing from the OpenStack infrastructure team to the OpenDev 
infrastructure team?
In the same way we want to signal that our services are not strictly for 
OpenStack projects, and that not every project using our services is an 
official part of OpenStack, we want to make it clear that our team also serves 
this larger community.

* Who should use OpenDev services? Does it have to be projects related to 
OpenStack, or any open source projects?
In short, open

Re: [openstack-dev] [python3] Enabling py37 unit tests

2018-11-07 Thread Clark Boylan

On Wed, Nov 7, 2018, at 4:47 AM, Mohammed Naser wrote:
> On Wed, Nov 7, 2018 at 1:37 PM Doug Hellmann  wrote:
> >
> > Corey Bryant  writes:
> >
> > > On Wed, Oct 10, 2018 at 8:45 AM Corey Bryant 
> > > wrote:
> > >
> > > I'd like to start moving forward with enabling py37 unit tests for a 
> > > subset
> > > of projects. Rather than putting too much load on infra by enabling 3 x 
> > > py3
> > > unit tests for every project, this would just focus on enablement of py37
> > > unit tests for a subset of projects in the Stein cycle. And just to be
> > > clear, I would not be disabling any unit tests (such as py35). I'd just be
> > > enabling py37 unit tests.
> > >
> > > As some background, this ML thread originally led to updating the
> > > python3-first governance goal (https://review.openstack.org/#/c/610708/)
> > > but has now led back to this ML thread for a +1 rather than updating the
> > > governance goal.
> > >
> > > I'd like to get an official +1 here on the ML from parties such as the TC
> > > and infra in particular but anyone else's input would be welcomed too.
> > > Obviously individual projects would have the right to reject proposed
> > > changes that enable py37 unit tests. Hopefully they wouldn't, of course,
> > > but they could individually vote that way.
> > >
> > > Thanks,
> > > Corey
> >
> > This seems like a good way to start. It lets us make incremental
> > progress while we take the time to think about the python version
> > management question more broadly. We can come back to the other projects
> > to add 3.7 jobs and remove 3.5 jobs when we have that plan worked out.
> 
> What's the impact on the number of consumption in upstream CI node usage?
> 

For period from 2018-10-25 15:16:32,079 to 2018-11-07 15:59:04,994, 
openstack-tox-py35 jobs in aggregate represent 0.73% of our total capacity 
usage.

I don't expect py37 to significantly deviate from that. Again the major 
resource consumption is dominated by a small number of projects/repos/jobs. 
Generally testing outside of that bubble doesn't represent a significant 
resource cost.

I see no problem with adding python 3.7 unit testing from an infrastructure 
perspective.

Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [all][qa] Migrating devstack jobs to Bionic (Ubuntu LTS 18.04)

2018-11-06 Thread Clark Boylan

On Tue, Nov 6, 2018, at 2:02 PM, Ghanshyam Mann wrote:
> Thanks Jens.
> 
> As most of the base jobs are in QA repo, QA team will coordinate this 
> migration based on either of the approach mentioned below. 
> 
> Another point to note - This migration will only target the zuulv3 jobs 
> not the legacy jobs. legacy jobs owner should migrate them to bionic 
> when they will be moved to zuulv3 native. Any risk of keeping the legacy 
> on xenial till zullv3 ?
> 
> Tempest testing patch found that stable queens/pike jobs failing for 
> bionic due to not supported distro in devstack[1]. Fixing in  
> https://review.openstack.org/#/c/616017/ and will backport to pike too.

The existing stable branches should continue to test on xenial as that is what 
they were built on. We aren't asking that everything be ported forward to 
bionic. Instead the idea is that current development (aka master) switch to 
bionic and roll forward from that point.

This applies to tempest jobs, functional jobs, and unittests, etc. Xenial isn't 
going away. It is there for the stable branches.

> 
> [1]  https://review.openstack.org/#/c/611572/
> 
> http://logs.openstack.org/72/611572/1/check/tempest-full-queens/7cd3f21/job-output.txt.gz#_2018-11-01_09_57_07_551538
>  
> 
> 
> -gmann

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

[openstack-dev] Community Infrastructure Berlin Summit Onboarding Session

2018-11-05 Thread Clark Boylan

Hello everyone,

My apologies for cross posting but wanted to make sure the various developer 
groups saw this.

Rather than use the Infrastructure Onboarding session in Berlin [0] for 
infrastructure sysadmin/developer onboarding, I thought we could use the time 
for user onboarding. We've got quite a few new groups interacting with us 
recently, and it would probably be useful to have a session on what we do, how 
people can take advantage of this, and so on.

I've been brainstorming ideas on this etherpad [1]. If you think you'll attend 
the session and find any of these subjects to be useful please +1 them. Also 
feel free to add additional topics.

I expect this will be an informal session that directly targets the interests 
of those attending. Please do drop by if you have any interest in using this 
infrastructure at all. This is your chance to better understand Zuul job 
configuration, the test environments themselves, the metrics and data we 
collect, and basically anything else related to the community developer 
infrastructure.

[0] 
https://www.openstack.org/summit/berlin-2018/summit-schedule/events/22950/infrastructure-project-onboarding
[1] https://etherpad.openstack.org/p/openstack-infra-berlin-onboarding

Hope to see you there,
Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [tripleo] Zuul Queue backlogs and resource usage

2018-10-30 Thread Clark Boylan

On Tue, Oct 30, 2018, at 1:01 PM, Ben Nemec wrote:
> 
> 
> On 10/30/18 1:25 PM, Clark Boylan wrote:
> > On Tue, Oct 30, 2018, at 10:42 AM, Alex Schultz wrote:
> >> On Tue, Oct 30, 2018 at 11:36 AM Ben Nemec  wrote:
> >>>
> >>> Tagging with tripleo since my suggestion below is specific to that 
> >>> project.
> >>>
> >>> On 10/30/18 11:03 AM, Clark Boylan wrote:
> >>>> Hello everyone,
> >>>>
> >>>> A little while back I sent email explaining how the gate queues work and 
> >>>> how fixing bugs helps us test and merge more code. All of this still is 
> >>>> still true and we should keep pushing to improve our testing to avoid 
> >>>> gate resets.
> >>>>
> >>>> Last week we migrated Zuul and Nodepool to a new Zookeeper cluster. In 
> >>>> the process of doing this we had to restart Zuul which brought in a new 
> >>>> logging feature that exposes node resource usage by jobs. Using this 
> >>>> data I've been able to generate some report information on where our 
> >>>> node demand is going. This change [0] produces this report [1].
> >>>>
> >>>> As with optimizing software we want to identify which changes will have 
> >>>> the biggest impact and to be able to measure whether or not changes have 
> >>>> had an impact once we have made them. Hopefully this information is a 
> >>>> start at doing that. Currently we can only look back to the point Zuul 
> >>>> was restarted, but we have a thirty day log rotation for this service 
> >>>> and should be able to look at a month's worth of data going forward.
> >>>>
> >>>> Looking at the data you might notice that Tripleo is using many more 
> >>>> node resources than our other projects. They are aware of this and have 
> >>>> a plan [2] to reduce their resource consumption. We'll likely be using 
> >>>> this report generator to check progress of this plan over time.
> >>>
> >>> I know at one point we had discussed reducing the concurrency of the
> >>> tripleo gate to help with this. Since tripleo is still using >50% of the
> >>> resources it seems like maybe we should revisit that, at least for the
> >>> short-term until the more major changes can be made? Looking through the
> >>> merge history for tripleo projects I don't see a lot of cases (any, in
> >>> fact) where more than a dozen patches made it through anyway*, so I
> >>> suspect it wouldn't have a significant impact on gate throughput, but it
> >>> would free up quite a few nodes for other uses.
> >>>
> >>
> >> It's the failures in gate and resets.  At this point I think it would
> >> be a good idea to turn down the concurrency of the tripleo queue in
> >> the gate if possible. As of late it's been timeouts but we've been
> >> unable to track down why it's timing out specifically.  I personally
> >> have a feeling it's the container download times since we do not have
> >> a local registry available and are only able to leverage the mirrors
> >> for some levels of caching. Unfortunately we don't get the best
> >> information about this out of docker (or the mirrors) and it's really
> >> hard to determine what exactly makes things run a bit slower.
> > 
> > We actually tried this not too long ago 
> > https://git.openstack.org/cgit/openstack-infra/project-config/commit/?id=22d98f7aab0fb23849f715a8796384cffa84600b
> >  but decided to revert it because it didn't decrease the check queue 
> > backlog significantly. We were still running at several hours behind most 
> > of the time.
> 
> I'm surprised to hear that. Counting the tripleo jobs in the gate at 
> positions 11-20 right now, I see around 84 nodes tied up in long-running 
> jobs and another 32 for shorter unit test jobs. The latter probably 
> don't have much impact, but the former is a non-trivial amount. It may 
> not erase the entire 2300+ job queue that we have right now, but it 
> seems like it should help.
> 
> > 
> > If we want to set up better monitoring and measuring and try it again we 
> > can do that. But we probably want to measure queue sizes with and without 
> > the change like that to better understand if it helps.
> 
> This seems like good information to start capturing, otherwise we are 
> kind of just guessing. Is there something in infra alread

Re: [openstack-dev] [tripleo] Zuul Queue backlogs and resource usage

2018-10-30 Thread Clark Boylan

On Tue, Oct 30, 2018, at 10:42 AM, Alex Schultz wrote:
> On Tue, Oct 30, 2018 at 11:36 AM Ben Nemec  wrote:
> >
> > Tagging with tripleo since my suggestion below is specific to that project.
> >
> > On 10/30/18 11:03 AM, Clark Boylan wrote:
> > > Hello everyone,
> > >
> > > A little while back I sent email explaining how the gate queues work and 
> > > how fixing bugs helps us test and merge more code. All of this still is 
> > > still true and we should keep pushing to improve our testing to avoid 
> > > gate resets.
> > >
> > > Last week we migrated Zuul and Nodepool to a new Zookeeper cluster. In 
> > > the process of doing this we had to restart Zuul which brought in a new 
> > > logging feature that exposes node resource usage by jobs. Using this data 
> > > I've been able to generate some report information on where our node 
> > > demand is going. This change [0] produces this report [1].
> > >
> > > As with optimizing software we want to identify which changes will have 
> > > the biggest impact and to be able to measure whether or not changes have 
> > > had an impact once we have made them. Hopefully this information is a 
> > > start at doing that. Currently we can only look back to the point Zuul 
> > > was restarted, but we have a thirty day log rotation for this service and 
> > > should be able to look at a month's worth of data going forward.
> > >
> > > Looking at the data you might notice that Tripleo is using many more node 
> > > resources than our other projects. They are aware of this and have a plan 
> > > [2] to reduce their resource consumption. We'll likely be using this 
> > > report generator to check progress of this plan over time.
> >
> > I know at one point we had discussed reducing the concurrency of the
> > tripleo gate to help with this. Since tripleo is still using >50% of the
> > resources it seems like maybe we should revisit that, at least for the
> > short-term until the more major changes can be made? Looking through the
> > merge history for tripleo projects I don't see a lot of cases (any, in
> > fact) where more than a dozen patches made it through anyway*, so I
> > suspect it wouldn't have a significant impact on gate throughput, but it
> > would free up quite a few nodes for other uses.
> >
> 
> It's the failures in gate and resets.  At this point I think it would
> be a good idea to turn down the concurrency of the tripleo queue in
> the gate if possible. As of late it's been timeouts but we've been
> unable to track down why it's timing out specifically.  I personally
> have a feeling it's the container download times since we do not have
> a local registry available and are only able to leverage the mirrors
> for some levels of caching. Unfortunately we don't get the best
> information about this out of docker (or the mirrors) and it's really
> hard to determine what exactly makes things run a bit slower.

We actually tried this not too long ago 
https://git.openstack.org/cgit/openstack-infra/project-config/commit/?id=22d98f7aab0fb23849f715a8796384cffa84600b
 but decided to revert it because it didn't decrease the check queue backlog 
significantly. We were still running at several hours behind most of the time.

If we want to set up better monitoring and measuring and try it again we can do 
that. But we probably want to measure queue sizes with and without the change 
like that to better understand if it helps.

As for container image download times can we quantify that via docker logs? 
Basically sum up the amount of time spent by a job downloading images so that 
we can see what the impact is but also measure if changes improve that? As for 
other ideas improving things seems like many of the images that tripleo use are 
quite large. I recall seeing a > 600MB image just for rsyslog. Wouldn't it be 
advantageous for both the gate and tripleo in the real world to trim the size 
of those images (which should improve download times). In any case quantifying 
the size of the downloads and trimming those if possible is likely also 
worthwhile.

Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

[openstack-dev] Zuul Queue backlogs and resource usage

2018-10-30 Thread Clark Boylan

Hello everyone,

A little while back I sent email explaining how the gate queues work and how 
fixing bugs helps us test and merge more code. All of this still is still true 
and we should keep pushing to improve our testing to avoid gate resets.

Last week we migrated Zuul and Nodepool to a new Zookeeper cluster. In the 
process of doing this we had to restart Zuul which brought in a new logging 
feature that exposes node resource usage by jobs. Using this data I've been 
able to generate some report information on where our node demand is going. 
This change [0] produces this report [1].

As with optimizing software we want to identify which changes will have the 
biggest impact and to be able to measure whether or not changes have had an 
impact once we have made them. Hopefully this information is a start at doing 
that. Currently we can only look back to the point Zuul was restarted, but we 
have a thirty day log rotation for this service and should be able to look at a 
month's worth of data going forward.

Looking at the data you might notice that Tripleo is using many more node 
resources than our other projects. They are aware of this and have a plan [2] 
to reduce their resource consumption. We'll likely be using this report 
generator to check progress of this plan over time.

Also related to the long queue backlogs is this proposal [3] to change how Zuul 
prioritizes resource allocations to try to be more fair.

[0] https://review.openstack.org/#/c/613674/
[1] http://paste.openstack.org/show/733644/
[2] http://lists.openstack.org/pipermail/openstack-dev/2018-October/135396.html
[3] http://lists.zuul-ci.org/pipermail/zuul-discuss/2018-October/000575.html

If you find any of this interesting and would like to help feel free to reach 
out to myself or the infra team.

Thank you,
Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] Proposal for a process to keep up with Python releases

2018-10-22 Thread Clark Boylan

On Mon, Oct 22, 2018, at 7:33 AM, Thomas Goirand wrote:
> On 10/19/18 5:17 PM, Zane Bitter wrote:

snip

> > Integration Tests
> > -
> > 
> > Integration tests do test, amongst other things, integration with
> > non-openstack-supplied things in the distro, so it's important that we
> > test on the actual distros we have identified as popular.[2] It's also
> > important that every project be testing on the same distro at the end of
> > a release, so we can be sure they all work together for users.
> 
> I find very disturbing to see the project only leaning toward these only
> 2 distributions. Why not SuSE & Debian?

It has to do with previous statements about distro support from the TC: 
https://governance.openstack.org/tc/reference/project-testing-interface.html#linux-distributions
 is the [2] above. Changing this would be an orthogonal piece of work even 
though there is relationship between the two topics. Zane's proposal can 
accommodate change in the distro support assertion, but is focused on figuring 
out which python versions to test with that as one of the inputs.

Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] Proposal for a process to keep up with Python releases

2018-10-19 Thread Clark Boylan

On Fri, Oct 19, 2018, at 8:17 AM, Zane Bitter wrote:
> There hasn't been a Python 2 release in 8 years, and during that time 
> we've gotten used to the idea that that's the way things go. However, 
> with the switch to Python 3 looming (we will drop support for Python 2 
> in the U release[1]), history is no longer a good guide: Python 3 
> releases drop as often as every year. We are already feeling the pain 
> from this, as Linux distros have largely already completed the shift to 
> Python 3, and those that have are on versions newer than the py35 we 
> currently have in gate jobs.
> 
> We have traditionally held to the principle that we want each release to 
> support the latest release of CentOS and the latest LTS release of 
> Ubuntu, as they existed at the beginning of the release cycle.[2] 
> Currently this means in practice one version of py2 and one of py3, but 
> in the future it will mean two, usually different, versions of py3.
> 
> There are two separate issues that we need to address: unit tests (we'll 
> define this as code tested in isolation, within or spawned from within 
> the testing process), and integration tests (we'll define this as code 
> running in its own process, tested from the outside). I have two 
> separate but related proposal for how to handle those.
> 
> I'd like to avoid discussion which versions of things we think should be 
> supported in Stein in this thread. Let's come up with a process that we 
> think is a good one to take into T and beyond, and then retroactively 
> apply it to Stein. Competing proposals are of course welcome, in 
> addition to feedback on this one.
> 
> Unit Tests
> --
> 
> For unit tests, the most important thing is to test on the versions of 
> Python we target. It's less important to be using the exact distro that 
> we want to target, because unit tests generally won't interact with 
> stuff outside of Python.
> 
> I'd like to propose that we handle this by setting up a unit test 
> template in openstack-zuul-jobs for each release. So for Stein we'd have 
> openstack-python3-stein-jobs. This template would contain:

Because zuul config is branch specific we could set up every project to use a 
`openstack-python3-jobs` template then define that template differently on each 
branch. This would mean you only have to update the location where the template 
is defined and not need to update every other project after cutting a stable 
branch. I would suggest we take advantage of that to reduce churn.

> 
> * A voting gate job for the highest minor version of py3 we want to 
> support in that release.
> * A voting gate job for the lowest minor version of py3 we want to 
> support in that release.
> * A periodic job for any interim minor releases.
> * (Starting late in the cycle) a non-voting check job for the highest 
> minor version of py3 we want to support in the *next* release (if 
> different), on the master branch only.
> 
> So, for example, (and this is still under active debate) for Stein we 
> might have gating jobs for py35 and py37, with a periodic job for py36. 
> The T jobs might only have voting py36 and py37 jobs, but late in the T 
> cycle we might add a non-voting py38 job on master so that people who 
> haven't switched to the U template yet can see what, if anything, 
> they'll need to fix.
> 
> We'll run the unit tests on any distro we can find that supports the 
> version of Python we want. It could be a non-LTS Ubuntu, Fedora, Debian 
> unstable, whatever it takes. We won't wait for an LTS Ubuntu to have a 
> particular Python version before trying to test it.
> 
> Before the start of each cycle, the TC would determine which range of 
> versions we want to support, on the basis of the latest one we can find 
> in any distro and the earliest one we're likely to need in one of the 
> supported Linux distros. There will be a project-wide goal to switch the 
> testing template from e.g. openstack-python3-stein-jobs to 
> openstack-python3-treasure-jobs for every repo before the end of the 
> cycle. We'll have goal champions as usual following up and helping teams 
> with the process. We'll know where the problem areas are because we'll 
> have added non-voting jobs for any new Python versions to the previous 
> release's template.

I don't know that this needs to be a project wide goal if you can just update 
the template on the master branch where the template is defined. Do that then 
every project is now running with the up to date version of the template. We 
should probably advertise when this is happening with some links to python 
version x.y breakages/features, but the process itself should be quick.

As for python version range selection I worry that that the criteria about 
relies on too much guesswork. I do think we should do our best to test future 
incoming versions of python even while not officially supporting them. We will 
have to support them at some point, either directly or via some later version

Re: [openstack-dev] [tripleo][ci] Having more that one queue for gate pipeline at tripleo

2018-10-11 Thread Clark Boylan

On Thu, Oct 11, 2018, at 7:17 AM, Ben Nemec wrote:
> 
> 
> On 10/11/18 8:53 AM, Felix Enrique Llorente Pastora wrote:
> > So for example, I don't see why changes at tripleo-quickstart can be 
> > reset if tripleo-ui fails, this is the kind of thing that maybe can be 
> > optimize.
> 
> Because if two incompatible changes are proposed to tripleo-quickstart 
> and tripleo-ui and both end up in parallel gate queues at the same time, 
> it's possible both queues could get wedged. Quickstart and the UI are 
> not completely independent projects. Quickstart has roles for deploying 
> the UI, which means there is a connection there.
> 
> I think the only way you could have independent gate queues is if you 
> had two disjoint sets of projects that could be gated without any use of 
> projects from the other set. I don't think it's possible to divide 
> TripleO in that way, but if I'm wrong then maybe you could do multiple 
> queues.

To follow up on this the Gate pipeline queue that your projects belong to are 
how you indicate to Zuul that there is coupling between these projects. Having 
things set up in this way allows you to ensure (through the Gate and Zuul's 
speculative future states) that a change to one project in the queue can't 
break another because they are tested together.

If your concern is "time to merge" splitting queues won't help all that much 
unless you put all of the unreliable broken code with broken tests in one queue 
and have the reliable code in another queue. Zuul tests everything in parallel 
within a queue. This means that if your code base and its tests are reliable 
you can merge 20 changes all at once and the time to merge for all 20 changes 
is the same as a single change. Problems arise when tests fail and these future 
states have to be updated and retested. This will affect one or many queues.

The fix here is to work on making reliable test jobs so that you can merge all 
20 changes in the span of time it takes to merge a single change.  This isn't 
necessarily easy, but helps you merge more code and be confident it works too.

Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [oslo][taskflow] Thoughts on moving taskflow out of openstack/oslo

2018-10-10 Thread Clark Boylan

On Wed, Oct 10, 2018, at 11:35 AM, Greg Hill wrote:
> > I'm not sure how using pull requests instead of Gerrit changesets would
> > help "core reviewers being pulled on to other projects"?
> >
> 
> The 2 +2 requirement works for larger projects with a lot of contributors.
> When you have only 3 regular contributors and 1 of them gets pulled on to a
> project and can no longer actively contribute, you have 2 developers who
> can +2 each other but nothing can get merged without that 3rd dev finding
> time to add another +2. This is what happened with Taskflow a few years
> back. Eventually the other 2 gave up and moved on also.
> 

To be clear this isn't enforced by anything but your reviewer practices. What 
is enforced is that you have +2 verified, a +2 code review, and +1 Workflow 
(this is a Gerrit submit requirements function that is also configurable per 
project). OpenStack requiring multiple +2 code reviews is enforced by humans 
and maybe the discussion could be "should taskflow and related tools allow 
single code review approval (and possibly self approval by any remaining 
cores)?"

It might be a worthwhile discussion to reevaluate whether or not the humans 
should continue to enforce this rule on all code bases independent of what 
happens with taskflow.

> 
> > Is this just about preferring not having a non-human gatekeeper like
> > Gerrit+Zuul and being able to just have a couple people merge whatever
> > they want to the master HEAD without needing to talk about +2/+W rights?
> >
> 
> We plan to still have a CI gatekeeper, probably Travis CI, to make sure PRs
> past muster before being merged, so it's not like we're wanting to
> circumvent good contribution practices by committing whatever to HEAD. But
> the +2/+W rights thing was a huge PITA to deal with with so few
> contributors, for sure.
> 
> If it's just about preferring the pull request workflow versus the
> > Gerrit rebase workflow, just say so. Same for just preferring the Github
> > UI versus Gerrit's UI (which I agree is awful).
> >
> 
> I mean, yes, I personally prefer the Github UI and workflow, but that was
> not a primary consideration. I got used to using gerrit well enough. It was
> mostly the  There's also a sense that if a project is in the Openstack
> umbrella, it's not useful outside Openstack, and Taskflow is designed to be
> a general purpose library. The hope is that just making it a regular open
> source project might attract more users and contributors. This may or may
> not bear out, but as it is, there's no real benefit to staying an openstack
> project on this front since nobody is actively working on it within the
> community.
> 
> Greg


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [manila] [infra] remove driverfixes/ocata branch [was: Re: [cinder][infra] Remove driverfixes/ocata branch]

2018-10-05 Thread Clark Boylan

On Fri, Oct 5, 2018, at 12:44 PM, Tom Barron wrote:
> Clark, would you be so kind, at your conveniencew, as to remove the 
> manila driverfixes/ocata branch?
> 
> There are no open changes on the branch and `git log 
> origin/driverfixes/ocata ^origin/stable/ocata --no-merges --oneline` 
> reveals no commits that we need to preserve.
> 
> Thanks much!
> 

Done. The old head of that branch was d9c0f8fa4b15a595ed46950b6e5b5d1b4514a7e4.

Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [all] Zuul job backlog

2018-10-04 Thread Clark Boylan

On Thu, Oct 4, 2018, at 12:16 AM, Abhishek Kekane wrote:
> Hi,
> Could you please point out some of the glance functional tests which are
> failing and causing this resets?
> I will like to put some efforts towards fixing those.

http://status.openstack.org/elastic-recheck/data/integrated_gate.html is a good 
place to start. That shows you a list of tests that failed in the OpenStack 
Integrated gate that elastic-recheck could not identify the failure for 
including those for several functional jobs.

If you'd like to start looking at identified bugs first then 
http://status.openstack.org/elastic-recheck/gate.html shows identified failures 
that happened in the gate.

For glance functional jobs the first link points to:
http://logs.openstack.org/99/595299/1/gate/openstack-tox-functional/fc13eca/
http://logs.openstack.org/44/569644/3/gate/openstack-tox-functional/b7c487c/
http://logs.openstack.org/99/595299/1/gate/openstack-tox-functional-py35/b166313/
http://logs.openstack.org/44/569644/3/gate/openstack-tox-functional-py35/ce262ab/

Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [all] Zuul job backlog

2018-09-28 Thread Clark Boylan

On Wed, Sep 19, 2018, at 12:11 PM, Clark Boylan wrote:
> Hello everyone,
> 
> You may have noticed there is a large Zuul job backlog and changes are 
> not getting CI reports as quickly as you might expect. There are several 
> factors interacting with each other to make this the case. The short 
> version is that one of our clouds is performing upgrades and has been 
> removed from service, and we have a large number of gate failures which 
> cause things to reset and start over. We have fewer resources than 
> normal and are using them inefficiently. Zuul is operating as expected.
> 
> Continue reading if you'd like to understand the technical details and 
> find out how you can help make this better.
> 
> Zuul gates related projects in shared queues. Changes enter these queues 
> and are ordered in a speculative future state that Zuul assumes will 
> pass because multiple humans have reviewed the changes and said they are 
> good (also they had to pass check testing first). Problems arise when 
> tests fail forcing Zuul to evict changes from the speculative future 
> state, build a new state, then start jobs over again for this new 
> future.
> 
> Typically this doesn't happen often and we merge many changes at a time, 
> quickly pushing code into our repos. Unfortunately, the results are 
> painful when we fail often as we end up rebuilding future states and 
> restarting jobs often. Currently we have the gate and release jobs set 
> to the highest priority as well so they run jobs before other queues. 
> This means the gate can starve other work if it is flaky. We've 
> configured things this way because the gate is not supposed to be flaky 
> since we've reviewed things and already passed check testing. One of the 
> tools we have in place to make this less painful is each gate queue 
> operates on a window that grows and shrinks similar to how TCP 
> slowstart. As changes merge we increase the size of the window and when 
> they fail to merge we decrease it. This reduces the size of the future 
> state that must be rebuilt and retested on failure when things are 
> persistently flaky.
> 
> The best way to make this better is to fix the bugs in our software, 
> whether that is in the CI system itself or the software being tested. 
> The first step in doing that is to identify and track the bugs that we 
> are dealing with. We have a tool called elastic-recheck that does this 
> using indexed logs from the jobs. The idea there is to go through the 
> list of unclassified failures [0] and fingerprint them so that we can 
> track them [1]. With that data available we can then prioritize fixing 
> the bugs that have the biggest impact.
> 
> Unfortunately, right now our classification rate is very poor (only 
> 15%), which makes it difficult to know what exactly is causing these 
> failures. Mriedem and I have quickly scanned the unclassified list, and 
> it appears there is a db migration testing issue causing these tests to 
> timeout across several projects. Mriedem is working to get this 
> classified and tracked which should help, but we will also need to fix 
> the bug. On top of that it appears that Glance has flaky functional 
> tests (both python2 and python3) which are causing resets and should be 
> looked into.
> 
> If you'd like to help, let mriedem or myself know and we'll gladly work 
> with you to get elasticsearch queries added to elastic-recheck. We are 
> likely less help when it comes to fixing functional tests in Glance, but 
> I'm happy to point people in the right direction for that as much as I 
> can. If you can take a few minutes to do this before/after you issue a 
> recheck it does help quite a bit.
> 
> One general thing I've found would be helpful is if projects can clean 
> up the deprecation warnings in their log outputs. The persistent 
> "WARNING you used the old name for a thing" messages make the logs large 
> and much harder to read to find the actual failures.
> 
> As a final note this is largely targeted at the OpenStack Integrated 
> gate (Nova, Glance, Cinder, Keystone, Swift, Neutron) since that appears 
> to be particularly flaky at the moment. The Zuul behavior applies to 
> other gate pipelines (OSA, Tripleo, Airship, etc) as does elastic-
> recheck and related tooling. If you find your particular pipeline is 
> flaky I'm more than happy to help in that context as well.
> 
> [0] http://status.openstack.org/elastic-recheck/data/integrated_gate.html
> [1] http://status.openstack.org/elastic-recheck/gate.html

I was asked to write a followup to this as the long Zuul queues have persisted 
through this week. Largely because the situation from last week hasn't changed 
much. We were down the upgraded cloud region while we

[openstack-dev] [all] Zuul job backlog

2018-09-19 Thread Clark Boylan

Hello everyone,

You may have noticed there is a large Zuul job backlog and changes are not 
getting CI reports as quickly as you might expect. There are several factors 
interacting with each other to make this the case. The short version is that 
one of our clouds is performing upgrades and has been removed from service, and 
we have a large number of gate failures which cause things to reset and start 
over. We have fewer resources than normal and are using them inefficiently. 
Zuul is operating as expected.

Continue reading if you'd like to understand the technical details and find out 
how you can help make this better.

Zuul gates related projects in shared queues. Changes enter these queues and 
are ordered in a speculative future state that Zuul assumes will pass because 
multiple humans have reviewed the changes and said they are good (also they had 
to pass check testing first). Problems arise when tests fail forcing Zuul to 
evict changes from the speculative future state, build a new state, then start 
jobs over again for this new future.

Typically this doesn't happen often and we merge many changes at a time, 
quickly pushing code into our repos. Unfortunately, the results are painful 
when we fail often as we end up rebuilding future states and restarting jobs 
often. Currently we have the gate and release jobs set to the highest priority 
as well so they run jobs before other queues. This means the gate can starve 
other work if it is flaky. We've configured things this way because the gate is 
not supposed to be flaky since we've reviewed things and already passed check 
testing. One of the tools we have in place to make this less painful is each 
gate queue operates on a window that grows and shrinks similar to how TCP 
slowstart. As changes merge we increase the size of the window and when they 
fail to merge we decrease it. This reduces the size of the future state that 
must be rebuilt and retested on failure when things are persistently flaky.

The best way to make this better is to fix the bugs in our software, whether 
that is in the CI system itself or the software being tested. The first step in 
doing that is to identify and track the bugs that we are dealing with. We have 
a tool called elastic-recheck that does this using indexed logs from the jobs. 
The idea there is to go through the list of unclassified failures [0] and 
fingerprint them so that we can track them [1]. With that data available we can 
then prioritize fixing the bugs that have the biggest impact.

Unfortunately, right now our classification rate is very poor (only 15%), which 
makes it difficult to know what exactly is causing these failures. Mriedem and 
I have quickly scanned the unclassified list, and it appears there is a db 
migration testing issue causing these tests to timeout across several projects. 
Mriedem is working to get this classified and tracked which should help, but we 
will also need to fix the bug. On top of that it appears that Glance has flaky 
functional tests (both python2 and python3) which are causing resets and should 
be looked into.

If you'd like to help, let mriedem or myself know and we'll gladly work with 
you to get elasticsearch queries added to elastic-recheck. We are likely less 
help when it comes to fixing functional tests in Glance, but I'm happy to point 
people in the right direction for that as much as I can. If you can take a few 
minutes to do this before/after you issue a recheck it does help quite a bit.

One general thing I've found would be helpful is if projects can clean up the 
deprecation warnings in their log outputs. The persistent "WARNING you used the 
old name for a thing" messages make the logs large and much harder to read to 
find the actual failures.

As a final note this is largely targeted at the OpenStack Integrated gate 
(Nova, Glance, Cinder, Keystone, Swift, Neutron) since that appears to be 
particularly flaky at the moment. The Zuul behavior applies to other gate 
pipelines (OSA, Tripleo, Airship, etc) as does elastic-recheck and related 
tooling. If you find your particular pipeline is flaky I'm more than happy to 
help in that context as well.

[0] http://status.openstack.org/elastic-recheck/data/integrated_gate.html
[1] http://status.openstack.org/elastic-recheck/gate.html

Thank you,
Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [python3] tempest and grenade conversion to python 3.6

2018-09-18 Thread Clark Boylan

On Tue, Sep 18, 2018, at 9:46 AM, Nate Johnston wrote:
> Hello python 3.6 champions,
> 
> I have looked around a little, and I don't see a method for me to
> specifically select the version of python that the tempest and grenade
> jobs for my project (neutron) are using.  I assume one of four things
> is at play here:
> 
> A. These projects already shifted to python 3 and I don't have to worry
> about it
> 
> B. There is a toggle for the python version I just have not seen yet
> 
> C. These projects are still on python 2 and need help to do a conversion
> to python 3, which would affect all customers
> 
> D. Something else that I have failed to imagine
> 
> Could you elaborate which of these options properly reflects the state
> of affairs?  If the answer is "C" then perhaps we can start a discussion
> on that migration.

For our devstack and grenade jobs tempest is installed using tox [0]. And since 
the full testenv in tempest's tox.ini doesn't specify a python version [1] I 
expect that it will attempt a python2 virtualenv on every platform (Arch linux 
may be an exception but we don't test that).

I think that means C is the situation here. To change that you can set 
basepython to python3 (see [2] for an example) which will run tempest under 
whichever python3 is present on the system. The one gotcha for this is that it 
will break tempest on centos which does not have python3. Maybe the thing to do 
there is add a full-python2 testenv that centos can run?

[0] https://git.openstack.org/cgit/openstack-dev/devstack/tree/lib/tempest#n653
[1] https://git.openstack.org/cgit/openstack/tempest/tree/tox.ini#n74
[2] https://git.openstack.org/cgit/openstack-infra/zuul/tree/tox.ini#n7

Hope this helps,
Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [cinder][infra] Remove driverfixes/ocata branch

2018-09-17 Thread Clark Boylan

On Mon, Sep 17, 2018, at 8:53 AM, Jay S Bryant wrote:
> 
> 
> On 9/17/2018 10:46 AM, Sean McGinnis wrote:
> >>> Plan
> >>> 
> >>> We would now like to have the driverfixes/ocata branch deleted so there 
> >>> is no
> >>> confusion about where backports should go and we don't accidentally get 
> >>> these
> >>> out of sync again.
> >>>
> >>> Infra team, please delete this branch or let me know if there is a process
> >>> somewhere I should follow to have this removed.
> >> The first step is to make sure that all changes on the branch are in a non 
> >> open state (merged or abandoned). 
> >> https://review.openstack.org/#/q/project:openstack/cinder+branch:driverfixes/ocata+status:open
> >>  shows that there are no open changes.
> >>
> >> Next you will want to make sure that the commits on this branch are 
> >> preserved somehow. Git garbage collection will delete and cleanup commits 
> >> if they are not discoverable when working backward from some ref. This is 
> >> why our old stable branch deletion process required we tag the stable 
> >> branch as $release-eol first. Looking at `git log origin/driverfixes/ocata 
> >> ^origin/stable/ocata --no-merges --oneline` there are quite a few commits 
> >> on the driverfixes branch that are not on the stable branch, but that 
> >> appears to be due to cherry pick writing new commits. You have indicated 
> >> above that you believe the two branches are in sync at this point. A quick 
> >> sampling of commits seems to confirm this as well.
> >>
> >> If you can go ahead and confirm that you are ready to delete the 
> >> driverfixes/ocata branch I will go ahead and remove it.
> >>
> >> Clark
> >>
> > I did another spot check too to make sure I hadn't missed anything, but it 
> > does
> > appear to be as you stated that the cherry pick resulted in new commits and
> > they actually are in sync for our purposes.
> >
> > I believe we are ready to proceed.
> Sean,
> 
> Thank you for following up on this.  I agee it is a good idea to remove 
> the old driverfixes/ocata branch to avoid possible confusion in the future.
> 
> Clark,
> 
> Sean, myself and the team worked to carefully cherry-pick everything 
> that was needed in stable/ocata so I am confident that we are ready to 
> remove driverfixes/ocata.
> 

I have removed openstack/cinder driverfixes/ocata branch with HEAD 
a37cc259f197e1a515cf82deb342739a125b65c6.

Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [cinder][infra] Remove driverfixes/ocata branch

2018-09-17 Thread Clark Boylan

On Mon, Sep 17, 2018, at 8:00 AM, Sean McGinnis wrote:
> Hello Cinder and Infra teams. Cinder needs some help from infra or some
> pointers on how to proceed.
> 
> tl;dr - The openstack/cinder repo had a driverfixes/ocata branch created for
> fixes that no longer met the more restrictive phase II stable policy criteria.
> Extended maintenance has changed that and we want to delete driverfixes/ocata
> to make sure patches are going to the right place.
> 
> Background
> --
> Before the extended maintenance changes, the Cinder team found a lot of 
> vendors
> were maintaining their own forks to keep backported driver fixes that we were
> not allowing upstream due to the stable policy being more restrictive for 
> older
> (or deleted) branches. We created the driverfixes/* branches as a central 
> place
> for these to go so distros would have one place to grab these fixes, if they
> chose to do so.
> 
> This has worked great IMO, and we do occasionally still have things that need
> to go to driverfixes/mitaka and driverfixes/newton. We had also pushed a lot 
> of
> fixes to driverfixes/ocata, but with the changes to stable policy with 
> extended
> maintenance, that is no longer needed.
> 
> Extended Maintenance Changes
> 
> With things being somewhat relaxed with the extended maintenance changes, we
> are now able to backport bug fixes to stable/ocata that we couldn't before and
> we don't have to worry as much about that branch being deleted.
> 
> I had gone through and identified all patches backported to driverfixes/ocata
> but not stable/ocata and cherry-picked them over to get the two branches in
> sync. The stable/ocata should now be identical or ahead of driverfixes/ocata
> and we want to make sure nothing more gets accidentally merged to
> driverfixes/ocata instead of the official stable branch.
> 
> Plan
> 
> We would now like to have the driverfixes/ocata branch deleted so there is no
> confusion about where backports should go and we don't accidentally get these
> out of sync again.
> 
> Infra team, please delete this branch or let me know if there is a process
> somewhere I should follow to have this removed.

The first step is to make sure that all changes on the branch are in a non open 
state (merged or abandoned). 
https://review.openstack.org/#/q/project:openstack/cinder+branch:driverfixes/ocata+status:open
 shows that there are no open changes.

Next you will want to make sure that the commits on this branch are preserved 
somehow. Git garbage collection will delete and cleanup commits if they are not 
discoverable when working backward from some ref. This is why our old stable 
branch deletion process required we tag the stable branch as $release-eol 
first. Looking at `git log origin/driverfixes/ocata ^origin/stable/ocata 
--no-merges --oneline` there are quite a few commits on the driverfixes branch 
that are not on the stable branch, but that appears to be due to cherry pick 
writing new commits. You have indicated above that you believe the two branches 
are in sync at this point. A quick sampling of commits seems to confirm this as 
well.

If you can go ahead and confirm that you are ready to delete the 
driverfixes/ocata branch I will go ahead and remove it.

Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [goals][python3] mixed versions?

2018-09-12 Thread Clark Boylan

On Wed, Sep 12, 2018, at 10:23 AM, Jim Rollenhagen wrote:
> The process of operators upgrading Python versions across their fleet came
> up this morning. It's fairly obvious that operators will want to do this in
> a rolling fashion.
> 
> Has anyone considered doing this in CI? For example, running multinode
> grenade with python 2 on one node and python 3 on the other node.
> 
> Should we (openstack) test this situation, or even care?
> 

This came up in a Vancouver summit session (the python3 one I think). General 
consensus there seemed to be that we should have grenade jobs that run python2 
on the old side and python3 on the new side and test the update from one to 
another through a release that way. Additionally there was thought that the 
nova partial job (and similar grenade jobs) could hold the non upgraded node on 
python2 and that would talk to a python3 control plane.

I haven't seen or heard of anyone working on this yet though.

Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [nova] CI job running functional against a mysql DB

2018-08-13 Thread Clark Boylan

On Mon, Aug 13, 2018, at 1:50 AM, Matthew Booth wrote:
> I was reviewing https://review.openstack.org/#/c/504885/ . The change
> looks good to me and I believe the test included exercises the root
> cause of the problem. However, I'd like to be certain that the test
> has been executed against MySQL rather than, eg, SQLite.
> 
> Zuul has voted +1 on the change. Can anybody tell me if any of those
> jobs ran the included functional test against a MySQL DB?,

Both functional jobs configured a MySQL and PostgeSQL database for use by the 
test suite [0][1]. Looking at Nova's tests, the migration tests 
(nova/tests/functional/db/api/test_migrations.py and 
nova/tests/unit/db/test_migrations.py) use the oslo.db ModelsMigrationsSync 
class which should use these real databases. I'm not finding evidence that any 
other tests classes will use the real databases.

[0] 
http://logs.openstack.org/85/504885/9/check/nova-tox-functional/fa3327b/job-output.txt.gz#_2018-08-13_10_32_09_943951
[1] 
http://logs.openstack.org/85/504885/9/check/nova-tox-functional-py35/1f04657/job-output.txt.gz#_2018-08-13_10_31_00_289802

Hope this helps,
Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

[openstack-dev] [all] Gerrit project renaming Friday August 3 at 16:00UTC

2018-08-01 Thread Clark Boylan

Hello everyone,

The infra team will be renaming a couple of projects in Gerrit on Friday 
starting at 16:00UTC. This requires us to restart Gerrit on 
review.openstack.org. The total noticeable downtime should be no more than 
about 10 minutes.

If you would like to follow along, the current process is laid out at 
https://etherpad.openstack.org/p/project-renames-2018-08-03.

Let us know if you have questions or concerns,
Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [infra][nova] Running NFV tests in CI

2018-07-24 Thread Clark Boylan

On Tue, Jul 24, 2018, at 10:21 AM, Artom Lifshitz wrote:
> On Tue, Jul 24, 2018 at 12:30 PM, Clark Boylan  wrote:
> > On Tue, Jul 24, 2018, at 9:23 AM, Artom Lifshitz wrote:
> >> Hey all,
> >>
> >> tl;dr Humbly requesting a handful of nodes to run NFV tests in CI
> >>
> >> Intel has their NFV tests tempest plugin [1] and manages a third party
> >> CI for Nova. Two of the cores on that project (Stephen Finucane and
> >> Sean Mooney) have now moved to Red Hat, but the point still stands
> >> that there's a need and a use case for testing things like NUMA
> >> topologies, CPU pinning and hugepages.
> >>
> >> At Red Hat, we also have a similar tempest plugin project [2] that we
> >> use for downstream whitebox testing. The scope is a bit bigger than
> >> just NFV, but the main use case is still testing NFV code in an
> >> automated way.
> >>
> >> Given that there's a clear need for this sort of whitebox testing, I
> >> would like to humbly request a handful of nodes (in the 3 to 5 range)
> >> from infra to run an "official" Nova NFV CI. The code doing the
> >> testing would initially be the current Intel plugin, bug we could have
> >> a separate discussion about keeping "Intel" in the name or forking
> >> and/or renaming it to something more vendor-neutral.
> >
> > The way you request nodes from Infra is through your Zuul configuration. 
> > Add jobs to a project to run tests on the node labels that you want.
> 
> Aha, thanks, I'll look into that. I was coming from a place of
> complete ignorance about infra.
> >
> > I'm guessing this process doesn't work for NFV tests because you have 
> > specific hardware requirements that are not met by our current VM resources?
> > If that is the case it would probably be best to start by documenting what 
> > is required and where the existing VM resources fall
> > short.
> 
> Well, it should be possible to do most of what we'd like with nested
> virt and virtual NUMA topologies, though things like hugepages will
> need host configuration, specifically the kernel boot command [1]. Is
> that possible with the nodes we have?

https://docs.openstack.org/infra/manual/testing.html attempts to give you an 
idea for what is currently available via the test environments.

Nested virt has historically been painful because not all clouds support it and 
those that do did not do so in a reliable way (VMs and possibly hypervisors 
would crash). This has gotten better recently as nested virt is something more 
people have an interest in getting working but it is still hit and miss 
particularly as you use newer kernels in guests. I think if we can continue to 
work together with our clouds (thank you limestone, OVH, and vexxhost!) we may 
be able to work out nested virt that is redundant across multiple clouds. We 
will likely need individuals willing to keep caring for that though and debug 
problems when the next release of your favorite distro shows up. Can you get by 
with qemu or is nested virt required?

As for hugepages, I've done a quick survey of cpuinfo across our clouds and all 
seem to have pse available but not all have pdpe1gb available. Are you using 
1GB hugepages? Keep in mind that the test VMs only have 8GB of memory total. As 
for booting with special kernel parameters you can have your job make those 
modifications to the test environment then reboot the test environment within 
the job. There is some Zuul specific housekeeping that needs to be done post 
reboot, we can figure that out if we decide to go down this route. Would your 
setup work with 2M hugepages?

> 
> > In general though we operate on top of donated cloud resources, and if 
> > those do not work we will have to identify a source of resources that would 
> > work.
> 
> Right, as always it comes down to resources and money. I believe
> historically Red Hat has been opposed to running an upstream third
> party CI (this is by no means an official Red Hat position, just
> remembering what I think I heard), but I can always see what I can do.
> 
> [1] 
> https://docs.openstack.org/nova/latest/admin/huge-pages.html#enabling-huge-pages-on-the-host

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [infra][nova] Running NFV tests in CI

2018-07-24 Thread Clark Boylan

On Tue, Jul 24, 2018, at 9:23 AM, Artom Lifshitz wrote:
> Hey all,
> 
> tl;dr Humbly requesting a handful of nodes to run NFV tests in CI
> 
> Intel has their NFV tests tempest plugin [1] and manages a third party
> CI for Nova. Two of the cores on that project (Stephen Finucane and
> Sean Mooney) have now moved to Red Hat, but the point still stands
> that there's a need and a use case for testing things like NUMA
> topologies, CPU pinning and hugepages.
> 
> At Red Hat, we also have a similar tempest plugin project [2] that we
> use for downstream whitebox testing. The scope is a bit bigger than
> just NFV, but the main use case is still testing NFV code in an
> automated way.
> 
> Given that there's a clear need for this sort of whitebox testing, I
> would like to humbly request a handful of nodes (in the 3 to 5 range)
> from infra to run an "official" Nova NFV CI. The code doing the
> testing would initially be the current Intel plugin, bug we could have
> a separate discussion about keeping "Intel" in the name or forking
> and/or renaming it to something more vendor-neutral.

The way you request nodes from Infra is through your Zuul configuration. Add 
jobs to a project to run tests on the node labels that you want.

I'm guessing this process doesn't work for NFV tests because you have specific 
hardware requirements that are not met by our current VM resources? If that is 
the case it would probably be best to start by documenting what is required and 
where the existing VM resources fall short. In general though we operate on top 
of donated cloud resources, and if those do not work we will have to identify a 
source of resources that would work.

> 
> I won't be at PTG (conflict with personal travel), so I'm kindly
> asking Stephen and Sean to represent this idea in Denver.
> 
> Cheers!
> 
> [1] https://github.com/openstack/intel-nfv-ci-tests
> [2] 
> https://review.rdoproject.org/r/#/admin/projects/openstack/whitebox-tempest-plugin

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] OpenStack lagging behind 2 major python versions: we need a Python 3.7 gate

2018-07-18 Thread Clark Boylan

On Thu, Jul 12, 2018, at 1:38 PM, Thomas Goirand wrote:
> Hi everyone!
> 
> It's yet another of these emails where I'm going to complain out of
> frustration because of OpenStack having bugs when running with the
> newest stuff... Sorry in advance ! :)
> 
> tl;dr: It's urgent, we need Python 3.7 uwsgi + SSL gate jobs.
> 
> Longer version:
> 
> When Python 3.6 reached Debian, i already forwarded a few patches. It
> went quite ok, but still... When switching services to Python 3 for
> Newton, I discover that many services still had issues with uwsgi /
> mod_wsgi, and I spent a large amount of time trying to figure out ways
> to fix the situation. Some patches are still not yet merged, even though
> it was a community goal to have this support for Newton:
> 
> Neutron:
> https://review.openstack.org/#/c/555608/
> https://review.openstack.org/#/c/580049/
> 
> Neutron FWaaS:
> https://review.openstack.org/#/c/580327/
> https://review.openstack.org/#/c/579433/
> 
> Horizon tempest plugin:
> https://review.openstack.org/#/c/575714/
> 
> Oslotet (clearly, the -1 is for someone considering only Devstack /
> venv, not understanding packaging environment):
> https://review.openstack.org/#/c/571962/
> 
> Designate:
> As much as I know, it still doesn't support uwsgi / mod_wsgi (please let
> me know if this changed recently).
> 
> There may be more, I didn't have much time investigating some projects
> which are less important to me.
> 
> Now, both Debian and Ubuntu have Python 3.7. Every package which I
> upload in Sid need to support that. Yet, OpenStack's CI is still lagging
> with Python 3.5. And there's lots of things currently broken. We've
> fixed most "async" stuff, though we are failing to rebuild
> oslo.messaging (from Queens) with Python 3.7: unit tests are just
> hanging doing nothing.
> 
> I'm very happy to do small contributions to each and every component
> here and there whenever it's possible, but this time, it's becoming a
> little bit frustrating. I sometimes even got replies like "hum ...
> OpenStack only supports Python 3.5" a few times. That's not really
> acceptable, unfortunately.
> 
> So moving forward, what I think needs to happen is:
> 
> - Get each and every project to actually gate using uwsgi for the API,
> using both Python 3 and SSL (any other test environment is *NOT* a real
> production environment).
> 
> - The gating has to happen with whatever is the latest Python 3 version
> available. Best would even be if we could have that *BEFORE* it reaches
> distributions like Debian and Ubuntu. I'm aware that there's been some
> attempts in the OpenStack infra to have Debian Sid (which is probably
> the distribution getting the updates the faster). This effort needs to
> be restarted, and some (non-voting ?) gate jobs needs to be setup using
> whatever the latest thing is. If it cannot happen with Sid, then I don't
> know, choose another platform, and do the Python 3-latest gating...

When you asked about this last month I suggested Tumbleweed as an option. You 
get rolling release packages that are almost always up to date. I'd still 
suggest that now as a place to start.

http://lists.openstack.org/pipermail/openstack-dev/2018-June/131302.html

> 
> The current situation with the gate still doing Python 3.5 only jobs is
> just not sustainable anymore. Moving forward, Python 2.7 will die. When
> this happens, moving faster with Python 3 versions will be mandatory for
> everyone, not only for fools like me who made the switch early.
> 
>  :)
> 
> Cheers,
> 
> Thomas Goirand (zigo)
> 
> P.S: A big thanks to everyone who where helpful for making the switch to
> Python 3 in Debian, especially Annp and the rest of the Neutron team.


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [requirements][infra] Maintaining constraints for several python versions

2018-07-12 Thread Clark Boylan

On Wed, Jul 11, 2018, at 9:34 PM, Tony Breeds wrote:
> Hi Folks,
> We have a pit of a problem in openstack/requirements and I'd liek to
> chat about it.
> 
> Currently when we generate constraints we create a venv for each
> (system) python supplied on the command line, install all of
> global-requirements into that venv and capture the pip freeze.
> 
> Where this falls down is if we want to generate a freeze for python 3.4
> and 3.5 we need an image that has both of those.  We cheated and just
> 'clone' them so if python3 is 3.4 we copy the results to 3.5 and vice
> versa.  This kinda worked for a while but it has drawbacks.
> 
> I can see a few of options:
> 
> 1. Build pythons from source and use that to construct the venv
>[please no]

Fungi mentions that 3.3 and 3.4 don't build easily on modern linux distros. 
However, 3.3 and 3.4 are also unsupported by Python at this point, maybe we can 
ignore them and focus on 3.5 and forward? We don't build new freeze lists for 
the stable branches, this is just a concern for master right?

> 
> 2. Generate the constraints in an F28 image.  My F28 has ample python
>versions:
>  - /usr/bin/python2.6
>  - /usr/bin/python2.7
>  - /usr/bin/python3.3
>  - /usr/bin/python3.4
>  - /usr/bin/python3.5
>  - /usr/bin/python3.6
>  - /usr/bin/python3.7
>I don't know how valid this still is but in the past fedora images
>have been seen as unstable and hard to keep current.  If that isn't
>still the feeling then we could go down this path.  Currently there a
>few minor problems with bindep.txt on fedora and generate-constraints
>doesn't work with py3 but these are pretty minor really.

I think most of the problems with Fedora stability are around  bringing up a 
new Fedora every 6 months or so. They tend to change sufficiently within that 
time period to make this a fairly involved exercise. But once working they work 
for the ~13 months of support they offer. I know Paul Belanger would like to 
iterate more quickly and just keep the most recent Fedora available (rather 
than ~2).

> 
> 3. Use docker images for python and generate the constraints with
>them.  I've hacked up something we could use as a base for that in:
>   https://review.openstack.org/581948
> 
>There are lots of open questions:
>  - How do we make this nodepool/cloud provider friendly ?
>* Currently the containers just talk to the main debian mirrors.
>  Do we have debian packages? If so we could just do sed magic.

http://$MIRROR/debian (http://mirror.dfw.rax.openstack.org/debian for example) 
should be a working amd64 debian package mirror.

>  - Do/Can we run a registry per provider?

We do not, but we do have a caching dockerhub registry proxy in each 
region/provider. http://$MIRROR:8081/registry-1.docker if using older docker 
and http://$MIRROR:8082 for current docker. This was a compromise between 
caching the Internet and reliability.

>  - Can we generate and caches these images and only run pip install -U
>g-r to speed up the build

Between cached upstream python docker images and prebuilt wheels mirrored in 
every cloud provider region I wonder if this will save a significant amount of 
time? May be worth starting without this and working from there if it remains 
slow.

>  - Are we okay with using docker this way?

Should be fine, particularly if we are consuming the official Python images.

> 
> I like #2 the most but I wanted to seek wider feedback.

I think each proposed option should work as long as we understand the 
limitations each presents. #2 should work fine if we have individuals 
interested and able to spin up new Fedora images and migrate jobs to that image 
after releases happen.

Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

[openstack-dev] [all][ci][infra] Small network interface MTUs on test nodes

2018-06-26 Thread Clark Boylan

Hello everyone,

We now have more than one cloud provider giving us test node resources where we 
can expect network interfaces to have MTUs less that 1500. This is a side 
effect of running Neutron with overlay networking in the cloud providing the 
test resources. Considering we've largely made this "problem" for ourselves we 
should try to accommodate this.

I have pushed a documentation update to explain this [0] as well as job updates 
for infra managed overlays used in multinode testing [1][2]. If your jobs 
manage interfaces or bridges themselves you may need to make similar updates as 
well. (I believe that devstack + neutron already do this for you if using them).

Let the infra team know if you have any questions about this.

[0] https://review.openstack.org/#/c/578159/1/doc/source/testing.rst
[1] https://review.openstack.org/578146
[2] https://review.openstack.org/578153

Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Release-job-failures][cyborg][release] Pre-release of openstack/cyborg failed

2018-06-12 Thread Clark Boylan

On Tue, Jun 12, 2018, at 4:59 PM, Sean McGinnis wrote:
> On 06/12/2018 06:01 PM, Zhipeng Huang wrote:
> > Hi Doug,
> >
> > Thanks for raising this, we will check it out
> >
> 
> Clark was sharped eyed enough to point out that setuptools does not 
> accept regexes for
> data files. So it would appear this:
> 
> https://git.openstack.org/cgit/openstack/cyborg/tree/setup.cfg#n31
> 
> Needs to be changed to explicitly list each file that is needed.
> 
> Hope that helps.
> 
> Sean

I thought data_files was a setuptools thing but on further reading it appears 
to be PBR. And reading the PBR docs trailing globs should be supported. Reading 
the code I think PBR expects the entire data_file spec on one line so that you 
end up with 'foo = bar/*' instead of 'foo =\n bar/*'. This is likely a bug that 
should be fixed.

To workaround this you can use 'etc/cyborg/rootwrap.d = 
etc/cyborg/rootwrap.d/*' all on one line in your config.

Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [TC] Stein Goal Selection

2018-06-07 Thread Clark Boylan

On Thu, Jun 7, 2018, at 1:53 PM, Thomas Goirand wrote:
> On 06/04/2018 08:59 PM, Ivan Kolodyazhny wrote:
> > I hope we'll have Ubuntu 18.04 LTS on our gates for this activity soon.
> > It becomes
> > important not only for developers but for operators and vendors too.
> 
> By the time the project will be gating on Python 3.6, most likely
> there's going to be 3.7 or even 3.8 in Debian Sid, and I'll get all the
> broken stuff alone again... Can't we try to get Sid in the gate, at
> least in non-voting mode, so we get to see problems early rather than
> late? As developers, we should always aim for the future, and Bionic
> should already be considered the past release to maintain, rather than
> the one to focus on. If we can't get Sid, then at least should we
> consider the non-LTS (always latest) Ubuntu releases?

We stopped following latest Ubuntu when they dropped non LTS support to 9 
months. What we do have are suse tumbleweed images which should get us brand 
new everything in a rolling fashion. If people are interested in this type of 
work I'd probably start there.

Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

[openstack-dev] [all] Zuul updating Ansible in use to Ansible 2.5

2018-06-06 Thread Clark Boylan

Zuul will be updating the version of Ansible it uses to run jobs from version 
2.3 to 2.5 tomorrow, June 7, 2018. The Infra team will followup shortly after 
and get that update deployed.

Other users have apparently checked that this works in general and we have 
tests that exercise some basic integration with Ansible so we don't expect 
major breakages. However, should you notice anything new/different/broken feel 
free to reach out to the Infra team.

You may notice there will be new deprecation warnings from Ansible particularly 
around our use of the include directive. Version 2.3 doesn't have the non 
deprecated directives available to it so we will have to transition after the 
upgrade.

Thank you for your patience,
Clark (and the Infra team)

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [tc][all] A culture change (nitpicking)

2018-05-30 Thread Clark Boylan

On Wed, May 30, 2018, at 8:13 AM, Zane Bitter wrote:
> On 30/05/18 00:52, Cédric Jeanneret wrote:
> >> Another issue is that if the original author needs to rev the patch
> >> again for any reason, they then need to figure out how to check out the
> >> modified patch. This requires a fairly sophisticated knowledge of both
> >> git and gerrit, which isn't a problem for those of us who have been
> >> using them for years but is potentially a nightmarish introduction for a
> >> relatively new contributor. Sometimes it's the right choice though
> >> (especially if the patch owner hasn't been seen for a while).
> > hm, "Download" -> copy/paste, and Voilà. Gerrit interface is pretty nice
> > with the user (I an "old new contributor", never really struggled with
> > Gerrit itself.. On the other hand, heat, ansible, that's another story :) ).
> 
> OK, so I am sitting here with a branch containing a patch I have sent 
> for review, and that I need to revise, but somebody else has pushed a 
> revision upstream. Which of the 4 'Download' commands do I use to 
> replace my commit with the latest one from upstream?
> 
> (Hint: it's a trick question)
> 
> Now imagine it wasn't the last patch in the series.
> 
> - ZB
> 
> (P.S. without testing, I believe the correct answers are `git reset 
> --hard FETCH_HEAD` and `git rebase HEAD~ --onto FETCH_HEAD` 
> respectively.)

We do have tools for this and it is the same tool you use to push code to 
gerrit. `git review -d changenumber` will grab the latest patchset from that 
change and check it out locally. You can use `git review -d 
changenumber,patchsetnumber` to pick a different older patchset. If you have a 
series of changes things become more complicated. I personally like to always 
operate against leaf most change, make local updates, then squash "back" onto 
the appropriate changes in the series to keep existing changes happy.

Yes, this is complicated especially for new users. In general though I think 
git review addresses the simpler cases of I need to use latest version of my 
change. If we use change series as proposed in this thread I think keeping the 
parent of the child changes up to date is going to be up to the more 
experienced nit picker that is addressing the minor problems and not the 
original change author.

Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [tripleo] tripleo upstream gate outtage, was: -> gate jobs impacted RAX yum mirror

2018-05-14 Thread Clark Boylan

On Mon, May 14, 2018, at 10:11 AM, Wesley Hayutin wrote:
> On Mon, May 14, 2018 at 12:08 PM Clark Boylan <cboy...@sapwetik.org> wrote:
> 
> > On Mon, May 14, 2018, at 8:57 AM, Wesley Hayutin wrote:
> > > On Mon, May 14, 2018 at 10:36 AM Jeremy Stanley <fu...@yuggoth.org>
> > wrote:
> > >
> > > > On 2018-05-14 07:07:03 -0600 (-0600), Wesley Hayutin wrote:

snip

> > > > Our automation doesn't know that there's a difference between
> > > > packages which were part of CentOS 7.4 and 7.5 any more than it
> > > > knows that there's a difference between Ubuntu 16.04.2 and 16.04.3.
> > > > Even if we somehow managed to pause our CentOS image updates
> > > > immediately prior to 7.5, jobs would still try to upgrade those
> > > > 7.4-based images to the 7.5 packages in our mirror, right?
> > > >
> > >
> > > Understood, I suspect this will become a more widespread issue as
> > > more projects start to use containers ( not sure ).  It's my
> > understanding
> > > that
> > > there are some mechanisms in place to pin packages in the centos nodepool
> > > image so
> > > there has been some thoughts generally in the area of this issue.
> >
> > Again, I think we need to understand why containers would make this worse
> > not better. Seems like the big feature everyone talks about when it comes
> > to containers is isolating packaging whether that be python packages so
> > that nova and glance can use a different version of oslo or cohabitating
> > software that would otherwise conflict. Why do the packages on the host
> > platform so strongly impact your container package lists?
> >
> 
> I'll let others comment on that, however my thought is you don't move from
> A -> Z in one step and containers do not make everything easier
> immediately.  Like most things, it takes a little time.
> 

If the main issue is being caught in a transition period at the same time a 
minor update happens can we treat this as a temporary state? Rather than 
attempting to for solve this particular case happening again the future we 
might be better served testing that upcoming CentOS releases won't break 
tripleo due to changes in the packaging using the centos-release-cr repo as 
Tristan suggests. That should tell you if something like pacemaker were to stop 
working. Note this wouldn't require any infra side updates, you would just have 
these jobs configure the additional repo and go from there.

Then on top of that get through the transition period so that the containers 
isolate you from these changes in the way they should. Then when 7.6 happens 
you'll have hopefully identified all the broken packaging ahead of time and 
worked with upstream to address those problems (which should be important for a 
stable long term support distro) and your containers can update at whatever 
pace they choose?

I don't think it would be appropriate for Infra to stage centos minor versions 
for a couple reasons. The first is we don't support specific minor versions of 
CentOS/RHEL, we support the major version and if it updates and OpenStack stops 
working that is CI doing its job and providing that info. The other major 
concern is CentOS specifically says "We are trying to make sure people 
understand they can NOT use older minor versions and still be secure." 
Similarly to how we won't support Ubuntu 12.04 because it is no longer 
supported we shouldn't support CentOS 7.4 at this point. These are no longer 
secure platforms.

However, I think testing using the pre release repo as proposed above should 
allow you to catch issues before updates happen just as well as a staged minor 
version update would. The added benefit of using this process is you should 
know as soon as possible and not after the release has been made (helping other 
users of CentOS by not releasing broken packages in the first place).

Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [tripleo] tripleo upstream gate outtage, was: -> gate jobs impacted RAX yum mirror

2018-05-14 Thread Clark Boylan

On Mon, May 14, 2018, at 8:57 AM, Wesley Hayutin wrote:
> On Mon, May 14, 2018 at 10:36 AM Jeremy Stanley  wrote:
> 
> > On 2018-05-14 07:07:03 -0600 (-0600), Wesley Hayutin wrote:
> > [...]

snip

> >
> > This _doesn't_ sound to me like a problem with how we've designed
> > our infrastructure, unless there are additional details you're
> > omitting.
> 
> 
> So the only thing out of our control is the package set on the base
> nodepool image.
> If that suddenly gets updated with too many packages, then we have to
> scramble to ensure the images and containers are also udpated.
> If there is a breaking change in the nodepool image for example [a], we
> have to react to and fix that as well.

Aren't the container images independent of the hosting platform (eg what infra 
hosts)? I'm not sure I understand why the host platform updating implies all 
the container images must also be updated.

> 
> 
> > It sounds like a problem with how the jobs are designed
> > and expectations around distros slowly trickling package updates
> > into the series without occasional larger bursts of package deltas.
> > I'd like to understand more about why you upgrade packages inside
> > your externally-produced container images at job runtime at all,
> > rather than relying on the package versions baked into them.
> 
> 
> We do that to ensure the gerrit review itself and it's dependencies are
> built via rpm and injected into the build.
> If we did not do this the job would not be testing the change at all.
>  This is a result of being a package based deployment for better or worse.

You'd only need to do that for the change in review, not the entire system 
right?

> 

snip

> > Our automation doesn't know that there's a difference between
> > packages which were part of CentOS 7.4 and 7.5 any more than it
> > knows that there's a difference between Ubuntu 16.04.2 and 16.04.3.
> > Even if we somehow managed to pause our CentOS image updates
> > immediately prior to 7.5, jobs would still try to upgrade those
> > 7.4-based images to the 7.5 packages in our mirror, right?
> >
> 
> Understood, I suspect this will become a more widespread issue as
> more projects start to use containers ( not sure ).  It's my understanding
> that
> there are some mechanisms in place to pin packages in the centos nodepool
> image so
> there has been some thoughts generally in the area of this issue.

Again, I think we need to understand why containers would make this worse not 
better. Seems like the big feature everyone talks about when it comes to 
containers is isolating packaging whether that be python packages so that nova 
and glance can use a different version of oslo or cohabitating software that 
would otherwise conflict. Why do the packages on the host platform so strongly 
impact your container package lists?

> 
> TripleO may be the exception to the rule here and that is fine, I'm more
> interested in exploring
> the possibilities of delivering updates in a staged fashion than anything.
> I don't have insight into
> what the possibilities are, or if other projects have similiar issues or
> requests.  Perhaps the TripleO
> project could share the details of our job workflow with the community and
> this would make more sense.
> 
> I appreciate your time, effort and thoughts you have shared in the thread.
> 
> 
> > --
> > Jeremy Stanley
> >
> 
> [a] https://bugs.launchpad.net/tripleo/+bug/1770298

I think understanding the questions above may be the important aspect of 
understanding what the underlying issue is here and how we might address it.

Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] Does the openstack ci vms start each time clear up enough?

2018-05-04 Thread Clark Boylan



On Fri, May 4, 2018, at 2:37 AM, Sean McGinnis wrote:
> On Fri, May 04, 2018 at 04:13:41PM +0800, linghucongsong wrote:
> > 
> > Hi all!
> > 
> > Recently we meet a strange problem in our ci. look this link: 
> > https://review.openstack.org/#/c/532097/
> > 
> > we can pass the ci in the first time, but when we begin to start the gate 
> > job, it will always failed in the second time.
> > 
> > we have rebased several times, it alway pass the ci in the first time and 
> > failed in the second time.
> > 
> > This have not happen before  and make me to guess is it really we start the 
> > ci from the new fresh vms each time?
> 
> A new VM is spun up for each test run, so I don't believe this is an issue 
> with
> stale artifacts on the host. I would guess this is more likely some sort of
> race condition, and you just happen to be hitting it 50% of the time.

Additionally you can check the job logs to see while these two jobs did run 
against the same cloud provider they did so in different regions on hosts with 
completely different IP addresses. The inventory files [0][1] are where I would 
start if you suspect oddness of this sort. Reading them I don't see anything to 
indicate the nodes were reused.

[0] 
http://logs.openstack.org/97/532097/16/check/legacy-tricircle-dsvm-multiregion/c9b3d29/zuul-info/inventory.yaml
[1] 
http://logs.openstack.org/97/532097/16/gate/legacy-tricircle-dsvm-multiregion/ad547d5/zuul-info/inventory.yaml

Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [mistral] Help with test run

2018-05-01 Thread Clark Boylan

On Fri, Apr 27, 2018, at 2:22 AM, András Kövi wrote:
> Hi,
> 
> Can someone please help me with why this build ended with TIMED_OUT?
> http://logs.openstack.org/85/527085/8/check/mistral-tox-unit-mysql/3ffae9f/

Reading the job log the job setup only took a few minutes. Then the unittests 
start and are running continuously until the timeout happens at 30 minutes. 
Chances are that the default 30 minute timeout is not sufficient for this job. 
Runtime may vary based on cloud region and presence of noisy neighbors.

As for making this more reliable you can increase the timeout in the job 
configuration for that job. Another approach would be to make the unittests run 
more quickly. I notice the job is hard coded to use concurrency=1 when invoking 
the test runner so you are only using ~1/8 of the available cpus. You might try 
increasing this value though will likely need to make sure the tests don't 
conflict with each other.

Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [all][tc] final stages of python 3 transition

2018-04-26 Thread Clark Boylan

On Thu, Apr 26, 2018, at 7:27 AM, Sean McGinnis wrote:
> On Wed, Apr 25, 2018 at 04:54:46PM -0400, Doug Hellmann wrote:
> > It's time to talk about the next steps in our migration from python
> > 2 to python 3.
> > 
> > [...]
> > 
> > 2. Change (or duplicate) all functional test jobs to run under
> >python 3.
> 
> As a test I ran Cinder functional and unit test jobs on bionic using 3.6. All
> went well.
> 
> That made me realize something though - right now we have jobs that explicitly
> say py35, both for unit tests and functional tests. But I realized setting up
> these test jobs that it works to just specify "basepython = python3" or run
> unit tests with "tox -e py3". Then with that, it just depends on whether the
> job runs on xenial or bionic as to whether the job is run with py35 or py36.
> 
> It is less explicit, so I see some downside to that, but would it make sense 
> to
> change jobs to drop the minor version to make it more flexible and easy to 
> make
> these transitions?

One reason to use it would be local user simplicity. Rather than need to 
explicitly add new python3 releases to the default env list so that it does 
what we want every year or two we can just list py3,py2,linters in the default 
list and get most of the way there for local users. Then we can continue to be 
more specific in the CI jobs if that is desirable.

I do think we likely want to be explicit about the python versions we are using 
in CI testing. This makes it clear to developers who may need to reproduce or 
just understand why failures happen what platform is used. It also makes it 
explicit that "openstack runs on $pythonversion".

Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [all][tc] final stages of python 3 transition

2018-04-25 Thread Clark Boylan

On Wed, Apr 25, 2018, at 3:25 PM, Doug Hellmann wrote:
> Excerpts from Jeremy Stanley's message of 2018-04-25 21:40:37 +:
> > On 2018-04-25 16:54:46 -0400 (-0400), Doug Hellmann wrote:
> > [...]
> > > Still, we need to press on to the next phase of the migration, which
> > > I have been calling "Python 3 first". This is where we use python
> > > 3 as the default, for everything, and set up the exceptions we need
> > > for anything that still requires python 2.
> > [...]
> > 
> > It may be worth considering how this interacts with the switch of
> > our default test platform from Ubuntu 16.04 (which provides Python
> > 3.5) to 18.04 (which provides Python 3.6). If we switch from 3.5 to
> > 3.6 before we change most remaining jobs over to Python 3.x versions
> > then it gives us a chance to spot differences between 3.5 and 3.6 at
> > that point. Given that the 14.04 to 16.04 migration, where we
> > attempted to allow projects to switch at their own pace, didn't go
> > so well we're hoping to do a "big bang" migration instead for 18.04
> > and expect teams who haven't set up experimental jobs ahead of time
> > to work out remaining blockers after the flag day before they can go
> > back to business as usual. Since the 18.04 release is happening so
> > far into the Rocky cycle, we're likely to want to do that at the
> > start of Stein instead when it will be less disruptive.
> > 
> > So I guess that raises the question: switch to Python 3.5 by default
> > for most jobs in Rocky and then have a potentially more disruptive
> > default platform switch with Python 3.5->3.6 at the beginning of
> > Stein, or wait until the default platform switch to move from Python
> > 2.7 to 3.6 as the job default? I can see some value in each option.
> 
> Does 18.04 include a python 2 option?

It does, https://packages.ubuntu.com/bionic/python2.7.

> 
> What is the target for completing the changeover? The first or
> second milestone for Stein, or the end of the cycle?

Previously we've tried to do the transition in OpenStack release that is under 
development when the LTS releases. However we've offset things a bit now so 
that may not be as feasible. I would expect that if we waited for the next 
cycle we would do it very early in that cycle.

For the transition from python 3.5 on Xenial to 3.6 on Bionic we may want to 
keep the python 3.5 jobs on Xenial but add in non voting python 3.6 jobs to 
every project running Xenial python3.5 jobs. Then those projects can toggle 
them to voting 3.6 jobs if/when they start working. Then we can decide at a 
later time if continuing to support python 3.5 (and testing it) is worthwhile.

> 
> It would be useful to have some input from the project teams who
> have no unit or functional test jobs running for 3.5, since they
> will have the most work to do to cope with the upgrade overall.
> 
> Who is coordinating Ubuntu upgrade work and setting up the experimental
> jobs?

Paul Belanger has been doing much of the work to get the images up and running 
and helping some projects start to run early jobs on the beta images. I expect 
Paul would want to continue to carry the transition through to the end.

Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

[openstack-dev] [all] Zuul job definitions and the branches attribute

2018-04-13 Thread Clark Boylan

Hello everyone,

Nova recently discovered that if you use the job.branches attribute in job
definitions the results may not be as expected, particularly if you are porting
jobs from openstack-zuul-jobs into your project repos.

The problem is the openstack-zuul-jobs project is "branchless", it only has a
master branch. This means for jobs defined in that repo to restrict running
jobs against certain branches it used the job.branches attribute. When ported
to "branched" repos like Nova this job.branches attribute has a slightly
different behavior and it applies the config on the current branch to all
branches matching job.branches.

In the Nova case this meant the stable/queens job definition was being applied
to the master job definition for the job with the same name. Instead the
job.branches attribute should be dropped and you should use the per branch job
definition to control branch specific attributes. If you want to stop running a
job on a branch delete the job's definition from that branch.

TL;DR if you have job definitions that have a branches attribute like Nova did
[0], you should consider removing that and use the per branch definitions to
control where and when jobs run.

[0]
https://git.openstack.org/cgit/openstack/nova/tree/.zuul.yaml?id=cb6c8ca1a7a5abc4d0079e285f877c18c49acaf2#n99

If you have any questions feel free to reach out to the infra team either here
or on IRC.

Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Nova][Deployers] Optional, platform specific, dependancies in requirements.txt

2018-04-12 Thread Clark Boylan

On Wed, Apr 11, 2018, at 5:45 PM, Matt Riedemann wrote:
> I also seem to remember that [extras] was less than user-friendly for 
> some reason, but maybe that was just because of how our CI jobs are 
> setup? Or I'm just making that up. I know it's pretty simple to install 
> the stuff from extras for tox runs, it's just an extra set of 
> dependencies to list in the tox.ini.

One concern I have as a user is that extras are not very discoverable without 
reading the source setup.cfg file. This can be addressed by improving 
installation docs to explain what the extras options are and why you might want 
to use them.

Another idea was to add a 'all' extras that installed all of the more fine 
grained extra's options. That way a user can just say give me all the features 
I don't care even if I can't use them all I know the ones I can use will be 
properly installed.

As for the CI jobs its just a matter of listing the extras in the appropriate 
requirements files or explicitly installing them.

Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Nova][Deployers] Optional, platform specific, dependancies in requirements.txt

2018-04-11 Thread Clark Boylan

On Wed, Apr 11, 2018, at 3:09 PM, Michael Still wrote:
> Hi,
> 
> https://review.openstack.org/#/c/523387 proposes adding a z/VM specific
> dependancy to nova's requirements.txt. When I objected the counter argument
> is that we have examples of windows specific dependancies (os-win) and
> powervm specific dependancies in that file already.
> 
> I think perhaps all three are a mistake and should be removed.
> 
> My recollection is that for drivers like ironic which may not be deployed
> by everyone, we have the dependancy documented, and then loaded at runtime
> by the driver itself instead of adding it to requirements.txt. This is to
> stop pip for auto-installing the dependancy for anyone who wants to run
> nova. I had assumed this was at the request of the deployer community.
> 
> So what do we do with z/VM? Do we clean this up? Or do we now allow
> dependancies that are only useful to a very small number of deployments
> into requirements.txt?
> 
> Michael

I think there are two somewhat related issues here. The first is being able to 
have platform specific dependencies so that nova can run on say python2 and 
python3 or linux and windows using the same requirements list. To address this 
you should use environment markers [0] to specify when a specific environment 
needs additional or different packages to function and those should probably 
all just go into requirements.txt.

The second issue is enabling optional functionality that a default install 
shouldn't reasonably have to worry about (and is install platform independent). 
For this you can use setuptools extras[1]. For an example of how this is used 
along with setup.cfg and PBR you can look at swiftclient. Then users that know 
they want the extra features will execute something like `pip install 
python-swiftclient[keystone]`.

[0] https://www.python.org/dev/peps/pep-0496/
[1] 
http://setuptools.readthedocs.io/en/latest/setuptools.html#declaring-extras-optional-features-with-their-own-dependencies
[2] 
https://git.openstack.org/cgit/openstack/python-swiftclient/tree/setup.cfg#n35

Hope this helps,
Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [openstack-infra][openstack-zuul-jobs]Questions about playbook copy module

2018-04-06 Thread Clark Boylan

On Fri, Apr 6, 2018, at 2:32 AM, Andreas Jaeger wrote:
> On 2018-04-06 11:20, Xinni Ge wrote:
> > Sorry, forgot to reply to the mail list.
> > 
> > On Fri, Apr 6, 2018 at 6:18 PM, Xinni Ge  > > wrote:
> > 
> > Hi, Andreas.
> > 
> > Thanks for reply. This is the link of log I am seeing.
> > 
> > http://logs.openstack.org/39/39067dbc1dee99d227f8001595633b5cc98cfc53/release/xstatic-check-version/9172297/ara-report/
> > 
> > 
> > 
> 
> thanks, your analysis is correct, seem we seldom release xstatic packages ;(
> 
> fix is at https://review.openstack.org/559300
> 
> Once that is merged, an infra-root can rerun the release job - please
> ask on #openstack-infra IRC channel,

I've re-enqueued the tag ref and we now have a new failure: 
http://logs.openstack.org/39/39067dbc1dee99d227f8001595633b5cc98cfc53/release/xstatic-check-version/c5baf7e/ara-report/result/09433617-44dd-4ffd-9c57-d62e04dfd75e/.

Reading into that we appear to be running the script from the wrong local 
directory so relative paths don't work as expected. I have proposed 
https://review.openstack.org/559373 to fix this.

Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [all][requirements] uncapping eventlet

2018-04-06 Thread Clark Boylan

On Fri, Apr 6, 2018, at 9:34 AM, Matthew Thode wrote:
> On 18-04-06 09:02:29, Jens Harbott wrote:
> > 2018-04-05 19:26 GMT+00:00 Matthew Thode :
> > > On 18-04-05 20:11:04, Graham Hayes wrote:
> > >> On 05/04/18 16:47, Matthew Thode wrote:
> > >> > eventlet-0.22.1 has been out for a while now, we should try and use it.
> > >> > Going to be fun times.
> > >> >
> > >> > I have a review projects can depend upon if they wish to test.
> > >> > https://review.openstack.org/533021
> > >>
> > >> It looks like we may have an issue with oslo.service -
> > >> https://review.openstack.org/#/c/559144/ is failing gates.
> > >>
> > >> Also - what is the dance for this to get merged? It doesn't look like we
> > >> can merge this while oslo.service has the old requirement restrictions.
> > >>
> > >
> > > The dance is as follows.
> > >
> > > 0. provide review for projects to test new eventlet version
> > >projects using eventlet should make backwards compat code changes at
> > >this time.
> > 
> > But this step is currently failing. Keystone doesn't even start when
> > eventlet-0.22.1 is installed, because loading oslo.service fails with
> > its pkg definition still requiring the capped eventlet:
> > 
> > http://logs.openstack.org/21/533021/4/check/legacy-requirements-integration-dsvm/7f7c3a8/logs/screen-keystone.txt.gz#_Apr_05_16_11_27_748482
> > 
> > So it looks like we need to have an uncapped release of oslo.service
> > before we can proceed here.
> > 
> 
> Ya, we may have to uncap and rely on upper-constraints to keep openstack
> gate from falling over.  The new steps would be the following:

My understanding of our use of upper constraints was that this should (almost) 
always be the case for (almost) all dependencies.  We should rely on 
constraints instead of requirements caps. Capping libs like pbr or eventlet and 
any other that is in use globally is incredibly difficult to work with when you 
want to uncap it because you have to coordinate globally. Instead if using 
constraints you just bump the constraint and are done.

It is probably worthwhile examining if we have any other deps in the situation 
and proactively addressing them rather than waiting for when we really need to 
fix them.

Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [infra][qa][requirements] Pip 10 is on the way

2018-04-05 Thread Clark Boylan

On Thu, Apr 5, 2018, at 1:27 PM, Clark Boylan wrote:
> The other major issue we've run into is that nova file injection (which 
> is tested by tempest) seems to require either libguestfs or nbd. 
> libguestfs bindings for python aren't available on pypi and instead we 
> get them from system packaging. This means if we want libguestfs support 
> we have to enable system site packages when using virtualenvs. The 
> alternative is to use nbd which apparently isn't preferred by nova and 
> doesn't work under current devstack anyways.
> 
> Why is this a problem? Well the new pip10 behavior that breaks devstack 
> is pip10's refusable to remove distutils installed packages. Distro 
> packages by and large are distutils packaged which means if you mix 
> system packages and pip installed packages there is a good chance 
> something will break (and it does break for current devstack). I'm not 
> sure that using a virtualenv with system site packages enabled will 
> sufficiently protect us from this case (but we should test it further). 
> Also it feels wrong to enable system packages in a virtualenv if the 
> entire point is avoiding system python packages.

Good news everyone, 
http://logs.openstack.org/74/559174/1/check/tempest-full-py3/4c5548f/job-output.txt.gz#_2018-04-05_21_26_36_669943
 shows the pip10 appears to do the right thing with a virtualenv using 
system-site-package option when attempting to install a newer version of a 
package that would require being deleted if done on the system python proper.

It determines there is an existing package, that it is outside the env and it 
cannot uninstall it, then installs a newer version of the package anyways.

If you look later in the job run you'll see it fails in the system python 
context on this same package, 
http://logs.openstack.org/74/559174/1/check/tempest-full-py3/4c5548f/job-output.txt.gz#_2018-04-05_21_29_31_399895.

I think that means this is a viable workaround for us even if it isn't ideal.

Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [infra][qa][requirements] Pip 10 is on the way

2018-04-05 Thread Clark Boylan

On Mon, Apr 2, 2018, at 9:13 AM, Clark Boylan wrote:
> On Mon, Apr 2, 2018, at 8:06 AM, Matthew Thode wrote:
> > On 18-03-31 15:00:27, Jeremy Stanley wrote:
> > > According to a notice[1] posted to the pypa-announce and
> > > distutils-sig mailing lists, pip 10.0.0.b1 is on PyPI now and 10.0.0
> > > is expected to be released in two weeks (over the April 14/15
> > > weekend). We know it's at least going to start breaking[2] DevStack
> > > and we need to come up with a plan for addressing that, but we don't
> > > know how much more widespread the problem might end up being so
> > > encourage everyone to try it out now where they can.
> > > 
> > 
> > I'd like to suggest locking down pip/setuptools/wheel like openstack
> > ansible is doing in 
> > https://github.com/openstack/openstack-ansible/blob/master/global-requirement-pins.txt
> > 
> > We could maintain it as a separate constraints file (or infra could
> > maintian it, doesn't mater).  The file would only be used for the
> > initial get-pip install.
> 
> In the past we've done our best to avoid pinning these tools because 1) 
> we've told people they should use latest for openstack to work and 2) it 
> is really difficult to actually control what versions of these tools end 
> up on your systems if not latest.
> 
> I would strongly push towards addressing the distutils package deletion 
> problem that we've run into with pip10 instead. One of the approaches 
> thrown out that pabelanger is working on is to use a common virtualenv 
> for devstack and avoid the system package conflict entirely.

I was mistaken and pabelanger was working to get devstack's USE_VENV option 
working which installs each service (if the service supports it) into its own 
virtualenv. There are two big drawbacks to this. This first is that we would 
lose coinstallation of all the openstack services which is one way we ensure 
they all work together at the end of the day. The second is that not all 
services in "base" devstack support USE_VENV and I doubt many plugins do either 
(neutron apparently doesn't?).

I've since worked out a change that passes tempest using a global virtualenv 
installed devstack at https://review.openstack.org/#/c/558930/. This needs to 
be cleaned up so that we only check for and install the virtualenv(s) once and 
we need to handle mixed python2 and python3 environments better (so that you 
can run a python2 swift and python3 everything else).

The other major issue we've run into is that nova file injection (which is 
tested by tempest) seems to require either libguestfs or nbd. libguestfs 
bindings for python aren't available on pypi and instead we get them from 
system packaging. This means if we want libguestfs support we have to enable 
system site packages when using virtualenvs. The alternative is to use nbd 
which apparently isn't preferred by nova and doesn't work under current 
devstack anyways.

Why is this a problem? Well the new pip10 behavior that breaks devstack is 
pip10's refusable to remove distutils installed packages. Distro packages by 
and large are distutils packaged which means if you mix system packages and pip 
installed packages there is a good chance something will break (and it does 
break for current devstack). I'm not sure that using a virtualenv with system 
site packages enabled will sufficiently protect us from this case (but we 
should test it further). Also it feels wrong to enable system packages in a 
virtualenv if the entire point is avoiding system python packages.

I'm not sure what the best option is here but if we can show that system site 
packages with virtualenvs is viable with pip10 and people want to move forward 
with devstack using a global virtualenv we can work to clean up this change and 
make it mergeable.

Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

[openstack-dev] [all] A quick note on recent IRC trolling/vandalism

2018-04-03 Thread Clark Boylan

Hello everyone,

During the recent holiday weekend some of our channels experienced some IRC 
trolling/vandalism. In particular the meetbot was used to start meetings titled 
'maintenance' which updated the channel topic to 'maintenance'. The individual 
or bot doing this then used this as the pretense for claiming the channel was 
to undergo maintenance and everyone should leave. This is one of the risks of 
using public communications channels, anyone can show up and abuse them.

In an effort to make it more clear as to what is trolling and what isn't, here 
are the bots we currently operate:
  - Meetbot ("openstack") to handle IRC meetings and log channels on 
eavesdrop.openstack.org
  - Statusbot ("openstackstatus") to notify channels about service outages and 
update topic accordingly
  - Gerritbot ("openstackgerrit") to notify channels about code review updates

Should the Infra team need to notify of pending maintenance work, that 
notification will come via the statusbot and not the meetbot. The number of 
individuals that can set topics via statusbot is limited to a small number of 
IRC operators.

If you have any questions you can reach out either in the #openstack-infra 
channel or to any channel operator directly and ask them. To get a list of 
channel operators run `/msg chanserv access #channel-name list`. Finally any 
user can end a meeting that meetbot started after one hour (by issuing a 
#endmeeting command). So you should feel free to clean those up yourself if you 
are able.

If the Freenode staff needs to perform maintenance or otherwise make 
announcements,  they tend to send special messages directly to clients  so you 
will see messages from them in your IRC client's status channel. Should you 
have any questions for Freenode you can find freenode operators in the 
#freenode channel.

As a final note the infra team has an approved spec for improving our IRC bot 
tooling, http://specs.openstack.org/openstack-infra/infra-specs/specs/irc.html. 
Implementing this spec is going to be a prerequisite for implementing smarter 
automated responses to problems like this and it needs volunteers. If you think 
this might be interesting to you definitely reach out.

Thank you for your patience,
Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [infra][qa][requirements] Pip 10 is on the way

2018-04-02 Thread Clark Boylan

On Mon, Apr 2, 2018, at 8:06 AM, Matthew Thode wrote:
> On 18-03-31 15:00:27, Jeremy Stanley wrote:
> > According to a notice[1] posted to the pypa-announce and
> > distutils-sig mailing lists, pip 10.0.0.b1 is on PyPI now and 10.0.0
> > is expected to be released in two weeks (over the April 14/15
> > weekend). We know it's at least going to start breaking[2] DevStack
> > and we need to come up with a plan for addressing that, but we don't
> > know how much more widespread the problem might end up being so
> > encourage everyone to try it out now where they can.
> > 
> 
> I'd like to suggest locking down pip/setuptools/wheel like openstack
> ansible is doing in 
> https://github.com/openstack/openstack-ansible/blob/master/global-requirement-pins.txt
> 
> We could maintain it as a separate constraints file (or infra could
> maintian it, doesn't mater).  The file would only be used for the
> initial get-pip install.

In the past we've done our best to avoid pinning these tools because 1) we've 
told people they should use latest for openstack to work and 2) it is really 
difficult to actually control what versions of these tools end up on your 
systems if not latest.

I would strongly push towards addressing the distutils package deletion problem 
that we've run into with pip10 instead. One of the approaches thrown out that 
pabelanger is working on is to use a common virtualenv for devstack and avoid 
the system package conflict entirely.

Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [tap-as-a-service] publish on pypi

2018-03-29 Thread Clark Boylan

On Wed, Mar 28, 2018, at 7:59 AM, Takashi Yamamoto wrote:
> hi,
> 
> i'm thinking about publishing the latest release of tap-as-a-service on pypi.
> background: https://review.openstack.org/#/c/555788/
> iirc, the naming (tap-as-a-service vs neutron-taas) was one of concerns
> when we talked about this topic last time. (long time ago. my memory is dim.)
> do you have any ideas or suggestions?
> probably i'll just use "tap-as-a-service" unless anyone has strong opinions.
> because:
> - it's the name we use the most frequently
> - we are not neutron (yet?)

http://git.openstack.org/cgit/openstack/tap-as-a-service/tree/setup.cfg#n2 
shows that tap-as-a-service is the existing package name so probably a good one 
to go with as anyone that already has it installed from source should have pip 
do the right thing when talking to pypi.

Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [openstack-infra] Where did the ARA logs go?

2018-03-29 Thread Clark Boylan

On Wed, Mar 28, 2018, at 8:13 AM, Jeremy Stanley wrote:
> On 2018-03-28 09:26:49 -0500 (-0500), Sean McGinnis wrote:
> [...]
> > I believe the ARA logs are only captured on failing jobs.
> 
> Correct. This was a stop-gap some months ago when we noticed we were
> overrunning our inode capacity on the logserver. ARA was was only
> one of the various contributors to that increased consumption but
> due to its original model based on numerous tiny files, limiting it
> to job failures (where it was most useful) was one of the ways we
> temporarily curtailed inode utilization. ARA has very recently grown
> the ability to stuff all that data into a single sqlite file and
> then handle it browser-side, so I expect we'll be able to switch
> back to collecting it for all job runs again fairly soon.

The switch has been flipped and you should start to see ara reports on all job 
logs again. Thank you dmsimard for making this happen. More details at 
http://lists.openstack.org/pipermail/openstack-dev/2018-March/128902.html

Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [keystone] [infra] Post PTG performance testing needs

2018-03-06 Thread Clark Boylan

On Tue, Mar 6, 2018, at 1:28 PM, Lance Bragstad wrote:
> Hey all,
> 
> Last week during the PTG the keystone team sat down with a few of the
> infra folks to discuss performance testing. The major hurdle here has
> always been having dedicated hosts to use for performance testing,
> regardless of that being rally, tempest, or a home-grown script.
> Otherwise results vary wildly from run to run in the gate due to
> differences from providers or noisy neighbor problems.
> 
> Opening up the discussion here because it sounded like some providers
> (mnaser, mtreinish) had some thoughts on how we can reserve specific
> hardware for these cases.
> 
> Thoughts?

Currently the Infra team has access to a variety of clouds, but due to how 
scheduling works we can't rule out noisy neighbors (or even being our own noisy 
neighbor). mtreinish also has data showing that runtimes are too noisy to do 
statistical analysis on, even within a single cloud region. So this is indeed 
an issue in the current setup.

One approach that has been talked about in the past is to measure performance 
impacting operations using metrics other than execution time. For example 
number of sql queries or rabbit requests. I think this would also be valuable 
but won't give you proper performance measurements.

That brought us back to the idea of possibly working with some cloud providers 
like mnaser and/or mtreinish to have a small number of dedicated instances to 
run performance tests on. We could then avoid the noisy neighbor problem as 
well.

For the infra team we would likely need to have at least two providers 
providing these resources so that we could handle the loss of one without 
backing up job queues. I don't think the hardware needs to have an other 
special properties as we don't care about performance on specific hardware as 
much as comparing performance of the project over time on known hardware.

Curious to hear what others may have to say.

Thanks,
Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [infra] Please delete branch "notif" of project tatu

2018-02-26 Thread Clark Boylan

On Mon, Feb 26, 2018, at 1:59 AM, Pino de Candia wrote:
> Hi OpenStack-Infra Team,
> 
> Please delete branch "notif" of openstack/tatu.
> 
> The project was recently created/imported from my private repo and only the
> master branch is needed for the community project.

Done. Just for historical purposes the sha1 of the HEAD of the branch was 
9ecbb46b8e645fbf2450d4bca09c8f4040341a85.

Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

[openstack-dev] [all][infra] PTG Infra Helproom Info and Signup

2018-02-13 Thread Clark Boylan

Hello everyone,

Last PTG the infra helproom seemed to work out for projects that knew about it. 
The biggest problem seemed to be that other projects either just weren't aware 
that there is/was an Infra helproom or didn't know when an appropriate time to 
show up would be. We are going to try a couple things this time around to try 
and address those issues.

First of all the Infra team is hosting a helproom at the Dublin PTG. Now you 
should all know :) The idea is that if projects or individuals have questions 
for the infra team or problems that we can help you with there is time set 
aside specifically for this. I'm not sure what room we will be in, you will 
have to look at the map, but we have the entirety of Monday and Tuesday set 
aside for this.

To address the second issue of not knowing when a good time would be I have put 
together a sign up sheet at https://ethercalc.openstack.org/cvro305izog2 that 
projects or individuals can use to claim specific times that we can dedicate to 
them. Currently there are 12x 1hour long slots over the course of the two days. 
If we need more slots we can probably add an extra hour to each day at the end 
of the day. If there are conflicts I'm sure we will be able to share two 
different projects in the help room during one slot (but if you schedule one of 
these let us know so we are prepared for it).

The ethercalc mentions this too but the schedule isn't supposed to be set in 
stone, we can be flexible. I just wanted to make sure there was enough 
structure this time around to make it easier for people to find the infra team 
and get the help they need.

So now you get to go and sign up and we'll see you at the PTG.

Thank you,
Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

[openstack-dev] [All] Gerrit User Study

2018-02-02 Thread Clark Boylan

Google is scheduling 60 minute Gerrit user research sessions to help shape the 
future of Gerrit. If you are interested in providing feedback to Gerrit as 
users this is a good opportunity to do so. More info can be found at this 
google group thread 
https://groups.google.com/forum/#!topic/repo-discuss/F_Qv1R_JtOI.

Thank you (and sorry for the spam but Gerrit is an important tool for us, your 
input is valuable),
Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [requirements] requirements-tox-validate-projects FAILURE

2018-01-18 Thread Clark Boylan

On Thu, Jan 18, 2018, at 1:54 PM, Kwan, Louie wrote:
> Would like to add the following module to openstack.masakari project
> 
> https://github.com/pytransitions/transitions
> 
> https://review.openstack.org/#/c/534990/
> 
> requirements-tox-validate-projects failed:
> 
> http://logs.openstack.org/90/534990/6/check/requirements-tox-validate-projects/ed69273/ara/result/4ee4f7a1-456c-4b89-933a-fe282cf534a3/
> 
> What else need to be done?

Reading the log [0] the job failed because python-cratonclient removed its 
check-requirements job. This was done in 
https://review.openstack.org/#/c/535344/ as part of the craton retirement and 
should be fixed on the requirements side by 
https://review.openstack.org/#/c/535351/. I think a recheck at this point will 
come back green (so I have done that for you).

[0] 
http://logs.openstack.org/90/534990/6/check/requirements-tox-validate-projects/ed69273/job-output.txt.gz#_2018-01-18_20_07_54_531014

Hope this helps,
Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [storyboard] need help figuring out how to use auth with storyboard client

2018-01-12 Thread Clark Boylan

On Fri, Jan 12, 2018, at 12:57 PM, Doug Hellmann wrote:
> The storyboard client docs mention an "access token" [1] as something
> a client needs in order to create stories and make other sorts of
> changes.  They don't explain what that token is or how to get one,
> though.
> 
> Where do I get a token? How long does the token work? Can I safely
> put a token in a configuration file, or do I need to get a new one
> each time I want to do something with the client?
> 
> Doug
> 
> [1] https://docs.openstack.org/infra/python-storyboardclient/usage.html
> 

The storyboard api docs [2] point to this location under your userprofile [3], 
though it seems to not be directly linked to in the storyboard UI. And there 
are docs for managing subsequent user tokens further down in the api docs [4].

I've not used any of this so unsure how accurate it is, but hope this is enough 
to get you going with storyboardclient.

[2] https://docs.openstack.org/infra/storyboard/webapi/v1.html#api
[3] https://storyboard.openstack.org/#!/profile/tokens
[4] https://docs.openstack.org/infra/storyboard/webapi/v1.html#user-tokens

Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [all] How to use libxml2 with tox

2018-01-11 Thread Clark Boylan

On Thu, Jan 11, 2018, at 1:52 PM, Kwan, Louie wrote:
> Would like to use libxml2 and having issues for tox.
> 
> What needs to be included in the requirements.txt file etc.
> 
> Any tip is much appreciated.

You likely need to make sure that libxml2 header packages are installed so that 
the python package can link against libxml2. On Debian and Ubuntu I think the 
package is libxml2-dev and is libxml2-devel on suse. This isn't something that 
you would add to your requirements.txt as it would be a system dependency. To 
get it installed on our test nodes you can add it to the project's bindep.txt 
file.

Hope this helps,
Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

[openstack-dev] [All][Infra] Meltdown Patching

2018-01-10 Thread Clark Boylan

Hello everyone,

As a general heads up Ubuntu has published new kernels which enable kernel page 
table isolation to address the meltdown vulnerability that made the news last 
week. The infra team is currently working through patching our Ubuntu servers 
to pick up these fixes. Unfortunately patching does require reboots so you may 
notice some service outages as we roll through and update things.

As a side note all of our CentOS servers were patched last week when CentOS 
published new kernels. We managed to do these with no service outages, but 
won't be so lucky with one off services running on Ubuntu.

Thank you for your patience and feel free to ask if you have any questions 
related to this or anything else really.

Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] zuulv3 log structure and format grumblings

2018-01-05 Thread Clark Boylan

On Fri, Jan 5, 2018, at 10:26 AM, Andrea Frittoli wrote:
> On Fri, 5 Jan 2018, 7:14 pm melanie witt,  wrote:
> 
> > On Thu, 4 Jan 2018 08:46:38 -0600, Matt Riedemann wrote:
> > > The main issue is for newer jobs like tempest-full, the logs are under
> > > controller/logs/ and we lose the log analyze formatting for color, being
> > > able to filter on log level, and being able to link directly to a line
> > > in the logs.
> >
> > I also noticed we're missing testr_results.html.gz under
> > controller/logs/, which was handy for seeing a summary of the tempest
> > test results.
> >
> 
> Uhm I'm pretty sure that used to be there, so something must have changed
> since.
> I cannot troubleshoot this on my mobile, but if you want to have a look,
> the process test results role in zuul-jobs is what is supposed to produce
> that.

To expand a bit more on that what we are attempting to do is port the log 
handling code in devstack-gate [0] to zuul v3 jobs living in tempest [1]. The 
new job in tempest itself relies on the  ansible process-test-results role 
which can be found here [2]. Chances are something in [1] and/or [2] will have 
to be updated to match the behavior in [0].

[0] 
https://git.openstack.org/cgit/openstack-infra/devstack-gate/tree/functions.sh#n524
[1] 
https://git.openstack.org/cgit/openstack/tempest/tree/playbooks/post-tempest.yaml#n8
[2] 
http://git.openstack.org/cgit/openstack-infra/zuul-jobs/tree/roles/process-test-results

Hope this helps,
Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] zuulv3 log structure and format grumblings

2018-01-04 Thread Clark Boylan

On Thu, Jan 4, 2018, at 6:46 AM, Matt Riedemann wrote:
> I've talked to a few people on the infra team about this but I'm not 
> sure what is temporary and transitional and what is permanent and needs 
> to be fixed, and how to fix it.
> 
> The main issue is for newer jobs like tempest-full, the logs are under 
> controller/logs/ and we lose the log analyze formatting for color, being 
> able to filter on log level, and being able to link directly to a line 
> in the logs.
> 
> Should things be like logs/controller/* instead? If not, can someone 
> point me to where the log analyze stuff runs so I can see if we need to 
> adjust a path regex for the new structure?

I don't think that is necessary, instead the next item you noticed is related 
to the issue.

> 
> The other thing is zipped up files further down the directory structure 
> now have to be downloaded, like the config files:
> 
> http://logs.openstack.org/69/530969/1/check/tempest-full/223c175/controller/logs/etc/nova/

The issue is that the wsgi os-loganalyze application is only applied to .txt 
log files if they are also gzipped:
https://git.openstack.org/cgit/openstack-infra/puppet-openstackci/tree/templates/logs.vhost.erb#n95

As you've noticed the new job processes log files differently. In the case of 
etc contents they are now zipped when they weren't before and in the case of 
the service logs themselves are no longer gzipped when they were gzipped before.

So if we want os-loganalyze to annotate these log files they should be gzipped 
by the job before getting copied to the log server (this also helps quite a bit 
with disk usage on the log server itself so is a good idea regardless).

> 
> I think that's part of devstack-gate's post-test host cleanup routine 
> where it modifies gz files so they can be viewed in the browser.

It was, but this new job does not use devstack-gate at all, there is only 
devstack + job config. Fixes for this will need to be applied to the new job 
itself rather than to devstack-gate.

I've pushed up https://review.openstack.org/531208 as a quick check that this 
is indeed the general problem, but for longer term fix I think we want to 
update our log publishing ansible roles to compress everything that isn't 
already compressed.

> 
> Please let me know if there is something I can help with here because I 
> really want to get the formatting back to help with debugging CI issues 
> and I've taken for granted how nice things were for oh these many years.

Please check that the above change results in the os-loganalyze behavior that 
you expect and if adventurous you can help us updating the generic publishing 
role.

Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [pbr] Git submodules support

2017-12-19 Thread Clark Boylan



On Tue, Dec 19, 2017, at 10:18 AM, Gaetan wrote:
> Hello
> 
> I recently submitted the following pull request in order to allow using git
> submodules with PBR modules: https://review.openstack.org/#/c/524421/
> 
> To give a little bit of context, in my team, all our python modules are
> following almos the same pattern:
> - python 3 only
> - setup.py/setuptools/distutil stuff completely handled by PBR
> - dependencies handled by Pipfile (see my first draft to support it with
> PBR here: https://review.openstack.org/#/c/524436/)
> - "active dependencies", basically shared modules we may want to update
> quickly are injected in the project using Git Submodules, and installed
> with `pipenv install ./deps/mydependency` (equivalent in pip: `pip install
> -e deps/mydepedency`).
> - all on gitlab (so using the github/gitlab "pull request"/"mergerequest"
> mecanism).
> - Python applications (that used PBR) are deployed in docker
> 
> This is a very, very cool architecture, allowing us to do change very
> quickly in depedencies and reference it directly even if it has been merged
> into the dependency yet. A whole cycle or review might takes a few days,
> change may be rejected, unit tests might be requested before merge,... So
> during all this time, in the "normal" way of releasing a change on a
> dependency (mergerequest + git tag (thanks PBR) + upload to our internal
> pypi), the change might not be available in the parent module that needs it.
> So injecting dependencies with git submodules is very handy, even if all
> dependencies, at the end, once stabilized, will end up in a clean pypi repo.
> 
> Now, why Pipfile? Because it is also much more convinient to use that a
> requirements.txt (and if you want to accurately handle the versions, you
> need a requirements.txt.in + pip-tools to generate requirements.txt). Same
> for the development requirements. So, pipfile handles all of that for us,
> and Pipfile will eventually become the defacto standard for dependency
> declaration, replacing requirements*.txt.
> Pipfile recently moved to Pypa, becoming a little bit more official:
> https://github.com/pypa/pipfile.
> 
> So, back to my first patch (https://review.openstack.org/#/c/524421/). It
> fixes an issue when trying to use everything all together. In our Gitlab
> CI, PBR is happy and almost does the right thing, even it does not handle
> Pipfile directly (see https://review.openstack.org/#/c/524436/ for the
> limitations).
> 
> But once in the docker build, PBR complains about not getting access to
> upstream GIT. And it is true: the git tree (with dependencies) has been
> checkedout by gitlab and the docker build (running in a gitlab runner) does
> not give access to the repository, because it is not required.

Is the issue here that submodules somehow cause PBR to need access to upstream 
git repositories or is the installation process not including the .git 
directories for repositories and only including their checked out state? PBR 
should work with local git repository checkouts without needing to talk to any 
remotes/upstreams.

> 
> Why PBR does that? Because:
> - git submodules dependencies does not provided pkg metadata, because they
> are not distribution package (they are installed from source)
> - the method that finds the version from the git tree fails with this
> upstream access error probably because it expects to be able to fetch
> something from the remote, and the environment does not give this access)
> 
> So my simple and dirty solution is to use the PBR_VERSION environment
> variable to skip this check, that is not required on dependencies coming
> from git submodules since the "version matching" is actually handled by git
> sumodules. I first only moved PBR_VERSION handling a few lines to avoid
> affecting other packages that uses PBR:
> https://review.openstack.org/#/c/524421/1/pbr/packaging.py
> 
> There are a bit of discussion on my patch, thanks for the reviewers by the
> way. I see basically 3 solutions for making PBR happy with git submodules
> and docker, all requiring developer to set an envvar in their Dockerfile:
> 
>- developers set the environment variable PBR_VERSION to freeze the
>version of dependencies injected by git submodules. This only requires my
>first patchset
>- same but with another variable (PBR_FALLBACK_VERSION ?).
>- same but with an environment variable with the name of the package in
>it. if the name can take non alphanumeric character, I can strip them out
>with a regex for example (see documentation proposal here:
>https://review.openstack.org/#/c/524421/3/doc/source/user/packagers.rst)
> 
> The other solution might be to allow pbr to NOT have git upstream access
> in _get_version_from_git.

At least without submodules this shouldn't be required. Without submodules you 
only need access to the local git repository and not any remotes. I think this 
behavior is desireable and if we can have that behavior with

[openstack-dev] The Infra Team is back to normal "office hours"

2017-11-27 Thread Clark Boylan

Hello everyone,

Just wanted to quickly let everyone know that with the majority of Zuul
v3 transition firefighting, Summit Travel, and the big US Thanksgiving
Holiday Weekend behind us the Infra team is resuming its normal "office
hours". That means you shouldn't feel like it is a bad time to ping us.
Please do talk to us if we can help with whatever problems related to
the infrastructure you are having.

Thank you for your patience,
Clark (for the rest of the Infra Team)

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Release-job-failures][release][trove][infra] Pre-release of openstack/trove failed

2017-11-14 Thread Clark Boylan



On Tue, Nov 14, 2017, at 12:15 PM, Doug Hellmann wrote:
> Excerpts from Clark Boylan's message of 2017-11-14 09:04:07 -0800:
> > 
> > On Tue, Nov 14, 2017, at 09:01 AM, Doug Hellmann wrote:
> > > Excerpts from zuul's message of 2017-11-14 16:24:33 +:
> > > > Build failed.
> > > > 
> > > > - release-openstack-python-without-pypi 
> > > > http://logs.openstack.org/17/175bd53e7b18a9dc4d42e60fe9225a5748eded34/pre-release/release-openstack-python-without-pypi/7a9474c/
> > > >  : POST_FAILURE in 14m 47s
> > > > - announce-release announce-release : SKIPPED
> > > > 
> > > 
> > > I can't find the error message in the job output log for this job. Maybe
> > > someone more familiar with the new log format can help?
> > 
> > If you look in ara you get http://paste.openstack.org/show/626285/
> > (pasted here because ara deeplinking not working with firefox, fix for
> > that should be out soon I think). Ansible is failing to update
> > permissions on a file. I think ansible does this to ensure that rsync
> > can read the files when it copies them.
> > 
> > Clark
> > 
> 
> Thanks. The ara page wasn't loading for me, but maybe I just wasn't
> patient enough.

Responses were slow, something was pulling significant amounts of data
out of the server earlier today to the detriment of other users (things
would eventually load though so wasn't a complete outage/Dos).

> 
> Is there any way to tell if this is a custom role or playbook? Or
> if it's part of the standard job?

Ara is again the magic here. I've pulled out chrome so that I can deep
link,
http://logs.openstack.org/17/175bd53e7b18a9dc4d42e60fe9225a5748eded34/pre-release/release-openstack-python-without-pypi/7a9474c/ara/file/18fc1bcc-3d6f-45e6-833e-c9fccefc7e72/#line-1
but that shows you exactly where the task comes from
(git.openstack.org/openstack-infra/zuul-jobs/roles/publish-artifacts-to-fileserver/tasks/main.yaml).

> 
> There is no "images/README" file in the trove repository. Can someone
> on the trove team tell us if that file comes from building trove?
> 
> Doug


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Release-job-failures][release][trove][infra] Pre-release of openstack/trove failed

2017-11-14 Thread Clark Boylan



On Tue, Nov 14, 2017, at 09:01 AM, Doug Hellmann wrote:
> Excerpts from zuul's message of 2017-11-14 16:24:33 +:
> > Build failed.
> > 
> > - release-openstack-python-without-pypi 
> > http://logs.openstack.org/17/175bd53e7b18a9dc4d42e60fe9225a5748eded34/pre-release/release-openstack-python-without-pypi/7a9474c/
> >  : POST_FAILURE in 14m 47s
> > - announce-release announce-release : SKIPPED
> > 
> 
> I can't find the error message in the job output log for this job. Maybe
> someone more familiar with the new log format can help?

If you look in ara you get http://paste.openstack.org/show/626285/
(pasted here because ara deeplinking not working with firefox, fix for
that should be out soon I think). Ansible is failing to update
permissions on a file. I think ansible does this to ensure that rsync
can read the files when it copies them.

Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Release-job-failures][zuul][infra][trove] Tag of openstack/trove-dashboard failed

2017-11-14 Thread Clark Boylan

On Tue, Nov 14, 2017, at 08:09 AM, Doug Hellmann wrote:
> Excerpts from zuul's message of 2017-11-14 16:01:20 +:
> > Unable to freeze job graph: Unable to modify final job  > publish-openstack-releasenotes branches: None source: 
> > openstack-infra/project-config/zuul.d/jobs.yaml@master#26> attribute 
> > required_projects={'openstack/horizon':  > 0x7ff8f8b0a710>} with variant  > None source: 
> > openstack-infra/openstack-zuul-jobs/zuul.d/project-templates.yaml@master#291>
> > 
> 
> Is there some way to detect this type of error before we approve a
> release?

My understanding is Zuul won't do complete pre merge testing of these
jobs because they run in a post merge context and have access to secrets
for stuff like AFS publishing. Zuul does do syntax checking on these
jobs pre merge though so if we could get Zuul to check "final" state
without building an entire job graph that may solve the problem. I'm not
familiar enough with Zuul's config compiler to know if that is
reasonable though.

It is possible that Zuul would notice the failure post merge when
attempting to run any jobs against the repo when it is in this state.
Though my hunch is we didn't notice until the release jobs ran because
hitting the error currently requires you to attempt to queue up the
specific broken job.

Likely any fix for this will have to happen in Zuul's config handler(s)
to notice this error early rather than late on job execution.

Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] Fwd: [Distutils][pbr] Announcement: Pip 10 is coming, and will move all internal APIs

2017-10-20 Thread Clark Boylan

On Fri, Oct 20, 2017, at 11:17 AM, Clark Boylan wrote:
> On Fri, Oct 20, 2017, at 07:23 AM, Doug Hellmann wrote:
> > It sounds like the PyPI/PyPA folks are planning some major changes to
> > pip internals, soon.
> > 
> > I know pbr uses setuptools, and I don't think it uses pip, but if
> > someone has time to verify that it would be helpful.
> > 
> > We'll also want to watch out for breakage in normal use of pip 10. If
> > they're making changes this big, they may miss something in their own
> > test coverage that affects our jobs.
> > 
> 
> After a quick skim of PBR I don't think we use pip internals anywhere.
> Its all executed via the command itself. That said we should test this
> so I've put up https://review.openstack.org/513825 (others should feel
> free to iterate on it if it doesn't work) to install latest pip master
> in a devstack run.

The current issue this change is facing can be seen at
http://logs.openstack.org/25/513825/4/check/legacy-tempest-dsvm-py35/c31deb2/logs/devstacklog.txt.gz#_2017-10-20_20_07_54_838.
The tl;dr is that for distutils installed packages (basically all the
distro installed python packges) pip refuses to uninstall them in order
to perform upgrades because it can't reliably determine where all the
files are. I think this is a new pip 10 behavior.

In the general case I think this means we can not rely on global pip
installs anymore. This may be a good thing to bring up with upstream
PyPA as I expect it will break a lot of people in a lot of places (it
will break infra for example too).

Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] Fwd: [Distutils][pbr] Announcement: Pip 10 is coming, and will move all internal APIs

2017-10-20 Thread Clark Boylan



On Fri, Oct 20, 2017, at 11:17 AM, Clark Boylan wrote:
> On Fri, Oct 20, 2017, at 07:23 AM, Doug Hellmann wrote:
> > It sounds like the PyPI/PyPA folks are planning some major changes to
> > pip internals, soon.
> > 
> > I know pbr uses setuptools, and I don't think it uses pip, but if
> > someone has time to verify that it would be helpful.
> > 
> > We'll also want to watch out for breakage in normal use of pip 10. If
> > they're making changes this big, they may miss something in their own
> > test coverage that affects our jobs.
> > 
> 
> After a quick skim of PBR I don't think we use pip internals anywhere.
> Its all executed via the command itself. That said we should test this
> so I've put up https://review.openstack.org/513825 (others should feel
> free to iterate on it if it doesn't work) to install latest pip master
> in a devstack run.

And it has already caught its first bug. Fix and details at
https://review.openstack.org/#/c/513832/.

Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] Fwd: [Distutils][pbr] Announcement: Pip 10 is coming, and will move all internal APIs

2017-10-20 Thread Clark Boylan

On Fri, Oct 20, 2017, at 07:23 AM, Doug Hellmann wrote:
> It sounds like the PyPI/PyPA folks are planning some major changes to
> pip internals, soon.
> 
> I know pbr uses setuptools, and I don't think it uses pip, but if
> someone has time to verify that it would be helpful.
> 
> We'll also want to watch out for breakage in normal use of pip 10. If
> they're making changes this big, they may miss something in their own
> test coverage that affects our jobs.
> 

After a quick skim of PBR I don't think we use pip internals anywhere.
Its all executed via the command itself. That said we should test this
so I've put up https://review.openstack.org/513825 (others should feel
free to iterate on it if it doesn't work) to install latest pip master
in a devstack run.

Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] Do you hate the new gerrit thing where the cursor jumps all over?

2017-10-20 Thread Clark Boylan

On Fri, Oct 20, 2017, at 10:16 AM, Jeremy Freudberg wrote:
> I don't wanna turn this into a complaint session about new Gerrit. But
> three little notes since the upgrade:
> 
> - I can't push the "e" key to go into editing mode anymore
> - Cherry-picks change the gerrit topic (that's probably a feature)
> - The patch I'm already looking at shows up under "Same Topic" (that's
> probably a feature)
> 
> 
> The only one that really kills me is the first one. Anyone know if the
> keyboard shortcut changed, or if it's disabled but configurable, etc...?

The help overlay on the diff screen (brought up by hitting '?') says it
is now " +  + e : Open Inline Editor".

This seems to work for me in firefox on linux.

Hope this helps,
Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [doc] [glance] backport patch doesn't seem to be applied to doc site

2017-10-03 Thread Clark Boylan

On Tue, Oct 3, 2017, at 04:30 PM, Ken'ichi Ohmichi wrote:
> Hi
> 
> I tried to install glance manually according to docs site[1] for Pike
> release, and current doc site doesn't show how to create glance
> database.
> The bug has been already fixed and the backport patch[2] also has been
> merged into Pike branch.
> So how to affect these backport patches to actual doc site?
> 
> Thanks
> Kenichi Omichi
> 
> ---
> [1]: https://docs.openstack.org/glance/pike/install/install-ubuntu.html
> [2]: https://review.openstack.org/#/c/508279
> 

This likely failed to publish due to bugs in the Zuulv3 jobs. Merging
new changes to the branches should retrigger doc publishing. Otherwise
an infra root will need to manually retrigger the jobs but we are fairly
swamped right now just various migration related items so it would be
great if merging a subsequent change could be used instead.

Hope this helps,
Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [all] Zuul v3 Status - and Rollback Information

2017-10-03 Thread Clark Boylan

On Tue, Oct 3, 2017, at 10:00 AM, Matt Riedemann wrote:
> On 10/3/2017 11:40 AM, Monty Taylor wrote:
> > Our nodepool quota will be allocated 80% to Zuul v2, and 20% to ZUul 
> > v3.  This will slow v2 down slightly, but allow us to continue to 
> > exercise v3 enough to find problems.
> > 
> > Zuul v2 and v3 can not both gate a project or set of projects.  In 
> > general, Zuul v2 will be gating all projects, except the few projects 
> > that are specifically v3-only: zuul-jobs, openstack-zuul-jobs, 
> > project-config, and zuul itself.
> 
> So if v3 is in check and periodic, and will be restarted and offline at 
> times, doesn't that mean we could have patches waiting for an extended 
> period of time on v3 results when the v2 jobs are long done? Or do the 
> v3 jobs just timeout and being non-voting shouldn't impact the overall 
> score?

v2 will do all of the gating and will for the most part completely
ignore what v3 is doing. The exception to this is any -2's that zuul has
left will need to be cleared out (fungi is working on this as I write
this email).

Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [infra] Zull retry limit

2017-10-01 Thread Clark Boylan

On Sun, Oct 1, 2017, at 10:19 AM, Gary Kotton wrote:
> Hi,
> Is this an example of the solution that you are proposing -
> https://review.openstack.org/#/c/508775/. I also read that we should
> create a .zuul.yaml file. Not sure what the correct thing to do is.
> Please advise.
> Thanks
> Gary
> 

Yes 508775 is an example of what you should do. Adding a .zuul.yaml file
is something you can do but not necessary to correct this problem. You
would use a repo local .zuul.yaml file to add new jobs to your project.

Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [infra] Zull retry limit

2017-10-01 Thread Clark Boylan

On Sat, Sep 30, 2017, at 11:23 PM, Gary Kotton wrote:
> Hi,
> It seems like all patches are hitting this issue – “RETRY_LIMIT in 5m
> 35s”. Any idea? Is this specific to vmware-nsx or all projects?
> Did we miss doing an update for zuulv3?
>

This is specific to certain jobs. You can see general live status to
compare against global state at http://status.openstack.org/zuul. In
this particular case the job has attempted to provide a helpful error
message [0]. You should be able to create an openstack-tox-pep8 job
variant for vmware-nsx that includes neutron as a required project.
Documentation for how to go about that can be found at [1].

This does raise the question of why vmware-nsx needs to install neutron
to run pep8 though.

[0]
http://logs.openstack.org/30/508130/3/check/openstack-tox-pep8/7cb51cb/job-output.txt.gz#_2017-10-01_07_24_29_980352
[1] https://docs.openstack.org/infra/manual/zuulv3.html

Hope this helps,
Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [infra] pypi publishing

2017-10-01 Thread Clark Boylan

On Sat, Sep 30, 2017, at 10:28 PM, Gary Kotton wrote:
> Hi,
> Any idea why latest packages are not being published to pypi.
> Examples are:
> vmware-nsxlib 10.0.2 (latest stable/ocata)
> vmware-nsxlib 11.0.1 (latest stable/pike)
> vmware-nsxlib 11.1.0 (latest queens)
> Did we miss a configuration that we needed to do in the infra projects?
> Thanks
> Gary

Looks like these are all new tags pushed within the last day. Looking at
logs for 11.1.1 we see the tarball artifact creation failed [0] due to
what is likely a bug in the new zuulv3 jobs.

[0]
http://logs.openstack.org/e5/e5a2189276396201ad88a6c47360c90447c91589/release/publish-openstack-python-tarball/2bdd521/ara/result/6ec8ae45-7266-40a9-8fd5-3fb4abcde677/

We'll need to get the jobs debugged.

Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [all][infra] Zuul v3 migration update

2017-09-28 Thread Clark Boylan

On Wed, Sep 27, 2017, at 03:24 PM, Monty Taylor wrote:
> Hey everybody,
> 
> We're there. It's ready.
> 
> We've worked through all of the migration script issues and are happy 
> with the results. The cutover trigger is primed and ready to go.
> 
> But as it's 21:51 UTC / 16:52 US Central it's a short day to be 
> available to respond to the questions folks may have... so we're going 
> to postpone one more day.
> 
> Since it's all ready to go we'll be looking at flipping the switch first 
> thing in the morning. (basically as soon as the West Coast wakes up and 
> is ready to go)
> 
> The project-config repo should still be considered frozen except for 
> migration-related changes. Hopefully we'll be able to flip the final 
> switch early tomorrow.
> 
> If you haven't yet, please see [1] for information about the transition.
> 
> [1] https://docs.openstack.org/infra/manual/zuulv3.html
> 

Its done! Except for all the work to make jobs run properly. Early today
(PDT) we converted everything over to our auto generated Zuulv3 config.
Since then we've been working to address problems in job configs.

These problems include:
Missing inclusion of the requirements repo for constraints in some
jobs
Configuration of python35 unittest jobs in some cases
Use of sudo checking not working properly
Multinode jobs not having multinode nodesets

Known issues we will continue to work on:
Multinode devstack and grenade jobs are not working quite right
Releasenote jobs not working due to use of origin/ refs in git
It looks like we may not have job branch exclusions in place for all
cases
The zuul-cloner shim may not work in all cases. We are tracking down
and fixing the broken corner cases.

Keep in mind that with things in flux, there is a good chance that
changes enqueued to the gate will fail. It is a good idea to check
recent check queue results before approving changes.

I don't think we've found any deal breaker problems at this point. I am
sure there are many more than I have listed above. Please feel free to
ask us about any errors. For the adventurous, fixing problems is likely
a great way to get familiar with the new system. You'll want to start by
fixing errors in openstack-infra/openstack-zuul-jobs/playbooks/legacy.
Once that stabilizes the next step is writing native job configs within
your project tree. Documentation can be found at
https://docs.openstack.org/infra/manual/zuulv3.html. I expect we'll
spend the next few days ironing out the transition.

Thank you for your patience,
Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [infra][mogan] Need help for replacing the current master

2017-09-27 Thread Clark Boylan

On Tue, Sep 26, 2017, at 05:57 PM, Zhenguo Niu wrote:
> Thanks Clark Boylan,
> 
> We have frozen the Mogan repo since this mail sent out, and there's no
> need
> to update the replacement master. So please help out when you got time.

I mentioned this to dims on IRC today, but should write it here as well
for broader reach. It looks like https://github.com/dims/mogan is a fast
forwardable change on top of 7744129c83839ab36801856f283fb165d71af32e.
Also its less than ten commits ahead of current mogan master (7744129).
For this reason I think you can just push those commits up to Gerrit and
review them normally.

The only gotcha with this is you may need to update the Gerrit ACLs to
allow merge commit pushes.

Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [infra][mogan] Need help for replacing the current master

2017-09-26 Thread Clark Boylan

On Tue, Sep 26, 2017, at 02:18 AM, Zhenguo Niu wrote:
> It's very appreciated if you shed some light on what the next steps would
> be to move this along.

We should schedule a period of time to freeze the Mogan repo, update the
replacement master (if necessary) then we can either force push that
over the existing branch or push it into a new branch and have you
propose and then merge a merge commit. Considering that the purpose of
this is to better update the history of the master branch the force push
is likely the most appropriate option. Using a merge commit will result
in potentially complicated history which won't help with the objective
here.

What is a good time to freeze, update and push? In total you probably
want to allocate a day to this, we can likely get by with less but it is
easy to block off a day and then we don't have to rush.

Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [devstack] pike time growth in August

2017-09-22 Thread Clark Boylan

On Fri, Sep 22, 2017, at 01:18 PM, Attila Fazekas wrote:
> The main offenders reported by devstack does not seams to explain the
> growth visible on OpenstackHealth [1] .
> The logs also stated to disappear which does not makes easy to figure
> out.
> 
> 
> Which code/infra changes can be related ?
> 
> 
> http://status.openstack.org/openstack-health/#/test/devstack?resolutionKey=day=P6M

A big factor is likely the loss of OSIC. That cloud performed really
well and now we don't have it anymore so averages will increase.

Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [devstack] Why do we apt-get install NEW files/debs/general at job time ?

2017-09-22 Thread Clark Boylan

On Fri, Sep 22, 2017, at 08:58 AM, Michał Jastrzębski wrote:
> Another, more revolutionary (for good or ill) alternative would be to
> move gates to run Kolla instead of DevStack. We're working towards
> registry of images, and we support most of openstack services now. If
> we enable mixed installation (your service in devstack-ish way, others
> via Kolla), that should lower the amount of downloads quite
> dramatically (lots of it will be downloads from registry which will be
> mirrored/cached in every nodepool). Then all we really need is to
> support barebone image with docker and ansible installed and that's
> it.

Except that it very likely isn't going to use less bandwidth. We already
mirror most of these package repos so all transfers are local to the
nodepool cloud region. In total we seem to grab about 139MB of packages
for a neutron dvr multinode scenario job (146676348 bytes) on Ubuntu
Xenial. This is based off the package list compiled at
http://paste.openstack.org/raw/621753/ then asking apt-cache for the
package size for the latest version.

Kolla images on the other hand are in the multigigabyte range
http://tarballs.openstack.org/kolla/images/.

Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [all] review.openstack.org downtime and Gerrit upgrade TODAY 15:00 UTC - 23:59 UTC

2017-09-20 Thread Clark Boylan

On Mon, Sep 18, 2017, at 04:58 PM, Clark Boylan wrote:
> On Mon, Sep 18, 2017, at 06:43 AM, Andreas Jaeger wrote:
> > Just a friendly reminder that the upgrade will happen TODAY, Monday
> > 18th, starting at 15:00 UTC. The infra team expects that it takes 8
> > hours, so until 2359 UTC.
> 
> This work was functionally completed at 23:43 UTC. We are now running
> Gerrit 2.13.9. There are some cleanup steps that need to be performed in
> Infra land, mostly to get puppet running properly again.
> 
> You will also notice that newer Gerrit behaves in some new and exciting
> ways. Most of these should be improvements like not needing to reapprove
> changes that already have a +1 Workflow but also have a +1 Verified;
> recheck should now work for these cases. If you find a new behavior that
> looks like a bug please let us know, but we should also work to file
> them upstream so that newer Gerrit can address them.
> 
> Feel free to ask us questions if anything else comes up.
> 
> Thank you to everyone that helped with the upgrade. Seems like these get
> more and more difficult with each Gerrit release so all the help is
> greatly appreciated.

As a followup we have been tracking new fun issues/behaviors in Gerrit
and fixing them over the last couple days. Here is an update on where we
are currently at.

Gerrit emails are slow. You may have noticed that you aren't getting
quite as much Gerrit email as before. This is because Gerrit is only
sending about one email a minute. Upstream bug is at
https://bugs.chromium.org/p/gerrit/issues/detail?id=7261 and we have
just got https://review.openstack.org/#/c/505677 merged based on the
info in that upstream bug. This won't be applied until we get puppet
running on review.openstack.org again (more on that later) and will
require another Gerrit service restart.

The Gerrit web UI's file editor behaves oddly resulting in what appear
to be API timeouts. This also seems to affect gertty. I don't think
anyone has dug in far enough to understand what is going on yet.

Now for known issues that should be fixed.

The Gerrit dashboard creator was using queries that didn't work with new
Gerrit query behavior. Sdague got this sorted out quick.

The Gerrit event stream changed its ref-updated data and now includes
refs/heads/$branchname instead of just $branchname under refName when
changes merge. This confused Zuul and meant no post jobs were running.
Zuul has been updated to handle this new behavior and post jobs are
running.

There were no gitweb links. This wasn't caught in testing because we
used a test cgit setup on review-dev. Fix here was just to switch to
using cgit on review.openstack.org (though the link is still called
"gitweb" in the Gerrit UI for reasons).

Memory consumption has gone up which initially led to frequent garbage
collection which led to 500 errors. We bumped heap memory available to
Gerrit up to 48GB (from 30GB) and that seems to have stabilized things.
Thankfully while needing more memory it doesn't seem to continuously
grow like it did on the old version (which forced us to do semi frequent
service restarts). We will have to monitor Gerrit to ensure it is
properly stable over time.

We could not create new projects in Gerrit. This is because Gerrit 2.12
dropped the --name argument from the create-project command which
Gerritlib was using. We have updated Gerritlib to check the Gerrit
version and pass the correct arguments to create-project.

Unfortunately, we still can't create new projects just yet, this is
related to puppet not running on review.openstack.org right now. The
gerrit server itself is fine and would puppet except that we force
puppet to run on our git mirror farm first to ensure proper mirroring of
repos and those have been failing since the CentOS 7.4 release. Once
we've got puppet happy we can get back to creating new projects in
Gerrit.

All the details can be found at
https://etherpad.openstack.org/p/gerrit-2.13-issues.

Thank you for your patience,
Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [all] review.openstack.org downtime and Gerrit upgrade TODAY 15:00 UTC - 23:59 UTC

2017-09-18 Thread Clark Boylan

On Mon, Sep 18, 2017, at 06:43 AM, Andreas Jaeger wrote:
> Just a friendly reminder that the upgrade will happen TODAY, Monday
> 18th, starting at 15:00 UTC. The infra team expects that it takes 8
> hours, so until 2359 UTC.

This work was functionally completed at 23:43 UTC. We are now running
Gerrit 2.13.9. There are some cleanup steps that need to be performed in
Infra land, mostly to get puppet running properly again.

You will also notice that newer Gerrit behaves in some new and exciting
ways. Most of these should be improvements like not needing to reapprove
changes that already have a +1 Workflow but also have a +1 Verified;
recheck should now work for these cases. If you find a new behavior that
looks like a bug please let us know, but we should also work to file
them upstream so that newer Gerrit can address them.

Feel free to ask us questions if anything else comes up.

Thank you to everyone that helped with the upgrade. Seems like these get
more and more difficult with each Gerrit release so all the help is
greatly appreciated.

Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [all] review.openstack.org downtime and Gerrit upgrade

2017-09-15 Thread Clark Boylan

On Wed, Aug 2, 2017, at 03:57 PM, Clark Boylan wrote:
> Hello,
> 
> The Infra team is planning to upgrade review.openstack.org from Gerrit
> 2.11 to Gerrit 2.13 on September 18, 2017. This downtime will begin at
> 1500UTC and is expected to take many hours as we have to perform an
> offline update of Gerrit's secondary indexes. The outage should be
> complete by 2359UTC.
> 
> This upgrade is a relatively minor one for users. You'll find that
> mobile use of Gerrit is slightly better (though still not great). The
> bug that forces us to reapply Approval votes rather than just rechecking
> has also been addressed. If you'd like to test out Gerrit 2.13 you can
> do so at https://review-dev.openstack.org.
> 
> The date we have chosen is the Monday after the PTG. The
> expectation/hope is that many people will still be traveling or
> otherwise recovering from the PTG so demand for Gerrit will be low. By
> doing it on Monday we also hope that there will be load on the service
> the following day which should help shake out any issues quickly (in the
> past we've done it on weekends then had to wait a couple days before
> problems are noticed).
> 
> If you have any concerns or feedback please let the Infra team know.

As a friendly reminder we are planning to move ahead with this upgrade
on Monday at 1500UTC. We are reviewing the process and getting some
final preparations done on our last day at the PTG.

Thank you for your patience,
Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [TripleO] CI design session at the PTG

2017-08-28 Thread Clark Boylan



On Mon, Aug 28, 2017, at 07:19 AM, Paul Belanger wrote:
> On Mon, Aug 28, 2017 at 09:42:45AM -0400, David Moreau Simard wrote:
> > Hi,
> > 
> > (cc whom I would at least like to attend)
> > 
> > The PTG would be a great opportunity to talk about CI design/layout
> > and how we see things moving forward in TripleO with Zuul v3, upstream
> > and in review.rdoproject.org.
> > 
> > Can we have a formal session on this scheduled somewhere ?
> > 
> Wednesday onwards likely is best for me, otherwise, I can find time
> during
> Mon-Tues if that is better.

The Zuulv3 stuff may be appropriate during the Infra team helproom on
Monday and Tuesday. There will be an afternoon Zuulv3 for OpenStack devs
session in Vail at 2pm Monday, but I think we generally plan on helping
with Zuulv3 during the entire helproom time.

Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [all] review.openstack.org downtime and Gerrit upgrade

2017-08-23 Thread Clark Boylan

On Wed, Aug 2, 2017, at 03:57 PM, Clark Boylan wrote:
> Hello,
> 
> The Infra team is planning to upgrade review.openstack.org from Gerrit
> 2.11 to Gerrit 2.13 on September 18, 2017. This downtime will begin at
> 1500UTC and is expected to take many hours as we have to perform an
> offline update of Gerrit's secondary indexes. The outage should be
> complete by 2359UTC.
> 
> This upgrade is a relatively minor one for users. You'll find that
> mobile use of Gerrit is slightly better (though still not great). The
> bug that forces us to reapply Approval votes rather than just rechecking
> has also been addressed. If you'd like to test out Gerrit 2.13 you can
> do so at https://review-dev.openstack.org.
> 
> The date we have chosen is the Monday after the PTG. The
> expectation/hope is that many people will still be traveling or
> otherwise recovering from the PTG so demand for Gerrit will be low. By
> doing it on Monday we also hope that there will be load on the service
> the following day which should help shake out any issues quickly (in the
> past we've done it on weekends then had to wait a couple days before
> problems are noticed).
> 
> If you have any concerns or feedback please let the Infra team know.
> 
> Thank you,
> Clark

This is a friendly reminder that we will be upgrading Gerrit on
review.openstack.org the Monday after PTG. Expect a prolonged service
outage. Once again let us know if you have any questions or concerns.

Thank you,
Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [distributions][devstack][requirements] Upper constraints broken in stable/ocata on CentOS 7/RHEL 7

2017-08-14 Thread Clark Boylan

On Mon, Aug 14, 2017, at 05:58 PM, Arun SAG wrote:
> Hi,
> 
> RHEL/CentOS 7 released new libvirt packages (upgraded from 2.x to 3.x)
> which broke bootstrapping devstack on stable/ocata branch. I have a
> review
> up here to fix this in upper-requirements.txt
> https://review.openstack.org/#/c/491032/
> 
> On the review i was asked to check if upgrading to libvirt-python 3.5.0
> will break any distributions, Is there a way to reach out to people
> running
> various distributions officially? If there are distro/vendor liaisons in
> this list, can you help me by reaching out to your internal teams to
> check
> whether it is okay to upgrade libvirt-python to 3.5.x in stable/ocata?
> 
> Thanks
> -- 
> Arun S A G
> http://zer0c00l.in/

My understanding is newer libvirt-python is supposed to work with older
libvirt. And on master we (the gate) use new libvirt-python (3.5.0)
against older libvirt (2.5.0) on Ubuntu Xenial which seems to support
this. My hunch is this bump should be perfectly fine, people will just
have to rebuild libvirt-python if installing from source. Not a bad idea
to get more confirmation though.

Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [nova][docs] Concerns with docs migration

2017-08-07 Thread Clark Boylan

On Wed, Aug 2, 2017, at 01:44 PM, Clark Boylan wrote:
> On Wed, Aug 2, 2017, at 11:37 AM, Sean Dague wrote:
> > On 08/02/2017 12:28 PM, Clark Boylan wrote:
> > > On Wed, Aug 2, 2017, at 07:55 AM, Matt Riedemann wrote:
> > >> Now that Stephen Finucane is back from enjoying his youth and 
> > >> gallivanting all over Europe, and we talked about a few things in IRC 
> > >> this morning on the docs migration for Nova, I wanted to dump my 
> > >> concerns here for broader consumption.
> > >>
> > >> 1. We know we have to fix a bunch of broken links by adding in redirects 
> > >> [1] which sdague started here [2]. However, that apparently didn't catch 
> > >> everything, e.g. [3], so I'm concerned we're missing other broken links. 
> > >> Is there a way to find out?
> > > 
> > > The infra team can generate lists of 404 urls fairly easily on the docs
> > > server. This won't show you everything but will show you what urls
> > > people are finding/using that 404.
> > 
> > If we could get a weekly report of 404 urls posted somewhere public,
> > that would be extremely useful, because the easy ones based on git
> > renames are done, and everything else is going to require human
> > inspection to figure out what the right landing target is.
> > 
> 
> I've pushed https://review.openstack.org/#/c/490175/ which will generate
> a report each day for roughly the last day's worth of 404s. You should
> be able to see them at https://docs.openstack.org/404s once the change
> merges and the cron job fires.
> 
> You can see what that will look like (from my test runs) at
> http://paste.openstack.org/show/617322/. Note that this isn't a complete
> file because paste.openstack.org truncated it but you'll get the full
> data from the wbeserver once this change merges.

http://files.openstack.org/docs-404s/ is now live (note it moved to
http://files.openstack.org/docs-404s because that is where we are
hosting raw bits of utility info for this hosting service). The current
content there was just generated by running the scripts manually, but it
should update daily at 0700UTC going forward.

Hope this helps,
Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [all][ptl][tc] IMPORTANT upcoming change to technical elections

2017-08-07 Thread Clark Boylan

On Mon, Aug 7, 2017, at 10:01 AM, Ken'ichi Ohmichi wrote:
> Hi
> 
> My name is on the nonmembers list and I guess that could be because of
> "Current Member Level: Speaker", not "Current Member Level: Foundation
> Member".
> Can I know how to change this member level?
> 
> Thanks
> Ken'ichi Ohmichi
> 

If you login to your foundation profile there should be a button that
says "Make me a foundation member" or similar. There are more details on
that process in the answer for
https://ask.openstack.org/en/question/56720/cannot-store-contact-information-when-updating-info-in-openstack-gerrit/.

Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [python-openstacksdk] Status of python-openstacksdk project

2017-08-04 Thread Clark Boylan

On Fri, Aug 4, 2017, at 02:37 PM, Kevin L. Mitchell wrote:
> On Fri, 2017-08-04 at 12:26 -0700, Boris Pavlovic wrote:
> > By the way stevedore is really providing very bad plugin experience
> > and should not be used definitely. 
> 
> Perhaps entrypointer[1]? ;)
> 
> [1] https://pypi.python.org/pypi/entrypointer
> -- 
> Kevin L. Mitchell 

The problems seem to be more with the use of entrypoints and the
incredible runtime cost for using them (which you've hinted at in
entrypointer's README). I don't think switching from $plugin lib to
$otherplugin lib changes much for tools like openstackclient unless we
first fix entrypoints (or avoid entrypoints all together). Until then
you must still rely on entrypoints to scan your python path which is
slow.

Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [all] review.openstack.org downtime and Gerrit upgrade

2017-08-03 Thread Clark Boylan

On Thu, Aug 3, 2017, at 09:11 AM, Markus Zoeller wrote:
> On 03.08.2017 00:57, Clark Boylan wrote:
> > Hello,
> > 
> > The Infra team is planning to upgrade review.openstack.org from Gerrit
> > 2.11 to Gerrit 2.13 on September 18, 2017. 
> > [...]
> > 
> > If you have any concerns or feedback please let the Infra team know.
> > 
> > Thank you,
> > Clark
> 
> Does this maybe include the "hashtag" feature [1] to assign a custom
> taxonomy to changes? We talked about that back in 2015 [2].
> After reading [3] I was still a little clueless, TBH.
> 
> [1] https://en.wikipedia.org/wiki/Tag_(metadata)
> [2]
> http://lists.openstack.org/pipermail/openstack-dev/2015-November/079835.html
> [3] https://phabricator.wikimedia.org/T37534
> 

My understanding is that the Gerrit hashtag feature requires notedb
which is part of the 2.14 release not the 2.13 release. We won't have it
after this upgrade but should have it once we get to 2.14.

Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [all] review.openstack.org downtime and Gerrit upgrade

2017-08-03 Thread Clark Boylan

On Thu, Aug 3, 2017, at 03:04 AM, Andrey Kurilin wrote:
> hi Clark,
> 
> Do you have any plans to update to 2.14 release which has a new polymer
> based user interface?
>

We decided to not try and go straight to 2.14 because it requires newer
java and introduces the notedb which is a fairly large change in how
Gerrit stores information. The hope is that by going to 2.13 we can get
an upgrade done more quickly which can then hopefully snowball into an
easier upgrade to 2.14 in the not too distant future.

Once 2.13 is done my goal is to upgrade the database behind Gerrit and
the base operating system which will put us in good position to upgrade
to 2.14. But it is still too early to know how quickly that can all
happen.

Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

[openstack-dev] [all] review.openstack.org downtime and Gerrit upgrade

2017-08-02 Thread Clark Boylan

Hello,

The Infra team is planning to upgrade review.openstack.org from Gerrit
2.11 to Gerrit 2.13 on September 18, 2017. This downtime will begin at
1500UTC and is expected to take many hours as we have to perform an
offline update of Gerrit's secondary indexes. The outage should be
complete by 2359UTC.

This upgrade is a relatively minor one for users. You'll find that
mobile use of Gerrit is slightly better (though still not great). The
bug that forces us to reapply Approval votes rather than just rechecking
has also been addressed. If you'd like to test out Gerrit 2.13 you can
do so at https://review-dev.openstack.org.

The date we have chosen is the Monday after the PTG. The
expectation/hope is that many people will still be traveling or
otherwise recovering from the PTG so demand for Gerrit will be low. By
doing it on Monday we also hope that there will be load on the service
the following day which should help shake out any issues quickly (in the
past we've done it on weekends then had to wait a couple days before
problems are noticed).

If you have any concerns or feedback please let the Infra team know.

Thank you,
Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [nova][docs] Concerns with docs migration

2017-08-02 Thread Clark Boylan

On Wed, Aug 2, 2017, at 11:37 AM, Sean Dague wrote:
> On 08/02/2017 12:28 PM, Clark Boylan wrote:
> > On Wed, Aug 2, 2017, at 07:55 AM, Matt Riedemann wrote:
> >> Now that Stephen Finucane is back from enjoying his youth and 
> >> gallivanting all over Europe, and we talked about a few things in IRC 
> >> this morning on the docs migration for Nova, I wanted to dump my 
> >> concerns here for broader consumption.
> >>
> >> 1. We know we have to fix a bunch of broken links by adding in redirects 
> >> [1] which sdague started here [2]. However, that apparently didn't catch 
> >> everything, e.g. [3], so I'm concerned we're missing other broken links. 
> >> Is there a way to find out?
> > 
> > The infra team can generate lists of 404 urls fairly easily on the docs
> > server. This won't show you everything but will show you what urls
> > people are finding/using that 404.
> 
> If we could get a weekly report of 404 urls posted somewhere public,
> that would be extremely useful, because the easy ones based on git
> renames are done, and everything else is going to require human
> inspection to figure out what the right landing target is.
> 

I've pushed https://review.openstack.org/#/c/490175/ which will generate
a report each day for roughly the last day's worth of 404s. You should
be able to see them at https://docs.openstack.org/404s once the change
merges and the cron job fires.

You can see what that will look like (from my test runs) at
http://paste.openstack.org/show/617322/. Note that this isn't a complete
file because paste.openstack.org truncated it but you'll get the full
data from the wbeserver once this change merges.

Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [nova][docs] Concerns with docs migration

2017-08-02 Thread Clark Boylan

On Wed, Aug 2, 2017, at 07:55 AM, Matt Riedemann wrote:
> Now that Stephen Finucane is back from enjoying his youth and 
> gallivanting all over Europe, and we talked about a few things in IRC 
> this morning on the docs migration for Nova, I wanted to dump my 
> concerns here for broader consumption.
> 
> 1. We know we have to fix a bunch of broken links by adding in redirects 
> [1] which sdague started here [2]. However, that apparently didn't catch 
> everything, e.g. [3], so I'm concerned we're missing other broken links. 
> Is there a way to find out?

The infra team can generate lists of 404 urls fairly easily on the docs
server. This won't show you everything but will show you what urls
people are finding/using that 404.

> 

snip

> [1] 
> http://lists.openstack.org/pipermail/openstack-dev/2017-August/120418.html
> [2] https://review.openstack.org/#/c/489650/
> [3] https://review.openstack.org/#/c/489641/

Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [infra][python3][congress] locally successful devstack setup fails in check-job

2017-07-19 Thread Clark Boylan

On Tue, Jul 18, 2017, at 12:47 PM, Eric K wrote:
> Hi all, looking for some hints/tips. Thanks so much in advance.
> 
> My local python3 devstack setup [2] succeeds, but in check-job a
> similarly
> configured devstack setup [1] fails for not installing congress client.
> 
> ./stack.sh:1439:check_libs_from_git
> /opt/stack/new/devstack/inc/python:401:die
> [ERROR] /opt/stack/new/devstack/inc/python:401 The following
> LIBS_FROM_GIT
> were not installed correct: python-congressclient
> 
> 
> It seems that the devstack setup in check-job never attempted to install
> congress client. Comparing the log [4] in my local run to the log in
> check-job [3], all these steps in my local log are absent from the
> check-job log:
> ++/opt/stack/congress/devstack/settings:source:9
> CONGRESSCLIENT_DIR=/opt/stack/python-congressclient
> 
> ++/opt/stack/congress/devstack/settings:source:52
> CONGRESSCLIENT_REPO=git://git.openstack.org/openstack/python-congressclient
> .git
> 
> Cloning into '/opt/stack/python-congressclient'

You won't see this logged by devstack because devstack-gate does all of
the git repo setup beforehand to ensure that the correct git refs are
checked out.

> 
> Check python version for : /opt/stack/python-congressclient
> Automatically using 3.5 version to install
> /opt/stack/python-congressclient based on classifiers
> 
> 
> Installing collected packages: python-congressclient
>   Running setup.py develop for python-congressclient
> Successfully installed python-congressclient
> 
> 
> [1] Check-job config:
> https://github.com/openstack-infra/project-config/blob/master/jenkins/jobs/
> congress.yaml#L65
> https://github.com/openstack-infra/project-config/blob/master/jenkins/jobs/
> congress.yaml#L111
> 
> [2] Local devstack local.conf:
> https://pastebin.com/qzuYTyAE   
> 
> [3] Check-job devstack log:
> http://logs.openstack.org/49/484049/1/check/gate-congress-dsvm-py35-api-mys
> ql-ubuntu-xenial-nv/7ae2814/logs/devstacklog.txt.gz
> 
> [4] Local devstack log:
> https://ufile.io/c9jhm

My best guess of what is happening here is that python-congressclient is
being installed to python2 from source so then when devstack checks if
python-congressclient is installed properly against python3 it fails.
You'll want to make sure that whatever is installing
python-congressclient is doing so against the appropriate python.

Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [docs][all][ptl] Contributor Portal and Better New Contributor On-boarding

2017-06-26 Thread Clark Boylan

On Mon, Jun 26, 2017, at 10:31 AM, Boris Pavlovic wrote:
> Mike,
> 
> I was recently helping one Intern to join OpenStack community and make
> some
> contribution.
> 
> And I found that current workflow is extremely complex and I think not
> all
> people that want to contribute can pass it..
> 
> Current workflow is:
> - Go to Gerrit sign in
> - Find how to contirubte to Gerrit (fail with this because no ssh key)
> - Find in Gerrit where to upload ssh (because no agreement)
> - Find in Gerrit where to accept License agreement  (fail because your
> agreement is invalid and contact info should be provided in Gerrit)
> - Server can't accept contact infomration (is what you see in gerrit)
> - Go to OpenStack.org sign in (to fix the problem with Gerrit)
> - Update contact information
> - When you try to contribute your first commit (if you already created
> it,
> you won't be able unit you do git commit --ament, so git review will add
> change-id)

Git review should automatically do this last step for you if a change id
is missing.

> 
> Overall it would take 1-2 days for people not familiar with OpenStack.
> 
> 
> What about if one make  "Sing-Up" page:
> 
> 1) Few steps: provide Username, Contact info, Agreement, SSH key (and it
> will do all work for you set Gerrit, OpenStack,...)
> 2) After one finished form it gets instruction for his OS how to setup
> and
> run properly git review
> 3) Maybe few tutorials (how to find some bug, how to test it and where
> are
> the docs, devstack, ...)
> 
> That would simplify onboarding process...

I think that Jeremy (fungi) has work in progress to tie electoral rolls
to foundation membership via an external lookup api that was recently
added to the foundation membership site. This means that we shouldn't
need to check that gerrit account info matches foundation account info
at CLA signing tiem anymore (at least this is my understanding, Jeremy
can correct me if I am wrong).

If this is the case it should make account setup much much simpler. You
just add an ssh key and sign the cla without worrying about account
details lining up.

Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [all][tc] Moving away from "big tent" terminology

2017-06-21 Thread Clark Boylan

On Wed, Jun 21, 2017, at 08:48 AM, Dmitry Tantsur wrote:
> On 06/19/2017 05:42 PM, Chris Hoge wrote:
> > 
> > 
> >> On Jun 15, 2017, at 5:57 AM, Thierry Carrez  wrote:
> >>
> >> Sean Dague wrote:
> >>> [...]
> >>> I think those are all fine. The other term that popped into my head was
> >>> "Friends of OpenStack" as a way to describe the openstack-hosted efforts
> >>> that aren't official projects. It may be too informal, but I do think
> >>> the OpenStack-Hosted vs. OpenStack might still mix up in people's head.
> >>
> >> My original thinking was to call them "hosted projects" or "host
> >> projects", but then it felt a bit incomplete. I kinda like the "Friends
> >> of OpenStack" name, although it seems to imply some kind of vetting that
> >> we don't actually do.
> > 
> > Why not bring back the name Stackforge and apply that
> > to unofficial projects? It’s short, descriptive, and unambiguous.
> 
> Just keep in mind that people always looked at stackforge projects as
> "immature 
> experimental projects". I remember getting questions "when is
> ironic-inspector 
> going to become a real project" because of our stackforge prefix back
> then, even 
> though it was already used in production.

A few days ago I suggested a variant of Thierry's suggestion below. Get
rid of the 'openstack' prefix entirely for hosting and use stackforge
for everything. Then officially governed OpenStack projects are hosted
just like any other project within infra under the stackforge (or Opium)
name. The problem with the current "flat" namespace is that OpenStack
means something specific and we have overloaded it for hosting. But we
could flip that upside down and host OpenStack within a different flat
namespace that represented "project hosting using OpenStack infra
tooling".

The hosting location isn't meant to convey anything beyond the project
is hosted on a Gerrit run by infra and tests are run by Zuul.
stackforge/ is not an (anti)endorsement (and neither is openstack/).

Unfortunately, I expect that doing this will also result in a bunch of
confusion around "why is OpenStack being renamed", "what is happening to
OpenStack governance", etc.

> >> An alternative would be to give "the OpenStack project infrastructure"
> >> some kind of a brand name (say, "Opium", for OpenStack project
> >> infrastructure ultimate madness) and then call the hosted projects
> >> "Opium projects". Rename the Infra team to Opium team, and voilà!
> >> -- 
> >> Thierry Carrez (ttx)

Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [requirements] Do we care about pypy for clients (broken by cryptography)

2017-05-31 Thread Clark Boylan

On Wed, May 31, 2017, at 07:39 AM, Sean McGinnis wrote:
> On Wed, May 31, 2017 at 09:47:37AM -0400, Doug Hellmann wrote:
> > Excerpts from Monty Taylor's message of 2017-05-31 07:34:03 -0500:
> > > On 05/31/2017 06:39 AM, Sean McGinnis wrote:
> > > > On Wed, May 31, 2017 at 06:37:02AM -0500, Sean McGinnis wrote:
> > > >>
> > > >> I am not aware of anyone using pypy, and there are other valid working
> > > >> alternatives. I would much rather just drop support for it than redo 
> > > >> our
> > > >> crypto functions again.
> > > >>
> > > 
> > 
> > This question came up recently for the Oslo libraries, and I think we
> > also agreed that pypy support was not being actively maintained.
> > 
> > Doug
> > 
> 
> Thanks Doug. If oslo does not support pypy, then I think that makes the
> decision for me. I will put up a patch to get rid of that job and stop
> wasting infra resources on it.

Just a couple things (I don't think it changes the decision made).

Cryptography does at least claim to support PyPy (see
https://pypi.python.org/pypi/cryptography/ trove identifiers), so
possibly a bug on their end that should get filed?

Also, our user clients should probably run under as many interpreters as
possible to make life easier for end users; however, they currently
depend on oslo and if oslo doesn't support pypy then likely not
reasonable to do in user clients.

Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] Documenting config drive - what do you want to see?

2017-05-24 Thread Clark Boylan

On Wed, May 24, 2017, at 07:39 AM, Matt Riedemann wrote:
> Rocky tipped me off to a request to document config drive which came up 
> at the Boston Forum, and I tracked that down to Clark's wishlist 
> etherpad [1] (L195) which states:
> 
> "Document the config drive. The only way I have been able to figure out 
> how to make a config drive is by either reading nova's source code or by 
> reading cloud-init's source code."
> 
> So naturally I have some questions, and I'm looking to flesh the idea / 
> request out a bit so we can start something in the in-tree nova devref.
> 
> Question the first: is this existing document [2] helpful? At a high 
> level, that's more about 'how' rather than 'what', as in what's in the 
> config drive.

This is helpful, but I think it is targeted to the deployer of OpenStack
and not the consumer of OpenStack.

> Question the second: are people mostly looking for documentation on the 
> content of the config drive? I assume so, because without reading the 
> source code you wouldn't know, which is the terrible part.

I'm (due to being a cloud user) mostly noticing the lack of information
on why cloud users might use config drive and how to consume it.
Documentation for the content of the config drive is a major piece of
what is missing. What do the key value pairs mean and how can I use them
to configure my nova instances to operate properly.

But also general information like, config drive can be more reliable
that metadata service as its directly attached to instance. Trade off is
possibly no live migration for the instance (under what circumstances
does live migration work and as a user is that discoverable?). What
filesystems are valid and I need to handle in my instance images? Will
the device id always be config-2? and so on. The user guide doc you
linked does try to address some of this, but seems to do so from the
perspective of the person deploying a cloud, "do this if you want to
avoid dhcp in your cloud", "install these things on compute hosts".

> Based on this, I can think of a few things we can do:
> 
> 1. Start documenting the versions which come out of the metadata API 
> service, which regardless of whether or not you're using it, is used to 
> build the config drive. I'm thinking we could start with something like 
> the in-tree REST API version history [3]. This would basically be a 
> change log of each version, e.g. in 2016-06-30 you got device tags, in 
> 2017-02-22 you got vlan tags, etc.

I like this as it should enable cloud users to implement tooling that
knows what it needs that can error properly if it ends up on a cloud too
old to contain the required information.

> 2. Start documenting the contents similar to the response tables in the 
> compute API reference [4]. For example, network_data.json has an example 
> response in this spec [5]. So have an example response and a table with 
> an explanation of fields in the response, so describe 
> ethernet_mac_address and vif_id, their type, whether or not they are 
> optional or required, and in which version they were added to the 
> response, similar to how we document microversions in the compute REST 
> API reference.

++

> 
> --
> 
> Are there other thoughts here or things I'm missing? At this point I'm 
> just trying to gather requirements so we can get something started. I 
> don't have volunteers to work on this, but I'm thinking we can at least 
> start with some basics and then people can help flesh it out over time.

I like this, starting small to produce something useful then going from
there makes sense to me.

Another idea I've had is making a tool that collected (or was fed)
information that goes into config drives and produces the device to
attach to a VM would be nice. Reason for this is while config drive is
something grown out of nova/OpenStack you often want to boot images with
Nova and other tools so making it easy for those other tools to work
properly too would be nice. In the simple case I build images locally,
then boot them with kvm to test that they work before pushing things
into OpenStack and config drive makes that somewhat complicated. Ideally
this would be the same code that nova uses to generate the config drives
just with a command line front end.

> 
> [1] https://etherpad.openstack.org/p/openstack-user-api-improvements
> [2] https://docs.openstack.org/user-guide/cli-config-drive.html
> [3]
> https://docs.openstack.org/developer/nova/api_microversion_history.html
> [4] https://developer.openstack.org/api-ref/compute/
> [5] 
> https://specs.openstack.org/openstack/nova-specs/specs/liberty/implemented/metadata-service-network-info.html#rest-api-impact

Thank you for bringing this up,
Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [neutron][heat] - making Neutron more friendly for orchestration

2017-05-19 Thread Clark Boylan

On Fri, May 19, 2017, at 02:03 PM, Kevin Benton wrote:
> I split this conversation off of the "Is the pendulum swinging on PaaS
> layers?" thread [1] to discuss some improvements we can make to Neutron
> to
> make orchestration easier.
> 
> There are some pain points that heat has when working with the Neutron
> API.
> I would like to get them converted into requests for enhancements in
> Neutron so the wider community is aware of them.
> 
> Starting with the port/subnet/network relationship - it's important to
> understand that IP addresses are not required on a port.
> 
> >So knowing now that a Network is a layer-2 network segment and a Subnet
> is... effectively a glorified DHCP address pool
> 
> Yes, a Subnet controls IP address allocation as well as setting up
> routing
> for routers, which is why routers reference subnets instead of networks
> (different routers can route for different subnets on the same network).
> It
> essentially dictates things related to L3 addressing and provides
> information for L3 reachability.

One thing that is odd about this is when creating a router you specify
the gateway information using a network which is l2 not l3. Seems like
it would be more correct to use a subnet rather than a network there?

Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] Is the pendulum swinging on PaaS layers?

2017-05-19 Thread Clark Boylan

On Fri, May 19, 2017, at 05:59 AM, Duncan Thomas wrote:
> On 19 May 2017 at 12:24, Sean Dague  wrote:
> 
> > I do get the concerns of extra logic in Nova, but the decision to break
> > up the working compute with network and storage problem space across 3
> > services and APIs doesn't mean we shouldn't still make it easy to
> > express some pretty basic and common intents.
> 
> Given that we've similar needs for retries and race avoidance in and
> between glance, nova, cinder and neutron, and a need or orchestrate
> between at least these three (arguably other infrastructure projects
> too, I'm not trying to get into specifics), maybe the answer is to put
> that logic in a new service, that talks to those four, and provides a
> nice simple API, while allowing the cinder, nova etc APIs to remove
> things like internal retries?

The big issue with trying to solve the problem this way is that various
clouds won't deploy this service then your users are stuck with the
"base" APIs anyway or deploying this service themselves. This is mostly
ok until you realize that we rarely build services to run "on" cloud
rather than "in" cloud so I as the user can't sanely deploy a new
service this way, and even if I can I'm stuck deploying it for the 6
clouds and 15 regions (numbers not exact) because even more rarely do we
write software that is multicloud/region aware.

We need to be very careful if this is the path we take because it often
doesn't actually make the user experience better.

Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] Development workflow for bunch of patches

2017-04-19 Thread Clark Boylan

On Wed, Apr 19, 2017, at 01:11 AM, Sławek Kapłoński wrote:
> Hello,
> 
> I have a question about how to deal with bunch of patches which depends
> one on another.
> I did patch to neutron (https://review.openstack.org/#/c/449831/
> ) which is not merged yet but I
> wanted to start also another patch which is depend on this one
> (https://review.openstack.org/#/c/457816/
> ).
> Currently I was trying to do something like:
> 1. git review -d 
> 2. git checkout -b new_branch_for_second_patch
> 3. Make second patch, commit all changes
> 4. git review <— this will ask me if I really want to push two patches to
> gerrit so I answered „yes”
> 
> Everything is easy for me as long as I’m not doing more changes in first
> patch. How I should work with it if I let’s say want to change something
> in first patch and later I want to make another change to second patch?
> IIRC when I tried to do something like that and I made „git review” to
> push changes in second patch, first one was also updated (and I lost
> changes made for this one in another branch).
> How I should work with something like that? Is there any guide about that
> (I couldn’t find such)?

The way I work is to always edit the tip of the series then "squash
back" edits as necessary.
So lets say we already have A <- B <- C and now I want to edit A and
push everything back so that it is up to date.

To do this I make a new commit such that A <- B <- C <-D then `git
rebase -i HEAD~4` and edit the rebase so that I have:

  pick A
  squash D
  pick B
  pick C

Then after rebase I end up with A' <- B' <- C' and when I git review all
three are updated properly in gerrit. The basic idea here is that you
are working on a series not a single commit so any time you make changes
you curate the entire series.

Jim Blair even wrote a tool called git-restack to make this sort of
workflow easy. You can pip install it.

Hope this helps,
Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [oslo] Can we stop global requirements update?

2017-04-19 Thread Clark Boylan

On Wed, Apr 19, 2017, at 05:54 AM, Julien Danjou wrote:
> Hoy,
> 
> So Gnocchi gate is all broken (agan) because it depends on "pbr" and
> some new release of oslo.* depends on pbr!=2.1.0.
> 
> Neither Gnocchi nor Oslo cares about whatever bug there is in pbr 2.1.0
> that got in banished by requirements Gods. It does not prevent it to be
> used e.g. to install the software or get version information. But it
> does break anything that is not in OpenStack because well, pip installs
> the latest pbr (2.1.0) and then oslo.* is unhappy about it.

It actually breaks everything, including OpenStack. Shade and others are
affected by this as well. The specific problem here is that PBR is a
setup_requires which means it gets installed by easy_install before
anything else. This means that the requirements restrictions are not
applied to it (neither are the constraints). So you get latest PBR from
easy_install then later when something checks the requirements
(pkg_resources console script entrypoints?) they break because latest
PBR isn't allowed.

We need to stop pinning PBR and more generally stop pinning any
setup_requires (there are a few more now since setuptools itself is
starting to use that to list its deps rather than bundling them).

> So I understand the culprit is probably pip installation scheme, and we
> can blame him until we fix it. I'm also trying to push pbr 2.2.0 to
> avoid the entire issue.

Yes, a new release of PBR undoing the "pin" is the current sane step
forward for fixing this particular issue. Monty also suggested that we
gate global requirements changes on requiring changes not pin any
setup_requires.

> But for the future, could we stop updating the requirements in oslo libs
> for no good reason? just because some random OpenStack project hit a bug
> somewhere?
> 
> For example, I've removed requirements update on tooz¹ for more than a
> year now, which did not break *anything* in the meantime, proving that
> this process is giving more problem than solutions. Oslo libs doing that
> automatic update introduce more pain for all consumers than anything (at
> least not in OpenStack).

You are likely largely shielded by the constraints list here which is
derivative of the global requirements list. Basically by using
constraints you get distilled global requirements and even without being
part of the requirements updates you'd be shielded from breakages when
installed via something like devstack or other deployment method using
constraints.

> So if we care about Oslo users outside OpenStack, I beg us to stop this
> crazyness. If we don't, we'll just spend time getting rid of Oslo over
> the long term…

I think we know from experience that just stopping (eg reverting to the
situation we had before requirements and constraints) would lead to
sadness. Installations would frequently be impossible due to some
unresolvable error in dependency resolution. Do you have some
alternative in mind? Perhaps we loosen the in project requirements and
explicitly state that constraints are known to work due to testing and
users should use constraints? That would give users control to manage
their own constraints list too if they wish. Maybe we do this in
libraries while continuing to be more specific in applications?

> 
> My 2c,
> 
> Cheers,
> 
> ¹ Unless some API changed in a dep and we needed to raise the dep,
> obviously.
> 
> -- 
> Julien Danjou
> # Free Software hacker
> # https://julien.danjou.info

I don't have all the answers, but am fairly certain the situation we
have today is better than the one from several years ago. It is just not
perfect. I think we are better served by refining the current setup or
replacing it with something better but not by reverting.

Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [infra] is it ok to create extra users with sudo permissions during devstack run

2017-04-10 Thread Clark Boylan

On Mon, Apr 10, 2017, at 08:31 AM, Pavlo Shchelokovskyy wrote:
> Hi infra team,
> 
> on order to test a piece of functionality I am developing, during the
> devstack plugin run I need to create an extra user with password-less
> sudo
> permissions. As I am not sure of intricacies of our infra setup, I'd like
> to clarify if it is acceptable.
> 
> TL;DR
> There is 'openstack/networking-generic-switch' project that mainly aims
> to
> provide a Neutron ML2 plugin suitable to manage cheap HW switches that
> only
> allow configuration over SSH. The problem with those is that these
> switches
> usually have limitations on the number of concurrent SSH sessions open on
> the switch.
> 
> In order to overcome this, I am attempting to introduce DLM to
> networking-generic-switch to globally limit the number of active SSH
> connections to a given switch across all threads of neutron-service on
> all
> hosts [0].
> 
> To test this locally in my Xenial DevStack VM, I am creating and
> configuring "ovsmanager" user with password-less sudo permissions (so it
> is
> able to manage OVS), limit the number of allowed sessions for that user
> in
> /etc/security/limits.d/ and configure networking-generic-switch to access
> localhost via that user to simulate a switch with limited number
> of allowed SSH sessions.
> 
> My questions is is it ok if I replicate this logic in the devstack plugin
> of networking-generic-switch to set up gate testing for this feature?
> In the end it seems it boils down to whether infra re-uses VMs it creates
> to run gate jobs for anything else and if such changes can affect those
> re-using these VMs, but I might be missing something else.
> 
> [0] https://review.openstack.org/#/c/452959/

The test instances that run devstack are single use VMs that are not
reused. Every job running on these instances starts with sudo access,
but by default we remove sudo access for the stack user which is what
the OpenStack services run as. We do this to force those services to use
rootwrap (or its equivalent) and test that the rootwrap rules are
functional.

As long as you create your new user during the initial devstack run you
shouldn't have any issues with this.

It wasn't mentioned above, but I would run a separate sshd for this
rather than modify the existing one as ssh is the control mechanism for
the test jobs. Better to run a separate independent service that won't
conflict with job control.

Hope this helps,
Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

1 2 3 4 >

1 - 100 of 336 matches

Mail list logo