Re: [openstack-dev] [all][tc] establishing project-wide goals

2016-08-02 Thread Clint Byrum
Excerpts from Jay Pipes's message of 2016-08-01 10:23:57 -0400:
> On 08/01/2016 08:33 AM, Sean Dague wrote:
> > On 07/29/2016 04:55 PM, Doug Hellmann wrote:
> >> One of the outcomes of the discussion at the leadership training
> >> session earlier this year was the idea that the TC should set some
> >> community-wide goals for accomplishing specific technical tasks to
> >> get the projects synced up and moving in the same direction.
> >>
> >> After several drafts via etherpad and input from other TC and SWG
> >> members, I've prepared the change for the governance repo [1] and
> >> am ready to open this discussion up to the broader community. Please
> >> read through the patch carefully, especially the "goals/index.rst"
> >> document which tries to lay out the expectations for what makes a
> >> good goal for this purpose and for how teams are meant to approach
> >> working on these goals.
> >>
> >> I've also prepared two patches proposing specific goals for Ocata
> >> [2][3].  I've tried to keep these suggested goals for the first
> >> iteration limited to "finish what we've started" type items, so
> >> they are small and straightforward enough to be able to be completed.
> >> That will let us experiment with the process of managing goals this
> >> time around, and set us up for discussions that may need to happen
> >> at the Ocata summit about implementation.
> >>
> >> For future cycles, we can iterate on making the goals "harder", and
> >> collecting suggestions for goals from the community during the forum
> >> discussions that will happen at summits starting in Boston.
> >>
> >> Doug
> >>
> >> [1] https://review.openstack.org/349068 describe a process for managing 
> >> community-wide goals
> >> [2] https://review.openstack.org/349069 add ocata goal "support python 3.5"
> >> [3] https://review.openstack.org/349070 add ocata goal "switch to oslo 
> >> libraries"
> >
> > I like the direction this is headed. And I think for the test items, it
> > works pretty well.
> 
> I commented on the reviews, but I disagree with both the direction and 
> the proposed implementation of this.
> 
> In short, I think there's too much stick and not enough carrot. We 
> should create natural incentives for projects to achieve desired 
> alignment in certain areas, but placing mandates on project teams in a 
> diverse community like OpenStack is not useful.
> 
> The consequences of a project team *not* meeting these proposed mandates 
> has yet to be decided (and I made that point on the governance patch 
> review). But let's say that the consequences are that a project is 
> removed from the OpenStack big tent if they fail to complete these 
> "shared objectives".
> 
> What will we do when Swift decides that they have no intention of using 
> oslo.messaging or oslo.config because they can't stand fundamentals 
> about those libraries? Are we going to kick Swift, a founding project of 
> OpenStack, out of the OpenStack big tent?

Membership in the tent is the carrot, and ejection is the stick. The
big tent was an acknowledgement that giving out carrots makes everyone
stronger (all these well fed projects have led to a bigger supply of
carrots in general).

I think this proposal is an attempt to manage the ensuing chaos. We've
all seen carrot farmers abandon their farms, as well as duplicated effort
leading to a confusing experience for consumers of OpenStack's products.

I think there's room to build consensus around diversity in implementation
and even culture. We don't need to be a monolith. Our Swift development
community is bringing strong, powerful insight to the overall effort,
and strengthens the OpenStack brand considerably.  Certainly we can
support projects doing things their own way in some instances if they
so choose. What we don't want, however, is projects that operate in
relative isolation, without any cohesion, even loose cohesion, with the
rest.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Kolla] [Fuel] [tc] Looks like Mirantis is getting Fuel CCP (docker/k8s) kicked off

2016-07-28 Thread Clint Byrum
Excerpts from Fox, Kevin M's message of 2016-07-27 22:56:56 +:
> I think that would be true, if the container api was opinionated. for 
> example, trying to map only a subset of the openstack config options to 
> docker environment variables. This would make the containers specific to what 
> your talking about. Which business rules to support, what hardware, etc.
> 
> But the api is a fairly small one. Its mostly a standardized way to pass 
> config files in through docker volumes and get them to land in the right 
> places in the container. You should be able to use any system you want 
> (puppet, chef, jinja, shell scripts) to deal with the business logic and 
> such, to generate the config files, then use the standard api contract to 
> ensure that whatever way you launch the container, (puppet, chef, heat, 
> docker run, kubelet, kubernetes, etc) it behaves the same. The way your 
> generated config file specifies.
> 
> Kolla has provided many different variants of each of the containers (centos, 
> ubuntu, etc), showing that api contract is pretty flexible.
> 
> A similar thing is going on with kolla-kubernetes.
> 

I appreciate your optimism, however, Kolla is not "the deployment of
OpenStack". It is a set of tools to deploy OpenStack with a set of options
available. If it were a small thing to do, people would choose it. But
instead, they know, the combinatorial matrix of options is staggering,
and one is much better off specializing if they don't fit into the
somewhat generic model that any tool like Kolla provides.

I'd say focus on _keeping things like Kolla focused_ rather than
worrying about making it interoperable.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Kolla] [Fuel] [tc] Looks like Mirantis is getting Fuel CCP (docker/k8s) kicked off

2016-07-27 Thread Clint Byrum
Excerpts from Fox, Kevin M's message of 2016-07-27 21:51:15 +:
> Its a standard way of launching a given openstack service container with 
> specified config regardless if its backed with a redhat or ubuntu or source 
> based package set that the Operator can rely on having a standardized 
> interface to. distro packages don't grantee that kind of thing and don't want 
> to.
> 
> To me, its an abstraction api kind of like nova is to kvm vs xen. the nova 
> user shouldn't have to care which backend is chosen.
> 

You're not wrong, and I do believe there is programming happening to
these interfaces. However the surface area of the API you describe is
_WAY_ too big to justify the work to maintain it as a single entity.

This is really why deployment tooling is so diverse. Hardware, networks,
business rules, operating systems, licensing, regulatory constraints...
all of those are part of a real deployment, and trying to make an API
that allows covering all of those bases, versus just having a bunch of
specific-ish implementations, has so far resulted in acceptance of more
implementations nearly every time.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Kolla] [Fuel] [tc] Looks like Mirantis is getting Fuel CCP (docker/k8s) kicked off

2016-07-27 Thread Clint Byrum
Excerpts from Ed Leafe's message of 2016-07-27 10:59:06 -0500:
> On Jul 27, 2016, at 10:51 AM, Joshua Harlow  wrote:
> 
> >> Whether to have competing projects in the big tent was debated by the TC
> >> at the time and my recollection is that we decided that was a good thing
> >> -- if someone wanted to develop a Nova replacement, then let them do it
> >> in public with the community. It would either win or lose based on its
> >> merits. Why is this not something which can happen here as well?
> > 
> > For real, I (or someone) can start a nova replacement without getting 
> > rejected (or yelled at or ...) by the TC saying it's a competing project??? 
> > Wow, this is news to me...
> 
> No, you can’t start a Nova replacement and still call yourself OpenStack.
> 

Is that true? I thought the thing would be that you'd have to be
running Nova to still use the mark. As long as you're running Nova and
the users can use Nova (the real one), if you also had a competitor to
Nova available, it would just need to pass the relatively low bar of
big tent membership to still be a part of "OpenStack".

> The sense I have gotten over the years from the TC is that gratuitous 
> competition is strongly discouraged. When the Monasca project was being 
> considered for the big tent, there was a *lot* of concern expressed over the 
> partial overlap with Ceilometer. It was only after much reassurance that the 
> overlap was not fundamental that these objections were dropped.
> 
> I have no stake in either Fuel or Kolla, so my only concern is duplication of 
> effort. You can always achieve more working together, though it will never 
> happen as fast as when you go it alone. It’s a trade-off: the needs of the 
> vendor vs. the health of the community.
> 
> 
> -- Ed Leafe
> 

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tc][all] Big tent? (Related to Plugins for all)

2016-07-20 Thread Clint Byrum
Excerpts from Fox, Kevin M's message of 2016-07-20 20:12:48 +:
> And maybe this raises an interesting defininition mismatch in the 
> conversation.
> 
> There is archetectural stuff like, do we support 7 different web frameworks, 
> or do we standardize on flask... python vs go.
> 

Yeah meh, that's developer centric implementation details and I think
not very interesting. To me the architectural questions are deeper. "How
do we do locking?", "How should we enable inter-process and inter-host
communication?", "How do we handle distributed transactions?" and "What
concurrency model should we use?".

> Theres also the architectural stuff at the, what interactive surface do you 
> expose to users/operators. How consistant is it. Does it have all the 
> features, no matter where they are implemented to do work.

I believe this is the goal of the API-WG. But again, they're not there
to compel, they're there to advise, assist, and work. Ultimately, if an
API is created and is poor, like Linus, the community can definitely say
"No" and refuse to use that API.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tc][all] Big tent? (Related to Plugins for all)

2016-07-20 Thread Clint Byrum
Excerpts from James Bottomley's message of 2016-07-20 08:31:34 -0700:
> So this is where the Open Source method takes over.  Change is produced
> by those people who most care about it because they're invested.  To
> take your Cinder example, you're unlikely to find them within Cinder
> because any project has inertia that resists change.  It takes the
> energy of the outside group X to force the change to Y, but this means
> that X often gets to propose, develop and even code Y.  Essentially
> they become drive by coders for Cinder.  This is where Open Source
> differs from Enterprise because you have the code and the access, you
> can do this.  However, you have to remember the inertia problem and
> build what you're trying to do as incrementally as possible: the larger
> the change, the bigger the resistance to it.  It's also a good test of
> the value of the change: if group X can't really be bothered to code it
> (and Cinder doesn't want it) then perhaps there's not really enough
> value in it anyway and it shouldn't happen.
> 

Thanks so much for stating this James. I couldn't agree more. A group
that can actually _accomplish_ change, and not just suggest it, is
exactly what we're working to start with the architecture WG. There are
plenty of people with the will to change, and I feel strongly that if
those people are given a structure and support, then they'll find the
time and space to complete these objectives.

I just want to make one nuance point about Cinder changes: the recent
DLM work, done outside any architecture working group, did actually come
from both inside and outside Cinder. The Cinder team realized something
was happening that would perhaps affect everyone, and raised it to the
cross-project level, which helped external individuals get involved. So
these initiatives can come from either direction. The key is that enough
momentum builds up to counter the natural inertia that you mentioned in
your message.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tc][all] Big tent? (Related to Plugins for all)

2016-07-19 Thread Clint Byrum
Excerpts from Fox, Kevin M's message of 2016-07-19 21:59:29 +:
> Yeah. I'm not saying any project has done it out of malice. Everyone's just 
> doing whats best for their project. But it does not seem like there is an 
> overarching entity watching over the whole or (pushing, encouraging, 
> enticing, whatever words are appropriate here) projects to work on the 
> commons anymore. It use to be that incubating projects were pushed to help 
> the other projects integrate with them by the incubating project being 
> strongly encouraged to write the integrations themselves as part of the 
> incubation process. Now it seems like each project just spawns and then hopes 
> someone else will do the legwork. Using the carrot of incubation to further 
> the commons is not an ideal solution, but it was at least something.
> 
> Linux has an overarching entity, Linus for that task. He's there to make sure 
> that someone is really paying attention to integration of the whole thing 
> into a cohesive, usable whole. Sometimes he pushes back and makes sure 
> commons work happens as part of features getting in to ensure commons work 
> still gets done. I'm not advocating a benevolent dictator for OpenStack 
> though.
> 
> But what I thought what the TC's job was, was benevolent dictators, which 
> each subproject (or subsystem in linux terms) are required to give up final 
> say to, so that sometimes the projects have to sacrifice a bit so that the 
> whole can flourish and those benevolent dictators are elected for a time, by 
> the OpenStack community. (Actually, I think  that kind of makes it a 
> Democratic Republic... but I digress) Maybe I misunderstood what the TC's 
> about. But I think we still do need some folks elected by the community to be 
> involved in making sure OpenStack as a whole has a cohesive technical 
> architecture that actually addresses user problems and that have some power 
> to to stop the "this feature belongs in this project", "no, it belongs in 
> that project", "no, lets spawn 3 new projects to cover that case" kinds of 
> things, make the difficult decision, and ask a project to help the community 
> out by going along with "a solution" and we all can move on. Critical 
> features have been stuck 
>  in this space for years and OpenStacks competitors have had solutions for 
> years. 

You're right, this is the TC's job. However, the TC does it more by
exception, rather than by default. So while Linus and the subsystem
leaders in the kernel look after changes in general, the TC is there to
catch things that bubble out of the processes in place. So, I think the
TC needs contributors to bring _specific_ things that need to be handled,
and they will. They're just not going to be able to stand at the gate
and review every spec... this process only scales to the velocity and
breadth that OpenStack has if we get contributors involved.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tc][all] Big tent? (Related to Plugins for all)

2016-07-19 Thread Clint Byrum
Excerpts from Julien Danjou's message of 2016-07-19 09:30:36 +0200:
> On Mon, Jul 18 2016, Joshua Harlow wrote:
> 
> > Thus why I think the starting of the architecture working group is a good
> > thing; because I have a believe that people are forgetting among all of this
> > that such a group holds a lot of the keys to the kingdom (whether u, the
> > reader, want to admit that or not is well the readers problem) in openstack
> > (sorry and no disrespect to independent folks & contributors), but most of 
> > us
> > work for large companies that have architects (and leads) and if those
> > architects (and leads) can get together cross-company to aggregate and 
> > (agree
> > on) and solve actual problems then that really is IMHO the only way to keep 
> > our
> > projects healthy (assuming we can even do that at this stage).
> 
> I think it is a bit naive to think any working group is going to fix
> architectural problems. You know first hand what happened¹ with the Nova
> service group and tooz for example.
> 

Perhaps if we form and start working together as a group, we can disect
why nothing happened, build consensus on the most important thing to do
next, and actually fix some architectural problems. The social structure
that teams have is a huge part of the deadlock we find ourselves in
with certain controversial changes. The idea is to unroll the dependency
loop and start _somewhere_ rather than where a lot of these efforts die:
starting _everywhere_.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [grenade] upgrades vs rootwrap

2016-07-06 Thread Clint Byrum
Excerpts from Matthew Treinish's message of 2016-07-06 11:55:53 -0400:
> On Wed, Jul 06, 2016 at 10:34:49AM -0500, Matt Riedemann wrote:
> > On 6/27/2016 6:24 AM, Sean Dague wrote:
> > > On 06/26/2016 10:02 PM, Angus Lees wrote:
> > > > On Fri, 24 Jun 2016 at 20:48 Sean Dague  > > > > wrote:
> > > > 
> > > > On 06/24/2016 05:12 AM, Thierry Carrez wrote:
> > > > > I'm adding Possibility (0): change Grenade so that rootwrap
> > > > filters from
> > > > > N+1 are put in place before you upgrade.
> > > > 
> > > > If you do that as general course what you are saying is that every
> > > > installer and install process includes overwriting all of rootwrap
> > > > before every upgrade. Keep in mind we do upstream upgrade as 
> > > > offline,
> > > > which means that we've fully shut down the cloud. This would remove 
> > > > the
> > > > testing requirement that rootwrap configs were even compatible 
> > > > between N
> > > > and N+1. And you think this is theoretical, you should see the 
> > > > patches
> > > > I've gotten over the years to grenade because people didn't see an 
> > > > issue
> > > > with that at all. :)
> > > > 
> > > > I do get that people don't like the constraints we've self imposed, 
> > > > but
> > > > we've done that for very good reasons. The #1 complaint from 
> > > > operators,
> > > > for ever, has been the pain and danger of upgrading. That's why we 
> > > > are
> > > > still trademarking new Juno clouds. When you upgrade Apache, you 
> > > > don't
> > > > have to change your config files.
> > > > 
> > > > 
> > > > In case it got lost, I'm 100% on board with making upgrades safe and
> > > > straightforward, and I understand that grenade is merely a tool to help
> > > > us test ourselves against our process and not an enemy to be worked
> > > > around.  I'm an ops guy proud and true and hate you all for making
> > > > openstack hard to upgrade in the first place :P
> > > > 
> > > > Rootwrap configs need to be updated in line with new rootwrap-using code
> > > > - that's just the way the rootwrap security mechanism works, since the
> > > > security "trust" flows from the root-installed rootwrap config files.
> > > > 
> > > > I would like to clarify what our self-imposed upgrade rules are so that
> > > > I can design code within those constraints, and no-one is answering my
> > > > question so I'm just getting more confused as this thread progresses...
> > > > 
> > > > ***
> > > > What are we trying to impose on ourselves for upgrades for the present
> > > > and near future (ie: while rootwrap is still a thing)?
> > > > ***
> > > > 
> > > > A. Sean says above that we do "offline" upgrades, by which I _think_ he
> > > > means a host-by-host (or even global?) "turn everything (on the same
> > > > host/container) off, upgrade all files on disk for that host/container,
> > > > turn it all back on again".  If this is the model, then we can trivially
> > > > update rootwrap files during the "upgrade" step, and I don't see any
> > > > reason why we need to discuss anything further - except how we implement
> > > > this in grenade.
> > > > 
> > > > B. We need to support a mix of old + new code running on the same
> > > > host/container, running against the same config files (presumably
> > > > because we're updating service-by-service, or want to minimise the
> > > > service-unavailability during upgrades to literally just a process
> > > > restart).  So we need to think about how and when we stage config vs
> > > > code updates, and make sure that any overlap is appropriately allowed
> > > > for (expand-contract, etc).
> > > > 
> > > > C. We would like to just never upgrade rootwrap (or other config) files
> > > > ever again (implying a freeze in as_root command lines, effective ~a
> > > > year ago).  Any config update is an exception dealt with through
> > > > case-by-case process and release notes.
> > > > 
> > > > 
> > > > I feel like the grenade check currently implements (B) with a 6 month
> > > > lead time on config changes, but the "theory of upgrade" doc and our
> > > > verbal policy might actually be (C) (see this thread, eg), and Sean
> > > > above introduced the phrase "offline" which threw me completely into
> > > > thinking maybe we're aiming for (A).  You can see why I'm looking for
> > > > clarification  ;)
> > > 
> > > Ok, there is theory of what we are striving for, and there is what is
> > > viable to test consistently.
> > > 
> > > The thing we are shooting for is making the code Continuously
> > > Deployable. Which means the upgrade process should be "pip install -U
> > > $foo && $foo-manage db-sync" on the API surfaces and "pip install -U
> > > $foo; service restart" on everything else.
> > > 
> > > Logic we can put into the python install process is common logic shared
> > > by all deployment tools, and we can encode it in there. So all
> > > installers just get it.
> 

Re: [openstack-dev] New Python35 Jobs coming

2016-07-03 Thread Clint Byrum
Excerpts from Henry Gessau's message of 2016-07-03 15:26:23 -0400:
> Clark Boylan  wrote:
> > The infra team is working on taking advantage of the new Ubuntu Xenial
> > release including running unittests on python35. The current plan is to
> > get https://review.openstack.org/#/c/336272/ merged next Tuesday (July
> > 5, 2016). This will add non voting python35 tests restricted to >=
> > master/Newton on all projects that had python34 testing.
> > 
> > The expectation is that in many cases python35 tests will just work if
> > python34 testing was also working. If this is the case for your project
> > you can propose a change to openstack-infra/project-config to make these
> > jobs voting against your project. You should only need to edit
> > jenkins/jobs/projects.yaml and zuul/layout.yaml and remove the '-nv'
> > portion of the python35 jobs to do this.
> > 
> > We do however expect that there will be a large group of failed tests
> > too. If your project has a specific tox.ini py34 target to restrict
> > python3 testing to a specific list of tests you will need to add a tox
> > target for py35 that does the same thing as the py34 target. We have
> > also seen bug reports against some projects whose tests rely on stable
> > error messages from Python itself which isn't always the case across
> > version changes so these tests will need to be updated as well.
> > 
> > Note this change will not add python35 jobs for cases where projects
> > have special tox targets. This is restricted just to the default py35
> > unittesting.
> > 
> > As always let us know if you questions,
> > Clark
> 
> How soon can projects replace py34 with py35?
> 
> I tried py35 for neutron locally, and it ran without errors.
> 

I think we should be aggressive on python 3.5 vs. 3.4, since anywhere
that shipped 3.4 also shipped 2.7. Otherwise we end up wasting time on
whatever subtle differences there are.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] Proposal: Architecture Working Group

2016-06-30 Thread Clint Byrum
Excerpts from Mike Perez's message of 2016-06-30 14:10:30 -0700:
> On 09:02 Jun 30, Clint Byrum wrote:
> > Excerpts from Mike Perez's message of 2016-06-30 07:50:42 -0700:
> > > On 11:31 Jun 20, Clint Byrum wrote:
> > > > Excerpts from Joshua Harlow's message of 2016-06-17 15:33:25 -0700:
> > > > > Thanks for getting this started Clint,
> > > > > 
> > > > > I'm happy and excited to be involved in helping try to guide the 
> > > > > whole 
> > > > > ecosystem together (it's also why I like being in oslo) to a 
> > > > > architecture that is more cohesive (and is more of something that we 
> > > > > can 
> > > > > say to our current or future children that we were all involved and 
> > > > > proud to be involved in creating/maturing...).
> > > > > 
> > > > > At a start, for said first meeting, any kind of agenda come to mind, 
> > > > > or 
> > > > > will it be more a informal gathering to start (either is fine with 
> > > > > me)?
> > > > > 
> > > > 
> > > > I've been hesitant to fill this in too much as I'm still forming the
> > > > idea, but here are the items I think are most compelling to begin with:
> > > > 
> > > > * DLM's across OpenStack -- This is already under way[1], but it seems 
> > > > to
> > > >   have fizzled out. IMO that is because there's no working group who
> > > >   owns it. We need to actually write some plans.
> > > 
> > > Not meaning to nitpick, but I don't think this is a compelling reason for 
> > > the
> > > architecture working group. We need a group that wants to spend time on
> > > reviewing the drivers being proposed. This is like saying we need the
> > > architecture working group because no working group is actively reshaping 
> > > quotas
> > > cross-project. 
> > > 
> > 
> > That sounds like a reasoned deep argument, not a nitpick, so thank you
> > for making it.
> > 
> > However, I don't think lack of drivers is standing in the way of a DLM
> > effort. It is a lack of coordination. There was a race to the finish line
> > to make Consul and etcd drivers, but then, like the fish in finding Nemo,
> > the drivers are in bags floating in the bay.. now what?
> 
> Some drivers are still in review, or likely abandoned efforts so it's not
> really a bay of options as you're describing it.
> 

Heh, that kind of sounds like the same thing.. not a bay of options,
just options stuck between the fish tank and the bay.

> Cinder has continued forward with being the guinea pig as planned with Tooz.
> [1] I don't think this a great example for your argument because
> 
> 1) Not all projects need this.
> 
> 2) This was discussed in Tokyo and just done in Mitaka for Cinder. Why not 
> give
>projects time to evaluate when they're ready?
> 
> > Nobody owns this effort. Everybody gets busy. Nothing gets done. We
> > continue to bring it up in the hallway and wish we had time.
>
> I don't ever foresee a release where we say "All projects support DLM". In 
> fact
> I see things going as planned because:
> 
> 1) We have a project that carried it forward as planned.
> 2) We're purposely not repeat the MQ mess. Only DLM drivers with support from
>members of the community are surfacing up.
> 
> I would ask you instead, how exactly are you measuring success here?
>

That's a great question. I think the community did what I'd like to
see the working group do as it's first order of business: Mapped the
territory, and provided a plan to improve it. So to your point, there's no
need for an architecture working group if this always happens as planned
in all instances. I'd personally like to see it happen this way all the
time, which is the primary reason I'm motivated to coordinate this
group.

As a second order of business, I think this group would have a hard time
keeping momentum if all it did were write architectural plans. Each of
the designs it helps create need to be backed up with actual work. Who
cares if you drew a picture of a bridge: show me the bridge. :)

> > This is just a place to have a meeting and some people who get together
> > and say "hey is that done yet? Do you need help? is that still a
> > priority?". Could we do this as part of Oslo? Yes! But, I want this to
> > be about going one step higher, and actually taking the implementations
> > into the respective projects.
> 
> How about calling a cross-project meeting? [2] I have already spent the time
> orga

Re: [openstack-dev] [all] Proposal: Architecture Working Group

2016-06-30 Thread Clint Byrum
Excerpts from Mike Perez's message of 2016-06-30 07:50:42 -0700:
> On 11:31 Jun 20, Clint Byrum wrote:
> > Excerpts from Joshua Harlow's message of 2016-06-17 15:33:25 -0700:
> > > Thanks for getting this started Clint,
> > > 
> > > I'm happy and excited to be involved in helping try to guide the whole 
> > > ecosystem together (it's also why I like being in oslo) to a 
> > > architecture that is more cohesive (and is more of something that we can 
> > > say to our current or future children that we were all involved and 
> > > proud to be involved in creating/maturing...).
> > > 
> > > At a start, for said first meeting, any kind of agenda come to mind, or 
> > > will it be more a informal gathering to start (either is fine with me)?
> > > 
> > 
> > I've been hesitant to fill this in too much as I'm still forming the
> > idea, but here are the items I think are most compelling to begin with:
> > 
> > * DLM's across OpenStack -- This is already under way[1], but it seems to
> >   have fizzled out. IMO that is because there's no working group who
> >   owns it. We need to actually write some plans.
> 
> Not meaning to nitpick, but I don't think this is a compelling reason for the
> architecture working group. We need a group that wants to spend time on
> reviewing the drivers being proposed. This is like saying we need the
> architecture working group because no working group is actively reshaping 
> quotas
> cross-project. 
> 

That sounds like a reasoned deep argument, not a nitpick, so thank you
for making it.

However, I don't think lack of drivers is standing in the way of a DLM
effort. It is a lack of coordination. There was a race to the finish line
to make Consul and etcd drivers, but then, like the fish in finding Nemo,
the drivers are in bags floating in the bay.. now what?

Nobody owns this effort. Everybody gets busy. Nothing gets done. We
continue to bring it up in the hallway and wish we had time.

This is just a place to have a meeting and some people who get together
and say "hey is that done yet? Do you need help? is that still a
priority?". Could we do this as part of Oslo? Yes! But, I want this to
be about going one step higher, and actually taking the implementations
into the respective projects.

> With that said, I can see the architecture working group providing information
> on to a group actually reviewing/writing drivers for DLM and saying "Doing
> mutexes with the mysql driver is crazy, I brought it in a environment and have
> such information to support that it is not reliable". THAT is useful and I
> don't feel like people do enough of.
> 

Ugh, no, I don't want it to be a group of information providers. I'm
not talking about an Architecture Review Board.

It's a group for doers. People who design together, and build with
others. The DLM spec process was actually one of the reasons I wanted
to create this group. We did such a great job on the design side, but
we didn't really stick together on it and push it all the way through.
This group is my idea of how we stick together and complete work like
that.

> My point is call your working group whatever you want (The Purple Parrots), 
> and
> just go spearhead DLM, but don't make it about one of the most compelling
> reasons for the existence of this group.
> 

The Purple Parrots is already taken -- it's my new band. We're playing
at the house of blues on August 3rd.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] Proposal: Architecture Working Group

2016-06-28 Thread Clint Byrum
Thanks everyone for participating and remaining positive and focused
on improving OpenStack. I've posted a review, and I'd like to encourage
everyone to move any future discussion of the Architecture Working group
to that review.

https://review.openstack.org/335141

Excerpts from Clint Byrum's message of 2016-06-17 14:52:43 -0700:
> ar·chi·tec·ture
> ˈärkəˌtek(t)SHər/
> noun
> noun: architecture
> 
> 1. 
> 
> the art or practice of designing and constructing buildings.
> 
> synonyms:building design, building style, planning, building, 
> construction; 
> 
> formalarchitectonics 
> 
> "modern architecture"
> 
> the style in which a building is designed or constructed, especially with 
> regard to a specific period, place, or culture.
> 
> plural noun: architectures
> 
> "Victorian architecture"
> 
> 2. 
> 
> the complex or carefully designed structure of something.
> 
> "the chemical architecture of the human brain"
> 
> the conceptual structure and logical organization of a computer or 
> computer-based system.
> 
> "a client/server architecture"
> 
> synonyms:structure, construction, organization, layout, design, build, 
> anatomy, makeup; 
> 
> informalsetup 
> 
> "the architecture of a computer system"
> 
> 
> Introduction
> =
> 
> OpenStack is a big system. We have debated what it actually is [1],
> and there are even t-shirts to poke fun at the fact that we don't have
> good answers.
> 
> But this isn't what any of us wants. We'd like to be able to point
> at something and proudly tell people "This is what we designed and
> implemented."
> 
> And for each individual project, that is a possibility. Neutron can
> tell you they designed how their agents and drivers work. Nova can
> tell you that they designed the way conductors handle communication
> with API nodes and compute nodes. But when we start talking about how
> they interact with each other, it's clearly just a coincidental mash of
> de-facto standards and specs that don't help anyone make decisions when
> refactoring or adding on to the system.
> 
> Oslo and cross-project initiatives have brought some peace and order
> to the implementation and engineering processes, but not to the design
> process. New ideas still start largely in the project where they are
> needed most, and often conflict with similar decisions and ideas in other
> projects [dlm, taskflow, tooz, service discovery, state machines, glance
> tasks, messaging patterns, database patterns, etc. etc.]. Often times this
> creates a log jam where none of the projects adopt a solution that would
> align with others. Most of the time when things finally come to a head
> these things get done in a piecemeal fashion, where it's half done here,
> 1/3 over there, 1/4 there, and 3/4 over there..., which to the outside
> looks like  chaos, because that's precisely what it is.
> 
> And this isn't always a technical design problem. OpenStack, for instance,
> isn't really a micro service architecture. Of course, it might look like
> that in diagrams [2], but we all know it really isn't. The compute node is
> home to agents for every single concern, and the API interactions between
> the services is too tightly woven to consider many of them functional
> without the same lockstep version of other services together. A game to
> play is ask yourself what would happen if a service was isolated on its
> own island, how functional would its API be, if at all. Is this something
> that we want? No. But there doesn't seem to be a place where we can go
> to actually design, discuss, debate, and ratify changes that would help
> us get to the point of gathering the necessary will and capability to
> enact these efforts.
> 
> Maybe nova-compute should be isolated from nova, with an API that
> nova, cinder and neutron talk to. Maybe we should make the scheduler
> cross-project aware and capable of scheduling more than just nova
> instances. Maybe we should have experimental groups that can look at how
> some of this functionality could perhaps be delegated to non-openstack
> projects. We hear that Mesos, for example to help with the scheduling
> aspects, but how do we discuss these outside hijacking threads on the
> mailing list? These are things that we all discuss in the hallways
> and bars and parties at the summit, but because they cross projects at
> the design level, and are inherently a lot of social and technical and
> exploratory work, Many of us fear we never get to a place of turning
> our dreams into reality.
> 
> So, with that, I'd like to propose the creation of an Architecture Working
> Group. This group's charge would not be design by committee, but a place
> for architects to share their designs and gain support across projects
> to move forward with and ratify architectural decisions. That includes
> coordinating exploratory work that may turn into being the base of further
> architectural decisions for OpenStack. I 

Re: [openstack-dev] [all] Proposal: Architecture Working Group

2016-06-24 Thread Clint Byrum
Excerpts from Zhipeng Huang's message of 2016-06-24 18:15:30 +0200:
> Hi Clint and Amrith,
> 
> Are you guys already working on the proposal ? Is there any public access
> to see the first draft ?
> 

I've started writing something up, and I hope to submit it for review
next week.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] Status of the OpenStack port to Python 3

2016-06-24 Thread Clint Byrum
Excerpts from Amrith Kumar's message of 2016-06-24 10:13:37 +:
> 
> > -Original Message-
> > From: Doug Hellmann [mailto:d...@doughellmann.com]
> > Sent: Thursday, June 23, 2016 5:16 PM
> > To: openstack-dev 
> > Subject: Re: [openstack-dev] [all] Status of the OpenStack port to Python 3
> > 
> [... snip ...]
> > 
> > Let's see what PTLs have to say about planning, but I think if not
> > Ocata then we'd want to set one for the P release. We're running
> > out of supported lifetime for Python 2.7.
> > 
> > Doug
> > 
> 
> Doug, I believe py27 will be supported till end of 2020 but distribution 
> vendors (os people) are not yet deploying py3 as the default.
> 
> Could someone share the best information on when we may see Python3 be the 
> default from the various os distribution providers. The date of 2020 for EOS 
> leads me to believe that we're good until about the U or V release (assuming 
> two per year) but I don't believe that's the correct way to plan for this, 
> yes?
> 

Fedora, Ubuntu, and Gentoo are already defaulting to python3 for all
of their OS tools, so you can have a fully functioning system without
python2.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] what do you work on upstream of OpenStack?

2016-06-23 Thread Clint Byrum
Excerpts from Doug Hellmann's message of 2016-06-23 08:37:04 -0400:
> Excerpts from Doug Hellmann's message of 2016-06-13 15:11:17 -0400:
> > I'm trying to pull together some information about contributions
> > that OpenStack community members have made *upstream* of OpenStack,
> > via code, docs, bug reports, or anything else to dependencies that
> > we have.
> > 
> > If you've made a contribution of that sort, I would appreciate a
> > quick note.  Please reply off-list, there's no need to spam everyone,
> > and I'll post the summary if folks want to see it.
> > 
> > Thanks,
> > Doug
> > 
> 
> I've summarized the results of all of your responses (on and off
> list) on a blog post this morning [1]. I removed individual names
> because I was concentrating on the community as a whole, rather than
> individual contributions.
> 
> I'm sure there are projects not listed, either because I missed
> something in my summary or because someone didn't reply. Please feel
> free to leave a comment on the post with references to other projects.
> It's not necessary to link to specific commits or bugs or anything like
> that, unless there's something you would especially like to highlight.
> 
> Thank you for your input into the survey. I'm impressed with the
> breadth of the results. I'm very happy to know that our community,
> which so often seems to be focused on building new projects, also
> contributes to existing projects that we all rely on.
> 
> [1] 
> https://doughellmann.com/blog/2016/06/23/openstack-contributions-to-other-open-source-projects/

That is pretty cool.

I forgot to reply to your original request, but I did a lot of python3
porting on the pysaml2 project in support of Keystone's python3 port.

https://github.com/rohe/pysaml2/commits?author=SpamapS

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] Proposal: Architecture Working Group

2016-06-22 Thread Clint Byrum
Excerpts from Amrith Kumar's message of 2016-06-22 13:15:03 +:
> Clint,
> 
> In your original email, you proposed "So, with that, I'd like to propose the 
> creation of an Architecture Working Group. This group's charge would not be 
> design by committee, but a place for architects to share their designs and 
> gain support across projects to move forward with and ratify architectural 
> decisions."
> 
> I like parts of this, and parts of this trouble me. But, I volunteered to be 
> part of this activity because I have a real problem that this group could 
> help me solve and I bet there are others who have this problem as well.
> 
> As you say, there are often problems, questions, and challenges of an 
> architectural nature, that have a scope larger than a single project. I would 
> like there to be a forum whose primary focus is to provide an avenue where 
> these can be discussed. I would like it to be a place less noisy than "take 
> it to the ML" and be a place where one could pose these questions and 
> potentially discuss solutions that other projects have adopted.
> 
> The part I'm uncomfortable is the implied decision making authority of 
> "ratifying architectural decisions". To ratify implies the ability to make 
> official, the ability to "approve and sanction formally" and I ask whence 
> came this power and authority?
> 
> Who currently has this power and authority, and is that individual or group 
> delegating it to this working group?
> 

When I say ratify there, what I mean is that this group would have regular
members who work on the group. To ratify something, a majority of them
would at least agree that this was something worth the group's time, and
that the group should publish their architecture decisions publicly. The
membership, I think, should be voluntary, and the only requirement be
that members regularly participate in discussions and voting.

Formality is a useful tool here, which is the reason I chose the word
'ratify'. It asks that those who want to propose new ideas do so in an
efficient manner that doesn't make noise on the mailing list and force
everyone to come up with an opinion on the spot or forever lose the
idea. We get a log of proposals, objections, and reasoning, to go along
with our log of successes and failures in taking the proposals to reality.

But the only power this group wields is over itself. Of course,
collaboration with the project teams is _critical_ for the success
of these proposals. And if we succeed in improving some projects, but
others resist, then it's up to those projects that have been improved
to support us pushing forward or not.

This isn't all that different than the way Oslo specs and OpenStack
specs work now. It's just that we'll have a group that organizes the
efforts and keeps pressure on them.

> While this ML thread is very interesting, it is also beginning to fragment 
> and I would like to propose a spec in Gerrit with a draft charter for this 
> working group and a review there.
> 

You're spot on Amrith. I've been noodling on a mission statement and
was going to bring it up at next week's TC meeting, but we don't have to
wait for that. Let's draft a charter now. Any suggestions on where that
should be submitted? openstack-specs I suppose? Governance?

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] ability to set metadata on instances (but config drive is not updated)

2016-06-21 Thread Clint Byrum
Excerpts from Joshua Harlow's message of 2016-06-21 15:12:09 -0700:
> Agreed, it appears supported right-now (whether intentional or not),
> 
> So the question at that point is what can we do to make it better...
> 
> I think we all agree that the config-drive probably shouldn't have the 
> equivalent of the metadata service in it; because if the metadata 
> service can change over time, and the config-drive can't then it's a bad 
> equivalent and that should be rectified, either by making it an 
> equivalent or making it be accepted that it is not, and ideally trimming 
> the data in it to not confuse people (ie by providing static networking 
> configuration only, and leaving the items that can be dynamic to the 
> dynamic metadata service).
> 
> Of course this whole mess IMHO gets into, 'why don't we just have an 
> agreed up agent that can be used'; because that's what the metadata 
> service is starting to become (a crappy version of an agent); but I digress.
> 

Not sure I agree with your initial stipulation. It shouldn't be a
surprise, at all, to any user, that an HTTP service is updated live,
while an ISO attached to an instance is kept static.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [diskimage-builder] ERROR: embedding is not possible, but this is required for cross-disk install

2016-06-21 Thread Clint Byrum
Excerpts from Jay Pipes's message of 2016-06-21 14:20:07 -0400:
> On 06/21/2016 02:01 PM, Andre Florath wrote:
> > Hello Jay,
> >
> > Yes - the partition alignment is a problem:
> > grub2 needs at least 63 blocks between the MBR and the first
> > partition.  Here for you the partition directly starts at block 1,
> > therefore grub has no way to put its data on the disk.
> >
> > The root cause is, that all the partitioning tools I found (like
> > parted or sfdisk) make some 'optimization': they do not what you
> > state but what they think you want. (And I have the impression that
> > their 'thinking' includes the moon phases and the biorhythm of the
> > user :-) .)
> >
> > Example in your case: sfdisk called with '1 - - *' creates on Ubuntu
> > Xenial a partition starting from block 1. On Debian 8.4 (my
> > development machine) it creates a partition starting from 2087.  This
> > gives some room for grub, but it horrible when it comes to alignment.
> >
> > Some possible workarounds:
> > o Use another host for creating the Ubuntu VM
> >(and hope, that sfdisk behaves 'better'.)
> > o Use a more recent version of diskimage-builder:
> >some time ago 'sfdisk' was replaced by 'parted'
> >(and hope, that 'parted' does a 'better' job for you).
> > o Under Ubuntu Xenial execute with 1.0.0 installed:
> >$ sudo vi 
> > /usr/share/diskimage-builder/elements/vm/block-device.d/10-partition
> >In line 23 replace
> >   1 - - *
> >with
> >   2048 - - *
> >(Note that this really only works on Ubuntu Xenial.)
> >
> > Hope this works and helps - was not able to test these things.
> >
> > If you are interested in some more background information:
> > I stumbled over the mostly random behavior of these tools last week.
> > One aspect is, that they optimize things for the current system (e.g.
> > reading some kernel parameters; especially IO buffer sizes).  These
> > sizes can be completely different on the target system - which might
> > lead to very poor disk performance.
> > During the last days I reworked my patch (which originally used
> > parted) to directly write the needed info to the boot records. [1]
> > More details can be found in the comments of MBR.py [2].
> >
> > Kind regards
> >
> > Andre
> 
> Thanks for the great information and help, all! I upgraded dib to 1.17, 
> re-ran the same command et voila, problem solved. :)
> 
> So, looks like parted vs. sfdisk is indeed the issue, and is solved in 
> modern dib versions. (Time to update Xenial's PPA for dib? ;)
> 

Just looks like it hasn't been uploaded in a while:

https://tracker.debian.org/pkg/python-diskimage-builder

Once 1.17+ gets into yakkety (The future Ubuntu 16.10) we can propose it
for xenial-backports.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] Proposal: Architecture Working Group

2016-06-21 Thread Clint Byrum
Excerpts from Jay Pipes's message of 2016-06-21 12:47:46 -0400:
> On 06/21/2016 04:25 AM, Chris Dent wrote:
> > However, I worry deeply that it could become astronauts with finger
> > paints.
> 
> Yes. This.
> 
> I will happily take software design suggestions from people that 
> demonstrate with code and benchmarks that their suggestion actually 
> works outside of the theoretical/academic/MartinFowler landscape and 
> actually solves a real problem better than existing code.
> 

So, I want to be careful not to descend too far into reductionism.
Nobody is suggesting that an architecture WG goes off into the corner
and applies theory to problems without practical application.

However, some things aren't measured in how fast, or even how reliable
a piece of code is. Some things are measured in how understandable the
actual code and deployment is.

If we architect a DLM solution, refactor out all of the one-off DLM-esque
things, and performance and measured reliability stay flat, did we fail?
What about when the locks break, and an operator has _one_ way to fix
locks, instead of 5? How do we measure that?

So I think what I want to focus on is: We have to have some real basis
on which to agree that an architecture that is selected is worth the
effort to refactor things around it. But I can't promise that every
problem has an objectively measurable solution. Part of the point of
having a working group is to develop a discipline for evaluating hard
problems like that.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] Proposal: Architecture Working Group

2016-06-21 Thread Clint Byrum
Excerpts from Jay Pipes's message of 2016-06-21 13:29:32 -0400:
> On 06/21/2016 12:53 PM, Doug Wiegley wrote:
> > Don’t get me wrong, I welcome this initiative. I find it mildly
> > disconcerting that the folks that I thought we were electing to fill
> > this role will instead be filled by others, but the vacuum does need
> > to be filled, and I thank Clint for stepping up.
> 
> I don't particularly think it's a vacuum that can be filled by a small 
> group of "architects". My experience is that unless said architects are 
> actually involved in building the software at hand and have a good 
> understanding of why certain design choices were originally made in the 
> various projects, the end deliverable of such a group tends to be the 
> software equivalent of silly putty -- fun to play with but in the end, 
> relatively free of practical value.
> 

I so agree with this. I don't want this group to turn into an ivory
tower shaped fountain of suggestions. That is my nightmare actually!

What I want it to be is a group that facilitates the people who build
the components of the system coming together to collaborate on large
refactoring efforts and improving our ability to design the system
together, rather than in silos.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] Proposal: Architecture Working Group

2016-06-21 Thread Clint Byrum
Excerpts from Ian Cordasco's message of 2016-06-21 11:12:40 -0500:
>  
> 
> -Original Message-
> From: Michael Krotscheck <krotsch...@gmail.com>
> Reply: OpenStack Development Mailing List (not for usage questions) 
> <openstack-dev@lists.openstack.org>
> Date: June 21, 2016 at 10:18:25
> To: OpenStack Development Mailing List (not for usage questions) 
> <openstack-dev@lists.openstack.org>
> Subject:  Re: [openstack-dev] [all] Proposal: Architecture Working Group
> 
> > On Mon, Jun 20, 2016 at 10:59 AM Clint Byrum wrote:
> >  
> > >
> > > As you should be, and we all must be. It's not going to happen if we
> > > just dream it. That's kind of the point. Let's write down a design _for
> > > the group that writes down designs_.
> > >
> >  
> > If I had any confidence that this group would have a significant impact on
> > making OpenStack more consistent, easy to work on, and easier to build apps
> > against, I'd be happy to help out.
> >  
> > The only thing that would give me that confidence is if the WG charter had
> > a TC-enforced section of: "If you do not implement this *thing* within two
> > cycles, your project will get kicked out of OpenStack".
> 
> I don't think that will or *should* ever happen. The documents produced by 
> this WG, to me, would be the set of best practices that should be aimed for, 
> not mandatory. Forcing someone to refactor some complex piece of architecture 
> in their project in < 1 year so the project can remain part of OpenStack 
> seems like someone begging for horrible bugs and regressions in critical 
> behaviour.
> 
> This is worse than saying "All projects should stop disabling hacking rules 
> in two cycles or they'll stop receiving OpenStack Infra resources for 
> testing." or "All projects need to write new versions of their API just to 
> conform with the API WG."
> 

Just to be clear, I'm thinking more "This is how a thing should work",
not "best practices". The difference being that we'd write down something
like " message busses should look like X, Y, or Z based on factors a, b,
and/or c", and then we'd actually go fix or document the divergent cases
in OpenStack.

A best practices group would not actually fix anything IMO. It has to be
a group of action. Of course, we'll ask for help from any project teams
that are involved, and coordinate on things like release schedules so
we don't complicate short term plans. But I don't want this to just be a
standards body. I want this to be the group that gets the ball rolling,
and then helps it keep rolling.

There's no such thing as a perfect system, so we need to work toward
a process that produces _good_ results. Enforcement without merit will
just drive a wedge into that process.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] Proposal: Architecture Working Group

2016-06-21 Thread Clint Byrum
Excerpts from Chris Dent's message of 2016-06-21 09:25:44 +0100:
> On Mon, 20 Jun 2016, Doug Wiegley wrote:
> 
> > So, it sounds like you’ve just described the job of the TC. And they
> > have so far refused to define OpenStack, leading to a series of
> > derivative decisions that seem … inconsistent over time.
> 
> Thanks for writing down what I was thinking. I agree that OpenStack
> needs some architectural vision, direction, leadership, call it what
> you will. Every time I've voted for the _Technical_ Committee that
> leadership is what I've wanted my vote to be creating.
> 

I think that's still part of it. The difference is that the TC is using
their expertise to resolve conflict and ratify decisions, while a working
group is creating a source of data and taking actions that hopefully
prevent the conflict from ever arising.

> It may be that an architecture working group can provide some
> guidance that people will find useful. Against the odds I think
> those of us in the API-WG have actually managed to have a positive
> influence. We've not shaken things down to the foundations from
> which a great a glorious future may be born -- a lot of compromises
> have been made and not everybody wants to play along -- but things
> are going in the right direction, for some people, in some projects.
> Maybe a similar thing can happen with architecture.
> 

I definitely saw what was happening with the API-WG and wanted to have a
similar effect on the design process.

> However, I worry deeply that it could become astronauts with finger
> paints. In the API working group at least we have the HTTP RFCs as
> foundational sources of authority to guide us. In something so
> fraught with opinion what are the sources of authority?
> 
> I was pointed at this a while ago
> 
>  https://wiki.openstack.org/wiki/BasicDesignTenets
> 
> It's full of lots of great rules that are frequently broken.
> 


This is a great point Chris. I definitely think we need to have some
authority to fall back on. The link above is a great start. I'd like to
invite others to think on that and share their sources of authority for
distributed system design.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Placement API WSGI code -- let's just use flask

2016-06-21 Thread Clint Byrum
Excerpts from Sean Dague's message of 2016-06-21 08:00:50 -0400:
> On 06/21/2016 07:39 AM, Jay Pipes wrote:
> > On 06/21/2016 05:43 AM, Sylvain Bauza wrote:
> >> Le 21/06/2016 10:04, Chris Dent a écrit :
> >>> On Mon, 20 Jun 2016, Jay Pipes wrote:
> >>>
>  Flask seems to be the most widely used and known WSGI framework so
>  for consistency's sake, I'm recommending we just use it and not rock
>  this boat. There are more important things to get hung up on than
>  this battle right now.
> >>>
> >>> That seems perfectly reasonable. My main goal in starting the
> >>> discussion was to ensure that we reach some kind of consensus,
> >>> whatever it might be[1]. It won't be too much of an ordeal to
> >>> turn the existing pure WSGI stuff into Flask stuff.
> >>>
> >>> From my standpoint doing the initial development in straight WSGI
> >>> was a win as it allowed for a lot of clarity from the inside out.
> >>> Now that that development has shown the shape of the API we can
> >>> do what we need to do to make it clear from outside in.
> >>>
> >>> Next question: There's some support for not using Paste and
> >>> paste.ini. Is anyone opposed to that?
> >>>
> >>
> >> Given Flask is not something we support yet in Nova, could we discuss on
> >> that during either a Nova meeting, or maybe wait for the midcycle ?
> > 
> > I really don't want to wait for the mid-cycle. Happy to discuss in the
> > Nova meeting, but my preference is to have Chris just modify his patch
> > series to use Flask now and review it.
> > 
> >> To be honest, Chris and you were saying that you don't like Flask, and
> >> I'm a bit agreeing with you. Why now it's a good possibility ?
> > 
> > Because Doug persuaded me that the benefits of being consistent with
> > what the community is using outweigh my (and Chris') personal misgivings
> > about the particular framework.
> 
> Just to be clear
> 
> http://codesearch.openstack.org/?q=Flask%3E%3D0.10=nope==
> 
> Flask is used by 2 (relatively new) projects in OpenStack
> 
> If we look at the iaas base layer:
> 
> Keystone - custom WSGI with Routes / Paste
> Glance - WSME + Routes / Paste
> Cinder - custom WSGI with Routes / Paste
> Neutron - pecan + Routes / Paste
> Nova - custom WSGI with Routes / Paste
> 

When I see "custom WSGI" I have a few thoughts:

* custom == special snowflake. But REST API's aren't exactly novel.

* If using a framework means not writing or cargo culting any custom
WSGI code, that seems like a win for maintainability from the get go.

* If using a framework means handling errors more consistently, that
seems like a win for operators.

* I don't have a grasp on how much custom WSGI code is actually
involved. That would help us all evaluate the meaning of the statements
above (both yours, and mine).

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Placement API WSGI code -- let's just use flask

2016-06-21 Thread Clint Byrum
Excerpts from Sean Dague's message of 2016-06-21 09:10:00 -0400:
> The amount of wsgi glue above Routes / Paste is pretty minimal (after
> you get rid of all the extensions facilities).
> 
> Templating and Session handling are things we don't need. We're not a
> webapp, we're a REST service. Saying that using a web app framework is
> better than a little bit of wsgi glue seems weird to me.
> 

Actually we do have sessions. We just call them "tokens".

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Version header for OpenStack microversion support

2016-06-21 Thread Clint Byrum
Excerpts from Edward Leafe's message of 2016-06-20 20:41:56 -0500:
> On Jun 18, 2016, at 9:03 AM, Clint Byrum <cl...@fewbar.com> wrote:
> 
> > Whatever API version is used behind the compute API is none of the user's
> > business.
> 
> Actually, yeah, it is.
> 
> If I write an app or a tool that expects to send information in a certain 
> format, and receive responses in a certain format, I don't want that to 
> change when the cloud operator upgrades their system. I only want things to 
> change when I specifically request that they change by specifying a new 
> microversion.
> 

The things I get back in the compute API are the purview of the compute
API, and nothing else.

Before we go too far down this road, is there actually an example of
one API providing a proxy to another directly? If so, is it something
we think is actually a good idea?

Because otherwise, the API I'm talking to needs to be clear about what
it does and does not emit and/or accept. That contract would just be
the microversion of the API I'm talking to.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [diskimage-builder] ERROR: embedding is not possible, but this is required for cross-disk install

2016-06-20 Thread Clint Byrum
Excerpts from Gregory Haynes's message of 2016-06-20 17:24:28 -0500:
> On Mon, Jun 20, 2016, at 03:52 PM, Jay Pipes wrote:
> > Hi dib-gurus,
> > 
> > I'm trying to build a simple ubuntu VM image on a local Gigabyte BRIX 
> > with a AMD A8-5557M APU with Ubuntu 16.04 installed and getting an odd 
> > error. Hoping someone has some ideas...
> > 
> > The command I am running is:
> > 
> > disk-image-create -o /tmp/ubuntu.qcow2 --image-size=10 ubuntu vm
> > 
> > Everything goes smoothly until trying to write the MBR, at which point I 
> > get the following error:
> > 
> > + /usr/sbin/grub-install '--modules=biosdisk part_msdos' 
> > --target=i386-pc /dev/loop0
> > Installing for i386-pc platform.
> > /usr/sbin/grub-install: warning: this msdos-style partition label has no 
> > post-MBR gap; embedding won't be possible.
> > /usr/sbin/grub-install: error: embedding is not possible, but this is 
> > required for cross-disk install.
> > /dev/loop0: [0047]:3 (/tmp/image.hk8wiFJe/image.raw)
> > 
> > Anybody got any ideas?
> > 
> > Thanks in advance!
> > -jay
> 
> Hey Jay,
> 
> I just tried to reproduce this on my 14.04 box and wasn't able to so I
> am betting there's some kind of new bug with us on 16.04. Do you get the
> same error if you run without --image-size=10? Last time we had an issue
> like this a new grub version changed the default behavior, so I'd
> suspect something along those lines.
> 
> I am trying out a new run on a 16.04 box but its going to be a bit
> before the cloud image downloads (cloud-image.ubuntu.com is pretty
> slow)...

I just completed a run on 16.04, so it's not that.

I wonder if it might be sensitive to disk geometry accidentally.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [diskimage-builder] ERROR: embedding is not possible, but this is required for cross-disk install

2016-06-20 Thread Clint Byrum
Ahh derp, that's not an ARM CPU. I read "A8" and "APU" and my brain
immediately lept to that. Ignore me.

Excerpts from Clint Byrum's message of 2016-06-20 15:12:23 -0700:
> Excerpts from Jay Pipes's message of 2016-06-20 16:52:38 -0400:
> > Hi dib-gurus,
> > 
> > I'm trying to build a simple ubuntu VM image on a local Gigabyte BRIX 
> > with a AMD A8-5557M APU with Ubuntu 16.04 installed and getting an odd 
> > error. Hoping someone has some ideas...
> > 
> > The command I am running is:
> > 
> > disk-image-create -o /tmp/ubuntu.qcow2 --image-size=10 ubuntu vm
> > 
> > Everything goes smoothly until trying to write the MBR, at which point I 
> > get the following error:
> > 
> > + /usr/sbin/grub-install '--modules=biosdisk part_msdos' 
> > --target=i386-pc /dev/loop0
> > Installing for i386-pc platform.
> > /usr/sbin/grub-install: warning: this msdos-style partition label has no 
> > post-MBR gap; embedding won't be possible.
> > /usr/sbin/grub-install: error: embedding is not possible, but this is 
> > required for cross-disk install.
> > /dev/loop0: [0047]:3 (/tmp/image.hk8wiFJe/image.raw)
> 
> I think you found a bug. The ARCH should be armhf, but
> elements/bootloader/finalise.d/50-bootloader doesn't know what to do with
> armhf, so it falls back to x86_64/amd64. Also, grub-pc is the package
> that bootloader claims to need in its pkg-map file, but you really need
> grub-efi-arm on arm boxes.
> 
> We may need to add the ability for pkg-map's to differentiate based on
> ARCH.
> 
> If you intended to build an amd64 image, you need to set ARCH=amd64
> before running disk-image-create, so it won't detect it by running
> 'dpkg --print-architecture'.
> 

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [diskimage-builder] ERROR: embedding is not possible, but this is required for cross-disk install

2016-06-20 Thread Clint Byrum
Excerpts from Jay Pipes's message of 2016-06-20 16:52:38 -0400:
> Hi dib-gurus,
> 
> I'm trying to build a simple ubuntu VM image on a local Gigabyte BRIX 
> with a AMD A8-5557M APU with Ubuntu 16.04 installed and getting an odd 
> error. Hoping someone has some ideas...
> 
> The command I am running is:
> 
> disk-image-create -o /tmp/ubuntu.qcow2 --image-size=10 ubuntu vm
> 
> Everything goes smoothly until trying to write the MBR, at which point I 
> get the following error:
> 
> + /usr/sbin/grub-install '--modules=biosdisk part_msdos' 
> --target=i386-pc /dev/loop0
> Installing for i386-pc platform.
> /usr/sbin/grub-install: warning: this msdos-style partition label has no 
> post-MBR gap; embedding won't be possible.
> /usr/sbin/grub-install: error: embedding is not possible, but this is 
> required for cross-disk install.
> /dev/loop0: [0047]:3 (/tmp/image.hk8wiFJe/image.raw)

I think you found a bug. The ARCH should be armhf, but
elements/bootloader/finalise.d/50-bootloader doesn't know what to do with
armhf, so it falls back to x86_64/amd64. Also, grub-pc is the package
that bootloader claims to need in its pkg-map file, but you really need
grub-efi-arm on arm boxes.

We may need to add the ability for pkg-map's to differentiate based on
ARCH.

If you intended to build an amd64 image, you need to set ARCH=amd64
before running disk-image-create, so it won't detect it by running
'dpkg --print-architecture'.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] Proposal: Architecture Working Group

2016-06-20 Thread Clint Byrum
Excerpts from Joshua Harlow's message of 2016-06-17 15:33:25 -0700:
> Thanks for getting this started Clint,
> 
> I'm happy and excited to be involved in helping try to guide the whole 
> ecosystem together (it's also why I like being in oslo) to a 
> architecture that is more cohesive (and is more of something that we can 
> say to our current or future children that we were all involved and 
> proud to be involved in creating/maturing...).
> 
> At a start, for said first meeting, any kind of agenda come to mind, or 
> will it be more a informal gathering to start (either is fine with me)?
> 

I've been hesitant to fill this in too much as I'm still forming the
idea, but here are the items I think are most compelling to begin with:

* DLM's across OpenStack -- This is already under way[1], but it seems to
  have fizzled out. IMO that is because there's no working group who
  owns it. We need to actually write some plans.

* Messaging patterns -- There was a recent discussion about nuances in
  oslo.messaging's implementation that vary driver to driver. I'd like to
  make sure we have all of the ways messaging is used written down and
  make sure groups have guidance on how each one should (or shouldn't)
  be used, including documentation of anti-patterns, and a plan for
  simplifying communication between processes in OpenStack where
  possible.

* True Microservices -- OpenStack's services are heavily
  interdependent. It makes it hard for an engineer to make an impact
  without becoming an expert on all of them, and it also leads to a heavy
  burden on operators and QA as they end up having to debug the system
  as a whole. We should write down how the system is intertwined now, and
  develop plans for unwinding that and allowing each service to stand on
  its own, including having stronger testing in isolation. (This includes
  exploring the thought that nova-compute could become its own project
  that all of the other things communicate with over well defined APIs).

These could keep a small group of architects and engineers busy for a
year or more. I'm sure others have things they'd like to design and
improve in OpenStack at this level.

Once we have broad agreement on such a group, and perhaps some guidance
from the TC, we can propose an agenda and prioritize efforts as part of
the first few meetings.

[1] 
http://specs.openstack.org/openstack/openstack-specs/specs/chronicles-of-a-dlm.html
> -Josh
> 
> Clint Byrum wrote:
> > ar·chi·tec·ture
> > ˈärkəˌtek(t)SHər/
> > noun
> > noun: architecture
> >
> >  1.
> >
> >  the art or practice of designing and constructing buildings.
> >
> >  synonyms:building design, building style, planning, building, 
> > construction;
> >
> >  formalarchitectonics
> >
> >  "modern architecture"
> >
> >  the style in which a building is designed or constructed, especially 
> > with regard to a specific period, place, or culture.
> >
> >  plural noun: architectures
> >
> >  "Victorian architecture"
> >
> >  2.
> >
> >  the complex or carefully designed structure of something.
> >
> >  "the chemical architecture of the human brain"
> >
> >  the conceptual structure and logical organization of a computer or 
> > computer-based system.
> >
> >  "a client/server architecture"
> >
> >  synonyms:structure, construction, organization, layout, design, build, 
> > anatomy, makeup;
> >
> >  informalsetup
> >
> >  "the architecture of a computer system"
> >
> >
> > Introduction
> > =
> >
> > OpenStack is a big system. We have debated what it actually is [1],
> > and there are even t-shirts to poke fun at the fact that we don't have
> > good answers.
> >
> > But this isn't what any of us wants. We'd like to be able to point
> > at something and proudly tell people "This is what we designed and
> > implemented."
> >
> > And for each individual project, that is a possibility. Neutron can
> > tell you they designed how their agents and drivers work. Nova can
> > tell you that they designed the way conductors handle communication
> > with API nodes and compute nodes. But when we start talking about how
> > they interact with each other, it's clearly just a coincidental mash of
> > de-facto standards and specs that don't help anyone make decisions when
> > refactoring or adding on to the system.
> >
> > Oslo and cross-project initiatives have brought some peace and order
> > to the implementation and engineering processes, but not to the design

Re: [openstack-dev] [all] Proposal: Architecture Working Group

2016-06-20 Thread Clint Byrum
Excerpts from Jesse Cook's message of 2016-06-20 16:58:48 +:
> +1
> 
> The points about the PWG and TC are worth some consideration.
> 
> From my perspective, I think it would make sense for the PWG to define the
> expected behaviors of the system, which would be an input to the
> architecture group. The architecture group would define both prescriptive
> (where we'd like to be) and descriptive (where we actually are...roughly)
> architectures. This would provide both the vision for the future state and
> understanding of current state that is necessary for us to all swim in the
> same general direction instead of constantly running into each other. I
> don't see the architecture as something you push down, but rather something
> that helps each contributor ask, "Does that get us closer to where we are
> trying to go?" I absolutely think this is something that would provide a
> huge benefit to the organization.
> 

That sounds about right Jesse. Thanks for bringing up the PWG. I
definitely don't think an Architecture WG would want to answer "what
is OpenStack" alone. More like "What should the OpenStack we described
actually look like?".

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] Proposal: Architecture Working Group

2016-06-20 Thread Clint Byrum
Excerpts from Michael Krotscheck's message of 2016-06-20 15:26:20 +:
> I like the idea in principle, but am bullish on the implementation.
> 

As you should be, and we all must be. It's not going to happen if we
just dream it. That's kind of the point. Let's write down a design _for
the group that writes down designs_.

> For example: we have the API-WG which fulfills part of an Architectural
> mission, as well as the Cross-Project WG which fulfills a different part.
> Yet there's no incentive, carrot or stick, that drives adoption of the
> approved specs, other than "Well if you want to do the work, have fun". In
> several cases, I've run into the excuses of "That's Hard/That breaks
> backwards compatibility/That's not profitable/I'm not paid to do that".
> 

If we write a bunch of specs which are too much of a burden to implement,
we're doing it wrong. The idea isn't to rip everything out.  It's to
acknowledge where there is a complete lack of design, write one that
is practical, and organize work to get it done.

> What's going to prevent [Insert Project Here] from coming along and saying
> "Oh, well, we don't like those decisions, so are going to ignore them."
> What provides the incentive for a project to adopt these recommendations?
> 

Absolutely nothing will prevent that, and I don't think putting any kind
of hard requirements for following this group's designs would be
productive until it has proven that it can actually improve OpenStack.
Perhaps if we do succeed in designing parts of the system, and some
teams find that useful, we can look at adding some of the parts of the
design to our more stringent requirements (like "use python or C"). But
that's not something I'd seek from day 1. That's just a recipe for
revolt.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] Proposal: Architecture Working Group

2016-06-20 Thread Clint Byrum
Excerpts from Doug Wiegley's message of 2016-06-20 10:40:56 -0600:
> So, it sounds like you’ve just described the job of the TC. And they have so 
> far refused to define OpenStack, leading to a series of derivative decisions 
> that seem … inconsistent over time.
> 
> How is this body going to be different?
> 
> How will it have any teeth, and not just end up with the standard entrenched 
> projects ignoring it?
> 

Hi Doug, thanks for your candid reply. I understand that there is a
concern and I want to address it. However, I feel like answering this
directly will start this working group out on the wrong foot.

We shouldn't need teeth. This isn't an adversarial working group that
will be challenging engineers any more than an architect of a building
challenges the builders while they work. An architect that ignores their
engineers is not going to complete many projects. Of course, engineers
may disagree on the cost of an architectural decision. But to disagree,
first we need to actually _make_ a decision on a design.

The goal of this group would be to provide a detailed architecture and
plans for the way the system should work and fit together as a whole. Like
any complex system, during implementation, things that architects weren't
aware of come to light. Something that seems simple turns out to be
complex. Something that seemed absolutely necessary can be factored out.

Nobody is suggesting designing OpenStack from the ground up, just that
where there isn't an agreed upon design, let's write down how the system
works now, and then make a design and a plan to actually improve it.

Engineers have no effective place to turn to right now when there is
a lack of design. The TC could of course do it, but what I want to do
is have a more open and free-flowing group that are laser focused on
providing support for the design of the system. I want to work out with
the community at large how we add weight to the designs we choose, and
one good option is for the Architecture Working Group to make proposals
to the openstack-specs repo, which the TC would ultimately approve.
That's not a new process, we already have it:

http://docs.openstack.org/project-team-guide/cross-project.html

I'm just suggesting a group that actually _focuses_ on the design
aspects of that process.

Without this, we are designing in real time and taking the shortest path
to achieve short term goals. This has positive and negative effects. I
think we've reached a point in OpenStack's evolution where the positive
effects of that are mostly realized, and now we should work to pay
down some of the negative effects by adopting some designs and beginning
refactoring of the system. It's not a fast process, so the longer we wait,
the longer we pay interest on that debt.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Version header for OpenStack microversion support

2016-06-18 Thread Clint Byrum
Excerpts from Henry Nash's message of 2016-06-18 13:14:17 +0100:
> > On 18 Jun 2016, at 11:32, Jamie Lennox  wrote:
> > 
> > Quick question: why do we need the service type or name in there? You 
> > really should know what API you're talking to already and it's just 
> > something that makes it more difficult to handle all the different APIs in 
> > a common way.
> >

< moved question to be inline for readability>

> …I think it is so you can have a header in a request that, once issued, can 
> be passed for service to service, e.g.:
> 
> OpenStack-API-Version: identity 3.7, compute 2.11
>

Whatever API version is used behind the compute API is none of the user's
business. Nova will support whatever identity API versions it supports,
and if you've passed it something that isn't compatible with the APIs it
knows how to speak, asking for it to support a version that it doesn't
isn't going to change the fact that this is an impossible request already.

That said, I kind of like the idea of specifying the name there for the
one API you are speaking, just in case something goes awry and you try
to send an identity request to compute, it's more clear that this is the
wrong API you're talking to. It's not like it's hard to split a string
on white space.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [all] Proposal: Architecture Working Group

2016-06-17 Thread Clint Byrum
ar·chi·tec·ture
ˈärkəˌtek(t)SHər/
noun
noun: architecture

1. 

the art or practice of designing and constructing buildings.

synonyms:building design, building style, planning, building, construction; 

formalarchitectonics 

"modern architecture"

the style in which a building is designed or constructed, especially with 
regard to a specific period, place, or culture.

plural noun: architectures

"Victorian architecture"

2. 

the complex or carefully designed structure of something.

"the chemical architecture of the human brain"

the conceptual structure and logical organization of a computer or 
computer-based system.

"a client/server architecture"

synonyms:structure, construction, organization, layout, design, build, 
anatomy, makeup; 

informalsetup 

"the architecture of a computer system"


Introduction
=

OpenStack is a big system. We have debated what it actually is [1],
and there are even t-shirts to poke fun at the fact that we don't have
good answers.

But this isn't what any of us wants. We'd like to be able to point
at something and proudly tell people "This is what we designed and
implemented."

And for each individual project, that is a possibility. Neutron can
tell you they designed how their agents and drivers work. Nova can
tell you that they designed the way conductors handle communication
with API nodes and compute nodes. But when we start talking about how
they interact with each other, it's clearly just a coincidental mash of
de-facto standards and specs that don't help anyone make decisions when
refactoring or adding on to the system.

Oslo and cross-project initiatives have brought some peace and order
to the implementation and engineering processes, but not to the design
process. New ideas still start largely in the project where they are
needed most, and often conflict with similar decisions and ideas in other
projects [dlm, taskflow, tooz, service discovery, state machines, glance
tasks, messaging patterns, database patterns, etc. etc.]. Often times this
creates a log jam where none of the projects adopt a solution that would
align with others. Most of the time when things finally come to a head
these things get done in a piecemeal fashion, where it's half done here,
1/3 over there, 1/4 there, and 3/4 over there..., which to the outside
looks like  chaos, because that's precisely what it is.

And this isn't always a technical design problem. OpenStack, for instance,
isn't really a micro service architecture. Of course, it might look like
that in diagrams [2], but we all know it really isn't. The compute node is
home to agents for every single concern, and the API interactions between
the services is too tightly woven to consider many of them functional
without the same lockstep version of other services together. A game to
play is ask yourself what would happen if a service was isolated on its
own island, how functional would its API be, if at all. Is this something
that we want? No. But there doesn't seem to be a place where we can go
to actually design, discuss, debate, and ratify changes that would help
us get to the point of gathering the necessary will and capability to
enact these efforts.

Maybe nova-compute should be isolated from nova, with an API that
nova, cinder and neutron talk to. Maybe we should make the scheduler
cross-project aware and capable of scheduling more than just nova
instances. Maybe we should have experimental groups that can look at how
some of this functionality could perhaps be delegated to non-openstack
projects. We hear that Mesos, for example to help with the scheduling
aspects, but how do we discuss these outside hijacking threads on the
mailing list? These are things that we all discuss in the hallways
and bars and parties at the summit, but because they cross projects at
the design level, and are inherently a lot of social and technical and
exploratory work, Many of us fear we never get to a place of turning
our dreams into reality.

So, with that, I'd like to propose the creation of an Architecture Working
Group. This group's charge would not be design by committee, but a place
for architects to share their designs and gain support across projects
to move forward with and ratify architectural decisions. That includes
coordinating exploratory work that may turn into being the base of further
architectural decisions for OpenStack. I would expect that the people
in this group would largely be senior at the companies involved and,
if done correctly, they can help prioritize this work by advocating for
people/fellow engineers to actually make it 'real'. This will give weight
to specs and implementation changes to make these designs a reality,
and thus I believe this group would do well to work closely with the
Oslo Team, where many of the cross-cutting efforts will need to happen.

How to get involved
===

If the idea is well received, I'd like to propose 

Re: [openstack-dev] [TripleO] Proposed TripleO core changes

2016-06-17 Thread Clint Byrum
Excerpts from Steven Hardy's message of 2016-06-09 15:03:51 +0100:
> Also, while reviewing the core group[2] I noticed the following members who
> are no longer active and should probably be removed:
> 
> - Radomir Dopieralski
> - Martyn Taylor
> - Clint Byrum
> 
> I know Clint is still involved with DiB (which has a separate core group),
> but he's indicated he's no longer going to be directly involved in other
> tripleo development, and AFAIK neither Martyn or Radomir are actively
> involved in TripleO reviews - thanks to them all for their contribution,
> we'll gladly add you back in the future should you wish to return :)
> 

Indeed, I'd like to remain involved with diskimage-builder, but the rest
of TripleO isn't in my focus anymore.

If I need to step up diskimage-builder reviews let me know. But I'd hope
the simplest course of action would be to just move me over to the
diskimage builder core group.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [keystone] Changing the project name uniqueness constraint

2016-06-13 Thread Clint Byrum
Excerpts from Dolph Mathews's message of 2016-06-13 20:11:57 +:
> On Fri, Jun 10, 2016 at 12:20 PM Clint Byrum <cl...@fewbar.com> wrote:
> 
> > Excerpts from Henry Nash's message of 2016-06-10 14:37:37 +0100:
> > > On further reflection, it seems to me that we can never simply enable
> > either of these approaches in a single release. Even a v4.0 version of the
> > API doesn’t help - since presumably a sever supporting v4 would want to be
> > able to support v3.x for a signification time; and, already discussed, as
> > soon as you allow multiple none-names to have the same name, you can no
> > longer guarantee to support the current API.
> > >
> > > Hence the only thing I think we can do (if we really do want to change
> > the current functionality) is to do this over several releases with a
> > typical deprecation cycle, e.g.
> > >
> > > 1) At release 3.7 we allow you to (optionally) specify path names for
> > auth….but make no changes to the uniqueness constraints. We also change the
> > GET /auth/projects to return a path name. However, you can still auth
> > exactly the way we do today (since there will always only be a single
> > project of a given node-name). If however, you do auth without a path (to a
> > project that isn’t a top level project), we log a warning to say this is
> > deprecated (2 cycles, 4 cycles?)
> > > 2) If you connect with a 3.6 client, then you get the same as today for
> > GET /auth/projects and cannot use a path name to auth.
> > > 3) At sometime in the future, we deprecate the “auth without a path”
> > capability. We can debate as to whether this has to be a major release.
> > >
> > > If we take this gradual approach, I would be pushing for the “relax
> > project name constraints” approach…since I believe this leads to a cleaner
> > eventual solution (and there is no particular advantage with the
> > hierarchical naming approach) - and (until the end of the deprecation)
> > there is no break to the existing API.
> >
> >
> Please don't ever break the API - with or without a supposed "deprecation"
> period.
> 
> > This seems really complicated.
> >
> > Why don't users just start using paths in project names, if they want
> > paths in project names?
> >
> > And then in v3.7 you can allow them to specify paths relative to parent of
> > the user:
> >
> > So just allow this always:
> >
> > {"name": "finance/dev"}
> >
> > And then add this later once users are aware of what the / means:
> >
> > {"basename": "dev"}
> >
> > What breaks by adding that?
> >
> 
> if I'm following your approach, then I should point out that we already
> allow forward slashes in project names, so what breaks is any user that
> already has forward slashes in their project names, but have no awareness
> of, or intention to consume, hierarchical multitenancy.
> 

Pretty simple solution to that: they use the API they've always used,
which doesn't care about the hierarchy.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [keystone] Changing the project name uniqueness constraint

2016-06-10 Thread Clint Byrum
Excerpts from Henry Nash's message of 2016-06-10 14:37:37 +0100:
> On further reflection, it seems to me that we can never simply enable either 
> of these approaches in a single release. Even a v4.0 version of the API 
> doesn’t help - since presumably a sever supporting v4 would want to be able 
> to support v3.x for a signification time; and, already discussed, as soon as 
> you allow multiple none-names to have the same name, you can no longer 
> guarantee to support the current API.
> 
> Hence the only thing I think we can do (if we really do want to change the 
> current functionality) is to do this over several releases with a typical 
> deprecation cycle, e.g.
> 
> 1) At release 3.7 we allow you to (optionally) specify path names for 
> auth….but make no changes to the uniqueness constraints. We also change the 
> GET /auth/projects to return a path name. However, you can still auth exactly 
> the way we do today (since there will always only be a single project of a 
> given node-name). If however, you do auth without a path (to a project that 
> isn’t a top level project), we log a warning to say this is deprecated (2 
> cycles, 4 cycles?)
> 2) If you connect with a 3.6 client, then you get the same as today for GET 
> /auth/projects and cannot use a path name to auth.
> 3) At sometime in the future, we deprecate the “auth without a path” 
> capability. We can debate as to whether this has to be a major release.
> 
> If we take this gradual approach, I would be pushing for the “relax project 
> name constraints” approach…since I believe this leads to a cleaner eventual 
> solution (and there is no particular advantage with the hierarchical naming 
> approach) - and (until the end of the deprecation) there is no break to the 
> existing API.
> 

This seems really complicated.

Why don't users just start using paths in project names, if they want
paths in project names?

And then in v3.7 you can allow them to specify paths relative to parent of
the user:

So just allow this always:

{"name": "finance/dev"}

And then add this later once users are aware of what the / means:

{"basename": "dev"}

What breaks by adding that?

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Reasoning behind my vote on the Go topic

2016-06-09 Thread Clint Byrum
Excerpts from Michael Barton's message of 2016-06-09 15:59:24 -0500:
> On Thu, Jun 9, 2016 at 2:49 PM, Clint Byrum <cl...@fewbar.com> wrote:
> >
> > Agreed it isn't done in uvloop. But it is done in libuv and the uvloop
> > devs agree it should be done. So this is the kind of thing where the
> > community can invest in python + C to help solve problems thought only
> > solvable by other languages.
> 
> 
> I mean, if someone wants to figure out a file server in python that can
> compete in any way with a go version, I'm totally down for rewriting swift
> to target some other python-based architecture.
> 
> But personally, my desire to try to build a universe where such a thing is
> possible is pretty low.  Because I've been fighting with it for years, and
> go already works great and there's nothing wrong with it.
> 

Mike, the whole entire crux of this thread, and Monty's words, is that
this sort of sentiment is hard to ignore, but it's even harder to ignore
the massive amount of inertia and power there is in having a community
that can all work on each others' code without investing a lot of time
in learning a new language.

That inertia is entirely the reason why other languages have surpassed
Python in some areas like concurrency. It takes longer to turn a
massive community going really hard in one direction than it does to
just start off heading in that direction in the first place. But that
turn is starting, and I for one think it's worth everyone's time to take
a hard look at whether or not we can in fact get it done together.

Nobody will force you to, but what I think Monty, and the rest of the
TC members who have voted to stay the course, are asking us all to do,
is to try to throw what we can at python solutions for these problems.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Reasoning behind my vote on the Go topic

2016-06-09 Thread Clint Byrum
Excerpts from Michael Barton's message of 2016-06-09 14:01:11 -0500:
> On Thu, Jun 9, 2016 at 9:58 AM, Ben Meyer  wrote:
> 
> >
> > uvloop (first commit 2015-11-01) is newer than Swift's hummingbird
> > (2015-04-20, based on
> >
> > https://github.com/openstack/swift/commit/a0e300df180f7f4ca64fc1eaf3601a1a73fc68cb
> > and github network graph) so it would not have been part of the
> > consideration.
> >
> 
> And it still wouldn't be, since it doesn't solve the problem.
> 

Agreed it isn't done in uvloop. But it is done in libuv and the uvloop
devs agree it should be done. So this is the kind of thing where the
community can invest in python + C to help solve problems thought only
solvable by other languages.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [ironic] using ironic as a replacement for existing datacenter baremetal provisioning

2016-06-07 Thread Clint Byrum
Excerpts from Joshua Harlow's message of 2016-06-07 08:46:28 -0700:
> Clint Byrum wrote:
> > Excerpts from Kris G. Lindgren's message of 2016-06-06 20:44:26 +:
> >> Hi ironic folks,
> >> As I'm trying to explore how GoDaddy can use ironic I've created the 
> >> following in an attempt to document some of my concerns, and I'm wondering 
> >> if you folks could help myself identity ongoing work to solve these (or 
> >> alternatives?)
> >
> > Hi Kris. I've been using Ironic in various forms for a while, and I can
> > answer a few of these things.
> >
> >> List of concerns with ironic:
> >>
> >> 1.)Nova<->  ironic interactions are generally seem terrible?
> >
> > I don't know if I'd call it terrible, but there's friction. Things that
> > are unchangable on hardware are just software configs in vms (like mac
> > addresses, overlays, etc), and things that make no sense in VMs are
> > pretty standard on servers (trunked vlans, bonding, etc).
> >
> > One way we've gotten around it is by using Ironic standalone via
> > Bifrost[1]. This deploys Ironic in wide open auth mode on 127.0.0.1,
> > and includes playbooks to build config drives and deploy images in a
> > fairly rudimentary way without Nova.
> >
> > I call this the "better than Cobbler" way of getting a toe into the
> > Ironic waters.
> >
> > [1] https://github.com/openstack/bifrost
> 
> Out of curiosity, why ansible vs turning 
> https://github.com/openstack/nova/blob/master/nova/virt/ironic/driver.py 
> (or something like it) into a tiny-wsgi-app (pick useful name here) that 
> has its own REST api (that looks pretty similar to the public functions 
> in that driver file)?

That's an interesting idea. I think a reason Bifrost doesn't just import
nova virt drivers is that they're likely _not_ a supported public API
(despite not having _'s at the front). Also, a lot of the reason Bifrost
exists is to enable users to get the benefits of all the baremetal
abstraction work done in Ironic without having to fully embrace all of
OpenStack's core. So while you could get a little bit of the stuff from
nova (like config drive building), you'd still need to handle network
address assignment, image management, etc. etc., and pretty soon you
start having to run a tiny glance and a tiny neutron. The Bifrost way
is the opposite: I just want a tiny Ironic, and _nothing_ else.

> 
> That seems almost easier than building a bunch of ansible scripts that 
> appear (at a glance) to do similar things; and u get the benefit of 
> using a actual programming language vs a 
> half-programming-ansible-yaml-language...
> 
> A realization I'm having is that I'm really not a fan of using ansible 
> as a half-programming-ansible-yaml-language, which it seems like people 
> start to try to do after a while (because at some point you need 
> something like if statements, then things like [1] get created), no 
> offense to the authors, but I guess this is my personal preference (it's 
> also one of the reasons taskflow directly is a lib. in python, because 
> , people don't need to learn a new language).
> 

We use python in Ansible all the time:

http://git.openstack.org/cgit/openstack/bifrost/tree/playbooks/library

The reason to use Ansible is that it has already implemented all of
the idempotency and error handling and UI needs that one might need for
running workflows.

I've tried multiple times to understand taskflow, and to me, Ansible is
the anti-taskflow. It's easy to pick up, easy to read the workflows,
doesn't require deep surgery on your code to use (just execute
ansible-playbook), and is full of modules to support nearly anything
your deployment may need.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [ironic] using ironic as a replacement for existing datacenter baremetal provisioning

2016-06-06 Thread Clint Byrum
Excerpts from Kris G. Lindgren's message of 2016-06-06 20:44:26 +:
> Hi ironic folks,
> As I'm trying to explore how GoDaddy can use ironic I've created the 
> following in an attempt to document some of my concerns, and I'm wondering if 
> you folks could help myself identity ongoing work to solve these (or 
> alternatives?)

Hi Kris. I've been using Ironic in various forms for a while, and I can
answer a few of these things.

> List of concerns with ironic:
> 
> 1.)Nova <-> ironic interactions are generally seem terrible?

I don't know if I'd call it terrible, but there's friction. Things that
are unchangable on hardware are just software configs in vms (like mac
addresses, overlays, etc), and things that make no sense in VMs are
pretty standard on servers (trunked vlans, bonding, etc).

One way we've gotten around it is by using Ironic standalone via
Bifrost[1]. This deploys Ironic in wide open auth mode on 127.0.0.1,
and includes playbooks to build config drives and deploy images in a
fairly rudimentary way without Nova.

I call this the "better than Cobbler" way of getting a toe into the
Ironic waters.

[1] https://github.com/openstack/bifrost

>   -How to accept raid config and partitioning(?) from end users? Seems to not 
> a yet agreed upon method between nova/ironic.

AFAIK accepting it from the users just isn't solved. Administrators
do have custom ramdisks that they boot to pre-configure RAID during
enrollment.

>-How to run multiple conductors/nova-computes?   Right now as far as I can 
> tell all of ironic front-end by a single nova-compute, which I will have to 
> manage via a cluster technology between two or mode nodes.  Because of this 
> and the way host-agregates work I am unable to expose fault domains for 
> ironic instances (all of ironic can only be under a single AZ (the az that is 
> assigned to the nova-compute node)). Unless I create multiple nova-compute 
> servers and manage multiple independent ironic setups.  This makes 
> on-boarding/query of hardware capacity painful.

The nova-compute does almost nothing. It really just talks to the
scheduler to tell it what's going on in Ironic. If it dies, deploys
won't stop. You can run many many conductors and spread load and fault
tolerance among them easily. I think for multiple AZs though, you're
right, there's no way to expose that. Perhaps it can be done with cells,
which I think Rackspace's OnMetal uses (but I'll let them refute or
confirm that).

Seems like the virt driver could be taught to be AZ-aware and some
metadata in the server record could allow AZs to go through to Ironic.

>   - Nova appears to be forcing a we are "compute" as long as "compute" is 
> VMs, means that we will have a baremetal flavor explosion (ie the mismatch 
> between baremetal and VM).
>   - This is a feeling I got from the ironic-nova cross project meeting in 
> Austin.  General exmaple goes back to raid config above. I can configure a 
> single piece of hardware many different ways, but to fit into nova's world 
> view I need to have many different flavors exposed to end-user.  In this way 
> many flavors can map back to a single piece of hardware with just a lsightly 
> different configuration applied. So how am I suppose to do a single server 
> with 6 drives as either: Raid 1 + Raid 5, Raid 5, Raid 10, Raid 6, or JBOD.  
> Seems like I would need to pre-mark out servers that were going to be a 
> specific raid level.  Which means that I need to start managing additional 
> sub-pools of hardware to just deal with how the end users wants the raid 
> configured, this is pretty much a non-starter for us.  I have not really 
> heard of whats being done on this specific front.

You got that right. Perhaps people are comfortable with this limitation.
It is at least simple.

> 
> 2.) Inspector:
>   - IPA service doesn't gather port/switching information
>   - Inspection service doesn't process port/switching information, which 
> means that it wont add it to ironic.  Which makes managing network swinging 
> of the host a non-starter.  As I would inspect the host – then modify the 
> ironic record to add the details about what port/switch the server is 
> connected to from a different source.  At that point why wouldn't I just 
> onboard everything through the API?
>   - Doesn't grab hardware disk configurations, If the server has multiple 
> raids (r1 + r5) only reports boot raid disk capacity.
>   - Inspection is geared towards using a different network and dnsmasq 
> infrastructure than what is in use for ironic/neutron.  Which also means that 
> in order to not conflict with dhcp requests for servers in ironic I need to 
> use different networks.  Which also means I now need to handle swinging 
> server ports between different networks.
> 
> 3.) IPA image:
>   - Default build stuff is pinned to extremly old versions due to gate 
> failure issues. So I can not work without a fork for onboard of servers due 
> to the fact that IPMI modules 

Re: [openstack-dev] [keystone][all] Incorporating performance feedback into the review process

2016-06-06 Thread Clint Byrum
Excerpts from Brant Knudson's message of 2016-06-03 15:16:20 -0500:
> On Fri, Jun 3, 2016 at 2:35 PM, Lance Bragstad  wrote:
> 
> > Hey all,
> >
> > I have been curious about impact of providing performance feedback as part
> > of the review process. From what I understand, keystone used to have a
> > performance job that would run against proposed patches (I've only heard
> > about it so someone else will have to keep me honest about its timeframe),
> > but it sounds like it wasn't valued.
> >
> >
> We had a job running rally for a year (I think) that nobody ever looked at
> so we decided it was a waste and stopped running it.
> 
> > I think revisiting this topic is valuable, but it raises a series of
> > questions.
> >
> > Initially it probably only makes sense to test a reasonable set of
> > defaults. What do we want these defaults to be? Should they be determined
> > by DevStack, openstack-ansible, or something else?
> >
> >
> A performance test is going to depend on the environment (the machines,
> disks, network, etc), the existing data (tokens, revocations, users, etc.),
> and the config (fernet, uuid, caching, etc.). If these aren't consistent
> between runs then the results are not going to be usable. (This is the
> problem with running rally on infra hardware.) If the data isn't realistic
> (1000s of tokens, etc.) then the results are going to be at best not useful
> or at worst misleading.
> 

That's why I started the counter-inspection spec:

http://specs.openstack.org/openstack/qa-specs/specs/devstack/counter-inspection.html

It just tries to count operations, and graph those. I've, unfortunately,
been pulled off to other things of late, but I do intend to loop back
and hit this hard over the next few months to try and get those graphs.

What we'd get initially is just graphs of how many messages we push
through RabbitMQ, and how many rows/queries/transactions we push through
mysql. We may also want to add counters like how many API requests
happened, and how many retries happen inside the code itself.

There's a _TON_ we can do now to ensure that we know what the trends are
when something gets "slow", so we can look for a gradual "death by 1000
papercuts" trend or a hockey stick that can be tied to a particular
commit.

> What does the performance test criteria look like and where does it live?
> > Does it just consist of running tempest?
> >
> >
> I don't think tempest is going to give us numbers that we're looking for
> for performance. I've seen a few scripts and have my own for testing
> performance of token validation, token creation, user creation, etc. which
> I think will do the exact tests we want and we can get the results
> formatted however we like.
> 

Agreed that tempest will only give a limited view. Ideally one would
also test things like "after we've booted 1000 vms, do we end up reading
1000 more rows, or 1000 * 1000 more rows.

> From a contributor and reviewer perspective, it would be nice to have the
> > ability to compare performance results across patch sets. I understand that
> > keeping all performance results for every patch for an extended period of
> > time is unrealistic. Maybe we take a daily performance snapshot against
> > master and use that to map performance patterns over time?
> >
> >
> Where are you planning to store the results?
> 

Infra has a graphite/statsd cluster which is made for collecting metrics
on tests. It might need to be expanded a bit, but it should be
relatively cheap to do so given the benefit of having some of these
numbers.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO][diskimage-builder] Proposing Stephane Miller to dib-core

2016-06-02 Thread Clint Byrum
Excerpts from Gregory Haynes's message of 2016-06-01 12:50:19 -0500:
> Hello everyone,
> 
> I'd like to propose adding Stephane Miller (cinerama) to the
> diskimage-builder core team. She has been a huge help with our reviews
> for some time now and I think she would make a great addition to our
> core team. I know I have benefited a lot from her bash expertise in many
> of my reviews and I am sure others have as well :).
> 
> I've spoken with many of the active cores privately and only received
> positive feedback on this, so rather than use this as an all out vote
> (although feel free to add your ++'s) I'd like to use this as a final
> call out in case any objections are wanting to be made. If none have
> been made by next Wednesday (6/8) I'll go ahead and add her to dib-core.

Sorry but I won't deny Stephanie her parade of +1's. ;)

+1. Stephanie has been doing a great job, and getting more reviews done
than me lately. :)

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] I'm going to expire open bug reports older than 18 months.

2016-05-30 Thread Clint Byrum
(Top posting as a general reply to the thread)

Bugs are precious data. As much as it feels like the bug list is full of
cruft that won't ever get touched, one thing that we might be missing in
doing this is that the user who encounters the bug and takes the time
to actually find the bug tracker and report a bug, may be best served
by finding that somebody else has experienced something similar. If you
close this bug, that user is now going to be presented with the "I may
be the first person to report this" flow instead of "yeah I've seen that
error too!". The former can be a daunting task, but the latter provides
extra incentive to press forward, since clearly there are others who
need this, and more data is helpful to triagers and fixers.

I 100% support those who are managing bugs doing whatever they need
to do to make sure users' issues are being addressed as well as can be
done with the resources available. However, I would also urge everyone
to remember that the bug tracker is not only a way for developers to
manage the bugs, it is also a way for the community of dedicated users
to interact with the project as a whole.

Excerpts from Markus Zoeller's message of 2016-05-23 13:02:29 +0200:
> TL;DR: Automatic closing of 185 bug reports which are older than 18
> months in the week R-13. Skipping specific bug reports is possible. A
> bug report comment explains the reasons.
> 
> 
> I'd like to get rid of more clutter in our bug list to make it more
> comprehensible by a human being. For this, I'm targeting our ~185 bug
> reports which were reported 18 months ago and still aren't in progress.
> That's around 37% of open bug reports which aren't in progress. This
> post is about *how* and *when* I do it. If you have very strong reasons
> to *not* do it, let me hear them.
> 
> When
> 
> I plan to do it in the week after the non-priority feature freeze.
> That's week R-13, at the beginning of July. Until this date you can
> comment on bug reports so they get spared from this cleanup (see below).
> Beginning from R-13 until R-5 (Newton-3 milestone), we should have
> enough time to gain some overview of the rest.
> 
> I also think it makes sense to make this a repeated effort, maybe after
> each milestone/release or monthly or daily.
> 
> How
> ---
> The bug reports which will be affected are:
> * in status: [new, confirmed, triaged]
> * AND without assignee
> * AND created at: > 18 months
> A preview of them can be found at [1].
> 
> You can spare bug reports if you leave a comment there which says
> one of these (case-sensitive flags):
> * CONFIRMED FOR: NEWTON
> * CONFIRMED FOR: MITAKA
> * CONFIRMED FOR: LIBERTY
> 
> The expired bug report will have:
> * status: won't fix
> * assignee: none
> * importance: undecided
> * a new comment which explains *why* this was done
> 
> The comment the expired bug reports will get:
> This is an automated cleanup. This bug report got closed because
> it is older than 18 months and there is no open code change to
> fix this. After this time it is unlikely that the circumstances
> which lead to the observed issue can be reproduced.
> If you can reproduce it, please:
> * reopen the bug report
> * AND leave a comment "CONFIRMED FOR: "
>   Only still supported release names are valid.
>   valid example: CONFIRMED FOR: LIBERTY
>   invalid example: CONFIRMED FOR: KILO
> * AND add the steps to reproduce the issue (if applicable)
> 
> 
> Let me know if you think this comment gives enough information how to
> handle this situation.
> 
> 
> References:
> [1] http://45.55.105.55:8082/bugs-dashboard.html#tabExpired
> 

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [ironic][neutron] bonding?

2016-05-24 Thread Clint Byrum
Excerpts from Jim Rollenhagen's message of 2016-05-24 07:51:21 -0400:
> Hi,
> 
> There's rumors floating around about Neutron having a bonding model in
> the near future. Are there any solid plans for that?
> 
> For context, as part of the multitenant networking work, ironic has a
> portgroup concept proposed, where operators can configure bonding for
> NICs in a baremetal machine. There are ML2 drivers that support this
> model and will configure a bond.
> 
> Some folks have concerns about landing this code if Neutron is going to
> support bonding as a first-class citizen. So before we delay any
> further, I'd like to find out if there's any truth to this, and what the
> timeline for that might look like.
> 

FYI we've been playing with bonding and Ironic using Bifrost and glean.
There are some patches up for the format we've used: 

https://review.openstack.org/#/c/318940/

Just a thought, it would be good if we can get that metadata to be
the same for Neutron, or at least get an early idea of the spec so we
can make sure glean supports it ASAP.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all][tc] Languages vs. Scope of "OpenStack"

2016-05-24 Thread Clint Byrum
Excerpts from Geoff O'Callaghan's message of 2016-05-24 15:31:28 +1000:
> 
> > On 24 May 2016, at 3:13 PM, Clint Byrum <cl...@fewbar.com> wrote:
> > 
> > 
> [snip]
> 
> > those other needs. Grab a python developer, land some code, and your
> > feature is there.
> 
> s/python/whateverlanguage/
> 

But that's just the point. If you have lots of languages, you have to
find developers who know whichever one your feature or bug fix needs to
be written in.

> > 
> >> I also never said, ship the source code and say ‘good luck’.   What I did 
> >> imply  was, due to a relaxing of coding platform requirements we might be 
> >> able to deliver a function at this performance point that  we may not have 
> >> been able to do otherwise.   We should always provide support and the 
> >> code,  but as to what language it’s written it i’m personally not fussed 
> >> and I have to deal with a variety of languages already so maybe that’s why 
> >> I don’t see it as a big problem.
> > 
> > This again assumes that one only buys software and does not ever
> > participate in its development in an ongoing basis. There's nothing
> > wrong with that, but this particular community is highly focused on
> > people who do want to participate and think that the ability to
> > adapt this cloud they've invested in to their changing business needs is
> > more important than any one feature.
> 
> No I didn’t say that at all and I don’t believe it’s assumed.I just said 
> I wasn’t fussed about what language it’s written in and just wanted 
> developers to be able to contribute if they had something to contribute.   
> 

Not being fussed about the language means not being fussed about who can
develop on it, so I took that to mean not being interested in developing
on it. I'm not sure either of us was "wrong" here, but I apologize for
assuming that's what you meant, if that's indeed not what you meant.

> > 
> >> 
> >> I understand there will be integration challenges and I agree with 
> >> cohesiveness being a good thing, but I also believe we must deliver value 
> >> more than cohesiveness.   The value is what operators want,  the 
> >> cohesiveness is what the developers may or may not want.
> >> 
> > 
> > We agree that delivering value to end users and operators is the #1
> > priority. I think we just disagree on the value of an open development
> > model and cohesion in the _community_.
> 
> It’s not open if you restrict developers based on programming language.
> Trust me I get cohesion and it’s value, we’ve reached the stage now where 
> cohesion is being questioned.  The questioning is a good thing and it is a 
> measure of the health of the community.

So, there's a funny principle here, where the word open is _so open_
that one can use it to classify any number of aspects, while ignoring
others, and still be correct.

I qualified the development model with the word open, because the way
we govern it, the way code and change move through the system, are 100%
transparent and available to anyone who wants to participate. But I agree,
it is less available to those who want to participate using languages
we've chosen to avoid. They have to begin at the governance level,
which they have, in fact, done by approaching the TC. But they may be
shout out, and that would make developing on OpenStack closed to them.

However, I don't think the TC takes this lightly, and they understand
that having it open to _more_ contribution is the goal. What I think may
not make sense to all parties, is that closing it to some, will keep it
open to many others. And what I think Thierry did by opening this thread
was try to figure out how many stand on either side of that decision.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all][tc] Languages vs. Scope of "OpenStack"

2016-05-23 Thread Clint Byrum
Excerpts from Geoff O'Callaghan's message of 2016-05-24 14:34:46 +1000:
> 
> > On 24 May 2016, at 2:04 PM, Clint Byrum <cl...@fewbar.com> wrote:
> > 
> > Excerpts from Geoff O'Callaghan's message of 2016-05-24 10:59:13 +1000:
> >> Surely openstack is defined by it’s capabilities and interfaces rather 
> >> than it’s internals.  Given the simplistic view that openstack is a 
> >> collection of micro services connected by well defined api’s does it 
> >> really matter what code is used inside that micro service (or macro 
> >> service )?   If there is a community willing to bring and support code in 
> >> language ‘x’,  isn’t that better than worrying about the off chance of 
> >> existing developers wishing to move projects and not knowing the target 
> >> language?Is there a fear that we’ll end up with a fork of nova (or 
> >> others) written in rust ?
> >> If we’re not open to evolving then we’ll die.
> >> 
> >> Just throwing in a different perspective.
> > 
> > Thanks Geoff. Your perspective is one that has been considered many
> > times. That is an engineering perspective though, and ignores the people
> > and businesses that are the users of OpenStack. We don't just shove the
> > code out the door and say "good luck!”.
> 
> Hey Clint,  That is exactly what I wasn’t saying.   Businesses and people out 
> there want the platform to have the features they want and work.  They could 
> care less about what it’s written in.   You tend to care when it doesn’t work 
> and / or it doesn’t have the features you want.   So I can understand for 
> operators now they have a vested interest in making sure they can debug what 
> is given to them as we don’t meet Geoff’s rule # 1 - the code must work and 
> it must do what I want.
> 

I completely respect that you may be in a situation where you can
actually define all of the things you want for your IT department today.

My experience is that those things change constantly, and there is no
product that will ever add all of the features that everyone needs. But
if one can hit 80%, and 100% of the startup features, then being open
source, there's no worry that some vendor will simply refuse to respect
those other needs. Grab a python developer, land some code, and your
feature is there.

> I also never said, ship the source code and say ‘good luck’.   What I did 
> imply  was, due to a relaxing of coding platform requirements we might be 
> able to deliver a function at this performance point that  we may not have 
> been able to do otherwise.   We should always provide support and the code,  
> but as to what language it’s written it i’m personally not fussed and I have 
> to deal with a variety of languages already so maybe that’s why I don’t see 
> it as a big problem.

This again assumes that one only buys software and does not ever
participate in its development in an ongoing basis. There's nothing
wrong with that, but this particular community is highly focused on
people who do want to participate and think that the ability to
adapt this cloud they've invested in to their changing business needs is
more important than any one feature.

> 
> I understand there will be integration challenges and I agree with 
> cohesiveness being a good thing, but I also believe we must deliver value 
> more than cohesiveness.   The value is what operators want,  the cohesiveness 
> is what the developers may or may not want.
> 

We agree that delivering value to end users and operators is the #1
priority. I think we just disagree on the value of an open development
model and cohesion in the _community_.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all][tc] Languages vs. Scope of "OpenStack"

2016-05-23 Thread Clint Byrum
Excerpts from Geoff O'Callaghan's message of 2016-05-24 10:59:13 +1000:
> Surely openstack is defined by it’s capabilities and interfaces rather than 
> it’s internals.  Given the simplistic view that openstack is a collection of 
> micro services connected by well defined api’s does it really matter what 
> code is used inside that micro service (or macro service )?   If there is a 
> community willing to bring and support code in language ‘x’,  isn’t that 
> better than worrying about the off chance of existing developers wishing to 
> move projects and not knowing the target language?Is there a fear that 
> we’ll end up with a fork of nova (or others) written in rust ?
> If we’re not open to evolving then we’ll die.
> 
> Just throwing in a different perspective.

Thanks Geoff. Your perspective is one that has been considered many
times. That is an engineering perspective though, and ignores the people
and businesses that are the users of OpenStack. We don't just shove the
code out the door and say "good luck!".

So, each new aspect of the internals that the people and business must
evaluate is another hurdle to adoption at some level. If OpenStack is
the fastest open source cloud platform ever, but only gets adopted by
the 5 largest cloud users on the planet, and smaller organizations are
forced to invest their money in closed source appliance based clouds,
then we'll all suffer, as neither of those options will stand a chance
against the large scale efforts of the established players like Amazon,
Google, and Microsoft.

Likewise, if OpenStack is adopted by the bulk of small/medium businesses
needing clouds, but nobody can run it at scale, we will also be crushed
as users develop elastic workloads that need to be moved onto a large
scale cloud.

And if there are simply two, one for big clouds, and one for small,
they'll never be fully compatible. It will be like POSIX... moderately
useful for a limited set of applications, but most things will have to
be optimized for each different implementation, wasting everyone's time
porting when they should be developing.

So, while this doesn't mean we should just force everything onto Python,
it does mean that we should remain as cohesive as we can when making
choices like this. So the question "What is OpenStack" needs to be
asked, so we can evaluate what should be kept close together, and what
might be better off as an independent component.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] 答复: [Heat][Glance] Can't migrate to glance v2 completely

2016-05-20 Thread Clint Byrum
Excerpts from Erno Kuvaja's message of 2016-05-20 13:20:11 +0100:
> 
> The only reliable way to create Glance images for consumption in general
> manner is to make sure that we use the normal workflows (currently
> uploading the image to Glance and in future the supported manners of Image
> Import) and let Glance and Glance only to deal with it's backends.
> 

Sounds good to me, Glance needs to be the gateway for images, not
anything else.

I wonder if the shade library would be useful here.

If Heat were to use shade, which hides the complexities of not only
v1 vs v2, but also v2 with import vs. v2 with upload through glance,
then one could have a fairly generic image type.

Ansible has done this, which is quite nice, because now ansible users
can be confident that whatever OpenStack cloud they're interacting with,
they can just use the os_image module with the same arguments.

Anyway, if shade isn't used, Heat will need to do something similar,
which is to provide template users a way to port their templates to the
API's available. Perhaps the provider template system could be used so
each Heat operator can provide a single "image upload" template snippet
that gets used.

That would bring on a second question, which is how does one even upload
a large file to Heat. This is non-trivial, and I think the only way Heat
can reasonably support this is by handing the user a Swift tempurl in
the create/update stack API whenever a file is referenced. That swift
object would then be used by the engine to proxy that file into glance if
import isn't supported, or if it is, to tell glance where to import from.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tc] supporting Go

2016-05-20 Thread Clint Byrum
Excerpts from Thomas Goirand's message of 2016-05-20 12:42:09 +0200:
> On 05/11/2016 04:17 PM, Dean Troyer wrote:
> > The big difference with Go here is that the dependency work happens at
> > build time, not deploy/runtime in most cases.  That shifts much of the
> > burden to people (theoretically) better suited to manage that work.
> 
> I am *NOT* buying that doing static linking is a progress. We're back 30
> years in the past, before the .so format. It is amazing that some of us
> think it's better. It simply isn't. It's a huge regression, for package
> maintainers, system admins, production/ops, and our final users. The
> only group of people who like it are developers, because they just don't
> need to care about shared library API/ABI incompatibilities and
> regressions anymore.
> 

Static linking is _a_ model. Dynamic linking is _a_ model.

There are aspects of each model, and when you lay different values over
any model, it will appear better or worse depending on those values.

Debian values slow, massively reusable change. Everyone advances at
around the same pace, and as a result, the whole community has a net
positive improvement. This is fantastically useful and is not outdated
in any way IMO. I love my stable OS.

But there is more software being written now than ever before, and that
growth does not have a downward curve. As people write more software,
and the demands on them get more intense, they have less reasons to
reuse a wider set of libraries, and have more of a need to reuse a
narrow subset, in a specific way. This gives rise to the continuous
delivery model where one ships a narrow subset all together and tests it
deeply, rather than testing the broader tools in isolation. That means
sometimes they go faster than the rest of the community in one area,
and slower in others. They give up the broad long term efficiency for
short term agility.

That may sound crass, like it's just fast and loose with no regard for the
future. But without the agility, they will just get run over by somebody
else more agile. When somebody chooses this, they're choosing it because
they have to, not because they don't understand what they're giving up.

Whichever model is chosen, It doesn't mean one doesn't care about the
greater community. It simply means one has a set of challenges when
contributing along side those with conflicting values.

But it's not a regression, it is simply people with a different set of
values, finding the same old solutions useful again for different reasons.

So, I'd urge that we all seek to find some empathy with people in other
positions, and compromise when we can. Debian has done so already,
with a Built-Using helper now for go programs. When libraries update,
one can just rebuild the packages that are built using it. So, rather
than fighting this community, Debian seems to be embracing it. Only
tradition stands in the way of harmony in this case.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tc][ansible][ironic] Reusing Ansible code in OpenStack projects

2016-05-19 Thread Clint Byrum
Excerpts from Pavlo Shchelokovskyy's message of 2016-05-19 15:28:03 +0300:
> Hi all,
> 
> I have a question re FOSS licenses interplay. I am pretty sure that
> OpenStack community (e.g. openstack-ansible) has already faced such
> questions and I would really appreciate any advice.
> 
> We are developing a new ansible-based deployment driver for Ironic [0] and
> would like to use some parts of ansible-lib Python API to avoid boilerplate
> code in custom Ansible modules and callbacks we are writing, and in the
> future probably use Ansible Python API to launch playbooks themselves.
> 
> The problem is Ansible and ansible-lib in particular are licensed under GPL
> v3 [1] "or later" [2]. According to [3] Apache 2.0 license is only one way
> compatible with GPL v3 (GPL v3-licensed code can include Apache
> 2.0-licensed code, but not vice versa).
> 
> I am by far not a legal expert, so my questions are:
> 
> Does it mean that the moment I do "from ansible import ..." in my Python
> code, which AFAIU means I am "linking" to it, I am required to use a
> GPLv3-compliant license for my code too (in particular not Apache 2.0)?
> What problems might that imply in respect with including such code in an
> OpenStack project (e.g. submitting it to Ironic repo) and distributing the
> project?

Yes that's what it means. You can write modules in any license you want
because AnsibleModule is BSD 2-clause, but plugins must be GPLv3.

> If there are indeed problems with that, would it be safer to keep the code
> in a separate project and also distribute it separately?
> Even when distributed separately, will merely using (dynamically importing
> at run-time) a GPLv3-licensed driver from ApacheV2-licensed Ironic
> constitute any license violation?
> 

I think your options are to make it function without the plugins, and
distribute just them separately (so a bare bones version comes with
Ironic, but it works better w/ the GPLv3 plugins), or just distribute
the whole thing separately.

Long term, you might approach Ansible about possibly making their plugin
interface LGPL so that people can write non-GPL plugins. But, it may be
part of a broader strategy to ensure that contribution happens in the
open. As an OSS hippie, I applaud them for choosing a strong copyleft
license. :)

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tc] [all] [glance] Answers to some questions about Glance

2016-05-18 Thread Clint Byrum
Excerpts from Nikhil Komawar's message of 2016-05-18 11:03:45 -0400:
> 
> On 5/18/16 2:15 AM, Clint Byrum wrote:
> > Excerpts from Robert Collins's message of 2016-05-18 14:57:05 +1200:
> >> On 18 May 2016 at 00:54, Brian Rosmaita <brian.rosma...@rackspace.com> 
> >> wrote:
> >>
> >>>> Couple of examples:
> >>>> 1. switching from "is_public=true" to "visibility=public"
> >>>
> >>> This was a major version change in the Images API.  The 'is_public' 
> >>> boolean
> >>> is in the original Images v1 API, 'visibility' was introduced with the
> >>> Images v2 API in the Folsom release.  You just need an awareness of which
> >>> version of the API you're talking to.
> >> So I realise this is ancient history, but this is really a good
> >> example of why Monty has been pushing on 'never break our APIs': API
> >> breaks hurt users, major versions or not. Keeping the old attribute as
> >> an alias to the new one would have avoided the user pain for a very
> >> small amount of code.
> >>
> >> We are by definition an API - doesn't matter that its HTTP vs Python -
> >> when we break compatibility, there's a very long tail of folk that
> >> will have to spend time updating their code; 'Microversions' are a
> >> good answer to this, as long as we never raise the minimum version we
> >> support. glibc does a very similar thing with versioned symbols - and
> >> they support things approximately indefinitely.
> > +1, realy well said. As Nikhil said, assumptions are bad, and assuming
> 
> You have only conveniently picked up one things I've said in my email,
> why not choose the other parts of resolving those assumptions correctly.
> Please do not phrase me when the propaganda is orthogonal to what I
> proposed.
> 
> > that nobody's using that, or that they'll just adapt, is not really a
> > great way to establish a relationship with the users.
> 
> It's not that assumption that users are not using it:
> 
> The very _assumption_ people who are using it is that glance v1 is ok to
> be public facing API which was never designed to be one. So, that's the
> assumption you need to take into account and not the one you like to
> pick. That's the part where I talk about being engaged missing in your
> message.
> 

It doesn't really matter what it was designed for, once something is
released as a public facing API, and users build on top of it, there
are consequences for breaking it.

There's nothing personal about this problem. It happens. My message is
simple, and consistent with the other thread (and I do see how they are
linked): We don't get to pick how people consume the messages we send.
Whether in docs, code, mailing list threads, or even a room with humans
face to face, mistakes will be made.

So lets be frank about that, and put aside our egos, and accept that
mistakes were made _by all parties_ in this specific case, and nobody is
"mad" about this, but we'd all like to avoid making them again.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tc] [all] [glance] Answers to some questions about Glance

2016-05-18 Thread Clint Byrum
Excerpts from Robert Collins's message of 2016-05-18 14:57:05 +1200:
> On 18 May 2016 at 00:54, Brian Rosmaita  wrote:
> 
> >> Couple of examples:
> >> 1. switching from "is_public=true" to "visibility=public"
> >
> >
> > This was a major version change in the Images API.  The 'is_public' boolean
> > is in the original Images v1 API, 'visibility' was introduced with the
> > Images v2 API in the Folsom release.  You just need an awareness of which
> > version of the API you're talking to.
> 
> So I realise this is ancient history, but this is really a good
> example of why Monty has been pushing on 'never break our APIs': API
> breaks hurt users, major versions or not. Keeping the old attribute as
> an alias to the new one would have avoided the user pain for a very
> small amount of code.
> 
> We are by definition an API - doesn't matter that its HTTP vs Python -
> when we break compatibility, there's a very long tail of folk that
> will have to spend time updating their code; 'Microversions' are a
> good answer to this, as long as we never raise the minimum version we
> support. glibc does a very similar thing with versioned symbols - and
> they support things approximately indefinitely.

+1, realy well said. As Nikhil said, assumptions are bad, and assuming
that nobody's using that, or that they'll just adapt, is not really a
great way to establish a relationship with the users.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Fuel][MySQL][DLM][Oslo][DB][Trove][Galera][operators] Multi-master writes look OK, OCF RA and more things

2016-05-17 Thread Clint Byrum
I missed your reply originally, so sorry for the 2 week lag...

Excerpts from Mike Bayer's message of 2016-04-30 15:14:05 -0500:
> 
> On 04/30/2016 10:50 AM, Clint Byrum wrote:
> > Excerpts from Roman Podoliaka's message of 2016-04-29 12:04:49 -0700:
> >>
> >
> > I'm curious why you think setting wsrep_sync_wait=1 wouldn't help.
> >
> > The exact example appears in the Galera documentation:
> >
> > http://galeracluster.com/documentation-webpages/mysqlwsrepoptions.html#wsrep-sync-wait
> >
> > The moment you say 'SET SESSION wsrep_sync_wait=1', the behavior should
> > prevent the list problem you see, and it should not matter that it is
> > a separate session, as that is the entire point of the variable:
> 
> 
> we prefer to keep it off and just point applications at a single node 
> using master/passive/passive in HAProxy, so that we don't have the 
> unnecessary performance hit of waiting for all transactions to 
> propagate; we just stick on one node at a time.   We've fixed a lot of 
> issues in our config in ensuring that HAProxy definitely keeps all 
> clients on exactly one Galera node at a time.
> 

Indeed, haproxy does a good job at shifting over rapidly. But it's not
atomic, so you will likely have a few seconds where commits landed on
the new demoted backup.

> >
> > "When you enable this parameter, the node triggers causality checks in
> > response to certain types of queries. During the check, the node blocks
> > new queries while the database server catches up with all updates made
> > in the cluster to the point where the check was begun. Once it reaches
> > this point, the node executes the original query."
> >
> > In the active/passive case where you never use the passive node as a
> > read slave, one could actually set wsrep_sync_wait=1 globally. This will
> > cause a ton of lag while new queries happen on the new active and old
> > transactions are still being applied, but that's exactly what you want,
> > so that when you fail over, nothing proceeds until all writes from the
> > original active node are applied and available on the new active node.
> > It would help if your failover technology actually _breaks_ connections
> > to a presumed dead node, so writes stop happening on the old one.
> 
> If HAProxy is failing over from the master, which is no longer 
> reachable, to another passive node, which is reachable, that means that 
> master is partitioned and will leave the Galera primary component.   It 
> also means all current database connections are going to be bounced off, 
> which will cause errors for those clients either in the middle of an 
> operation, or if a pooled connection is reused before it is known that 
> the connection has been reset.  So failover is usually not an error-free 
> situation in any case from a database client perspective and retry 
> schemes are always going to be needed.
> 

There are some really big assumptions above, so I want to enumerate
them:

1. You assume that a partition between haproxy and a node is a partition
   between that node and the other galera nodes.
2. You assume that I never want to failover on purpose, smoothly.

In the case of (1), there are absolutely times where the load balancer
thinks a node is dead, and it is quite happily chugging along doing its
job. Transactions will be already committed in this scenario that have
not propagated, and there may be more than one load balancer, and only
one of them thinks that node is dead.

For the limited partition problem, having wsrep_sync_wait turned on
would result in consistency, and the lag would only be minimal as the
transactions propagate onto the new primary server.

For the multiple haproxy problem, lag would be _horrible_ on all nodes
that are getting reads as long as there's another one getting writes,
so a solution for making sure only one is specified would need to be
developed using a leader election strategy. If haproxy is able to query
wsrep status, that might be ideal, as galera will in fact elect leaders
for you (assuming all of your wsrep nodes are also mysql nodes, which
is not the case if you're using 2 nodes + garbd for example).

This is, however, a bit of a strawman, as most people don't need
active/active haproxy nodes, so the simplest solution is to go
active/passive on your haproxy nodes with something like UCARP handling
the failover there. As long as they all use the same primary/backup
ordering, then a new UCARP target should just result in using the same
node, and a very tiny window for inconsistency and connection errors.

The second assumption is handled by leader election as well. If there's
always one leader node that load balancers send traffic to, then one
should be able to force promotion of a different node

Re: [openstack-dev] [tc] [all] [glance] On operating a high throughput or otherwise team

2016-05-16 Thread Clint Byrum
Excerpts from Nikhil Komawar's message of 2016-05-14 17:42:16 -0400:
> Hi all,
> 
> 
> Lately I have been involved in discussions that have resulted in giving
> a wrong idea to the approach I take in operating the (Glance) team(s).
> While my approach is consistency, coherency and agility in getting
> things done (especially including the short, mid as well as long term
> plans), it appears that it wasn't something evident. So, I have decided
> to write this email so that I can collectively gather feedback and share
> my thoughts on the right(eous) approach.
> 

I find it rather odd that you or anyone believes there is a "right"
approach that would work for over 1500 active developers and 200+
companies.

We can definitely improve upon the aspects of what we have now, by
incremental change or revolution. But I doubt we'll ever have a
community that is "right" for everyone.

> 
> My experience has been that OpenStack is relatively slow. In fact the
> feedback I get from people who are secondary (short span contributors)
> is that it's very slow. There's a genuine reason for that and it's not
> as simple as you are an Open Source/Community project or that people are
> unreasonable or that there's lot of bike-shedding, etc.
> 

It's slow like a freight train. Sure, time from point A to point B for
any one interest can be agonizingly slow. But the aggregate number of
changes (both in design, as well as code) is _staggering_ given the
number of competing interests involved.

> 
> We are developing something that is usable, operationally friendly and
> that it's easier to contribute & maintain but, many strong influencers
> are missing on the most important need for OpenStack -- efficient way of
> communication. I think we have the tools and right approach on paper and
> we've mandated it in the charter too, but that's not enough to operate
> things. Also, many people like to work on the assumption that all the
> tools of communication are equivalent or useful and there are no
> side-effects of using them ever. I strongly disagree. Please find the
> reason below:
> 

I'd be interested to see evidence of anyone believing something close
to that, much less "many people".

I do believe people don't take into account everyone's perspective and
communication style when choosing how to communicate. But we can't really
know all of the ways anything we do in a distributed system affects all
of the parts. We can reason about it, and I think you've done a fine job
of reasoning through some of the points. But you can't know, nor can I,
and I don't think anyone is laboring under the illusion that they can
know this.

> 
> Let me start from scratch:-
> 
> 
> * What is code really?
> 
> Code is nothing but a way to communicate your decisions. These decisions
> (if, then, else, while, etc.) are nothing but a way to consistently
> produce a repeatable output using a machine.  (
> https://en.wikipedia.org/wiki/Turing_machine )
> 

We'll have to agree to disagree here. _YES_ code does as much
communicating with humans as it does controlling computers. However,
there's a huge difference between the way communication works
(influence) and the way computing works (control).

> 
> * If it's that simple, why is there even a problem?
> 
> Decisions when taken in tandem or in parallel can result into a more
> complex phenomenon that is not perceptibly evident. That results into
> assumptions.
> 

Isn't it funny how our communication system has the same problems as our
software? [1]

[1] https://en.wikipedia.org/wiki/Conway's_law

> 
> * So, what can be the blocker?
> 
> Nothing, but working with these assumptions is really the blocker. That
> is exactly why many people in their feedback say we have a "people
> problem" in OpenStack. But it's not really the people problem, it is the
> assumption problem.
> 
> Assumptions are very very bad:
> 
> With 'n' problems in a domain and 'm' people working on all those
> problems, individually, we have the assumption problem of the order of
> O((m*e)^n) where you can think of 'e' as the convergence factor.
> Convergence factor being the ability of a group to come to an agreement
> of the order of 'agree to agree', 'agree to disagree' (add percentages
> to each for more granularity). There is also another assumption (for the
> convergence factor) that everyone wants to work in the best interest of
> solving the problems in that domain.
> 
> 

rAmen brother. We can't assume to know the motivations of anyone, though
we can at least decide how much to trust what people say. So if they say
that they're interested in solving the problems in a domain, I certainly
will give them space to prove that right or wrong.

> * How do I attempt to solve this situation?
> 
> I think the first and foremost step is understanding the 'intent' behind
> every step -- whether it is a proposal, code, email, etc.
> 
> Another important step is to reduce the communication gap -- be it be
> meetings, emails, chats, etc. I 

Re: [openstack-dev] [tc] supporting Go

2016-05-14 Thread Clint Byrum
Excerpts from Dieterly, Deklan's message of 2016-05-14 01:18:20 +:
> Python 2.x will not be supported for much longer, and let¹s face it,
> Python is easy, but it just does not scale. Nor does Python have the
> performance characteristics that large, distributed systems require. Maybe
> Java could replace Python in OpenStack as the workhorse language.

Which is why we've been pushing toward python 3 for years now. It's
default for python apps in distros now, gates are holding the line at the
unit test level now, so we just need a push toward integration testing
and I truly believe we'll be seeing people use python3 and pypy to run
OpenStack in the next year.

And regarding not scaling: That's precisely what's being discussed,
and it seems like there are plenty of options for pushing python further
that aren't even half explored yet. Meanwhile, if enough people agree,
perhaps go is a good option for those areas where we just can't push
Python further without it already looking like another language anyway.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tc] supporting Go

2016-05-13 Thread Clint Byrum
Excerpts from Dmitry Tantsur's message of 2016-05-13 01:14:02 -0700:
> On 05/11/2016 09:50 PM, Eric Larson wrote:
> > To contrast that, the go POC was able to use a well tested go DNS
> > library and implement the same documented interface that was then
> > testable via the same functional tests. It also allowed an extremely
> > simple deployment and had a minimal impact for our CI systems. Finally,
> > as other go code has been written on our small team, getting Python
> > developers up to speed has been trivial. Memory management, built in
> > concurrency primitives, and similar language constructs have made using
> > Go feel natural.
> 
> This is pretty subjective, I would say. I personally don't feel Go 
> (especially its approach to error handling) any natural (at least no 
> more than Rust or Scala, for example). If familiarity for Python 
> developers is an argument here, mastering Cython or making OpenStack run 
> on PyPy must be much easier for a random Python developer out there to 
> seriously bump the performance. And it would not require introducing a 
> completely new language to the picture.
> 

I have been told before that eventlet isn't going to take advantage of
most of pypy's advantages. Can anyone confirm that? It seems like the
built in greenlet support would be a good fit, but maybe there's a layer
between greenlets and eventlet that I missed.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tc] supporting Go

2016-05-10 Thread Clint Byrum
Excerpts from Rayson Ho's message of 2016-05-10 07:19:23 -0700:
> On Tue, May 10, 2016 at 2:42 AM, Tim Bell  wrote:
> > I hope that the packaging technologies are considered as part of the TC
> evaluation of a new language. While many alternative approaches are
> available, a language which could not be packaged into RPM or DEB would be
> an additional burden for distro builders and deployers.
> >
> 
> I mentioned in earlier replies but I may as well mention it again: a
> package manager gives you no advantage in a language toolchain like Go (or
> Rust). In fact, when you code is written in Go, you will be spared from
> dependency hell.
> 

Package managers don't just resolve dependencies. You may have forgotten
the days _before_ apt-get where they just _expressed_ dependencies, but
it was up to you to find and download and install them all together.

There is also integration. Having an init script or systemd unit that
expresses when it makes sense to start this service is quite useful.

Source packages assist users in repeating the build the way the binary
they're using was built. If you do need to patch, patching the version
you're using, with the flags it used, means less entropy to deal with.
The alternative is going full upstream, which is great, and should be
done for anything you intend to have a deep relationship, but may not be
appropriate in every case.

Finally, the chain of trust is huge. Knowing that the binary in that
package is the one that was built by developers who understand your OS
is something we take for granted every time we 'yum install' or 'apt-get
install'. Of course, a go binary distributor can make detached pgp
signatures of their binaries, or try to claim their server is secured
and https is enough. But that puts the onus on the user to figure out
how to verify, or places trust in the global PKI, which is usually fine
(and definitely better than nothing at all!) but far inferior to the
signed metadata/binary approach distros use.

> And, while not the official way to install Go, the Go toolchain can be
> packaged and in fact it is in Ubuntu 16.04 LTS:
> 
> https://launchpad.net/ubuntu/xenial/+source/golang-1.6
> 
> 
> IMO, the best use case of not using a package manager is when deploying
> into containers -- would you prefer to just drop a static binary of your Go
> code, or you would rather install "apt-get" into a container image, and
> then install the language runtime via apt-get, and finally your
> application?? I don't know about you, but many startup companies like Go as
> it would give them much faster time to react.
> 
> Lastly, I would encourage anyone who has never even used Go to at least
> download the Go toolchain and install it in your home directory (or if you
> are committed enough, system wide), and then compile a few hello world
> programs and see what Go gives you. Go won't give you everything, but I am
> a "pick the right tool for the right job" guy and I am pretty happy about
> Go.
> 

Go's fine. But so is Python. I think the debate here is whether Go has
enough strengths vs. Python to warrant endorsement by OpenStack.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tc] supporting Go

2016-05-09 Thread Clint Byrum
Excerpts from Edward Leafe's message of 2016-05-09 12:17:40 -0700:
> On May 9, 2016, at 1:58 PM, Hayes, Graham  wrote:
> 
> > This is not a "Go seems cool - lets go try that" decision from us - we
> > know we have a performance problem with one of our components, and we
> > have come to the conclusion that Go (or something like it) is the
> > solution.
> 
> Whenever I hear claims that Python is “too slow for X”, I wonder what’s so 
> special about X that makes it so much more demanding than, say, serving up 
> YouTube. YouTube is written nearly entirely in Python, and has been for many, 
> many years. The only parts that aren’t are those that were identified as 
> particular performance bottlenecks, such as some parts of the encoding 
> process. These were then written in C, which is drop-in compatible with 
> Python using ctypes.
> 

NO, we should paint it yellow!

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tc] supporting Go

2016-05-09 Thread Clint Byrum
Excerpts from Hayes, Graham's message of 2016-05-09 11:58:38 -0700:
> On 09/05/2016 19:39, Ben Swartzlander wrote:
> > On 05/09/2016 02:15 PM, Clint Byrum wrote:
> >> Excerpts from Pete Zaitcev's message of 2016-05-09 08:52:16 -0700:
> >>> On Mon, 9 May 2016 09:06:02 -0400
> >>> Rayson Ho <raysonlo...@gmail.com> wrote:
> >>>
> >>>> Since the Go toolchain is pretty self-contained, most people just follow
> >>>> the official instructions to get it installed... by a one-step:
> >>>>
> >>>> # tar -C /usr/local -xzf go$VERSION.$OS-$ARCH.tar.gz
> >>>
> >>> I'm pretty certain the humanity has moved on from this sort of thing.
> >>> Nowadays "most people" use packaged language runtimes that come with
> >>> the Linux they're running.
> >>>
> >>
> >> Perhaps for mature languages. But go is still finding its way, and that
> >> usually involves rapid changes that are needed faster than the multi-year
> >> cycle Linux distributions offer.
> >
> > This statement right here would be the nail in the coffin of this idea
> > if I were deciding. As a community we should not be building software
> > based on unstable platforms and languages.
> >
> > I have nothing against golang in particular but I strongly believe that
> > mixing 2 languages within a project is always the wrong decision, and
> > doubly so if one of those languages is a niche language. The reason is
> > simple: it's hard enough to find programmers who are competent in one
> > language -- finding programmers who know both languages well will be
> > nearly impossible. You'll end up with core reviewers who can't review
> > half of the code and developers who can only fix bugs in half the code.
> >
> > If you want to write code in a language that's not Python, go start
> > another project. Don't call it OpenStack. If it ends up being a better
> > implementation than the reference OpenStack Swift implementation, it
> > will win anyways and perhaps Swift will start to look more like the rest
> > of the projects in OpenStack with a standardized API and multiple
> > plugable implementations.
> >
> 
> Sure - the Designate team could maintain 2 copies of our DNS server,
> one in python as a reference, and one externally in Golang / C / C++ /
> Rust / $language, which would in reality need to be used by anything
> over a medium size deployment.
> 
> That seems less than ideal for our users though.
> 
> This is not a "Go seems cool - lets go try that" decision from us - we
> know we have a performance problem with one of our components, and we
> have come to the conclusion that Go (or something like it) is the
> solution.
> 
>  From a deck about "the rise and fall of Bind 10" [0] -
> 
>"Python is awesome, but too damn slow for DNS"
> 
> 0 - 
> https://ripe68.ripe.net/presentations/208-The_Decline_and_Fall_of_BIND_10.pdf
> 

I love this, first a bikeshed statement: "Python is too damn slow for
DNS", and a few lines later "Perfect bikeshed topic".

There are all kinds of reasons to pick languages, but I think it would
be foolish of OpenStack to ignore the phenomenon of actual deployers
choosing Go to address OpenStack's shortcomings. Whether they're right,
I'm not sure, but I do know that the community wants to invest in things
that aren't Python, and so, that fact should at least be considered
fully before making any long term decisions.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tc] supporting Go

2016-05-09 Thread Clint Byrum
Excerpts from Pete Zaitcev's message of 2016-05-09 08:52:16 -0700:
> On Mon, 9 May 2016 09:06:02 -0400
> Rayson Ho  wrote:
> 
> > Since the Go toolchain is pretty self-contained, most people just follow
> > the official instructions to get it installed... by a one-step:
> > 
> > # tar -C /usr/local -xzf go$VERSION.$OS-$ARCH.tar.gz
> 
> I'm pretty certain the humanity has moved on from this sort of thing.
> Nowadays "most people" use packaged language runtimes that come with
> the Linux they're running.
> 

Perhaps for mature languages. But go is still finding its way, and that
usually involves rapid changes that are needed faster than the multi-year
cycle Linux distributions offer.

Also worth noting, is that go is not a "language runtime" but a compiler
(that happens to statically link in a runtime to the binaries it
produces...).

The point here though, is that the versions of Python that OpenStack
has traditionally supported have been directly tied to what the Linux
distributions carry in their repositories (case in point, Python 2.6
was dropped from most things as soon as RHEL7 was available with Python
2.7). With Go, there might need to be similar restrictions.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tc] supporting Go

2016-05-05 Thread Clint Byrum
Excerpts from Hayes, Graham's message of 2016-05-05 07:26:26 -0700:
> On 04/05/2016 00:32, Hayes, Graham wrote:
> > On 03/05/2016 17:03, John Dickinson wrote:
> >> TC,
> >>
> >> In reference to 
> >> http://lists.openstack.org/pipermail/openstack-dev/2016-May/093680.html 
> >> and Thierry's reply, I'm currently drafting a TC resolution to update 
> >> http://governance.openstack.org/resolutions/20150901-programming-languages.html
> >>  to include Go as a supported language in OpenStack projects.
> >>
> >> As a starting point, what would you like to see addressed in the document 
> >> I'm drafting?
> >>
> >> --John
> >>
> >>
> >>
> >
> > Great - I was about to write a thread like this :)
> >
> > Designate is looking to move a single component of ours to Go - and we
> > were wondering what was the best way to do it.
> >
> > The current policy does allow for the TC to bless different languages
> > on a case by case basis - do we need to go from just python and JS to
> > allowing all projects to use go, or should the TC approve (or
> > disapprove) the swift and designate requests?
> >
> > I think the swift and designate changes might be a good test case to
> > see how the build / mirroring / packaging / artifact / library issues
> > shake out.
> >
> > - Graham
> >
> > __
> > OpenStack Development Mailing List (not for usage questions)
> > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> So, as part of the update to that policy, should we have an
> "OpenStack Go developer guide" that lays out how we think we will
> implement go in our stack.
> 
> I am sure it will not be complete, or even correct for what we actually
> do in the end, but it could give us a place to try and come to a shared
> understanding.
> 

I think I share your desire, to have something concrete on the table
before the TC (I'm not a TC member) considered this. However, I would
want to make sure the barrier isn't so high as to discourage progress.

I would love for the team(s) proposing Go code to provide whatever they
have now as a base for such a document. I would support the minimum
amount of effort to just reformat and reorganize that document so that
it resembles our python oriented guides, and start from there.

I doubt anyone not writing Go regularly will have much practical to
say about the details, but the community as a whole can review it and
be sure that the teams making the proposal share the same orientation
toward developers as we have for python development.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Distributed Database

2016-05-04 Thread Clint Byrum
Excerpts from Mark Doffman's message of 2016-05-03 17:05:54 -0700:
> This thread has been a depressing read.
> 

First, I apologize if any of my actions have caused you any undue stress.

> I understand that the content is supposed to be distributed databases 
> but for me it has become an inquisition of cellsV2.
> 

That word, inquisition, is a bit loaded with cultural significance,
though I think the sterile definition applies accurately. It's not my
intend to bring any of the unfortunate aspects of it into this process
though. My main concern is that the actual details haven't even been
thought through at a high level, and we maybe shouldn't be pinning all
our scaling hopes on something that may well end up changing radically
in practice.

> Our question has clearly become "Should we continue efforts on 
> cellsV2?", which I will address head-on.
> 
> We shouldn't be afraid to abandon CellsV2. If there are designs that are 
> proven to be a better solution then our current momentum shouldn't keep 
> us from an abrupt change. As someone who is working on this I have an 
> attachment to the current design, but Its important for me to keep an 
> open mind.
> 
> Here are my *main* reasons for continuing work on CellsV2.
> 
> 1. It provides a proven solution to an immediate message queue problem.
> 
> Yes CellsV2 is different to CellsV1, but the previous solution showed 
> that application-level sharding of the message queue can work. CellsV2 
> provides this solution with a (moderately) easy upgrade path for 
> existing deployments. These deployments may not be comfortable with 
> changing MQ technologies or may already be using CellsV1. Application 
> level sharding of the message queue is not pretty, but will work.
> 

Indeed, one advantage of using a broker for RPC is that you only have
to ensure connectivity from nodes -> brokers. I can totally understand
a hesitance to ask people to ensure connectivity from (class of
nodes)<->(class of nodes), for each class of nodes that need it. That is
what 0mq asks one to do.

I was witness to a brief presentation from one of the QPID community
members about how they've addressed brokerless comms with a very simple,
non-broker "router daemon", and it was impressive how it straddled this
line nicely, allowing one to basically replace a broker with a set of
relatively stupid daemons that simply pass messages along in realtime,
using some clever techniques borrowed from OSPF and the like.

Both of these, 0mq, and brokerless AMQP 1.0, can be taken advantage of
_today_ with oslo.messaging drivers that exist already. However, they
require some battle hardening, so I respect that there are some who'd
rather we change OpenStack around its own battle tested choices than
start experimenting with new solutions that are outside of OpenStack.

The point of my persistence here is to make it clear that I don't think
Cells V2 is settled, and I don't think it will be a generally consumable
solution any time soon. I think for those of us with immediate concerns,
who are not interested in taking on cells v1 at this time, we should
look to experiment with these other options.

> 2. The 'complexity' of CellsV2 is vastly overstated.
> 
> Sure there is a-lot of *work* to do for cellsv2, but this doesn't imply 
> increased complexity: any refactoring requires work. CellsV1 added 
> complexity to our codebase, Cellsv2 does not. In-fact by clearly 
> separating data that is 'owned'by the different services we have I 
> believe that we are improving the modularity and encapsulation present 
> in Nova.
> 

I think the complexity is entirely unknown, and that the design should
fill its gaps, even at high levels, so that we can actually reason about
the complexity. Right now, there's hand waving in places that concern
me.

> 3. CellsV2 does not prohibit *ANY* of the alternative scaling methods
> mentioned in this thread.
> 
> Really, it doesn't. Both message queue and database switching are 
> completely optional. Both in the sense of running a single cell, and 
> even when running multiple cells. If anything, the ability to run 
> separate message queues and database connections could give us the 
> ability to trial these alternative technologies within a real, running, 
> cloud.
> 
> Just imagine the ability to set up a cell in your existing cloud that 
> runs 0mq rather than rabbit. How about a NewSQL database integrated in 
> to an existing cloud? Both of these things may (With some work) be possible.
> 

Prohibit is definitely not the word I would use either. But I'm not sure
I'd get too excited about enabling multiple drivers across cells either.
What I'd really like is a simple solution, and I truly do hope that
cells v2 becomes that some day.

> 
> 
> I could go on, but I won't. These are my main reasons and I'll stick to 
> them.
> 
> Its difficult to be proven wrong, but sometimes necessary to get the 
> best product that we can. I don't think that the existence of 
> alternative 

Re: [openstack-dev] [nova] Distributed Database

2016-05-03 Thread Clint Byrum
Excerpts from Andrew Laski's message of 2016-05-03 14:46:08 -0700:
> 
> On Mon, May 2, 2016, at 01:13 PM, Edward Leafe wrote:
> > On May 2, 2016, at 10:51 AM, Mike Bayer  wrote:
> > 
> > >> Concretely, we think that there are three possible approaches:
> > >> 1) We can use the SQLAlchemy API as the common denominator between a 
> > >> relational and non-relational implementation of the db.api component. 
> > >> These two implementation could continue to converge by sharing a large 
> > >> amount of code.
> > >> 2) We create a new non-relational implementation (from scratch) of 
> > >> the db.api component. It would require probably more work.
> > >> 3) We are also studying a last alternative: writing a SQLAlchemy 
> > >> engine that targets NewSQL databases (scalability + ACID):
> > >>  - https://github.com/cockroachdb/cockroach
> > >>  - https://github.com/pingcap/tidb
> > > 
> > > Going with a NewSQL backend is by far the best approach here.   That way, 
> > > very little needs to be reinvented and the application's approach to data 
> > > doesn't need to dramatically change.
> > 
> > I’m glad that Matthieu responded, but I did want to emphasize one thing:
> > of *course* this isn’t an ideal approach, but it *is* a practical one.
> > The biggest problem in any change like this isn’t getting it to work, or
> > to perform better, or anything else except being able to make the change
> > while disrupting as little of the existing code as possible. Taking an
> > approach that would be more efficient would be a non-starter since it
> > wouldn’t provide a clean upgrade path for existing deployments.
> 
> I would like to point out that this same logic applies to the current
> cellsv2 effort. It is a very practical set of changes which allows Nova
> to move forward with only minor effort on the part of deployers. And it
> moves towards a model that is already used and well understood by large
> deployers of Nova while also learning from the shortcomings of the
> previous architecture. In short, much of this is already battle tested
> and proven.
> 
> If we started Nova from scratch, I hear golang is lovely for this sort
> of thing, would we do things differently? Probably. However that's not
> the position we're in. And we're able to make measurable progress with
> cellsv2 at the moment and have a pretty clear idea of the end state. I
> can recall conversations about NoSQL as far back as the San Diego
> summit, which was my first so I can't say they didn't happen previously,
> and this is the first time I've seen any measurable progress on moving
> forward with it. But where it would go is not at all clear.
> 

I beg to differ about "pretty clear idea of the end state".

* There's no clear answer about scheduling. It's a high level "we'll
  give it a scheduler/resource tracker database of its own". But that's a
  massive amount of work just to design the migrations and solidify the
  API. I understand some of that work is ongoing and unrelated to cells
  v2, but it's not done  or clear yet.
  
* This also doesn't address the fact that for cellsv1 users a move like
  that will _regress_ scheduler scalability since now we can only have
  one scheduler and resource tracker instead of many. For those of us
  just now ramping up, it leaves us with no way to get high throughput
  on our scheduler.

* Further, if there's a central scheduler, that means all of the sort of
  clever scheduling hacks that people have achieved with cells v1 (a
  cell of baremetal, a cell of SSD, etc) will need to be done via other
  means, which is more design work that needs to happen.

* There's no clear way to efficiently list and sort results from lots of
  cells. The discussion came up with a few experiments to try, but the
  problem is _fundamental_ to sharding, and the cells v1 answer was a
  duplication of data which obviously cells v2 wants to avoid, and I
  would assume with good reason.

I have a huge amount of respect for what has been achieved with cells v1,
and I totally understand the hesitance to promote the way it works given
what cells v1 has taught its users. However, the design of v2 is quite
a bit different than v1, enough so that I think it should be treated as
an experiment until someone has a solid design of the whole thing and
can assert that it actually addresses scale without regressing things
significantly.

Meanwhile, there are other things deployers can do to address scale that
will likely cause less churn in Nova, and may even help other projects
scale to a similar size. I intend to return to my pursuit of actual
experiment results for these things now that I understand the state of
cells v2. I hope others will consider this path as well, so we can
collaborate on things like 0mq and better database connection handling.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: 

Re: [openstack-dev] [keystone] Token providers and Fernet as the default

2016-05-03 Thread Clint Byrum
Excerpts from Morgan Fainberg's message of 2016-05-03 11:13:38 -0700:
> On Tue, May 3, 2016 at 10:28 AM, Monty Taylor <mord...@inaugust.com> wrote:
> 
> > On 05/03/2016 11:47 AM, Clint Byrum wrote:
> >
> >> Excerpts from Monty Taylor's message of 2016-05-03 07:59:21 -0700:
> >>
> >>> On 05/03/2016 08:55 AM, Clint Byrum wrote:
> >>>
> >>>>
> >>>> Perhaps we have different perspectives. How is accepting what we
> >>>> previously emitted and told the user would be valid sneaky or wrong?
> >>>> Sounds like common sense due diligence to me.
> >>>>
> >>>
> >>> I agree - I see no reason we can't validate previously emitted tokens.
> >>> But I don't agree strongly, because re-authing on invalid token is a
> >>> thing users do hundreds of times a day. (these aren't oauth API Keys or
> >>> anything)
> >>>
> >>>
> >> Sure, one should definitely not be expecting everything to always work
> >> without errors. On this we agree for sure. However, when we do decide to
> >> intentionally induce errors for reasons we have not done so before, we
> >> should weigh the cost of avoiding that with the cost of having it
> >> happen. Consider this strawman:
> >>
> >> - User gets token, it says "expires_at Now+4 hours"
> >> - User starts a brief set of automation tasks in their system
> >>that does not use python and has not failed with invalid tokens thus
> >>far.
> >> - Keystone nodes are all updated at one time (AMAZING cloud ops team)
> >> - User's automation jobs fail at next OpenStack REST call
> >> - User begins debugging, wasting hours of time figuring out that
> >>their tokens, which they stored and show should still be valid, were
> >>rejected.
> >>
> >
> > Ah - I guess this is where we're missing each other, which is good and
> > helpful.
> >
> > I would argue that any user that is _storing_ tokens is doing way too much
> > work. If they are doing short tasks, they should just treat them as
> > ephemeral. If they are doing longer tasks, they need to deal with timeouts.
> > SO, this:
> >
> >
> > - User gets token, it says "expires_at Now+4 hours"
> > - User starts a brief set of automation tasks in their system
> >that does not use python and has not failed with invalid tokens thus
> >far.
> >
> > should be:
> >
> > - User starts a brief set of automation tasks in their system
> > that does not use python and has not failed with invalid tokens thus
> > far.
> >
> > "Get a token" should never be an activity that anyone ever consciously
> > performs.
> >
> >
> This is my view. Never, ever, ever assume your token is good until
> expiration. Assume the token might be broken at any request and know how to
> re-auth.
> 
> > And now they have to refactor their app, because this may happen again,
> >> and they have to make sure that invalid token errors can bubble up to the
> >> layer that has the username/password, or accept rolling back and
> >> retrying the whole thing.
> >>
> >> I'm not saying anybody has this system, I'm suggesting we're putting
> >> undue burden on users with an unknown consequence. Falling back to UUID
> >> for a while has a known cost of a little bit of code and checking junk
> >> tokens twice.
> >>
> >
> Please do not advocate "falling back" to UUID. I am actually against making
> fernet the default (very, very strongly), if we have to have this
> "fallback" code. It is the wrong kind of approach, we already have serious
> issues with complex code paths that produce subtly different results. If
> the options are:
> 
> 1) Make Fernet Default and have "fallback" code
> 
> or
> 
> 2) Leave UUID default and highly recommend fernet (plus gate on fernet
> primarily, default in devstack)
> 
> I will jump on my soapbox and be very loudly in favor of the 2nd option. If
> we communicate this is a change that will happen (hey, maybe throw an
> error/make the config option "none" so it has to be explicit) in Newton,
> and then move to a Fernet default in O - I'd be ok with that.
> 
> >
> > Totally. I have no problem with the suggestion that keystone handle this.
> > But I also think that users should quite honestly stop thinking about
> > tokens at all. Tokens are an implementation detail that if any user thinks
> > abou

Re: [openstack-dev] [nova] Distributed Database

2016-05-03 Thread Clint Byrum
Excerpts from Mike Bayer's message of 2016-05-03 09:04:00 -0700:
> 
> On 05/02/2016 01:48 PM, Clint Byrum wrote:
> >>
> >
> > FWIW, I agree with you. If you're going to use SQLAlchemy, use it to
> > take advantage of the relational model.
> >
> > However, how is what you describe a win? Whether you use SELECT .. FOR
> > UPDATE, or a stored procedure, the lock is not distributed, and thus, will
> > still suffer rollback failures in Galera. For single DB server setups, you
> > don't have to worry about that, and SELECT .. FOR UPDATE will work fine.
> 
> Well it's a "win" vs. the lesser approach considered which also did not 
> include a distributed locking system like Zookeeper.   It is also a win 
> even with a Zookeeper-like system in place because it allows a SQL query 
> to be much smarter about selecting data that involves IP numbers and 
> CIDRs, without the need to pull data into memory and process it there. 
> This is the most common mistake in SQL programming, not taking advantage 
> of SQL's set-based nature and instead pulling data into memory 
> unnecessarily.
> 

Indeed, we use relational databases so we don't have to deal with lots
of data that doesn't make sense to us at the time we want it.

> Also, the "federated MySQL" approach of Cells V2 would still be OK with 
> pessimistic locking, since this lock is not "distributed" across the 
> entire dataspace.   Only the usual Galera caveats apply, e.g. point to 
> only one galera "master" at a time and/or wait for Galera to support 
> "SELECT FOR UPDATE" across the cluster.
> 

Right, of course it would work. It's just a ton of code for not much
improvement in scalability or resilience.

> >
> > Furthermore, any logic that happens inside the database server is extra
> > load on a much much much harder resource to scale, using code that is
> > much more complicated to update.
> 
> So I was careful to use the term "stored function" and not "stored 
> procedure".   As ironic as it is for me to defend both the ORM 
> business-logic-in-the-application-not-the-database position, *and* the 
> let-the-database-do-things-not-the-application at the same time, using 
> database functions to allow new kinds of math and comparison operations 
> to take place over sets is entirely reasonable, and should not be 
> confused with the old-school big-business approach of building an entire 
> business logic layer as a huge wall of stored procedures, this is 
> nothing like that.
> 

Indeed, it's a complicated and nuanced position, but I think I
understand where you're going with it. My reluctance to put intelligence
in the database is just that, reluctance, not some hard and fast rule I
can quote.

> The Postgresql database has INET and CIDR types native which include the 
> same overlap logic we are implementing here as a MySQL stored function, 
> so the addition of math functions like these shouldn't be controversial. 
>The "load" of this function is completely negligible (however I would 
> be glad to assist in load testing it to confirm), especially compared to 
> pulling the same data across the wire, processing it in Python, then 
> sending just a tiny portion of it back again after we've extracted the 
> needle from the haystack.
> 

It's death by 1000 paper cuts when you talk about scaling. Of course it
will be faster, but the slices of CPU on the database server are still a
limited resource, whereas slices of CPU on stateless API/conductor nodes
are virtually limitless and far cheaper to scale elastically.

> In pretty much every kind of load testing scenario we do with Openstack, 
> the actual "load" on the database barely pushes anything.   The only 
> database "resource" issue we have is Openstack using far more idle 
> connections than it should, which is on my end to work on improvements 
> to the connection pooling system which does not scale well across 
> Openstack's tons-of-processes model.
> 

Indeed, pooling is something we should improve upon. But even more, we
need to improve upon error handling and resilience.

> >
> > To be clear, it's not the amount of data, but the size of the failure
> > domain. We're more worried about what will happen to those 40,000 open
> > connections from our 4000 servers when we do have to violently move them.
> 
> That's a really big number and I will admit I would need to dig into 
> this particular problem domain more deeply to understand what exactly 
> the rationale of that kind of scale would be here.   But it does seem 
> like if you were using SQL databases, and the 4000 server system is in 
> fact grouped into hundreds of "silos" that only

Re: [openstack-dev] [keystone] Token providers and Fernet as the default

2016-05-03 Thread Clint Byrum
Excerpts from Adam Young's message of 2016-05-03 07:21:52 -0700:
> On 05/03/2016 09:55 AM, Clint Byrum wrote:
> > When the operator has configured a new token format to emit, they should
> > also be able to allow any previously emitted formats to be validated to
> > allow users a smooth transition to the new format. We can then make the
> > default behavior for one release cycle to emit Fernet, and honor both
> > Fernet and UUID.
> >
> > Perhaps ignore the other bit that I put in there about switching formats
> > just because you have fernet keys. Let's say the new pseudo code only
> > happens in validation:
> >
> > try:
> >self._validate_fernet_token()
> > except NotAFernetToken:
> >self._validate_uuid_token()
> 
> I was actually thinking of a different migration strategy, exactly the 
> opposite:  for a while, run with the uuid tokens, but store the Fernet 
> body.  After while, switch from validating the uuid token body to the 
> stored Fernet.  Finally, switch to validating the Fernet token from the 
> request.  That way, we always have only one token provider, and the 
> migration can happen step by step.
> 
> It will not help someone that migrates from Icehouse to Ocata. Then 
> again, the dual plan you laid out above will not either;  at some point, 
> people will have to dump the token table to make major migrations.
> 

Your plan has a nice aspect that it allows validating Fernet tokens on
UUID-configured nodes too, which means operators don't have to be careful
to update all nodes at one time. So I think what you describe above is
an even better plan.

Either way, the point is to avoid an immediate mass token invalidation
event on change of provider.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [keystone] Token providers and Fernet as the default

2016-05-03 Thread Clint Byrum
Excerpts from Lance Bragstad's message of 2016-05-03 07:42:43 -0700:
> If we were to write a uuid/fernet hybrid provider, it would only be
> expected to support something like stable/liberty to stable/mitaka, right?
> This is something that we could contribute to stackforge, too.
> 

If done the way Adam Young described, with Fernet content as UUIDs,
one could in theory update from any UUID-aware provider, since the
Fernet-emitting nodes would just be writing their Fernet tokens into
the database that the UUID nodes read from, allowing the UUID-only nodes
to validate the new tokens. However, we never support jumping more than
one release at a time, so that is somewhat moot.

Also, stackforge isn't a thing, but I see what you're saying. It could
live out of tree, but let's not abandon all hope that we can collaborate
on something that works for users who desire to not have a window mass
token invalidation on update.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [keystone] Token providers and Fernet as the default

2016-05-03 Thread Clint Byrum
Excerpts from Monty Taylor's message of 2016-05-03 07:59:21 -0700:
> On 05/03/2016 08:55 AM, Clint Byrum wrote:
> >
> > Perhaps we have different perspectives. How is accepting what we
> > previously emitted and told the user would be valid sneaky or wrong?
> > Sounds like common sense due diligence to me.
> 
> I agree - I see no reason we can't validate previously emitted tokens. 
> But I don't agree strongly, because re-authing on invalid token is a 
> thing users do hundreds of times a day. (these aren't oauth API Keys or 
> anything)
> 

Sure, one should definitely not be expecting everything to always work
without errors. On this we agree for sure. However, when we do decide to
intentionally induce errors for reasons we have not done so before, we
should weigh the cost of avoiding that with the cost of having it
happen. Consider this strawman:

- User gets token, it says "expires_at Now+4 hours"
- User starts a brief set of automation tasks in their system
  that does not use python and has not failed with invalid tokens thus
  far.
- Keystone nodes are all updated at one time (AMAZING cloud ops team)
- User's automation jobs fail at next OpenStack REST call
- User begins debugging, wasting hours of time figuring out that
  their tokens, which they stored and show should still be valid, were
  rejected.

And now they have to refactor their app, because this may happen again,
and they have to make sure that invalid token errors can bubble up to the
layer that has the username/password, or accept rolling back and
retrying the whole thing.

I'm not saying anybody has this system, I'm suggesting we're putting
undue burden on users with an unknown consequence. Falling back to UUID
for a while has a known cost of a little bit of code and checking junk
tokens twice.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Distributed Database

2016-05-03 Thread Clint Byrum
Excerpts from Edward Leafe's message of 2016-05-03 08:20:36 -0700:
> On May 3, 2016, at 6:45 AM, Miles Gould  wrote:
> 
> >> This DB could be an RDBMS or Cassandra, depending on the deployer's 
> >> preferences
> > AFAICT this would mean introducing and maintaining a layer that abstracts 
> > over RDBMSes and Cassandra. That's a big abstraction, over two quite 
> > different systems, and it would be hard to write code that performs well in 
> > both cases. If performance in this layer is critical, then pick whichever 
> > DB architecture handles the expected query load better and use that.
> 
> Agreed - you simply can’t structure the data the same way. When I read 
> criticisms of Cassandra that include “you can’t do joins” or “you can’t 
> aggregate”, it highlights this fact: you have to think about (and store) your 
> data completely differently than you would in an RDBMS. You cannot simply 
> abstract out the differences.
> 

Right, once one accepts that fact, Cassandra looks a lot less like a
revolutionary database, and a lot more like a sharding toolkit.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [keystone] Token providers and Fernet as the default

2016-05-03 Thread Clint Byrum
Excerpts from Steve Martinelli's message of 2016-05-02 19:56:15 -0700:
> Comments inline...
> 
> On Mon, May 2, 2016 at 7:39 PM, Matt Fischer <m...@mattfischer.com> wrote:
> 
> > On Mon, May 2, 2016 at 5:26 PM, Clint Byrum <cl...@fewbar.com> wrote:
> >
> >> Hello! I enjoyed very much listening in on the default token provider
> >> work session last week in Austin, so thanks everyone for participating
> >> in that. I did not speak up then, because I wasn't really sure of this
> >> idea that has been bouncing around in my head, but now I think it's the
> >> case and we should consider this.
> >>
> >> Right now, Keystones without fernet keys, are issuing UUID tokens. These
> >> tokens will be in the database, and valid, for however long the token
> >> TTL is.
> >>
> >> The moment that one changes the configuration, keystone will start
> >> rejecting these tokens. This will cause disruption, and I don't think
> >> that is fair to the users who will likely be shown new bugs in their
> >> code at a very unexpected moment.
> >>
> >
> > This will reduce the interruption and will also as you said possibly catch
> > bugs. We had bugs in some custom python code that didn't get a new token
> > when the keystone server returned certain code, but we found all those in
> > our dev environment.
> >
> > From an operational POV, I can't imagine that any operators will go to
> > work one day and find out that they have a new token provider because of a
> > new default. Wouldn't the settings in keystone.conf be under some kind of
> > config management? I don't know what distros do with new defaults however,
> > maybe that would be the surprise?
> >
> 
> With respect to upgrades, assuming we default to Fernet tokens in the
> Newton release, it's only an issue if the the deployer has no token format
> specified (since it defaulted to UUID pre-Newton), and relied on the
> default after the upgrade (since it'll switches to Fernet in Newton).
> 

Assume all users are using defaults.

> I'm glad Matt outlines his reasoning above since that is nearly exactly
> what Jesse Keating said at the Fernet token work session we had in Austin.
> The straw man we come up with of a deployer that just upgrades without
> checking then config files is just that, a straw man. Upgrades are well
> planned and thought out before being performed. None of the operators in
> the room saw this as an issue. We opened a bug to prevent keystone from
> starting if fernet setup had not been run, and Fernet is the
> selected/defaulted token provider option:
> https://bugs.launchpad.net/keystone/+bug/1576315
> 


Right, I responded there, but just to be clear, this is not about
_operators_ being inconvenienced, it is about _users_.

> For all new installations, deploying your cloud will now have two extra
> steps, running "keystone-manage fernet_setup" and "keystone-manage
> fernet_rotate". We will update the install guide docs accordingly.
> 
> With all that said, we do intend to default to Fernet tokens for the Newton
> release.
> 

Great! They are supremely efficient and I love that we're moving
forward. However, users really do not care about something that just
makes the operator's life easier if it causes all of their stuff to blow
up in non-deterministic ways (since their new jobs won't have that fail,
it will be a really fun day in the debug chair).

> >
> >
> >>
> >> I wonder if one could merge UUID and Fernet into a provider which
> >> handles this transition gracefully:
> >>
> >> if self._fernet_keys:
> >>   return self._issue_fernet_token()
> >> else:
> >>   return self._issue_uuid_token()
> >>
> >> And in the validation, do the same, but also with an eye toward keeping
> >> the UUID tokens alive:
> >>
> >> if self._fernet_keys:
> >>   try:
> >> self._validate_fernet_token()
> >>   except InvalidFernetFormatting:
> >> self._validate_uuid_token()
> >> else:
> >>   self._validate_uuid_token()
> >>
> >
> This just seems sneaky/wrong to me. I'd rather see a failure here than
> switch token formats on the fly.
> 

You say "on the fly" I say "when the operator has configured things
fully".

Perhaps we have different perspectives. How is accepting what we
previously emitted and told the user would be valid sneaky or wrong?
Sounds like common sense due diligence to me.

Anyway, the idea could use a few kicks, and I think perhaps a better
way to state what I'm thinking is this:

When the ope

Re: [openstack-dev] [keystone] Token providers and Fernet as the default

2016-05-03 Thread Clint Byrum
Excerpts from Matt Fischer's message of 2016-05-02 16:39:02 -0700:
> On Mon, May 2, 2016 at 5:26 PM, Clint Byrum <cl...@fewbar.com> wrote:
> 
> > Hello! I enjoyed very much listening in on the default token provider
> > work session last week in Austin, so thanks everyone for participating
> > in that. I did not speak up then, because I wasn't really sure of this
> > idea that has been bouncing around in my head, but now I think it's the
> > case and we should consider this.
> >
> > Right now, Keystones without fernet keys, are issuing UUID tokens. These
> > tokens will be in the database, and valid, for however long the token
> > TTL is.
> >
> > The moment that one changes the configuration, keystone will start
> > rejecting these tokens. This will cause disruption, and I don't think
> > that is fair to the users who will likely be shown new bugs in their
> > code at a very unexpected moment.
> >
> 
> This will reduce the interruption and will also as you said possibly catch
> bugs. We had bugs in some custom python code that didn't get a new token
> when the keystone server returned certain code, but we found all those in
> our dev environment.
> 
> From an operational POV, I can't imagine that any operators will go to work
> one day and find out that they have a new token provider because of a new
> default. Wouldn't the settings in keystone.conf be under some kind of
> config management? I don't know what distros do with new defaults however,
> maybe that would be the surprise?
> 

"Production defaults" is something we used to mention a lot. One would
hope you can run a very nice Keystone with only the required settings
such as database connection details.

Agreed that upgrades will be conscious decisions by operators, no doubt!

However, the operator is not the one who gets the surprise. It is the
user who doesn't expect their tokens to be invalidated until their TTL
is up. The cloud changes when the operator decides it changes. And if
that is in the middle of something important, the operator has just
induced unnecessary complication on the user.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Distributed Database

2016-05-02 Thread Clint Byrum
Excerpts from Jay Pipes's message of 2016-05-02 10:43:21 -0700:
> On 05/02/2016 11:51 AM, Mike Bayer wrote:
> > On 05/02/2016 07:38 AM, Matthieu Simonin wrote:
> >> As far as we understand the idea of an ORM is to hide the relational
> >> database with an Object oriented API.
> >
> > I actually disagree with that completely.  The reason ORMs are so
> > maligned is because of this misconception; developer attempts to use an
> > ORM so that they will need not have to have any awareness of their
> > database, how queries are constructed, or even its schema's design;
> > witness tools such as Django ORM and Rails ActiveRecord which promise
> > this.   You then end up with an inefficient and unextensible mess
> > because the developers never considered anything about how the database
> > works or how it is queried, nor do they even have easy ways to monitor
> > or control it while still making use of the tool.   There are many blog
> > posts and articles that discuss this and it is in general known as the
> > "object relational impedance mismatch".
> >
> > SQLAlchemy's success comes from its rejection of this entire philosophy.
> >   The purpose of SQLAlchemy's ORM is not to "hide" anything but rather
> > to apply automation to the many aspects of relational database
> > communication as well as row->object mapping that otherwise express
> > themselves in an application as either a large amount of repetitive
> > boilerplate throughout an application or as an awkward series of ad-hoc
> > abstractions that don't really do the job very well.   SQLAlchemy is
> > designed to expose both the schema design as well as the structure of
> > queries completely.   My talk at [1] goes into this topic in detail
> > including specific API architectures that facilitate this concept.
> >
> > It's for that reason that I've always rejected notions of attempting to
> > apply SQLAlchemy directly on top of a datastore that is explicitly
> > non-relational.   By doing so, you remove a vast portion of the
> > functionality that relational databases provide and there's really no
> > point in using a tool like SQLAlchemy that is very explicit about DDL
> > and SQL on top of that kind of database.
> >
> > To effectively put SQLAlchemy on top of a non-relational datastore, what
> > you really want to do is build an entire SQL engine on top of it.  This
> > is actually feasible; I was doing work for the now-defunct FoundationDB
> > (was bought by Apple) who had a very good implementation of
> > SQL-on-top-of-distributed keystore going, and the Cockroach and TiDB
> > projects you mention are definitely the most appropriate choice to take
> > if a certain variety of distribution underneath SQL is desired.
> 
> Well said, Mike, on all points above.
> 
> 
> 
> > But also, w.r.t. Cells there seems to be some remaining debate over why
> > exactly a distributed approach is even needed.  As others have posted, a
> > single MySQL database, replicated across Galera or not, scales just fine
> > for far more data than Nova ever needs to store.  So it's not clear why
> > the need for a dramatic rewrite of its datastore is called for.
> 
> Cells(v1) in Nova *already has* completely isolated DB/MQ for each cell, 
> and there are bunch of 
> duplicated-but-slightly-different-and-impossible-to-maintain code paths 
> in the scheduler and compute manager. Part of the cellsv2 effort is to 
> remove these duplicated code paths.
> 
> Cells are, as much as anything else, an answer to an MQ scaling problem, 
> less so an answer to a DB scaling problem. Having a single MQ bus for 
> tens of thousands of compute nodes is just not tenable -- at least with 
> the message passing patterns and architecture that we use today...
> 
> Finally, Cells also represent a failure domain for the control plane. If 
> a network partition occurs between a cell and the top-level API layer, 
> no other cell is affected by the disruption.
> 
> Now, what does all this mean with regards to whether to use a single 
> distributed database solution versus a single RDBMS versus many isolated 
> RDBMS instances per cell? Not sure. Arguments can be made for all three 
> approaches clearly. Depending on what folks' priorities are with regards 
> to simplicity, scale, and isolation of failure domains, the "right" 
> choice is tough to determine.
> 
> On the one hand, using a single distributed datastore like Cassandra for 
> everything would make things conceptually easy to reason about and make 
> OpenStack clouds much easier to deploy at scale.
> 
> On the other hand, porting things to Cassandra (or any other NoSQL 
> solution) would require a complete overhaul of the way *certain* data is 
> queried in the Nova subsystems. Examples of Cassandra's poor fit for 
> some types of data are quota and resource usage aggregation queries. 
> While Cassandra does support some aggregation via CQL in recent 
> Cassandra versions, Cassandra simply wasn't built for this kind of data 
> access pattern and 

[openstack-dev] [keystone] Token providers and Fernet as the default

2016-05-02 Thread Clint Byrum
Hello! I enjoyed very much listening in on the default token provider
work session last week in Austin, so thanks everyone for participating
in that. I did not speak up then, because I wasn't really sure of this
idea that has been bouncing around in my head, but now I think it's the
case and we should consider this.

Right now, Keystones without fernet keys, are issuing UUID tokens. These
tokens will be in the database, and valid, for however long the token
TTL is.

The moment that one changes the configuration, keystone will start
rejecting these tokens. This will cause disruption, and I don't think
that is fair to the users who will likely be shown new bugs in their
code at a very unexpected moment.

I wonder if one could merge UUID and Fernet into a provider which
handles this transition gracefully:

if self._fernet_keys:
  return self._issue_fernet_token()
else:
  return self._issue_uuid_token()

And in the validation, do the same, but also with an eye toward keeping
the UUID tokens alive:

if self._fernet_keys:
  try:
self._validate_fernet_token()
  except InvalidFernetFormatting:
self._validate_uuid_token()
else:
  self._validate_uuid_token()

So that while one is rolling out new keystone nodes and syncing fernet
keys, all tokens issued would validated properly, with minimal extra
cost to support both (basically just a number of UUID tokens will need
to be parsed twice, once as Fernet, and once as UUID).

Thoughts? I think doing this would make changing the default fairly
uncontroversial.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Distributed Database

2016-05-02 Thread Clint Byrum
Excerpts from Mike Bayer's message of 2016-05-02 08:51:58 -0700:
> 
> Well IMO that's actually often a problem.  My goal across Openstack 
> projects in general is to allow them to make use of SQL more effectively 
> than they do right now; for example, in Neutron I am helping them to 
> move a block of code that inefficiently needs to load a block of data 
> into memory, scan it for CIDR overlaps, and then push data back out. 
> This approach prevents it from performing a single UPDATE statement and 
> ushers in the need for pessimistic locking against concurrent 
> transactions.  Instead, I've written for them a simple stored function 
> proof-of-concept [2] that will allow the entire operation to be 
> performed on the database side alone in a single statement.  Wins like 
> these are much less feasible if not impossible when a project decides it 
> wants to split its backend store between dramatically different 
> databases which don't offer such features.
> 

FWIW, I agree with you. If you're going to use SQLAlchemy, use it to
take advantage of the relational model.

However, how is what you describe a win? Whether you use SELECT .. FOR
UPDATE, or a stored procedure, the lock is not distributed, and thus, will
still suffer rollback failures in Galera. For single DB server setups, you
don't have to worry about that, and SELECT .. FOR UPDATE will work fine.

So to me, this is something where you need a distributed locking system
(ala ZooKeeper) to actually solve the problem for multiple database
servers.

Furthermore, any logic that happens inside the database server is extra
load on a much much much harder resource to scale, using code that is
much more complicated to update. For those reasons I'm generally opposed
to using any kind of stored procedures in large scale systems. It's the
same reason I dislike foreign key enforcement: you're expending a limited
resource to mitigate a problem which _can_ be controlled and addressed
with non-stateful resources that are easier and simpler to scale.

> >
> > Concretely, we think that there are three possible approaches:
> >  1) We can use the SQLAlchemy API as the common denominator between a 
> > relational and non-relational implementation of the db.api component. These 
> > two implementation could continue to converge by sharing a large amount of 
> > code.
> >  2) We create a new non-relational implementation (from scratch) of the 
> > db.api component. It would require probably more work.
> >  3) We are also studying a last alternative: writing a SQLAlchemy 
> > engine that targets NewSQL databases (scalability + ACID):
> >   - https://github.com/cockroachdb/cockroach
> >   - https://github.com/pingcap/tidb
> 
> Going with a NewSQL backend is by far the best approach here.   That 
> way, very little needs to be reinvented and the application's approach 
> to data doesn't need to dramatically change.
> 
> But also, w.r.t. Cells there seems to be some remaining debate over why 
> exactly a distributed approach is even needed.  As others have posted, a 
> single MySQL database, replicated across Galera or not, scales just fine 
> for far more data than Nova ever needs to store.  So it's not clear why 
> the need for a dramatic rewrite of its datastore is called for.
> 

To be clear, it's not the amount of data, but the size of the failure
domain. We're more worried about what will happen to those 40,000 open
connections from our 4000 servers when we do have to violently move them.

That particular problem isn't as scary if you have a large
Cassandra/MongoDB/Riak/ROME cluster, as the client libraries are
generally connecting to all or most of the nodes already, and will
simply use a different connection if the initial one fails. However,
these other systems also bring a whole host of new problems which the
simpler SQL approach doesn't have.

So it's worth doing an actual analysis of the failure handling before
jumping to the conclusion that a pile of cells/sharding code or a rewrite
to use a distributed database would be of benefit.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Fuel][MySQL][DLM][Oslo][DB][Trove][Galera][operators] Multi-master writes look OK, OCF RA and more things

2016-04-30 Thread Clint Byrum
Excerpts from Roman Podoliaka's message of 2016-04-29 12:04:49 -0700:
> Hi Bogdan,
> 
> Thank you for sharing this! I'll need to familiarize myself with this
> Jepsen thing, but overall it looks interesting.
> 
> As it turns out, we already run Galera in multi-writer mode in Fuel
> unintentionally in the case, when the active MySQL node goes down,
> HAProxy starts opening connections to a backup, then the active goes
> up again, HAProxy starts opening connections to the original MySQL
> node, but OpenStack services may still have connections opened to the
> backup in their connection pools - so now you may have connections to
> multiple MySQL nodes at the same time, exactly what you wanted to
> avoid by using active/backup in the HAProxy configuration.
> 
> ^ this actually leads to an interesting issue [1], when the DB state
> committed on one node is not immediately available on another one.
> Replication lag can be controlled  via session variables [2], but that
> does not always help: e.g. in [1] Nova first goes to Neutron to create
> a new floating IP, gets 201 (and Neutron actually *commits* the DB
> transaction) and then makes another REST API request to get a list of
> floating IPs by address - the latter can be served by another
> neutron-server, connected to another Galera node, which does not have
> the latest state applied yet due to 'slave lag' - it can happen that
> the list will be empty. Unfortunately, 'wsrep_sync_wait' can't help
> here, as it's two different REST API requests, potentially served by
> two different neutron-server instances.
> 

I'm curious why you think setting wsrep_sync_wait=1 wouldn't help.

The exact example appears in the Galera documentation:

http://galeracluster.com/documentation-webpages/mysqlwsrepoptions.html#wsrep-sync-wait

The moment you say 'SET SESSION wsrep_sync_wait=1', the behavior should
prevent the list problem you see, and it should not matter that it is
a separate session, as that is the entire point of the variable:

"When you enable this parameter, the node triggers causality checks in
response to certain types of queries. During the check, the node blocks
new queries while the database server catches up with all updates made
in the cluster to the point where the check was begun. Once it reaches
this point, the node executes the original query."

In the active/passive case where you never use the passive node as a
read slave, one could actually set wsrep_sync_wait=1 globally. This will
cause a ton of lag while new queries happen on the new active and old
transactions are still being applied, but that's exactly what you want,
so that when you fail over, nothing proceeds until all writes from the
original active node are applied and available on the new active node.
It would help if your failover technology actually _breaks_ connections
to a presumed dead node, so writes stop happening on the old one.

Also, If you thrash back and forth a bit, that could cause your app to
virtually freeze, but HAProxy and most other failover technologies allow
tuning timings so that you can stay off of a passive server long enough
to calm it down and fail more gracefully to it.

Anyway, this is why sometimes I do wonder if we'd be better off just
using MySQL with DRBD and good old pacemaker.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Distributed Database

2016-04-29 Thread Clint Byrum
Excerpts from Matt Riedemann's message of 2016-04-29 05:38:17 -0700:
> 
> So, we're all here in person this week (with 1 day left). The Nova team 
> has a meetup session all day (Salon A in the Hilton). Clint/Ed, can you 
> guys show up to that and bring these issues up in person so we can 
> actually talk through this? Preferably in the morning since people are 
> starting to leave after lunch.
> 

Unfortunately, I'm already back home in Los Angeles due to family
needs. But I do hope you all can have a discussion this morning, and
I'm happy to join via phone/skype/hangouts/etc. after 11:00am Austin time.

The reason I didn't bring any of this up while there was mostly that I
spent the time actually learning what was actually planned with Cells
v2, which I think I've gotten wrong 3 times now while learning before.
It has taken me another day or so to be able to articulate why I think
we may want to separate the concept of cells from the concept of scaling.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Distributed Database

2016-04-28 Thread Clint Byrum
Excerpts from Mike Bayer's message of 2016-04-28 22:16:54 -0500:
> 
> On 04/28/2016 08:25 PM, Edward Leafe wrote:
> 
> > Your own tests showed that a single RDBMS instance doesn’t even break a 
> > sweat
> > under your test loads. I don’t see why we need to shard it in the first
> > place, especially if in doing so we add another layer of complexity and
> > another dependency in order to compensate for that choice. Cells are a 
> > useful
> > concept, but this proposed implementation is adding way too much complexity
> > and debt to make it worthwhile.
> 
> now that is a question I have also.  Horizontal sharding is usually for 
> the case where you need to store say, 10B rows, and you'd like to split 
> it up among different silos.  Nothing that I've seen about Nova suggests 
> this is a system with any large data requirements, or even medium size 
> data (a few million rows in relational databases is nothing).I 
> didn't have the impression that this was the rationale behind Cells, it 
> seems like this is more of some kind of logical separation of some kind 
> that somehow suits some environments (but I don't know how). 
> Certainly, if you're proposing a single large namespace of data across a 
> partition of nonrelational databases, and then the data size itself is 
> not that large, as long as "a single namespace" is appropriate then 
> there's no reason to break out of more than one MySQL database.  There's 
> not much reason to transparently shard unless you are concerned about 
> adding limitless storage capacity.   The Cells sharding seems to be 
> intentionally explicit and non-transparent.
> 

There's a bit more to it than the number of rows. There's also a desire
to limit failure domains. IMO, that is entirely unfounded, as I've run
thousands of servers that depended on a single pair of MySQL servers
using simple DRBD and pacemaker with a floating IP for failover. This
is the main reason MySQL is a thing... it can handle 100,000 concurrent
connections just fine, and the ecosystem around detecting and handling
failure/maintenance is mature.

The whole cells conversation, IMO, stems from the way we use RabbitMQ.
We should just stop doing that. I know as I move forward with our scaling
efforts, I'll be trying several RPC drivers and none of them will go
through RabbitMQ.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Distributed Database

2016-04-28 Thread Clint Byrum
Excerpts from Jay Pipes's message of 2016-04-28 13:09:29 -0500:
> On 04/28/2016 08:44 AM, Edward Leafe wrote:
> > On Apr 24, 2016, at 3:28 PM, Robert Collins  
> > wrote:
> >> For instance, the things I think are essential for a distributed
> >> database based datastore:
> >> - good single-machine developer story. Must not need a physical
> >> cluster to hack on OpenStack
> >> - deal gracefully with single node/rack/site failures (when deployed
> >> appropriately) - allow limiting failure domain impact
> >> - straightforward programming model: wrong uses should be obvious to 
> >> reviewers
> >> - low latency performance with big datasets: e.g. nova list as an
> >> admin should be able to get the Nth page as rapidly as the 2nd or 3rd.
> 
> nova list as an admin (or any user frankly) should be a proxy call to 
> Project Searchlight and elasticsearch.
> 
> elasticsearch is a great interface for this kind of operation. We should 
> use it.
> 
> The Cells architecture, which allows the operator to both scale the 
> message queue *and* limit the scope of failure domains, is a good one. 
> Having a database that stores only the local (to the cell) information 
> is perfectly fine given the top-level API database's index/mapping 
> tables. Where this design has short-comings, as Ed and others point out, 
> are things like doing list operations across dozens of separate cells. 
> So, let's not use the Nova database for that and instead use a solution 
> that works very well for it: Project Searchlight.
> 
> And for those that have small clouds and/or don't want/need to install 
> elasticsearch, OK, cool, stick with a single cell and a single RDBMS 
> instance.
> 


Why are we inventing more things in OpenStack?

- 0MQ allows decentralized RPC and exists today! There's very little
  need for RabbitMQ sharding if it is just handling notifications,
  but if that's a concern, Kafka is also available in oslo.messaging
  and scales out naturally.
  - And now AMQP 1.0 / proton has a legitimate contender for an
alternative to 0MQ with the dispatch-router approach [1]
  - Or, crazy idea, we could just make RPC happen over http(s).

- Vitess [2] is a proven technology that serves _every_ request to
  Youtube, and provides a familiar SQL interface with sharding built
  in. Shard by project ID and you can just use regular index semantics.
  Or if that's unacceptable (IMO it's fine since Vitess provides enough
  redundancy that one shard has plenty of failure-domain reliability),
  you can also use the built-in Hadoop support they have for doing
  exactly what has been described (merge sorting the result of cross-cell
  queries).

- If we adopted those, the only reason for cells would be to allow
  setting up new batches of hosts to pre-test before unleashing the
  world on them.  Except, we could do that with host aggregates and
  permissions that hide/expose the right flavors for testing.


So, I have to ask, why is cells v2 being pushed so hard without looking
outside OpenStack for actual existing solutions, which, IMO, are
_numerous_, battle hardened, and simpler than cells.

[1] http://qpid.apache.org/components/dispatch-router/
[2] http://vitess.io/

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Distributed Database

2016-04-25 Thread Clint Byrum
Excerpts from Andrew Laski's message of 2016-04-22 14:32:59 -0700:
> 
> On Fri, Apr 22, 2016, at 04:27 PM, Ed Leafe wrote:
> > OK, so I know that Friday afternoons are usually the worst times to
> > write a blog post and start an email discussion, and that the Friday
> > immediately before a Summit is the absolute worst, but I did it anyway.
> > 
> > http://blog.leafe.com/index.php/2016/04/22/distributed_data_nova/
> > 
> > Summary: we are creating way too much complexity by trying to make Nova
> > handle things that are best handled by a distributed database. The
> > recent split of the Nova DB into an API database and separate cell
> > databases is the glaring example of going down the wrong road.
> > 
> > Anyway, read it on your flight (or, in my case, drive) to Austin, and
> > feel free to pull me aside to explain just how wrong I am. ;-)
> 
> I agree with a lot of what Monty wrote in his response. And agree that
> given a greenfield there are much better approaches that could be taken
> rather than partitioning the database.
> 
> However I do want to point out that cells v2 is not just about dealing
> with scale in the database. The message queue is another consideration,
> and as far as I know there is not an analog to the "distributed
> database" option available for the persistence layer.
> 

It's not even scale, it is failure domain isolation. I'm pretty
confident I can back 1000 busy compute nodes with a single 32 core 128GB
RabbitMQ. But, to do so is basically pure madness because of the failover
costs. Having 10x8 core RabbitMQ servers, as Cells v2 wants to do, means
the disruption caused by disrupting any one of them should be able to be
contained to 1/10th of the instances running. However, that assumes the
complexity of the implementation won't leak out to the unaffected servers.

Anyway, for messaging, part of the problem is until somewhat recently,
we thought RPC and Notifications were the same thing. They're VASTLY
different. For things like notifications, you don't need to look beyond
Apache Kafka to see that scale-out solutions exist. Also, if you
actually separate these two, you'll find that a single tiny RabbitMQ
cluster can handle the notifications without breaking a tiny sweat,
because it uses RabbitMQ for what it was actually designed for (Lots of
messages, few topics).

RPC being a different animal, we're, frankly, abusing RabbitMQ in silly
ways. There are a _massive_ pile of simpler things just waiting to
be tried:

- 0MQ - There's this fear of change and a bit of chicken/egg preventing
  this from becoming the default choice for RPC any time soon. I for one
  want to look into it, but keep getting side tracked because RMQ is
  "good enough" for now, and the default.
  
- Direct HTTP for RPC - I've always wondered why we don't do this for
  RPC. Same basic idea as 0MQ, but even more familiar to all of us.
  
- Thrift

- gRPC/protobuf

The basic theme for RPC is simple: just send the messages to the
services directly.

> Additionally with v1 we found that deployers have enjoyed being able to
> group their hardware with cells. Baremetal goes in this cell, SSD filled
> computes over here, and spinning disks over there. And beyond that
> there's the ability to create a cell, fill it with hardware, and then
> test it without plugging it up to the production API. Cells provides an
> entry point for poking at things that isn't available without it.
> 
> I don't want to get too sidetracked on talking about cells. I just
> wanted to point out that cells v2 did not come to fruition due to a fear
> of distributed databases.
> 

I'd love for the feature described to be separate from scaling. It's a
great feature, but it's just a happy accident that it helps with scale,
and would work just as well if we called it "host aggregates" and actually
made host aggregates work well.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Summit Core Party after Austin

2016-04-22 Thread Clint Byrum
Excerpts from Chivers, Doug's message of 2016-04-22 10:17:45 -0700:
> The Vancouver core party was a fantastic opportunity to meet some very smart 
> people and learn a lot about the projects they worked on. It was probably one 
> of the most useful parts of the summit, certainly more so than the greasy 
> marketing party, and arguably a much better use of developer time.
> 
> An opportunity to chill out and talk to technical people over a quiet beer? 
> Long may that continue, even if it is not the core party in its current form.
> 

Honestly, a common theme I see in all of this is "can we just have more
relaxed evening events?"

Perhaps this will happen naturally if/when we split design summit and
conference. But in the mean time, maybe we can just send this message
to party planners: Provide us with interesting spaces to converse and
bond in, and we will be happier.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Summit Core Party after Austin

2016-04-22 Thread Clint Byrum
Excerpts from Thierry Carrez's message of 2016-04-21 09:22:53 -0700:
> Michael Krotscheck wrote:
> > So, HPE is seeking sponsors to continue the core party. The reasons are
> > varied - internal sponsors have moved to other projects, the Big Tent
> > has drastically increased the # of cores, and the upcoming summit format
> > change creates quite a bit of uncertainty on everything surrounding the
> > summit.
> >
> > Furthermore, the existence of the Core party has been... contentious.
> > Some believe it's exclusionary, others think it's inappropriate, yet
> > others think it's a good way to thank those of use who agree to be
> > constantly pestered for code reviews.
> >
> > I'm writing this message for two reasons - mostly, to kick off a
> > discussion on whether the party is worthwhile. Secondly, to signal to
> > other organizations that this promotional opportunity is available.
> >
> > Personally, I appreciate being thanked for my work. I do not necessarily
> > need to be thanked in this fashion, however as the past venues have been
> > far more subdued than the Tuesday night events (think cocktail party),
> > it's a welcome mid-week respite for this overwhelmed little introvert. I
> > don't want to see it go, but I will understand if it does.
> >
> > Some numbers, for those who like them (Thanks to Mark Atwood for
> > providing them):
> >
> > Total repos: 1010
> > Total approvers: 1085
> > Repos for official teams: 566
> > OpenStack repo approvers: 717
> > Repos under release management: 90
> > Managed release repo approvers: 281
> 
> I think it's inappropriate because it gives a wrong incentive to become 
> a core reviewer. Core reviewing should just be a duty you sign up to, 
> not necessarily a way to get into a cool party. It was also a bit 
> exclusive of other types of contributions.
> 
> Apparently in Austin the group was reduced to only release:managed 
> repositories. This tag is to describe which repositories the release 
> team is comfortable handling. I think it's inappropriate to reuse it to 
> single out a subgroup of cool folks, and if that became a tradition the 
> release team would face pressure from repositories to get the tag that 
> are totally unrelated to what the tag describes.
> 
> So.. while I understand the need for calmer parties during the week, I 
> think the general trends is to have less parties and more small group 
> dinners. I would be fine with HPE sponsoring more project team dinners 
> instead :)
> 

I echo all your thoughts above Thierry, though I'd like to keep around
one aspect of them.

Some of these parties have been fantastic for learning about the local
culture of each city, so I want to be clear: that is something that
_does_ embody the spirit of the summit. Being in different cities brings
different individuals, and also puts all of us in a different frame
of mind, which I think opens us up to more collaboration. As has been
stated before, some of our more introverted collaborators welcome the
idea of a smaller party, but still one where introductions can be made,
and new social networks can be built.

Since part of this process is spending more money per person to produce a
deeper cultural experience, I wonder if a more fair system for attendance
could be devised. Instead of limiting to release:managed repositories,
could we randomize selection? In doing so, we could also include a
percentage of people who are not core reviewers but have expressed
interest in attending.

Anyway, I suppose this just boils down to a suggestion for whoever
decides to pick up the bill. Thanks for your consideration, whoever you
are. :)

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [release][requirements][packaging][summit] input needed on summit discussion about global requirements

2016-04-19 Thread Clint Byrum
Excerpts from Thomas Goirand's message of 2016-04-19 05:59:19 -0700:
> On 04/19/2016 01:01 PM, Chris Dent wrote:
> > We also, however, need to consider what the future might look like and
> > at least for some people and situations
> 
> I agree.
> 
> > the future does not involve
> > debs or rpms of OpenStack: packages and distributions are more trouble
> > than they are worth when you want to be managing your infrastructure
> > across multiple, isolated environments.
> 
> But here, I don't. It is my view that best, for containers, is to build
> them using distribution packages.
> 
> > In that case you want
> > puppet, ansible, chef, docker, k8s, etc.
> > 
> > Sure all those things _can_ use packages to do their install but for
> > at least _some_ people that's redundant: deploy from a tag, branch,
> > commit or head of a repo.
> > 
> > That's for _some_ people.
> 
> This thinking (ie: using pip and trunk, always) was one of the reason
> for TripleO to fail, and they went back to use packages. Can we learn
> from the past?
> 

I want to clarify something here because what you say above implies that
the reason a whole lot of us stopped contributing to TripleO was that we
"failed", or that the approach was wrong.

There was never an intention to preclude packages entirely. Those of
us initially building TripleO did so with continuous delivery as the
sole purpose that _we_ had. RedHat's contributors joined in and had a
stable release focus, and eventually as our sponsors changed their mind
about what they wanted from TripleO, we stopped pushing the CD model,
because we weren't even working on the project anymore. This left a void,
and RedHat's stable release model was obviously better served by using
the packaging tools they are already familiar with, thus the project now
looks like it pivoted. But it's really more that one set of contributors
stopped working on one use case, and another set ramped up contribution
on a different one.

So please, do not use the comment above to color this discussion.
Continuous delivery is a model that people will use to great success,
and in that model, dependency management is _very_ different.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [release][requirements][packaging][summit] input needed on summit discussion about global requirements

2016-04-19 Thread Clint Byrum
Excerpts from Matthew Thode's message of 2016-04-18 11:22:38 -0700:
> On 04/18/2016 12:33 PM, Doug Hellmann wrote:
> > Excerpts from Matthew Thode's message of 2016-04-18 10:23:37 -0500:
> >> On 04/18/2016 08:24 AM, Hayes, Graham wrote:
> >>> On 18/04/2016 13:51, Sean Dague wrote:
>  On 04/18/2016 08:22 AM, Chris Dent wrote:
> > On Mon, 18 Apr 2016, Sean Dague wrote:
> >
> >> So if you have strong feelings and ideas, why not get them out in email
> >> now? That will help in the framing of the conversation.
> >
> > I won't be at summit and I feel pretty strongly about this topic, so
> > I'll throw out my comments:
> >
> > I agree with the basic premise: In the big tent universe co-
> > installability is holding us back and is a huge cost in terms of spent
> > energy. In a world where service isolation is desirable and common
> > (whether by virtualenv, containers, different hosts, etc) targeting an
> > all-in-one install seems only to serve the purposes of all-in-one rpm-
> > or deb-based installations.
> >
> > Many (most?) people won't be doing those kinds of installations. If 
> > all-in-
> > one installations are important to the rpm- and deb- based distributions
> > then _they_ should be resolving the dependency issues local to their own
> > infrastructure (or realizing that it is too painful and start
> > containerizing or otherwise as well).
> >
> > I think making these changes will help to improve and strengthen the
> > boundaries and contracts between services. If not technically then
> > at least socially, in the sense that the negotiations that people
> > make to get things to work are about what actually matters in their
> > services, not unwinding python dependencies and the like.
> >
> > A lot of the basics of getting this to work are already in place in
> > devstack. One challenge I've run into the past is when devstack
> > plugin A has made an assumption about having access to a python
> > script provided by devstack plugin B, but it's not on $PATH or its
> > dependencies are not in the site-packages visible to the current
> > context. The solution here is to use full paths _into_ virtenvs.
> 
>  As Chris said, doing virtualenvs on the Devstack side for services is
>  pretty much there. The team looked at doing this last year, then stopped
>  due to operator feedback.
> 
>  One of the things that gets a little weird (when using devstack for
>  development) is if you actually want to see the impact of library
>  changes on the environment. As you'll need to make sure you loop and
>  install those libraries into every venv where they are used. This
>  forward reference doesn't really exist. So some tooling there will be
>  needed.
> 
>  Middleware that's pushed from one project into another (like Ceilometer
>  -> Swift) is also a funny edge case that I think get funnier here.
> 
>  Those are mostly implementation details, that probably have work
>  arounds, but would need people on them.
> 
> 
>   From a strategic perspective this would basically make traditional Linux
>  Packaging of OpenStack a lot harder. That might be the right call,
>  because traditional Linux Packaging definitely suffers from the fact
>  that everything on a host needs to be upgraded at the same time. For
>  large installs of OpenStack (especially public cloud cases) traditional
>  packages are definitely less used.
> 
>  However Linux Packaging is how a lot of people get exposed to software.
>  The power of onboarding with apt-get / yum install is a big one.
> 
>  I've been through the ups and downs of both approaches so many times now
>  in my own head, I no longer have a strong preference beyond the fact
>  that we do one approach today, and doing a different one is effort to
>  make the transition.
> 
>  -Sean
> 
> >>>
> >>> It is also worth noting that according to the OpenStack User Survey [0]
> >>> 56% of deployments use "Unmodifed packages from the operating system".
> >>>
> >>> Granted it was a small sample size (302 responses to that question)
> >>> but it is worth keeping this in mind as we talk about moving the burden
> >>> to packagers.
> >>>
> >>> 0 - 
> >>> https://www.openstack.org/assets/survey/April-2016-User-Survey-Report.pdf 
> >>> (page 
> >>> 36)
> >>>
> >>> __
> >>> OpenStack Development Mailing List (not for usage questions)
> >>> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> >>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> >>>
> >> To add to this, I'd also note that I as a packager would likely stop
> >> packaging Openstack at whatever release this goes into.  While the
> >> option to 

Re: [openstack-dev] [release][requirements][packaging][summit] input needed on summit discussion about global requirements

2016-04-19 Thread Clint Byrum
Excerpts from Michał Jastrzębski's message of 2016-04-18 10:29:20 -0700:
> What I meant is if you have liberty Nova and liberty Cinder, and you
> want to upgrade Nova to Mitaka, you also upgrade Oslo to Mitaka and
> Cinder which was liberty either needs to be upgraded or is broken,
> therefore during upgrade you need to do cinder and nova at the same
> time. DB can be snapshotted for rollbacks.
> 

If we're breaking backward compatibility even across one release, that
is a bug.  You should be able to run Liberty components with Mitaka
Libraries. Unfortunately, the testing matrix for all of the combinations
is huge and nobody is suggesting we try to solve that equation.

However, to the point of distros: partial upgrades is not the model distro
packages work under. They upgrade what they can, whether they're a rolling
release, or 7 year cycle LTS's. When the operator says "give me the new
release", the packages that can be upgraded, will be upgraded. And if
Mitaka Nova is depending on something outside the upper constraints in
another package on the system, the distro will just hold Nova back.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [magnum][keystone][all] Using Keystone /v3/credentials to store TLS certificates

2016-04-13 Thread Clint Byrum
Excerpts from Clayton O'Neill's message of 2016-04-13 07:37:16 -0700:
> On Wed, Apr 13, 2016 at 10:26 AM, rezroo  wrote:
> > Hi Kevin,
> >
> > I understand that this is how it is now. My question is how bad would it be
> > to wrap the Barbican client library calls in another class and claim, for
> > all practical purposes, that Magnum has no direct dependency on Barbican?
> > What is the negative of doing that?
> >
> > Anyone who wants to use another mechanism should be able to do that with a
> > simple change to the Magnum conf file. Nothing more complicated. That's the
> > essence of my question.
> 
> For us, the main reason we’d want to be able to deploy without
> Barbican is mostly to lower the initial barrier of entry.  We’re not
> running anything else that would require Barbican for a multi-node
> deployment, so for us to do a realistic evaluation of Magnum, we’d
> have to get two “new to us” services up and running in a development
> environment.  Since we’re not running Barbican or Magnum, that’s a big
> time commitment for something we don’t really know if we’d end up
> using.  From that perspective, something that’s less secure might be
> just fine in the short term.  For example, I’d be completely fine with
> storing certificates in the Magnum database as part of an evaluation,
> knowing I had to switch from that before going to production.
> 

I'd say there's a perfectly reasonable option already for evaluation
purposes, and that is the existing file based backend. For multiple
nodes, I wonder how poorly an evaluation will go if one simply rsyncs
that directory every few minutes.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [magnum][keystone][all] Using Keystone /v3/credentials to store TLS certificates

2016-04-13 Thread Clint Byrum
Excerpts from Douglas Mendizábal's message of 2016-04-13 10:01:21 -0700:
> Hash: SHA512
> 
> Hi Reza,
> 
> The Barbican team has already abstracted python-barbicanclient into a
> general purpose key-storage library called Castellan [1]
> 
> There are a few OpenStack projects that have planned to integrate or
> are currently integrating with Castellan to avoid a hard dependency on
> Barbican.
> 
> There are some tradeoffs to choosing Castellan over
> python-barbicanclient and Castellan may not be right for everyone.
> Also, the only complete implementation of Castellan is currently the
> Barbican implementation, so even though integrating with Castellan
> does not result in a direct dependency, there is still work to be done
> to have a working non-barbican solution.

From an outsider's perspective with no real stake in this debate,
this sounds like a very reasonable way for Magnum to proceed, which
a pre-dependency that they would move their file based approach into
Castellan.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all][stackalytics] Gaming the Stackalytics stats

2016-04-10 Thread Clint Byrum
Excerpts from Morgan Fainberg's message of 2016-04-10 16:47:28 -0700:
> On Sun, Apr 10, 2016 at 4:37 PM, Clint Byrum <cl...@fewbar.com> wrote:
> 
> > Excerpts from Matt Riedemann's message of 2016-04-09 06:42:54 -0700:
> > > There is also disincentive in +1ing a change that you don't understand
> > > and is wrong and then a core comes along and -1s it (you get dinged for
> > > the disagreement). And there is disincentive in -1ing a change for the
> > > wrong reasons (silly nits or asking questions for understanding). I ask
> > > a lot of questions in a lot of changes and I don't vote on those because
> > > it would be inappropriate.
> > >
> >
> > Why is disagreement a negative thing? IMO, reviewers who agree too much
> > are just part of the echo chamber.
> >
> >
> There is no problem with disagreement IMHO. However, we track it as a stat,
> and people don't want to feel as though they are in disagreement with the
> cores. I think this is just some level of psychology.
> 
> I very, very rarely look at disagreement stat for anything (now or when I
> was PTL).
> 

Agreed, as a number, it can be highly misleading and is especially hard
to compare to any of the other numbers.

However, in meta-reviews, I found actual occurrences very useful to
analyze how a reviewer handles confronting the other cores and how
confident they are in their understanding of the code base. So it worries
me that new people might be somehow discouraged from disagreement.

So let me just say it here, disagreeing with the core reviewers when
there is a valid reason _is what somebody who wants to be a core reviewer
should be doing_.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all][stackalytics] Gaming the Stackalytics stats

2016-04-10 Thread Clint Byrum
Excerpts from Matt Riedemann's message of 2016-04-09 06:42:54 -0700:
> There is also disincentive in +1ing a change that you don't understand 
> and is wrong and then a core comes along and -1s it (you get dinged for 
> the disagreement). And there is disincentive in -1ing a change for the 
> wrong reasons (silly nits or asking questions for understanding). I ask 
> a lot of questions in a lot of changes and I don't vote on those because 
> it would be inappropriate.
> 

Why is disagreement a negative thing? IMO, reviewers who agree too much
are just part of the echo chamber.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] FreeIPA integration

2016-04-07 Thread Clint Byrum
Excerpts from Adam Young's message of 2016-04-05 19:02:58 -0700:
> On 04/05/2016 11:42 AM, Fox, Kevin M wrote:
> > Yeah, and they just deprecated vendor data plugins too, which 
> > eliminates my other workaround. :/
> >
> > We need to really discuss this problem at the summit and get a viable 
> > path forward. Its just getting worse. :/
> >
> > Thanks,
> > Kevin
> > 
> > *From:* Juan Antonio Osorio [jaosor...@gmail.com]
> > *Sent:* Tuesday, April 05, 2016 5:16 AM
> > *To:* OpenStack Development Mailing List (not for usage questions)
> > *Subject:* Re: [openstack-dev] [TripleO] FreeIPA integration
> >
> >
> >
> > On Tue, Apr 5, 2016 at 2:45 PM, Fox, Kevin M  > > wrote:
> >
> > This sounds suspiciously like, "how do you get a secret to the
> > instance to get a secret from the secret store" issue :)
> >
> > Yeah, sounds pretty familiar. We were using the nova hooks mechanism 
> > for this means, but it was deprecated recently. So bummer :/
> >
> >
> > Nova instance user spec again?
> >
> > Thanks,
> > Kevin
> >
> 
> Yep, and we need a solution.  I think the right solution is a keypair 
> generated on the instance, public key posted by the instace to the 
> hypervisor and stored with the instance data in the database.  I wrote 
> that to the mailing list earlier today.
> 

If you log your public SSH host key to the console, this already
happens. No need for hypervisor magic, just scrape your console.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][glance] Proposal to remove `nova image-*` commands from novaclient

2016-04-06 Thread Clint Byrum
Excerpts from Nikhil Komawar's message of 2016-04-06 10:46:28 -0700:
> Need a inline clarification.
> 
> On 4/6/16 10:58 AM, Flavio Percoco wrote:
> > On 06/04/16 08:26 -0400, Sean Dague wrote:
> >> On 04/06/2016 04:13 AM, Markus Zoeller wrote:
> >>> +1 for deprecation and removal
> >>>
> >>> To be honest, when I started with Nova during Kilo, I didn't get
> >>> why we have those passthrough APIs. They looked like convenience APIs.
> >>> A short history lesson, why they got introduced, would be cool. I only
> >>> found commit [1] which looks like they were there from the beginning.
> >>>
> >>> References:
> >>> [1]
> >>> https://github.com/openstack/python-novaclient/commit/7304ed80df265b3b11a0018a826ce2e38c052572#diff-56f10b3a40a197d5691da75c2b847d31R33
> >>>
> >>
> >> The short history lesson is nova image API existed before glance. Glance
> >> was a spin out from Nova of that API. Doing so doesn't immediately make
> >> that API go away however. Especially as all these things live on
> >> different ports with different end points. So the image API remained as
> >> a proxy (as did volumes, baremetal, and even to some extend networks).
> >>
> >> It's not super clear how you deprecate and remove these things without
> >> breaking a lot of people, as a lot of the libraries implement the nova
> >> image resources -
> >> https://github.com/fog/fog-openstack/blob/master/lib/fog/openstack/compute.rb
> >>
> >
> > We can deprecate it without removing it. We make it work with v2 and
> > start
> > warning people that the API is not supported anymore. We don't fix
> > bugs in that
> > API but tell people to use the newer version.
> >
> > I think that should do it, unless I'm missing something.
> > Flavio
> >
> 
> Is it a safe practice to not fix bugs on a publicly exposed API? What
> are the recommendations for such cases?
> 

I don't think you can make a blanket statement that no bugs will be
fixed.

There are going to be evolutions behind this API that make a small bug
today into a big bug tomorrow. The idea is to push the user off the API
when they try to do more with it, not when we forcibly explode their
working code.

"We don't break userspace". I know _we_ didn't say that about our
project. But I like to think we understand the wisdom behind that, and can
start at least pretending we believe in ourselves and our users enough
to hold to it for some things, even if we don't really like some of the
more dark and dingy corners of userspace that we have put out there.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Neutron][LBaaS]Removing LBaaS v1 - are we ready?

2016-03-08 Thread Clint Byrum
Excerpts from Samuel Bercovici's message of 2016-03-02 07:06:30 -0800:
> 2.  HEAT Support - will it be ready in Mitaka?

Side note, please remember, Heat is not an acronym.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] A proposal to separate the design summit

2016-03-01 Thread Clint Byrum
Excerpts from Eoghan Glynn's message of 2016-03-01 02:08:00 -0800:
> 
> > > Current thinking would be to give preferential rates to access the 
> > > main
> > > summit to people who are present to other events (like this new
> > > separated contributors-oriented event, or Ops midcycle(s)). That would
> > > allow for a wider definition of "active community member" and reduce
> > > gaming.
> > >
> > 
> >  I think reducing gaming is important. It is valuable to include those
> >  folks who wish to make a contribution to OpenStack, I have confidence
> >  the next iteration of entry structure will try to more accurately
> >  identify those folks who bring value to OpenStack.
> > >>>
> > >>> There have been a couple references to "gaming" on this thread, which
> > >>> seem to imply a certain degree of dishonesty, in the sense of bending
> > >>> the rules.
> > >>>
> > >>> Can anyone who has used the phrase clarify:
> > >>>
> > >>>  (a) what exactly they mean by gaming in this context
> > >>>
> > >>> and:
> > >>>
> > >>>  (b) why they think this is a clear & present problem demanding a
> > >>>  solution?
> > >>>
> > >>> For the record, landing a small number of patches per cycle and thus
> > >>> earning an ATC summit pass as a result is not, IMO at least, gaming.
> > >>>
> > >>> Instead, it's called *contributing*.
> > >>>
> > >>> (on a small scale, but contributing none-the-less).
> > >>>
> > >>> Cheers,
> > >>> Eoghan
> > >>
> > >> Sure I can tell you what I mean.
> > >>
> > >> In Vancouver I happened to be sitting behind someone who stated "I'm
> > >> just here for the buzz." Which is lovely for that person. The problem is
> > >> that the buzz that person is there for is partially created by me and I
> > >> create it and mean to offer it to people who will return it in kind, not
> > >> just soak it up and keep it to themselves.
> > >>
> > >> Now I have no way of knowing who this person is and how they arrived at
> > >> the event. But the numbers for people offering one patch to OpenStack
> > >> (the bar for a summit pass) is significantly higher than the curve of
> > >> people offering two, three or four patches to OpenStack (patches that
> > >> are accepted and merged). So some folks are doing the minimum to get a
> > >> summit pass rather than being part of the cohort that has their first
> > >> patch to OpenStack as a means of offering their second patch to 
> > >> OpenStack.
> > >>
> > >> I consider it an honour and a privilege that I get to work with so many
> > >> wonderful people everyday who are dedicated to making open source clouds
> > >> available for whoever would wish to have clouds. I'm more than a little
> > >> tired of having my energy drained by folks who enjoy feeding off of it
> > >> while making no effort to return beneficial energy in kind.
> > >>
> > >> So when I use the phrase gaming, this is the dynamic to which I refer.
> > > 
> > > Thanks for the response.
> > > 
> > > I don't know if drive-by attendance at design summit sessions by under-
> > > qualified or uninformed summiteers is encouraged by the availability of
> > > ATC passes. But as long as those individuals aren't actively derailing
> > > the conversation in sessions, I wouldn't consider their buzz soakage as
> > > a major issue TBH.
> > > 
> > > In any case, I would say that just meeting the bar for an ATC summit pass
> > > (by landing the required number of patches) is not bending the rules or
> > > misrepresenting in any way.
> > > 
> > > Even if specifically motivated by the ATC pass (as opposed to scratching
> > > a very specific itch) it's still simply an honest and rational response
> > > to an incentive offered by the foundation.
> > > 
> > > One could argue whether the incentive is mis-designed, but that doesn't
> > > IMO make a gamer of any contributor who simply meets the required 
> > > threshold
> > > of activity.
> > > 
> > > Cheers,
> > > Eoghan
> > > 
> > 
> > No I'm not saying that. I'm saying that the larger issue is one of
> > motivation.
> > 
> > Folks who want to help (even if they don't know how yet) carry an energy
> > of intention with them which is nourishing to be around. Folks who are
> > trying to get in the door and not be expected to help and hope noone
> > notices carry an entirely different kind of energy with them. It is a
> > non-nourishing energy.
> 
> Personally I don't buy into that notion of the wrong sort of people
> sneaking in the door of summit, keeping their heads down and hoping
> no-one notices.
> 
> We have an open community that conducts its business in public. Not
> wanting folks with the wrong sort of energy to be around when that
> business is being done, runs counter to our open ethos IMO.
> 
> There are a whole slew of folks who work fulltime on OpenStack but
> contribute mainly in the background: operating clouds, managing
> engineering teams, supporting customers, designing product roadmaps,
> training new users etc. TBH 

Re: [openstack-dev] [heat] convergence cancel messages

2016-02-24 Thread Clint Byrum
Excerpts from Anant Patil's message of 2016-02-23 23:08:31 -0800:
> Hi,
> 
> I would like the discuss various approaches towards fixing bug
> https://launchpad.net/bugs/1533176
> 
> When convergence is on, and if the stack is stuck, there is no way to
> cancel the existing request. This feature was not implemented in
> convergence, as the user can again issue an update on an in-progress
> stack. But if a resource worker is stuck, the new update will wait
> for-ever on it and the update will not be effective.
> 
> The solution is to implement cancel request. Since the work for a stack
> is distributed among heat engines, the cancel request will not work as
> it does in legacy way. Many or all of the heat engines might be running
> worker threads to provision a stack.
> 
> I could think of two options which I would like to discuss:
> 
> (a) When a user triggered cancel request is received, set the stack
> current traversal to None or something else other than current
> traversal. With this the new check-resources/workers will never be
> triggered. This is okay as long as the worker(s) is not stuck. The
> existing workers will finish running, and no new check-resource
> (workers) will be triggered, and it will be a graceful cancel.  But the
> workers that are stuck will be stuck for-ever till stack times-out.  To
> take care of such cases, we will have to implement logic of "polling"
> the DB at regular intervals (may be at each step() of scheduler task)
> and bail out if the current traversal is updated. Basically, each worker
> will "poll" the DB to see if the current traversal is still valid and if
> not, stop itself. The drawback of this approach is that all the workers
> will be hitting the DB and incur a significant overhead.  Besides, all
> the stack workers irrespective of whether they will be cancelled or not,
> will keep on hitting DB. The advantage is that it probably is easier to
> implement. Also, if the worker is stuck in particular "step", then this
> approach will not work.
> 
> (b) Another approach is to send cancel message to all the heat engines
> when one receives a stack cancel request. The idea is to use the thread
> group manager in each engine to keep track of threads running for a
> stack, and stop the thread group when a cancel message is received. The
> advantage is that the messages to cancel stack workers is sent only when
> required and there is no other over-head. The draw-back is that the
> cancel message is 'broadcasted' to all heat engines, even if they are
> not running any workers for the given stack, though, in such cases, it
> will be a just no-op for the heat-engine (the message will be gracefully
> discarded).

Oh hah, I just sent (b) as an option to avoid (a) without really
thinking about (b) again.

I don't think the cancel broadcasts are all that much of a drawback. I
do think you need to rate limit cancels though, or you give users the
chance to DDoS the system.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [heat] convergence cancel messages

2016-02-24 Thread Clint Byrum
Excerpts from Anant Patil's message of 2016-02-24 00:56:34 -0800:
> On 24-Feb-16 13:12, Clint Byrum wrote:
> > Excerpts from Anant Patil's message of 2016-02-23 23:08:31 -0800:
> >> Hi,
> >>
> >> I would like the discuss various approaches towards fixing bug
> >> https://launchpad.net/bugs/1533176
> >>
> >> When convergence is on, and if the stack is stuck, there is no way to
> >> cancel the existing request. This feature was not implemented in
> >> convergence, as the user can again issue an update on an in-progress
> >> stack. But if a resource worker is stuck, the new update will wait
> >> for-ever on it and the update will not be effective.
> >>
> >> The solution is to implement cancel request. Since the work for a stack
> >> is distributed among heat engines, the cancel request will not work as
> >> it does in legacy way. Many or all of the heat engines might be running
> >> worker threads to provision a stack.
> >>
> >> I could think of two options which I would like to discuss:
> >>
> >> (a) When a user triggered cancel request is received, set the stack
> >> current traversal to None or something else other than current
> >> traversal. With this the new check-resources/workers will never be
> >> triggered. This is okay as long as the worker(s) is not stuck. The
> >> existing workers will finish running, and no new check-resource
> >> (workers) will be triggered, and it will be a graceful cancel.  But the
> >> workers that are stuck will be stuck for-ever till stack times-out.  To
> >> take care of such cases, we will have to implement logic of "polling"
> >> the DB at regular intervals (may be at each step() of scheduler task)
> >> and bail out if the current traversal is updated. Basically, each worker
> >> will "poll" the DB to see if the current traversal is still valid and if
> >> not, stop itself. The drawback of this approach is that all the workers
> >> will be hitting the DB and incur a significant overhead.  Besides, all
> >> the stack workers irrespective of whether they will be cancelled or not,
> >> will keep on hitting DB. The advantage is that it probably is easier to
> >> implement. Also, if the worker is stuck in particular "step", then this
> >> approach will not work.
> >>
> > 
> > I think this is the simplest option. And if the polling gets to be too
> > much, you can implement an observer pattern where one worker is just
> > assigned to poll the traversal and if it changes, RPC to the known
> > active workers that they should cancel any jobs using a now-cancelled
> > stack version.
> > 
> > __
> > OpenStack Development Mailing List (not for usage questions)
> > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> > 
> 
> Hi Clint,
> 
> I see that observer pattern is simple, but IMO it too is not efficient.
> To implement it, we will have to note down in DB the worker to engine-id
> relationship for all the workers, and then go through all of them and
> send targeted cancel messages. This will also need us to have thread
> group manager in each engine so that it can stop the thread group
> running workers for the stack.
> 

You have to have that thread group manager anyway, or you can't ever
cancel anything in progress. That same thread group manager could also
be managing timeouts.

Apologies for my lack of understanding of where the implementation
has gone, I thought you would already have that mapping in the DB. If
that's a problem though, for this case you can have a notification
channel for cancellations, and have the management thread listen to
that, with its own local awareness of what is being worked on.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [heat] convergence cancel messages

2016-02-23 Thread Clint Byrum
Excerpts from Anant Patil's message of 2016-02-23 23:08:31 -0800:
> Hi,
> 
> I would like the discuss various approaches towards fixing bug
> https://launchpad.net/bugs/1533176
> 
> When convergence is on, and if the stack is stuck, there is no way to
> cancel the existing request. This feature was not implemented in
> convergence, as the user can again issue an update on an in-progress
> stack. But if a resource worker is stuck, the new update will wait
> for-ever on it and the update will not be effective.
> 
> The solution is to implement cancel request. Since the work for a stack
> is distributed among heat engines, the cancel request will not work as
> it does in legacy way. Many or all of the heat engines might be running
> worker threads to provision a stack.
> 
> I could think of two options which I would like to discuss:
> 
> (a) When a user triggered cancel request is received, set the stack
> current traversal to None or something else other than current
> traversal. With this the new check-resources/workers will never be
> triggered. This is okay as long as the worker(s) is not stuck. The
> existing workers will finish running, and no new check-resource
> (workers) will be triggered, and it will be a graceful cancel.  But the
> workers that are stuck will be stuck for-ever till stack times-out.  To
> take care of such cases, we will have to implement logic of "polling"
> the DB at regular intervals (may be at each step() of scheduler task)
> and bail out if the current traversal is updated. Basically, each worker
> will "poll" the DB to see if the current traversal is still valid and if
> not, stop itself. The drawback of this approach is that all the workers
> will be hitting the DB and incur a significant overhead.  Besides, all
> the stack workers irrespective of whether they will be cancelled or not,
> will keep on hitting DB. The advantage is that it probably is easier to
> implement. Also, if the worker is stuck in particular "step", then this
> approach will not work.
> 

I think this is the simplest option. And if the polling gets to be too
much, you can implement an observer pattern where one worker is just
assigned to poll the traversal and if it changes, RPC to the known
active workers that they should cancel any jobs using a now-cancelled
stack version.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] A prototype implementation towards the "shared state scheduler"

2016-02-23 Thread Clint Byrum
Excerpts from Jay Pipes's message of 2016-02-23 16:10:46 -0800:
> On 02/22/2016 04:23 AM, Sylvain Bauza wrote:
> > I won't argue against performance here. You made a very nice PoC for
> > testing scaling DB writes within a single python process and I trust
> > your findings. While I would be naturally preferring some shared-nothing
> > approach that can horizontally scale, one could mention that we can do
> > the same with Galera clusters.
> 
> a) My benchmarks aren't single process comparisons. They are 
> multi-process benchmarks.
> 
> b) The approach I've taken is indeed shared-nothing. The scheduler 
> processes do not share any data whatsoever.
> 

I think this is a matter of perspective. What I read from Sylvain's
message was that the approach you've taken shares state in a database,
and shares access to all compute nodes.

I also read in to Sylvain's comments taht what he was referring to was
a system where the compute nodes divide up the resources and never share
anything at all.

> c) Galera isn't horizontally scalable. Never was, never will be. That 
> isn't its strong-suit. Galera is best for having a 
> synchronously-replicated database cluster that is incredibly easy to 
> manage and administer but it isn't a performance panacea. It's focus is 
> on availability not performance :)
> 

I also think this is a matter of perspective. Galera is actually
fantastically horizontally scalable in any situation where you have a
very high ratio of reads to writes with a need for consistent reads.

However, for OpenStack's needs, we are typically pretty low on that ratio.

> > That said, most of the operators run a controller/compute situation
> > where all the services but the compute node are hosted on 1:N hosts.
> > Implementing the resource-providers-scheduler BP (and only that one)
> > will dramatically increase the number of writes we do on the scheduler
> > process (ie. on the "controller" - quoting because there is no notion of
> > a "controller" in Nova, it's just a deployment choice).
> 
> Yup, no doubt about it. It won't increase the *total* number of writes 
> the system makes, just the concentration of those writes into the 
> scheduler processes. You are trading increased writes in the scheduler 
> for the challenges inherent in keeping a large distributed cache system 
> valid and fresh (which itself introduces a different kind of writes).
> 

Funny enough, I think of Galera as a large distributed cache that is
always kept valid and fresh. The challenges of doing this for a _busy_
cache are not unique to Galera.

> > That's a big game changer for operators who are currently capping their
> > capacity by adding more conductors. It would require them to do some DB
> > modifications to be able to scale their capacity. I'm not against that,
> > I just say it's a big thing that we need to consider and properly
> > communicate if agreed.
> 
> Agreed completely. I will say, however, that on a 1600 compute node 
> simulation (~60K variably-sized instances), an untuned stock MySQL 5.6 
> database with 128MB InnoDB buffer pool size barely breaks a sweat on my 
> local machine.
> 

That agrees with what I've seen as well. We're talking about tables of
integers for the most part, so your least expensive SSD's can keep up
with this load for many many thousands of computes.

I'd actually also be interested if this has a potential to reduce the
demand on the message bus. I've been investigating this for a while, and I
found that RabbitMQ will happily consume 5 high end CPU cores on a single box
just to serve the needs of 1000 idle compute nodes. I am sorry that I
haven't read enough of the details in your proposal, but doesn't this
mean there'd be quite a bit less load on the MQ if the only time
messages are happening is for direct RPC dispatches and error reporting?

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] A proposal to separate the design summit

2016-02-23 Thread Clint Byrum
Excerpts from Eoghan Glynn's message of 2016-02-22 15:06:01 -0800:
> 
> > Hi everyone,
> > 
> > TL;DR: Let's split the events, starting after Barcelona.
> > 
> > Long long version:
> > 
> > In a global and virtual community, high-bandwidth face-to-face time is
> > essential. This is why we made the OpenStack Design Summits an integral
> > part of our processes from day 0. Those were set at the beginning of
> > each of our development cycles to help set goals and organize the work
> > for the upcoming 6 months. At the same time and in the same location, a
> > more traditional conference was happening, ensuring a lot of interaction
> > between the upstream (producers) and downstream (consumers) parts of our
> > community.
> > 
> > This setup, however, has a number of issues. For developers first: the
> > "conference" part of the common event got bigger and bigger and it is
> > difficult to focus on upstream work (and socially bond with your
> > teammates) with so much other commitments and distractions. The result
> > is that our design summits are a lot less productive than they used to
> > be, and we organize other events ("midcycles") to fill our focus and
> > small-group socialization needs. The timing of the event (a couple of
> > weeks after the previous cycle release) is also suboptimal: it is way
> > too late to gather any sort of requirements and priorities for the
> > already-started new cycle, and also too late to do any sort of work
> > planning (the cycle work started almost 2 months ago).
> > 
> > But it's not just suboptimal for developers. For contributing companies,
> > flying all their developers to expensive cities and conference hotels so
> > that they can attend the Design Summit is pretty costly, and the goals
> > of the summit location (reaching out to users everywhere) do not
> > necessarily align with the goals of the Design Summit location (minimize
> > and balance travel costs for existing contributors). For the companies
> > that build products and distributions on top of the recent release, the
> > timing of the common event is not so great either: it is difficult to
> > show off products based on the recent release only two weeks after it's
> > out. The summit date is also too early to leverage all the users
> > attending the summit to gather feedback on the recent release -- not a
> > lot of people would have tried upgrades by summit time. Finally a common
> > event is also suboptimal for the events organization : finding venues
> > that can accommodate both events is becoming increasingly complicated.
> > 
> > Time is ripe for a change. After Tokyo, we at the Foundation have been
> > considering options on how to evolve our events to solve those issues.
> > This proposal is the result of this work. There is no perfect solution
> > here (and this is still work in progress), but we are confident that
> > this strawman solution solves a lot more problems than it creates, and
> > balances the needs of the various constituents of our community.
> > 
> > The idea would be to split the events. The first event would be for
> > upstream technical contributors to OpenStack. It would be held in a
> > simpler, scaled-back setting that would let all OpenStack project teams
> > meet in separate rooms, but in a co-located event that would make it
> > easy to have ad-hoc cross-project discussions. It would happen closer to
> > the centers of mass of contributors, in less-expensive locations.
> > 
> > More importantly, it would be set to happen a couple of weeks /before/
> > the previous cycle release. There is a lot of overlap between cycles.
> > Work on a cycle starts at the previous cycle feature freeze, while there
> > is still 5 weeks to go. Most people switch full-time to the next cycle
> > by RC1. Organizing the event just after that time lets us organize the
> > work and kickstart the new cycle at the best moment. It also allows us
> > to use our time together to quickly address last-minute release-critical
> > issues if such issues arise.
> > 
> > The second event would be the main downstream business conference, with
> > high-end keynotes, marketplace and breakout sessions. It would be
> > organized two or three months /after/ the release, to give time for all
> > downstream users to deploy and build products on top of the release. It
> > would be the best time to gather feedback on the recent release, and
> > also the best time to have strategic discussions: start gathering
> > requirements for the next cycle, leveraging the very large cross-section
> > of all our community that attends the event.
> > 
> > To that effect, we'd still hold a number of strategic planning sessions
> > at the main event to gather feedback, determine requirements and define
> > overall cross-project themes, but the session format would not require
> > all project contributors to attend. A subset of contributors who would
> > like to participate in this sessions can collect and relay feedback to
> > other 

Re: [openstack-dev] [all] A proposal to separate the design summit

2016-02-23 Thread Clint Byrum
Excerpts from Sean McGinnis's message of 2016-02-22 11:48:50 -0800:
> On Mon, Feb 22, 2016 at 05:20:21PM +, Amrith Kumar wrote:
> > Thierry and all of those who contributed to putting together this write-up, 
> > thank you very much.
> > 
> > TL;DR: +0
> > 
> > Longer version:
> > 
> > While I definitely believe that the new proposed timing for "OpenStack 
> > Summit" which is some months after the release, is a huge improvement, I am 
> > not completely enamored of this proposal. Here is why.
> > 
> > As a result of this proposal, there will still be four events each year, 
> > two "OpenStack Summit" events and two "MidCycle" events. The material 
> > change is that the "MidCycle" event that is currently project specific will 
> > become a single event inclusive of all projects, not unlike our current 
> > "Design Summit".
> > 
> > I contrast this proposal with a mid-cycle two weeks ago for the Trove 
> > project. Thanks to the folks at Red Hat who hosted us in Raleigh, we had a 
> > dedicated room, with high bandwidth internet and the ability to have people 
> > join us remotely via audio and video (which we used mostly for screen 
> > sharing). The previous mid-cycle similarly had excellent facilities 
> > provided us by HP (in California), Rackspace (in Austin) and at MIT in 
> > Cambridge when we (Tesora) hosted the event.
> > 
> > At these "simpler, scaled-back settings", would we be able to provide the 
> > same kind of infrastructure for each project?
> > 
> > Given the number of projects, and leaving aside high bandwidth internet and 
> > remote participation, providing dedicated meeting room for the duration of 
> > the MidCycle event for each project is a considerable undertaking. I 
> > believe therefore that the consequence is that the MidCycle event will end 
> > up being of comparable scale to the current Design Summit or larger, and 
> > will likely need a similar venue.
> > 
> > I also believe that it is important that OpenStack continue to grow not 
> > only a global customer base but also a global contributor base. As others 
> > have already commented, this proposal risks the "design summit" become US 
> > based, maybe Europe once in a long while. But I find it much harder to 
> > believe that these design summits would be truly global. And this I think 
> > would be an unwelcome consequence.
> > 
> > At the current OpenStack Summit, there is an opportunity for contributors, 
> > customers and operators to interact, not just in technical meetings, but 
> > also in a social setting. I think this is valuable, even though there seems 
> > to be a number of people who believe that this is not necessarily the case.
> > 
> > Those are the three concerns I have with the proposal. 
> > 
> > Thanks again to Thierry and all who contributed to putting this proposal 
> > together.
> > 
> > -amrith
> 
> I agree with a lot of the concerns raised here. I wonder if we're not
> just shifting some of the problems and causing others.
> 
> While the timing of things isn't ideal right now, I'm also afraid the
> timing of these changes would also interupt our development flow and
> cause distractions when we need folks focused on getting things done.
> 
> I'm also very concerned about losing our midcycles. At least for Cinder,
> the midcycle events have been hugely successful and well worth the time
> and travel expense, IMO. To me, the design summit event is good for
> cross-project communication and getting more operator input. But the
> midcycles have been where we've really been able to focus and figure out
> issues.
> 

I do understand this concern, but the difference is in the way a
development-summit-only event is attended versus a conference+summit.
When you don't have keynotes every morning expending peoples' time, and
you don't have people running out of discussions to give their talks,
this immediately adds a calm focus to the discussions that feels a
lot more like a mid-cycle. When there's no booth for your company to
ask you to come by and man for a while to meet customers and partners,
suddenly every developer can spend the whole of the event talking to
other developers and operators who have come to participate directly.

I did not attend the first few summits, my first one being the Boston
event, but I did attend quite a few Ubuntu Developer Summits, which were
much more about development discussions, and almost completely devoid of
conference semantics. It always felt like a series of productive meetings,
and not like a series of rushed, agitated, nervous brain dumps, which
frankly is what a lot of Tokyo felt like.

> Even if we still have a colocated "midcycle" now, I would be afraid that
> there would be too many distractions from everything else going on for
> us to be able to really tackle some of the things we've been able to in
> our past midcycles.
> 

I _DO_ share your concern here. The mid-cycles are productive because
they're focused. Putting one at the conference will just 

<    1   2   3   4   5   6   7   8   9   10   >