date:20131212

On 13 December 2013 03:31, Mike Scherbakov  wrote:
> Folks,
>
>
> Most of you by now have heard of Fuel, which we’ve been working on as a
> related OpenStack project for a period of time - see
> https://launchpad.net/fuel and https://wiki.openstack.org/wiki/Fuel. The aim
> of the project is to provide a distribution agnostic and plug-in agnostic
> engine for preparing, configuring and ultimately deploying various “flavors”
> of OpenStack in production. We’ve also used Fuel in most of our customer
> engagements to stand up an OpenStack cloud.
...
> We’d love to open discussion on this and hear everybody’s thoughts on this
> direction.

+1 on more collaboration :). I think the general strategy of finding
what program in OpenStack fits the features you'd like to contribute,
and then working with that program to get the features into it - as a
new project for the program, or as patches to an existing project, or
even as a replacement implementation of the project - is exactly the
right approach to break down the silos and make your stuff generally
available and something we're all collaborating on. The specific
examples you gave also make sense to me.

Let me know if/when there are further or more detailed discussions we
need to have - I'll be more than happy to participate!

-Rob

-- 
Robert Collins 
Distinguished Technologist
HP Converged Cloud

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements

On 13 December 2013 06:24, Will Foster  wrote:

> I just wanted to add a few thoughts:

Thank you!

> For some comparative information here "from the field" I work
> extensively on deployments of large OpenStack implementations,
> most recently with a ~220node/9rack deployment (scaling up to 42racks / 1024
> nodes soon).  My primary role is of a Devops/Sysadmin nature, and not a
> specific development area so rapid provisioning/tooling/automation is an
> area I almost exclusively work within (mostly using API-driven
> using Foreman/Puppet).  The infrastructure our small team designs/builds
> supports our development and business.
>
> I am the target user base you'd probably want to cater to.

Absolutely!

> I can tell you the philosophy and mechanics of Tuskar/OOO are great,
> something I'd love to start using extensively but there are some needed
> aspects in the areas of control that I feel should be added (though arguably
> less for me and more for my ilk who are looking to expand their OpenStack
> footprint).
>
> * ability to 'preview' changes going to the scheduler

What does this give you? How detailed a preview do you need? What
information is critical there? Have you seen the proposed designs for
a heat template preview feature - would that be sufficient?

> * ability to override/change some aspects within node assignment

What would this be used to do? How often do those situations turn up?
Whats the impact if you can't do that?

> * ability to view at least minimal logging from within Tuskar UI

Logging of what - the deployment engine? The heat event-log? Nova
undercloud logs? Logs from the deployed instances? If it's not there
in V1, but you can get, or already have credentials for the [instances
that hold the logs that you wanted] would that be a big adoption
blocker, or just a nuisance?

> Here's the main reason - most new adopters of OpenStack/IaaS are going to be
> running legacy/mixed hardware and while they might have an initiative to
> explore and invest and even a decent budget most of them are not going to
> have
> completely identical hardware, isolated/flat networks and things set
> aside in such a way that blind auto-discovery/deployment will just work all
> the time.

Thats great information (and something I reasonably well expected, to
a degree). We have a hard dependency on no wildcard DHCP servers in
the environment (or we can't deploy). Autodiscovery is something we
don't have yet, but certainly debugging deployment failures is a very
important use case and one we need to improve both at the plumbing
layer and in the stories around it in the UI.

> There will be a need to sometimes adjust, and those coming from a more
> vertically-scaling infrastructure (most large orgs.) will not have
> 100% matching standards in place of vendor, machine spec and network design
> which may make Tuscar/OOO seem inflexible and 'one-way'.  This may just be a
> carry-over or fear of the old ways of deployment but nonetheless it
> is present.

I'm not sure what you mean by matching standards here :). Ironic is
designed to support extremely varied environments with arbitrary mixes
of IPMI/drac/ilo/what-have-you, and abstract that away for us. From a
network perspective I've been arguing the following:

 - we need routable access to the mgmt cards
 - if we don't have that (say there are 5 different mgmt domains with
no routing between them) then we install 5 deployment layers (5
underclouds) which could be as small as one machine each.
 - within the machines that are served by one routable region of mgmt
cards, we need no wildcard DHCP servers, for our DHCP server to serve
PXE to the machines (for the PXE driver in Ironic).
 - building a single region overcloud from multiple undercloud regions
will involve manually injecting well known endpoints (such as the
floating virtual IP for API endpoints) into some of the regions, but
it's in principle straightforward to do and use with the plumbing
layer today.

> In my case, we're lucky enough to have dedicated, near-identical
> equipment and a flexible network design we've architected prior that
> makes Tuskar/OOO a great fit.  Most people will not have this
> greenfield ability and will use what they have lying around initially
> as to not make a big investment until familiarity and trust of
> something new is permeated.
>
> That said, I've been working with Jaromir Coufal on some UI mockups of
> Tuskar with some of this 'advanced' functionality included and from
> my perspective it looks like something to consider pulling in sooner than
> later if you want to maximize the adoption of new users.

So, for Tuskar my short term goals are to support RH in shipping a
polished product while still architecting and building something
sustainable and suitable for integration into the OpenStack ecosystem.
(For instance, one of the requirements for integration is that we
don't [significantly] overlap other projects - and thats why I've been
pushing so hard on the do

Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements

On 13 December 2013 10:05, Jay Dobies  wrote:
>> Maybe this is a valid use case?

> You mention three specific nodes, but what you're describing is more likely
> three concepts:
> - Balanced Nodes
> - High Disk I/O Nodes
> - Low-End Appliance Nodes
>
> They may have one node in each, but I think your example of three nodes is
> potentially *too* simplified to be considered as proper sample size. I'd
> guess there are more than three in play commonly, in which case the concepts
> breakdown starts to be more appealing.
>
> I think the disk flavor in particular has quite a few use cases, especially
> until SSDs are ubiquitous. I'd want to flag those (in Jay terminology, "the
> disk hotness") as hosting the data-intensive portions, but where I had
> previously been viewing that as manual allocation, it sounds like the
> approach is to properly categorize them for what they are and teach Nova how
> to use them.
>
> Robert - Please correct me if I misread any of what your intention was, I
> don't want to drive people down the wrong path if I'm misinterpretting
> anything.

You nailed it, no butchering involved at all!

-Rob


-- 
Robert Collins 
Distinguished Technologist
HP Converged Cloud

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements

On 13 December 2013 06:13, Keith Basil  wrote:
> On Dec 11, 2013, at 3:42 PM, Robert Collins wrote:

>>> My question is - can't we help them now? To enable users to use our app even
>>> when we don't have enough smartness to help them 'auto' way?
>>
>> I understand the question: but I can't answer it until we have *an*
>> example that is both real and not deliverable today. At the moment the
>> only one we know of is HA, and thats certainly an important feature on
>> the nova scheduled side, so doing manual control to deliver a future
>> automatic feature doesn't make a lot of sense to me. Crawl, walk, run.
>
> Maybe this is a valid use case?
>
> Cloud operator has several core service nodes of differing configuration
> types.
>
> [node1]  <-- balanced mix of disk/cpu/ram for general core services
> [node2]  <-- lots of disks for Ceilometer data storage
> [node3]  <-- low-end "appliance like" box for a specialized/custom core 
> service
>  (SIEM box for example)
>
> All nodes[1,2,3] are in the same deployment grouping ("core services)".  As 
> such,
> this is a heterogenous deployment grouping.  Heterogeneity in this case 
> defined by
> differing roles and hardware configurations.
>
> This is a real use case.
>
> How do we handle this?

Ok, so node1 gets flavor A, node2 gets flavor B, node3 gets flavor C.

We have three disk images, one with general core services on it
(imageA), one with ceilometer backend storage (imageB), one with SIEM
on it (imageC).
And we have three service groups, one that binds imageA to {flavors:
[FlavorA], count:1}, one that binds imageB to {flavors:[FlavorB],
count:1}, one that binds imageC to {flavors:[FlavorC], count:1}

Thats doable by the plumbing today, without any bypass of the Nova scheduler.

FlavorB might be the same as the flavor for gluster boxes for
instance, in which case you'll get commonality - if one fails, we can
schedule onto another.

-Rob

-- 
Robert Collins 
Distinguished Technologist
HP Converged Cloud

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements

On 13 December 2013 05:35, Keith Basil  wrote:
> On Dec 10, 2013, at 5:09 PM, Robert Collins wrote:

 unallocated | aqvailable | undeployed
>>>
>>> +1 unallocated
>>
>> I think available is most accurate, but undeployed works too. I really
>> don't like unallocated, sorry!
>
> Would "available" introduce/denote that the service is deployed
> and operational?

It could lead to that confusion. Jaromir suggested free in the other
thread, I think that that would work well and avoid the confusion with
'working service' that available has.

>> Brainstorming: role is something like 'KVM compute', but we may have
>> two differing only in configuration sets of that role. In a very
>> technical sense it's actually:
>> image + configuration -> scaling group in Heat.
>> So perhaps:
>> Role + Service group ?
>> e.g. GPU KVM Hypervisor would be a service group, using the KVM
>> Compute role aka disk image.
>>
>> Or perhaps we should actually surface image all the way up:
>>
>> Image + Service group ?
>> image = what things we build into the image
>> service group = what runtime configuration we're giving, including how
>> many machines we want in the group
>>
> How about just leaving it as Resource Class?  The things you've
> brainstormed about are in line with the original thinking around
> the resource class concept.
>
> role (assumes role specific image) +
> service/resource grouping +
> hardware that can provide that service/resource

So, Resource Class as originally communicated really is quite
different to me: though obviously there is some overlap. I can drill
into that if you want ... however the implications of the words and
how folk can map from them back to the plumbing is what really
concerns me, so thats what I'll focus on here.

Specifically: Resource Class was focused on the resources being
offered into the overcloud, but the image + (service config/service
group/group config) idea applies to all things we deploy equally -
it's relevant to management instances, control plane instances, as
well as Nova and Cinder. So the Resource part of it doesn't really
fit. Using 'Class' is just jargon - I would expect it to be pretty
impenetrable to non-programmers.

Ideally I think we want something that:
 - has a fairly obvious mapping back to Nova/Heat terminology (e.g. if
the concepts are the same, lets call them the same)
 - doesn't overlap other terms unless they are compatible.

For instance Heat has a concept 'resourcegroup' where resource means
'the object that heat has created and is managing' and the group
refers to scaling to some N of them. This is what we will eventually
back a particular image + config onto - that becomes one resourcegroup
in heat; using resource class to refer to that when the resource
referred to is the delivered service, not 'Instance's (the Nova
baremetal instances we create through the resourcegroup) is going to
cause significant confusion at minimum :)

-Rob

-- 
Robert Collins 
Distinguished Technologist
HP Converged Cloud

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Keystone] policy has no effect because of hard coded assert_admin?

2013-12-12 Thread Qiu Yu

On Fri, Dec 13, 2013 at 2:40 AM, Morgan Fainberg  wrote:

> As Dolph stated, V3 is where the policy file protects.  This is one of the
> many reasons why I would encourage movement to using V3 Keystone over V2.
>
> The V2 API is officially deprecated in the Icehouse cycle, I think that
> moving the decorator potentially could cause more issues than not as stated
> for compatibility.  I would be very concerned about breaking compatibility
> with deployments and maintaining the security behavior with the
> encouragement to move from V2 to V3.  I am also not convinced passing the
> context down to the manager level is the right approach.  Making a move on
> where the protection occurs likely warrants a deeper discussion (perhaps in
> Atlanta?).
>
>
Thanks for the background info. However, after a quick go-through keystone
V3 API and existing BPs. Two questions still confuse me regarding policy
enforcement.

#1 Seems V3 policy api [1] has nothing to do with the policy rules. It
seems to be dealing with access / secret keys only. So it might be used for
access key authentication and related control in my understanding.

Is there any use case / example regarding V3 policy api? Does it even
related to policy rules in json file?

#2 Found this slides[2] online by Adam Young. And in page 27, he mentioned
"isAdmin" currently in nova belongs to keystone actually.

Would be really appreciated for some pointers. ML discussion or bp (I don't
find any so far), etc.

[1] http://api.openstack.org/api-ref-identity.html#Policy_Calls
[2] http://www.slideshare.net/kamesh001/openstack-keystone

Thanks,
--
Qiu Yu
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] State of the Gate - Dec 12

2013-12-12 Thread Anita Kuno

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 12/12/2013 08:20 AM, Sean Dague wrote:
> Current Gate Length: 12hrs*, 41 deep
> 
> (top of gate entered 12hrs ago)
> 
> It's been an *exciting* week this week. For people not paying
> attention we had 2 external events which made things terrible
> earlier in the week.
> 
> == Event 1: sphinx 1.2 complete breakage -
> MOSTLY RESOLVED ==
> 
> It turns out sphinx 1.2 + distutils (which pbr magic call through)
> means total sadness. The fix for this was a requirements pin to
> sphinx < 1.2, and until a project has taken that they will fail in
> the gate.
> 
> It also turns out that tox installs pre-released software by
> default (a terrible default behavior), so you also need a tox.ini
> change like this -
> https://github.com/openstack/nova/blob/master/tox.ini#L9 otherwise 
> local users will install things like sphinx 1.2b3. They will also
> break in other ways.
> 
> Not all projects have merged this. If you are a project that
> hasn't, please don't send any other jobs to the gate until you do.
> A lot of delay was added to the gate yesterday by Glance patches
> being pushed to the gate before their doc jobs were done.
> 
> == Event 2: apt.puppetlabs.com outage -
> RESOLVED ==
> 
> We use that apt repository to setup the devstack nodes in nodepool
> with puppet. We were triggering an issue with grenade where it's
> apt-get calls were failing, because it does apt-get update once to
> make sure life is good. This only triggered in grenade (noth other
> devstack runs) because we do set -o errexit aggressively.
> 
> A fix in grenade to ignore these errors was merged yesterday
> afternoon (the purple line -
> http://status.openstack.org/elastic-recheck/ you can see where it
> showed up).
> 
> == Top Gate Bugs 
> ==
> 
> We normally do this as a list, and you can see the whole list here
> - http://status.openstack.org/elastic-recheck/ (now sorted by
> number of FAILURES in the last 2 weeks)
> 
> That being said, our bigs race bug is currently this one bug - 
> https://bugs.launchpad.net/tempest/+bug/1253896 - and if you want
> to merge patches, fixing that one bug will be huge.
We have been trying to make progress on this one all day.

Salvatore Orlando was able to dig a bit more before he had to sign off
for some sleep see comment #25.

Brent Eagles is working on this one, thanks for the reviews dkranz and
dims: https://review.openstack.org/#/c/59517/2
It isn't expected to entirely fix the bug but hopefully will reduce
some of its frequency.

Don Kehn is trying to work on 1253896 to see what he can see, he
hasn't looked at this one before so is just getting familiar with it.

I wish I had something better to report. I have to go to bed soon
myself so I thought I would share the status we have in the hopes that
those rising soon will read and carry on.

Thanks to everyone with your help addressing this,
Anita.
> 
> Basically, you can't ssh into guests that get created. That's sort
> of a fundamental property of a cloud. It shows up more frequently
> on neutron jobs, possibly due to actually testing the metadata
> server path. There have been many attempts on retry logic on this,
> we actually retry for 196 seconds to get in and only fail once we
> can't get in, so waiting isn't helping. It doesn't seem like the
> env is under that much load.
> 
> Until we resolve this, life will not be good in landing patches.
> 
> -Sean
> 
> 
> 
> ___ OpenStack-dev
> mailing list OpenStack-dev@lists.openstack.org 
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.12 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJSqoQcAAoJELmKyZugNFU0GN8IAMreS63oLFb94biuMLRUPY0v
HFw2pflJb1FGHHa1mbuHcCWAtJ6CK290olLIYKEbazG1dwmkuF1gbx118TLbThko
02OQ1aaLqDPkSe8YCo7glUpS7qbAO/RuWElzUxp0ILAv3eFVPFNjojCCpu2BJkck
auao4glRODpIlZVVRaDR8GU/AYfEpudpF9Pm57BuSGNVs2efFvBAaUAdjjHIolkk
cd9Uc0umdHi2ZnuxfwZtSeKjB7iFAUqL60zsQo25KT8a23h6vybBInThl3/znfef
8J1TsddQUiYEKZrs+lclL/mqjlZAOUR2imdW79EtPTA4168V34UamK26URRrj3A=
=TpUa
-END PGP SIGNATURE-

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [TripleO][Tuskar] Terminology

On 12 December 2013 21:59, Jaromir Coufal  wrote:
> On 2013/12/12 01:21, Robert Collins wrote:
>>
>> On 12 December 2013 08:15, Tzu-Mainn Chen >>
>>>   * MANAGEMENT NODE - a node that has been mapped with an
>>> undercloud role
>>
>>
>> Pedantically, this is 'A node with an instance of a management role
>> running on it'. I think calling it 'management node' is too sticky.
>> What if we cold migrate it to another machine when a disk fails and we
>> want to avoid dataloss if another disk were to fail?
>>
>> Management instance?
>
> I think the difference here is if I am looking on Nodes as HW stuff or if I
> am interested in services running on it. In the first case, I want to see
> 'Management Node', in the second case I want to see 'Management Instance'.
> So in the terms of Resources/Nodes it is valid to say 'Management Node'.

mmm, I don't really agree, I mean, I agree that if you're looking at
HW you want to look at Nodes. But we might migrate services between
nodes while keeping the same instance. Nodes should only be surfaced
when folk are actually addressing hardware IMO.

>>>   * SERVICE NODE - a node that has been mapped with an overcloud
>>> role
>>
>>
>> Again, the binding to node is too sticky IMNSHO.
>>
>> Service instance? Cloud instance?
>
> Same as above - depends on context of what I want to see.
>
> Service Instance is misleading. One service instance is for example
> nova-scheduler, not the whole node itself.
>
> I would avoid 'cloud' wording here. Service Node sounds fine for me, since
> the context is within Nodes/Resources.

Avoiding cloud - ack.

However, on instance - 'instance' is a very well defined term in Nova
and thus OpenStack: Nova boot gets you an instance, nova delete gets
rid of an instance, nova rebuild recreates it, etc. Instances run
[virtual|baremetal] machines managed by a hypervisor. So
nova-scheduler is not ever going to be confused with instance in the
OpenStack space IMO. But it brings up a broader question, which is -
what should we do when terms that are well defined in OpenStack - like
Node, Instance, Flavor - are not so well defined for new users? We
could use different terms, but that may confuse 'stackers, and will
mean that our UI needs it's own dedicated terminology to map back to
e.g. the manuals for Nova and Ironic. I'm inclined to suggest that as
a principle, where there is a well defined OpenStack concept, that we
use it, even if it is not ideal, because the consistency will be
valuable.

...
>>>  * BLOCK STORAGE NODE - a service node that has been mapped
>>> to an overcloud block storage role
>>
>>
>> s/Node/instance/ ?
>
> Within deployment section, +1 for substitution. However with respect to my
> note above (service instance meaning).

Yeah - but see above, I don't think there is room for confusion with
e.g. nova-compute. However there may be room for confusion between
instance-running-on-baremetal and instance-deployed-as-virt - but TBH
I don't think that matters too much, if we go to docker or something
in future, we'd have physical and container instances both serving
stuff out, but it's not clear to me that we'd want to show these
things up as separate at the top level.

> -1. Availability is very broad term and might mean various things. I can
> have assigned nodes with some role which are available for me - in terms of
> reachability for example.
>
> I vote for unallocated, unassigned, free?

"Free nodes" works well IMO. It's a positive, direct statement.

>>>   * INSTANCE - A role deployed on a node - this is where work
>>> actually happens.
>
> Yes. However this term is overloaded as well. Can we find something better?

See above - it is, but I think something different would cause
confusion, not reduce it.

>>> * DEPLOYMENT
>>>   * SIZE THE ROLES - the act of deciding how many nodes will need to
>>> be assigned to each role
>>> * another option - DISTRIBUTE NODES (?)
>>>   - (I think the former is more accurate,
>>> but perhaps there's a better way to say it?)
>>
>>
>> Perhaps 'Size the cloud' ? "How big do you want your cloud to be?"
>
> * Design the deployment?
>
> (I am sorry for the aversion for 'cloud' - it's just used everywhere :))

I get that - thats fine.

>>>   * SCHEDULING - the process of deciding which role is deployed on
>>> which node
>>
>>
>> This possible should be a sub step of deployment.
>>
>>>   * SERVICE CLASS - a further categorization within a service role
>>> for a particular deployment.
>>
>>
>> See the other thread where I suggested perhaps bringing the image +
>> config aspects all the way up - I think that renames 'service class'
>> to 'Role configuration'. KVM Compute is a role configuration. KVM
>> compute(GPU) might be another.
>
> Role configuration sounds good to me.
>
> My only concern is - if/when we add multiple classes, role configuration
> doesn't sound accurate to me. Because Compute is a Role and if I have
> mul

Re: [openstack-dev] [Nova] Support for Pecan in Nova

2013-12-12 Thread Christopher Yeoh

On Fri, Dec 13, 2013 at 4:12 AM, Jay Pipes  wrote:

> On 12/11/2013 11:47 PM, Mike Perez wrote:
>
>> On 10:06 Thu 12 Dec , Christopher Yeoh wrote:
>>
>>> On Thu, Dec 12, 2013 at 8:59 AM, Doug Hellmann
>>> >> >wrote:
>>>
>>>


  On Wed, Dec 11, 2013 at 3:41 PM, Ryan Petrello <
> ryan.petre...@dreamhost.com
> >
>
 wrote:
>>
>>>
  Hello,
>
> I’ve spent the past week experimenting with using Pecan for
> Nova’s
>
 API
>>
>>> and have opened an experimental review:
>
> https://review.openstack.org/#/c/61303/6
>
> …which implements the `versions` v3 endpoint using pecan (and
>
 paves the
>>
>>> way for other extensions to use pecan).  This is a *potential*
>
>  approach
>>
>>> I've considered for gradually moving the V3 API, but I’m open
> to other suggestions (and feedback on this approach).  I’ve
> also got a few open questions/general observations:
>
> 1.  It looks like the Nova v3 API is composed *entirely* of
> extensions (including “core” API calls), and that extensions
> and their routes are discoverable and extensible via installed
> software that registers
>
 itself
>>
>>> via stevedore.  This seems to lead to an API that’s composed of
>
>  installed
>>
>>> software, which in my opinion, makes it fairly hard to map out
> the
>
 API (as
>>
>>> opposed to how routes are manually defined in other WSGI
>
 frameworks).  I
>>
>>> assume at this time, this design decision has already been
>
 solidified for
>>
>>> v3?
>
>
 Yeah, I brought this up at the summit. I am still having some
 trouble understanding how we are going to express a stable core
 API for compatibility testing if the behavior of the API can be
 varied so significantly by deployment decisions. Will we just
 list each

>>> "required"
>>
>>> extension, and forbid any extras for a compliant cloud?


>>>  Maybe the issue is caused by me misunderstanding the term
 "extension," which (to me) implies an optional component but is
 perhaps reflecting a technical implementation detail instead?


  Yes and no :-) As Ryan mentions, all API code is a plugin in the V3
>>> API. However, some must be loaded or the V3 API refuses to start
>>> up. In nova/api/openstack/__init__.py we have
>>> API_V3_CORE_EXTENSIONS which hard codes which extensions must be
>>> loaded and there is no config option to override this (blacklisting
>>> a core plugin will result in the V3 API not starting up).
>>>
>>> So for compatibility testing I think what will probably happen is
>>> that we'll be defining a minimum set (API_V3_CORE_EXTENSIONS) that
>>> must be implemented and clients can rely on that always being
>>>
>> present
>>
>>> on a compliant cloud. But clients can also then query through
>>> /extensions what other functionality (which is backwards compatible
>>> with respect to core) may also be present on that specific cloud.
>>>
>>
>> This really seems similar to the idea of having a router class, some
>> controllers and you map them. From my observation at the summit,
>> calling everything an extension creates confusion. An extension
>> "extends" something. For example, Chrome has extensions, and they
>> extend the idea of the core features of a browser. If you want to do
>> more than back/forward, go to an address, stop, etc, that's an
>> extension. If you want it to play an audio clip "stop, hammer time"
>> after clicking the stop button, that's an example of an extension.
>>
>> In OpenStack, we use extensions to extend core. Core are the
>> essential feature(s) of the project. In Cinder for example, core is
>> volume. In core you can create a volume, delete a volume, attach a
>> volume, detach a volume, etc. If you want to go beyond that, that's
>> an extension. If you want to do volume encryption, that's an example
>> of an extension.
>>
>> I'm worried by the discrepancies this will create among the programs.
>> You mentioned maintainability being a plus for this. I don't think
>> it'll be great from the deployers perspective when you have one
>> program that thinks everything is an extension and some of them have
>> to be enabled that the deployer has to be mindful of, while the rest
>> of the programs consider all extensions to be optional.
>>
>
> +1. I agree with most of what Mike says above. The idea that there are
> core "extensions" in Nova's v3 API doesn't make a whole lot of sense to me.
>
>
So would it help if we used the term "plugin" to talk about the framework
that the API is implemented with,
and extensions when talking about things which extend the core API? So the
whole of the API is implemented
using plugins, while the core plugins are not considered to be extensions.

Chris
___
OpenStack-dev mailing list
OpenStack-dev@li

Re: [openstack-dev] [Nova] Support for Pecan in Nova

2013-12-12 Thread Christopher Yeoh

On Fri, Dec 13, 2013 at 8:12 AM, Jonathan LaCour <
jonathan-li...@cleverdevil.org> wrote:

>
> On December 11, 2013 at 2:34:07 PM, Doug Hellmann (
> doug.hellm...@dreamhost.com) wrote:
>
> > On Wed, Dec 11, 2013 at 3:41 PM, Ryan Petrello wrote:
> >
> > > 1. It looks like the Nova v3 API is composed *entirely* of
> > > extensions (including “core” API calls), and that extensions and
> > > their routes are discoverable and extensible via installed
> > > software that registers itself via stevedore. This seems to lead
> > > to an API that’s composed of installed software, which in my
> > > opinion, makes it fairly hard to map out the API (as opposed to
> > > how routes are manually defined in other WSGI frameworks). I
> > > assume at this time, this design decision has already been
> > > solidified for v3?
> >
> > Yeah, I brought this up at the summit. I am still having some
> > trouble understanding how we are going to express a stable core API
> > for compatibility testing if the behavior of the API can be varied
> > so significantly by deployment decisions. Will we just list each
> > "required" extension, and forbid any extras for a compliant cloud?
> >
> > Maybe the issue is caused by me misunderstanding the term
> > "extension," which (to me) implies an optional component but is
> > perhaps reflecting a technical implementation detail instead?
>
> After taking a close look at how the API is constructed, I
> actually think that the current approach of having the API be
> defined purely through extensions is flawed, for a few reasons:
>
> 1. The code is extremely difficult to read and follow, because the API
>structure is entirely built at runtime based upon what is
>installed, rather than expressed declaratively in code.
>
>
So I'm too close to the code to have an unbiased opinion, but I found that
learning the V2 API code where parts of the API (core) were defined one way
and then extensions defined another even more confusing. Since they are
attempting
to achieve the same thing.

2. As a company providing a public cloud based upon OpenStack, with a
>desire to express compatibility with the "OpenStack API," its
>difficult to document the "standard" baseline Nova API. I shouldn't
>have to say "it depends" in API documentation.
>
>
The standard baseline for the Nova V3 API will be fixed. I think its a
decision higher up to
be made as to what is considered "OpenStack" but I'd be surprised if
OpenStack
compliant clouds were not permitted to add extra functionality to remain
compliant.


> 3. Based upon my read, extensions are in no way "quarantined" from the
>the baseline/standard/required API. In fact, they seem to be able
>to pollute the standard API with additional parameters and
>functionality. I can not envision a world in which this is sane.
>
> In my opinion, a well-designed and architected API should have the
> core functionality declaratively defined in the code itself, so as to
> give a good, well-documented, standard, and unchanging baseline. Then,
> an "extension" capability should be layered on in such a way that it
> doesn't alter the core API or serialized data.
>

Extensions can definitely add additional parameters and functionality and
that has been a pretty fundamental requirement for Nova development.
Perhaps this
will change as Nova matures, but at the moment we'd either have to stop new
features being added like being able to specify to a preserve_ephemeral
flag in rebuild
(just as one example happening in Icehouse), or have a major API version
every release.

Note that extensions are restricted to making only backwards compatible
changes
- eg behaviour has to stay the same if extra input parameters are not
present in a
request, and they can only add extra output parameters, they can't change
the value of
existing ones.


>
> Note: my opinion isn’t altered by the fact that some of the “core”
> API involves “required” extensions. The result is still difficult to
> read and document.
>
> That said, I don’t want to diminish or minimize the hard work that
> has been done on the V3 API thus far! Lots of thinking and heavy
> lifting has already been done, and its much appreciated. I am just
> concerned that we lost our way somewhere. Major API revisions only
> come along so often, and I’d prefer to raise my objections now
> rather than to hold them in and regret it!
>
>
So the idea of the V3 API originally came out of primarily wanting to clean
up many of the
warts, inconsistencies and bits of brokenness in the V2 API that we
couldn't because
of a requirement to keep backwards compatibility.  Perhaps in the future
Nova maturity
will reach the point where can start making guarantees like the core API is
never in anyway affected by extensions but I don't think we're there yet.

Chris
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Cinder] Cloning vs copying images

2013-12-12 Thread Andrew Woodward

All,

I'd like to ask the core reviewers to review both patch sets again since
they are now basically the same. I'd like to see either merged so the bug
cane closed and improve cinder (and ceph). We can simply determine
preference by votes.

Thanks

Andrew Woodward
Mirantis


On Fri, Dec 6, 2013 at 10:11 AM, Dmitry Borodaenko  wrote:

> Dear All,
>
> The consensus in comments to both patches seems to be that the
> decision to clone an image based on disk format should be made in each
> driver, instead of being imposed on all drivers by the flow. Edward
> has updated his patch to follow the same logic as my patch, and I have
> updated my patch to include additional unit test improvements and
> better log messages lifted from Edward's version. The only difference
> between the patches now is that my patch passes the whole image_meta
> dictionary into clone_image while Edward's patch only passes the
> image_format string.
>
> Please review the patches once again and provide feedback on which
> should be merged. I naturally favor my version, which came up first,
> is consistent with other driver methods which also pass image_meta
> dictionary around, and prevents further refactoring down the road if
> any driver comes up with a reason to consider other fields of
> image_meta (e.g. size) when deciding whether an image can be cloned.
>
> Thanks,
> Dmitry Borodaenko
>
> On Mon, Dec 2, 2013 at 11:29 AM, Dmitry Borodaenko
>  wrote:
> > Hi OpenStack, particularly Cinder backend developers,
> >
> > Please consider the following two competing fixes for the same problem:
> >
> > https://review.openstack.org/#/c/58870/
> > https://review.openstack.org/#/c/58893/
> >
> > The problem being fixed is that some backends, specifically Ceph RBD,
> > can only boot from volumes created from images in a certain format, in
> > RBD's case, RAW. When an image in a different format gets cloned into
> > a volume, it cannot be booted from. Obvious solution is to refuse
> > clone operation and copy/convert the image instead.
> >
> > And now the principal question: is it safe to assume that this
> > restriction applies to all backends? Should the fix enforce copy of
> > non-RAW images for all backends? Or should the decision whether to
> > clone or copy the image be made in each backend?
> >
> > The first fix puts this logic into the RBD backend, and makes changes
> > necessary for all other backends to have enough information to make a
> > similar decision if necessary. The problem with this approach is that
> > it's relatively intrusive, because driver clone_image() method
> > signature has to be changed.
> >
> > The second fix has significantly less code changes, but it does
> > prevent cloning non-RAW images for all backends. I am not sure if this
> > is a real problem or not.
> >
> > Can anyone point at a backend that can boot from a volume cloned from
> > a non-RAW image? I can think of one candidate: GPFS is a file-based
> > backend, while GPFS has a file clone operation. Is GPFS backend able
> > to boot from, say, a QCOW2 volume?
> >
> > Thanks,
> >
> > --
> > Dmitry Borodaenko
>
>
>
> --
> Dmitry Borodaenko
>
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>



-- 
If google has done it, Google did it right!
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Openstack] Friday Dec 20th - Doc Bug Day

2013-12-12 Thread Tom Fifield

Reminder: Friday next week - Doc Bug Day.

Current number of doc bugs: 505

On 22/11/13 11:27, Tom Fifield wrote:
> All,
> 
> This month, docs reaches 500 bugs, making it the 2nd-largest project by
> bug count in all of OpenStack. Yes, it beats Cinder, Horizon, Swift,
> Keystone and Glance, and will soon surpass Neutron.
> 
> In order to start the new year in a slightly better state, we have
> arranged a bug squash day:
> 
> 
> Friday, December 20th
> 
> 
> https://wiki.openstack.org/wiki/Documentation/BugDay
> 
> 
> Join us in #openstack-doc whenever you get to your computer, and let's
> beat the bugs :)
> 
> 
> For those who are unfamiliar:
> Bug days are a day-long event where all the OpenStack community focuses
> exclusively on a task around bugs corresponding to the bug day topic.
> With so many community members available around the same task, these
> days are a great way to start joining the OpenStack community.
> 
> 
> Regards,
> 
> 
> Tom
> 
> ___
> Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
> Post to : openst...@lists.openstack.org
> Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [trove] configuration groups and datastores type/versions

2013-12-12 Thread McReynolds, Auston

Another Example:

  Datastore Type | Version
  -
  MySQL 5.5  | 5.5.35
  MySQL 5.5  | 5.5.20
  MySQL 5.6  | 5.6.15
  

A user creates a MySQL 5.5 configuration-group that merely consists
of a innodb_buffer_pool_size override. The innodb_buffer_pool_size
parameter is still featured in MySQL 5.6, so arguably the
configuration-group should work with MySQL 5.6 as well. If a
configuration-group can only be tied to a single datastore type
and/or a single datastore-version, this will not work.

To support all possible permutations, a "compatibility" list of sorts
has to be introduced.

Table: configuration_datastore_compatibility

  Name| Description
  --
  id| PrimaryKey, Generated UUID
  from_version_id | ForeignKey(datastore_version.id)
  to_version_id  | ForeignKey(datastore_version.id)

The cloud provider can then be responsible for updating the
compatibility table (via trove-manage) whenever a new version of a
datastore is introduced and has a strict superset of configuration
parameters as compared to previous versions.

On a related note, it would probably behoove us to consider how to
handle datastore migrations in relation to configuration-groups.
A rough-draft blueprint/gist for datastore migrations is located at
https://gist.github.com/amcrn/dfd493200fcdfdb61a23.


Auston

---

From:  Craig Vyvial 
Reply-To:  "OpenStack Development Mailing List (not for usage questions)"

Date:  Wednesday, December 11, 2013 8:52 AM
To:  OpenStack Development Mailing List 
Subject:  [openstack-dev] [trove] configuration groups and
datastores  type/versions


Configuration Groups is currently developed to associate the datastore
version with a configuration that is created. If a datastore version is
not presented it will use the default similar to the way instances are
created now. This looks like
 a way of associating the configuration with a datastore because an
instance has this same association.

Depending on how you setup your datastore types and versions this might
not be ideal.
Example:
Datastore Type | Version
-
Mysql  | 5.1
Mysql  | 5.5

Percona| 5.5
-

Configuration  | datastore_version
---
mysql-5.5-config   | mysql 5.5

percona-5.5-config | percona 5.5

---

or 

Datastore Type | Version
-
Mysql 5.1  | 5.1.12
Mysql 5.1  | 5.1.13

Mysql  | 5.5.32

Percona| 5.5.44
-


Configuration  | datastore_version
---
mysql-5.1-config   | mysql 5.5

percona-5.5-config | percona 5.5

---



Notice that if you associate the configuration with a datastore version
then in the latter example you will not be able to use the same
configurations that you created with different minor versions of the
datastore. 

Something that we should consider is allowing a configuration to be
associated with a just a datastore type (eg. Mysql 5.1) so that any
versions of 5.1 should allow the same configuration to be applied.

I do not view this as a change that needs to happen before the current
code is merged but more as an additive feature of configurations.


*snippet from Morris and I talking about this*
Given the nature of how the datastore / types code has been implemented in
that it is highly configurable, I believe that we we need to adjust the
way in which we are associating configuration groups with datastore types
and versions.  The main
 use case that I am considering here is that as a user of the API, I want
to be able to associate configurations with a specific datastore type so
that I can easily return a list of the configurations that are valid for
that database type (Example: Get me a
 list of configurations for MySQL 5.6).   We know that configurations will
vary across types (MySQL vs. Redis) as well as across major versions
(MySQL 5.1 vs MySQL 5.6).   Presently, the code only keys off the
datastore version, and consequently, if I were
 to set up my datastore type as MySQL X.X and datastore versions as X.X.X,
then you would be potentially associating a configuration with a specific
minor version such as MySQL 5.1.63.Given then, I am thinking that it
makes more sense to allow a configuration
 to be associated with both a datastore type AND and datastore version
with precedence given to the datastore type (where both attributes are
either optional  or at least one is required).  This would give the most
flexibility to associate configurations with
 either the type, version, or both and would allow it to work across
providers given that they are likely to configure types/versions
differently.

[openstack-dev] [Swift] Release of Swift 1.11.0

2013-12-12 Thread John Dickinson

I'm happy to announce that we've released Swift 1.11.0. You can find
the high-level Launchpad details (including a link to the tarball) at
https://launchpad.net/swift/icehouse/1.11.0.

As always, you can upgrade to this release without any downtime to your
users.

Swift 1.11.0 is the work of 26 contributors, including the following 5
new contributors to Swift:

Rick Hawkins
Steven Lang
Gonéri Le Bouder
Zhenguo Niu
Aaron Rosen

This release includes some significant new features. I encourage you
to read the change log
(https://github.com/openstack/swift/blob/master/CHANGELOG), and I'll
highlight some of the more significant changes below.

* Discoverable capabilities: The Swift proxy server will now respond
  to /info requests with information about the particular cluster
  being queried. This will allow easy programmatic discovery of limits
  and features implemented in a particular Swift system. The first two
  obvious use cases are for cross-cluster clients (e.g. common client
  between Rackspace, HP, and a private deployment) and for deeper
  functional testing of all parts of the Swift API.

* Early quorum response: On writes, the Swift proxy server will not
  return success unless a quorum of the storage nodes indicate they
  have successfully written data to disk. Previously, the proxy waited
  for all storage nodes to respond, even if it had already heard from
  a quorum of servers. With this change, the proxy node will be able
  to respond to client requests as soon as a quorum of the storage
  nodes indicate a common response. This can help lower response times
  to clients and improve performance of the cluster.

* Retry reads: If a storage server fails during an object read
  request, the proxy will now continue the response stream to the
  client by making a request to a different replica of the data. For
  example, if a client requests a 3GB object and the particular object
  server serving the response fails during the request after 1.25GB,
  the proxy will make a range request to a different replica, asking
  for the data starting at 1.25GB into the file. In this way, Swift
  provides even higher availability to your data in the face of
  hardware failures.

* DiskFile API: The DiskFile abstraction for talking to data on disk
  has been refactored to allow alternate implementations to be
  developed. There is an example in-memory implementation included in
  the codebase. External implementations include one for Gluster and
  one for Seagate Kinetic drives. The DiskFile API is still a work in
  progress and is not yet finalized.

* Object replication ssync (an rsync alternative): A Swift storage
  node can now be configured to use Swift primitives for replication
  transport instead of rsync. Although still being tested at scale,
  this mechanism will allow for future development improving
  replication times and lowering both MTTD and MTTR of errors.

I'd like to publicly thank the Swift contributors and core developers
for their work on Swift. Their diverse experience and viewpoints make
Swift the mature project it is, capable of running the world's largest
storage clouds.

--John





signature.asc
Description: Message signed with OpenPGP using GPGMail
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

[openstack-dev] [neutron] [policy] Policy-group relationship

2013-12-12 Thread Mohammad Banikazemi


Continuing the discussion we had earlier today during the Neutron Group
Policy weekly meeting [0], I would like to initiate a couple of email
threads and follow up on a couple of important issues we need to agree on
so we can move forward. In this email thread, I would like to discuss the
policy-group relationship.

I want to summarize the discussion we had during the meeting [1] and see if
we have reached an agreement:

There are two models for expressing the relationship between Groups and
Policies that were discussed:
1- Policies are defined for a source Group and a destination Group
2- Groups specify the Policies they "provide" and the Policies they
"consume"

As expressed during the IRC meeting, both models have strong support and we
decided to have a resource model that can be used to express both models.
The solution we came up with was rather simple:

Update the resource model (shown in the attribute tables and the taxonomy
in the google doc [2]) such that policy can refer to a "list" of source
Groups and a "list" of destination Groups.
This boils down to having two attributes namely, src_groups and
destination_groups (both list of uuid-str type) replacing the current
attributes src_group and dest_group, respectively.

This change simply allows the support for both models. For supporting model
1, specify a single source Group and a single destination Group. For model
2, specify the producers of a Policy in the source Group list and specify
the consumers of the Policy in the destination Group list.

If there is agreement, I will update the taxonomy and the attribute tables
in the doc.

Best,

Mohammad


[0] https://wiki.openstack.org/wiki/Meetings/Neutron_Group_Policy
[1]
http://eavesdrop.openstack.org/meetings/networking_policy/2013/networking_policy.2013-12-12-16.01.log.html
[2]
https://docs.google.com/document/d/1ZbOFxAoibZbJmDWx1oOrOsDcov6Cuom5aaBIrupCD9E/edit#heading=h.x1h06xqhlo1n
   (Note the new additions are at the end of the document.)
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [TripleO][Tuskar] Terminology

2013-12-12 Thread Lucas Alvares Gomes

Hi

2) UNDEPLOYED NODE - a node that has not been deployed with an instance
>
> Other suggestions included  UNASSIGED, UNMAPPED, FREE, and AVAILABLE.
>  Some people (I'm one of them)
> find AVAILABLE to be a bit of an overloaded term, as it can also be
> construed to mean that, say,
> a service instance is now running on a node and is now "available for
> use".  I'm in favor of an
> "UN" word, and it sounds like "UNDEPLOYED" was the most generally
> acceptable?
>

 Maybe unprovisioned would be a better description? Meaning the node has been
discovered but not yet configured.
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [keystone] domain admin role query

2013-12-12 Thread Henry Nash

Hi

So the idea wasn't the you create a domain with the id of 'domain_admin_id', 
rather that you create the domain that you plan to use for your admin domain, 
and then paste its (auto-generated) domain_id into the policy file.

Henry
On 12 Dec 2013, at 03:11, Paul Belanger  wrote:

> On 13-12-11 11:18 AM, Lyle, David wrote:
>> +1 on moving the domain admin role rules to the default policy.json
>> 
>> -David Lyle
>> 
>> From: Dolph Mathews [mailto:dolph.math...@gmail.com]
>> Sent: Wednesday, December 11, 2013 9:04 AM
>> To: OpenStack Development Mailing List (not for usage questions)
>> Subject: Re: [openstack-dev] [keystone] domain admin role query
>> 
>> 
>> On Tue, Dec 10, 2013 at 10:49 PM, Jamie Lennox  
>> wrote:
>> Using the default policies it will simply check for the admin role and not 
>> care about the domain that admin is limited to. This is partially a left 
>> over from the V2 api when there wasn't domains to worry > about.
>> 
>> A better example of policies are in the file etc/policy.v3cloudsample.json. 
>> In there you will see the rule for create_project is:
>> 
>>   "identity:create_project": "rule:admin_required and 
>> domain_id:%(project.domain_id)s",
>> 
>> as opposed to (in policy.json):
>> 
>>   "identity:create_project": "rule:admin_required",
>> 
>> This is what you are looking for to scope the admin role to a domain.
>> 
>> We need to start moving the rules from policy.v3cloudsample.json to the 
>> default policy.json =)
>> 
>> 
>> Jamie
>> 
>> - Original Message -
>>> From: "Ravi Chunduru" 
>>> To: "OpenStack Development Mailing List" 
>>> Sent: Wednesday, 11 December, 2013 11:23:15 AM
>>> Subject: [openstack-dev] [keystone] domain admin role query
>>> 
>>> Hi,
>>> I am trying out Keystone V3 APIs and domains.
>>> I created an domain, created a project in that domain, created an user in
>>> that domain and project.
>>> Next, gave an admin role for that user in that domain.
>>> 
>>> I am assuming that user is now admin to that domain.
>>> Now, I got a scoped token with that user, domain and project. With that
>>> token, I tried to create a new project in that domain. It worked.
>>> 
>>> But, using the same token, I could also create a new project in a 'default'
>>> domain too. I expected it should throw authentication error. Is it a bug?
>>> 
>>> Thanks,
>>> --
>>> Ravi
>>> 
> 
> One of the issues I had this week while using the policy.v3cloudsample.json 
> was I had no easy way of creating a domain with the id of 'admin_domain_id'.  
> I basically had to modify the SQL directly to do it.
> 
> Any chance we can create a 2nd domain using 'admin_domain_id' via 
> keystone-manage sync_db?
> 
> -- 
> Paul Belanger | PolyBeacon, Inc.
> Jabber: paul.belan...@polybeacon.com | IRC: pabelanger (Freenode)
> Github: https://github.com/pabelanger | Twitter: 
> https://twitter.com/pabelanger
> 
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] Incubation Request for Barbican

2013-12-12 Thread Morgan Fainberg

On December 12, 2013 at 14:32:36, Dolph Mathews (dolph.math...@gmail.com) wrote:

On Thu, Dec 12, 2013 at 2:58 PM, Adam Young  wrote:
On 12/04/2013 08:58 AM, Jarret Raim wrote:
While I am all for adding a new program, I think we should only add one
if we  
rule out all existing programs as a home. With that in mind why not add
this  
to the  keystone program? Perhaps that may require a tweak to keystones
mission  
statement, but that is doable. I saw a partial answer to this somewhere
but not a full one.
From our point of view, Barbican can certainly help solve some problems
related to identity like SSH key management and client certs. However,
there is a wide array of functionality that Barbican will handle that is
not related to identity.


Some examples, there is some additional detail in our application if you
want to dig deeper [1].


* Symmetric key management - These keys are used for encryption of data at
rest in various places including Swift, Nova, Cinder, etc. Keys are
resources that roll up to a project, much like servers or load balancers,
but they have no direct relationship to an identity.

* SSL / TLS certificates - The management of certificate authorities and
the issuance of keys for SSL / TLS. Again, these are resources rather than
anything attached to identity.

* SSH Key Management - These could certainly be managed through keystone
if we think that¹s the right way to go about it, but from Barbican¹s point
of view, these are just another type of a key to be generated and tracked
that roll up to an identity.


* Client certificates - These are most likely tied to an identity, but
again, just managed as resources from a Barbican point of view.

* Raw Secret Storage - This functionality is usually used by applications
residing on an Cloud. An app can use Barbican to store secrets such as
sensitive configuration files, encryption keys and the like. This data
belongs to the application rather than any particular user in Keystone.
For example, some Rackspace customers don¹t allow their application dev /
maintenance teams direct access to the Rackspace APIs.

* Boot Verification - This functionality is used as part of the trusted
boot functionality for transparent disk encryption on Nova.

* Randomness Source - Barbican manages HSMs which allow us to offer a
source of true randomness.



In short (ha), I would encourage everyone to think of keys / certificates
as resources managed by an API in much the same way we think of VMs being
managed by the Nova API. A consumer of Barbican (either as an OpenStack
service or a consumer of an OpenStack cloud) will have an API to create
and manage various types of secrets that are owned by their project.

My reason for keeping them separate is more practical:  the Keystone team is 
already somewhat overloaded.  I know that a couple of us have interest in 
contributing to Barbican, the question is time and prioritization. 

Unless there is some benefit to having both projects in the same program with 
essentially different teams, I think Barbican should proceed as is.  I 
personally plan on contributing to Barbican.

/me puts PTL hat on

++ I don't want Russel's job.

Keystone has a fairly narrow mission statement in my mind (come to think of it, 
I need to propose it to governance..), and that's basically to abstract away 
the problem of authenticating and authorizing the API users of other openstack 
services. Everything else, including identity management, key management, key 
distribution, quotas, etc, is just secondary fodder that we tend to help with 
along the way... but they should be first class problems in someone else's mind.

If we rolled everything together that kind of looks related to keystone under a 
big keystone program for the sake of organizational tidiness, I know I would be 
less effective as a "PTL" and that's a bit disheartening. That said, I'm always 
happy to help where I can.
 
The long and the short of it is that I can’t argue that Barbican couldn’t be 
considered a mechanism of “Identity” (in most everything keys end up being a 
form of Identity, and the management of that would fit nicely under the 
“Identity Program”).  That being said I also can’t argue that Barbican 
shouldn’t be it’s own top-level program.  It comes down to the best fit for 
OpenStack as a whole.

From a deployer standpoint, I don’t think it will make any real difference if 
Barbican is in Identity or it’s own program.  Basically, it’ll be a separate 
process to run in either case.  It will have it’s own rules and quirks.

From a developer standpoint, I don’t think it will make a significant 
difference (besides, perhaps where documentation lies).  The contributors to 
Keystone will contribute (or not) to Barbican and vice-versa based upon 
interest/time/needs.

From a community and communication standpoint (which is the important part 
here), I think it comes down to messaging and what Barbican is meant to be.  If 
we are happy messaging that it is a sepa

Re: [openstack-dev] [Neutron][ML2] Unit test coverage

2013-12-12 Thread Amir Sadoughi

Mathieu,

Here are my results for running the unit tests for the agents.

I ran `tox -e cover neutron.tests.unit.openvswitch.test_ovs_neutron_agent` at 
3b4233873539bad62d202025529678a5b0add412 with the following result:

Name Stmts   Miss 
Branch BrMiss  Cover
…
neutron/plugins/openvswitch/agent/ovs_neutron_agent639257   
 23712357%
…

and `tox -e cover neutron.tests.unit.linuxbridge.test_lb_neutron_agent` with 
the following result:

...
neutron/plugins/linuxbridge/agent/linuxbridge_neutron_agent607134   
 255 7376%
...

Amir


On Dec 11, 2013, at 3:01 PM, Mathieu Rohon  wrote:

> the coverage is quite good on the ML2 plugin.
> it looks like the biggest effort should be done on the ovs and lb agents, no?
> 
> On Wed, Dec 11, 2013 at 9:00 PM, Amir Sadoughi
>  wrote:
>> From today’s ML2 meeting, I had an action item to produce coverage report
>> for ML2 unit tests.
>> 
>> Here is the command line output of the tests and report I produced:
>> 
>> http://paste.openstack.org/show/54845/
>> 
>> Amir Sadoughi
>> 
>> ___
>> OpenStack-dev mailing list
>> OpenStack-dev@lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>> 
> 
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Keystone] policy has no effect because of hard coded assert_admin?

On Thu, Dec 12, 2013 at 12:40 PM, Morgan Fainberg  wrote:

> As Dolph stated, V3 is where the policy file protects.  This is one of the
> many reasons why I would encourage movement to using V3 Keystone over V2.
>
> The V2 API is officially deprecated in the Icehouse cycle, I think that
> moving the decorator potentially could cause more issues than not as stated
> for compatibility.  I would be very concerned about breaking compatibility
> with deployments and maintaining the security behavior with the
> encouragement to move from V2 to V3.  I am also not convinced passing the
> context down to the manager level is the right approach.  Making a move on
> where the protection occurs likely warrants a deeper discussion (perhaps in
> Atlanta?).
>

++ I *should* have written "could be moved to the manager layer." I don't
actually think they should, at least at the moment. With v2.0 gone, it
would be a more interesting, more approachable discussion.


>
> Cheers,
> Morgan Fainberg
>
> On December 12, 2013 at 10:32:40, Dolph Mathews 
> (dolph.math...@gmail.com)
> wrote:
>
> The policy file is protecting v3 API calls at the controller layer, but
> you're calling the v2 API. The policy decorators should be moved to the
> manager layer to protect both APIs equally... but we'd have to be very
> careful not to break deployments depending on the trivial "assert_admin"
> behavior (hence the reason we only wrapped v3 with the new policy
> decorators).
>
>
> On Thu, Dec 12, 2013 at 1:41 AM, Qiu Yu  wrote:
>
>>  Hi,
>>
>> I was trying to fine tune some keystone policy rules. Basically I want to
>> grant "create_project" action to user in "ops" role. And following are my
>> steps.
>>
>> 1. Adding a new user "usr1"
>> 2. Creating new role "ops"
>> 3. Granting this user a "ops" role in "service" tenant
>> 4. Adding new lines to keystone policy file
>>
>>  "ops_required": [["role:ops"]],
>> "admin_or_ops": [["rule:admin_required"], ["rule:ops_required"]],
>>
>> 5. Change
>>
>> "identity:create_project": [["rule:admin_required"]],
>> to
>> "identity:create_project": [["rule:admin_or_ops"]],
>>
>> 6. Restart keystone service
>>
>> keystone tenant-create with credential of user "usr1" still returns 403
>> Forbidden error.
>> “You are not authorized to perform the requested action, admin_required.
>> (HTTP 403)”
>>
>> After some quick scan, it seems that create_project function has a
>> hard-coded assert_admin call[1], which does not respect settings in the
>> policy file.
>>
>> Any ideas why? Is it a bug to fix? Thanks!
>> BTW, I'm running keystone havana release with V2 API.
>>
>> [1]
>> https://github.com/openstack/keystone/blob/master/keystone/identity/controllers.py#L105
>>
>> Thanks,
>> --
>> Qiu Yu
>>
>> ___
>> OpenStack-dev mailing list
>> OpenStack-dev@lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>>
>
>
> --
>
> -Dolph
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>


-- 

-Dolph
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [bugs] definition of triaged

On Thu, Dec 12, 2013 at 3:46 PM, Robert Collins
wrote:

> Hi, I'm trying to overhaul the bug triage process for nova (initially)
> to make it much lighter and more effective.
>
> I'll be sending a more comprehensive mail shortly but one thing that
> has been giving me pause is this:
> "
> Confirmed The bug was reproduced or confirmed as a genuine bug
> Triaged The bug comments contain a full analysis on how to properly
> fix the issue
> "
>
> From wiki.openstack.org/wiki/Bugs
>
> Putting aside the difficulty of complete reproduction sometimes, I
> don't understand the use of Triaged here.
>
> In LP they mean:
>
> Confirmed Verified by someone other than the reporter.
> Triaged Verified by the bug supervisor.
>
> So our meaning is very divergent. I'd like us to consolidate on the
> standard meaning - which is that the relative priority of having a
> doctor [developer] attack the problem has been assessed.
>
> Specifically:
>  - we should use Triaged to indicate that:
> - we have assigned a priority
> - we believe it's a genuine bug
> - we have routed[tagged] it to what is probably the right place
>

++ that's exactly how I use it, with some emphasis on "believe" which I use
to differentiate from this from "Confirmed" ...


> [vendor driver/low-hanging-fruit etc]
>  - we should use Incomplete if we aren't sure that its a bug and need
> the reporter to tell us more to be sure
>  - triagers shouldn't ever set 'confirmed' - thats reserved solely for
> end users to tell us that more than one user is encountering the
> problem.
>

As a "triager", if I put my user hat on and am able to reproduce a bug,
I'll mark Confirmed, otherwise it's just Triaged.


>
> -Rob
>
>
>
> --
> Robert Collins 
> Distinguished Technologist
> HP Converged Cloud
>
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>



-- 

-Dolph
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] Announcing Fuel

2013-12-12 Thread Andrew Woodward

Mike,

It's great that we are continuing to align fuel with TripleO. I've been
toying with the idea of using TripleO components and CEPH to enhance our CI
infra for a while now, and this announcement has encouraged me to write it
down. I've proposed a BP for "Super fast deploy CI" that would help us
perform CI better by "accelerate fuel CI testing by taking reasonable
shortcuts integrating with several existing components like TripleO and
CEPH".

Blueprint:
https://blueprints.launchpad.net/fuel/+spec/fuel-superfastdeploy-ci
Pad Spec: https://etherpad.openstack.org/p/bp-fuel-superfastdeploy-ci

--
Andrew Woodward
Mirantis


On Thu, Dec 12, 2013 at 6:31 AM, Mike Scherbakov
wrote:

> Folks,
>
> Most of you by now have heard of Fuel, which we’ve been working on as a
> related OpenStack project for a period of time -
> see https://launchpad.net/fueland
> https://wiki.openstack.org/wiki/Fuel. The aim of the project is to
> provide a distribution agnostic and plug-in agnostic engine for preparing,
> configuring and ultimately deploying various “flavors” of OpenStack in
> production. We’ve also used Fuel in most of our customer engagements to
> stand up an OpenStack cloud.
>
>  At the same time, we’ve been actively involved with TripleO, which we
> believe to be a great effort in simplifying deployment, operations, scaling
> (and eventually upgrading) of OpenStack.
>
> Per our discussions with core TripleO team during the Icehouse summit,
> we’ve uncovered that while there are certain areas of collision, most of
> the functionality in TripleO and Fuel is complementary. In general, Fuel
> helps solve many problems around “step zero” of setting up an OpenStack
> environment, such as auto-discovery and inventory of bare metal hardware,
> pre-deployment & post-deployment environment  checks, and wizard-driven
> web-based configuration of OpenStack flavors. At the same time, TripleO has
> made great progress in deployment, scaling and operations (with Tuskar).
>
> We’d like to propose an effort for community consideration to bring the
> two initiatives closer together to eventually arrive at a distribution
> agnostic, community supported framework covering the entire spectrum of
> deployment, management and upgrades; from “step zero” to a fully functional
> and manageable production-grade OpenStack environment.
>
> To that effect, we propose the following high-level roadmap plans for this
> effort:
>
>
>-
>
>Keep and continue to evolve bare-metal discovery and inventory module
>of Fuel, tightly integrating it with Ironic.
>-
>
>Keep and continue to evolve Fuel’s wizard-driven OpenStack flavor
>configurator. In the near term we’ll work with the UX team to unify the
>user experience across Fuel, TripleO and Tuskar. We are also thinking about
>leveraging diskimagebuilder.
>-
>
>Continue to evolve Fuel’s pre-deployment (DHCP, L2 connectivity
>checks) and post-deployment validation checks in collaboration with the
>TripleO and Tempest teams.
>-
>
>Eventually replace Fuel’s current orchestration engine
>https://github.com/stackforge/fuel-astute/ with Heat
>
>
> We’d love to open discussion on this and hear everybody’s thoughts on this
> direction.
>
> --
> Mike Scherbakov
> Fuel Engineering Lead
> #mihgen
>
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] Incubation Request for Barbican

On Thu, Dec 12, 2013 at 2:58 PM, Adam Young  wrote:

>  On 12/04/2013 08:58 AM, Jarret Raim wrote:
>
>  While I am all for adding a new program, I think we should only add one
> if we
> rule out all existing programs as a home. With that in mind why not add
> this
> to the  keystone program? Perhaps that may require a tweak to keystones
> mission
> statement, but that is doable. I saw a partial answer to this somewhere
> but not a full one.
>
>  From our point of view, Barbican can certainly help solve some problems
> related to identity like SSH key management and client certs. However,
> there is a wide array of functionality that Barbican will handle that is
> not related to identity.
>
>
> Some examples, there is some additional detail in our application if you
> want to dig deeper [1].
>
>
> * Symmetric key management - These keys are used for encryption of data at
> rest in various places including Swift, Nova, Cinder, etc. Keys are
> resources that roll up to a project, much like servers or load balancers,
> but they have no direct relationship to an identity.
>
> * SSL / TLS certificates - The management of certificate authorities and
> the issuance of keys for SSL / TLS. Again, these are resources rather than
> anything attached to identity.
>
> * SSH Key Management - These could certainly be managed through keystone
> if we think that¹s the right way to go about it, but from Barbican¹s point
> of view, these are just another type of a key to be generated and tracked
> that roll up to an identity.
>
>
> * Client certificates - These are most likely tied to an identity, but
> again, just managed as resources from a Barbican point of view.
>
> * Raw Secret Storage - This functionality is usually used by applications
> residing on an Cloud. An app can use Barbican to store secrets such as
> sensitive configuration files, encryption keys and the like. This data
> belongs to the application rather than any particular user in Keystone.
> For example, some Rackspace customers don¹t allow their application dev /
> maintenance teams direct access to the Rackspace APIs.
>
> * Boot Verification - This functionality is used as part of the trusted
> boot functionality for transparent disk encryption on Nova.
>
> * Randomness Source - Barbican manages HSMs which allow us to offer a
> source of true randomness.
>
>
>
> In short (ha), I would encourage everyone to think of keys / certificates
> as resources managed by an API in much the same way we think of VMs being
> managed by the Nova API. A consumer of Barbican (either as an OpenStack
> service or a consumer of an OpenStack cloud) will have an API to create
> and manage various types of secrets that are owned by their project.
>
>
> My reason for keeping them separate is more practical:  the Keystone team
> is already somewhat overloaded.  I know that a couple of us have interest
> in contributing to Barbican, the question is time and prioritization.
>
> Unless there is some benefit to having both projects in the same program
> with essentially different teams, I think Barbican should proceed as is.  I
> personally plan on contributing to Barbican.
>

/me puts PTL hat on

++ I don't want Russel's job.

Keystone has a fairly narrow mission statement in my mind (come to think of
it, I need to propose it to governance..), and that's basically to abstract
away the problem of authenticating and authorizing the API users of other
openstack services. Everything else, including identity management, key
management, key distribution, quotas, etc, is just secondary fodder that we
tend to help with along the way... but they should be first class problems
in someone else's mind.

If we rolled everything together that kind of looks related to keystone
under a big keystone program for the sake of organizational tidiness, I
know I would be less effective as a "PTL" and that's a bit disheartening.
That said, I'm always happy to help where I can.


>
>
>
>  Keystone plays a critical role for us (as it does with every service) in
> authenticating the user to a particular project and storing the roles that
> the user has for that project. Barbican then enforces these restrictions.
> However, keys / secrets are fundamentally divorced from identities in much
> the same way that databases in Trove are, they are owned by a project, not
> a particular user.
>
> Hopefully our thought process makes some sense, let me know if I can
> provide more detail.
>
>
>
> Jarret
>
>
>
>
>
> [1] https://wiki.openstack.org/wiki/Barbican/Incubation
>
>
>
> ___
> OpenStack-dev mailing 
> listOpenStack-dev@lists.openstack.orghttp://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
>
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>


-- 

-Dolph
___
OpenStack-dev mai

Re: [openstack-dev] [Horizon] Nominations to Horizon Core

2013-12-12 Thread Paul McMillan

> From: Russell Bryant 
> We can involve people in security reviews without having them on the
> core review team.  They are separate concerns.

As I noted in my original mail, this was my primary concern. I just didn't want 
"not core" to stand in the way of "is appropriate to provide security review 
for private patches on Launchpad". If that is the case, I want to be sure that 
there is someone on core who has the appropriate domain-specific knowledge to 
make sure the patch is thorough and correct.

I'll leave the rest of the argument about why this is important for after I 
finish filing the tickets and fixes are released so we can publicly talk about 
it.

-Paul
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements

2013-12-12 Thread Jay Dobies




On 12/12/2013 04:25 PM, Keith Basil wrote:

On Dec 12, 2013, at 4:05 PM, Jay Dobies wrote:


Maybe this is a valid use case?

Cloud operator has several core service nodes of differing configuration
types.

[node1]  <-- balanced mix of disk/cpu/ram for general core services
[node2]  <-- lots of disks for Ceilometer data storage
[node3]  <-- low-end "appliance like" box for a specialized/custom core service
 (SIEM box for example)

All nodes[1,2,3] are in the same deployment grouping ("core services)".  As 
such,
this is a heterogenous deployment grouping.  Heterogeneity in this case defined 
by
differing roles and hardware configurations.

This is a real use case.

How do we handle this?


This is the sort of thing I had been concerned with, but I think this is just a 
variation on Robert's GPU example. Rather than butcher it by paraphrasing, I'll 
just include the relevant part:


"The basic stuff we're talking about so far is just about saying each
role can run on some set of undercloud flavors. If that new bit of kit
has the same coarse metadata as other kit, Nova can't tell it apart.
So the way to solve the problem is:
- a) teach Ironic about the specialness of the node (e.g. a tag 'GPU')
- b) teach Nova that there is a flavor that maps to the presence of
that specialness, and
   c) teach Nova that other flavors may not map to that specialness

then in Tuskar whatever Nova configuration is needed to use that GPU
is a special role ('GPU compute' for instance) and only that role
would be given that flavor to use. That special config is probably
being in a host aggregate, with an overcloud flavor that specifies
that aggregate, which means at the TripleO level we need to put the
aggregate in the config metadata for that role, and the admin does a
one-time setup in the Nova Horizon UI to configure their GPU compute
flavor."



Yes, the core services example is a variation on the above.  The idea
of _undercloud_ flavor assignment (flavor to role mapping) escaped me
when I read that earlier.

It appears to be very elegant and provides another attribute for Tuskar's
notion of resource classes.  So +1 here.



You mention three specific nodes, but what you're describing is more likely 
three concepts:
- Balanced Nodes
- High Disk I/O Nodes
- Low-End Appliance Nodes

They may have one node in each, but I think your example of three nodes is 
potentially *too* simplified to be considered as proper sample size. I'd guess 
there are more than three in play commonly, in which case the concepts 
breakdown starts to be more appealing.


Correct - definitely more than three, I just wanted to illustrate the use case.


I not sure I explained what I was getting at properly. I wasn't implying 
you thought it was limited to just three. I do the same thing, simplify 
down for discussion purposes (I've done so in my head about this very 
topic).


But I think this may be a rare case where simplifying actually masks the 
concept rather than exposes it. Manual feels a bit more desirable in 
small sample groups but when looking at larger sets of nodes, the flavor 
concept feels less odd than it does when defining a flavor for a single 
machine.


That's all. :) Maybe that was clear already, but I wanted to make sure I 
didn't come off as attacking your example. It certainly wasn't my 
intention. The balanced v. disk machine thing is the sort of thing I'd 
been thinking for a while but hadn't found a good way to make concrete.



I think the disk flavor in particular has quite a few use cases, especially until SSDs 
are ubiquitous. I'd want to flag those (in Jay terminology, "the disk hotness") 
as hosting the data-intensive portions, but where I had previously been viewing that as 
manual allocation, it sounds like the approach is to properly categorize them for what 
they are and teach Nova how to use them.

Robert - Please correct me if I misread any of what your intention was, I don't 
want to drive people down the wrong path if I'm misinterpretting anything.


-k


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] Unified Guest Agent proposal

2013-12-12 Thread Steven Dake


On 12/12/2013 02:19 PM, Clint Byrum wrote:

Excerpts from Steven Dake's message of 2013-12-12 12:32:55 -0800:

On 12/12/2013 10:24 AM, Dmitry Mescheryakov wrote:

Clint, Kevin,

Thanks for reassuring me :-) I just wanted to make sure that having
direct access from VMs to a single facility is not a dead end in terms
of security and extensibility. And since it is not, I agree it is much
simpler (and hence better) than hypervisor-dependent design.


Then returning to two major suggestions made:
  * Salt
  * Custom solution specific to our needs

The custom solution could be made on top of oslo.messaging. That gives
us RPC working on different messaging systems. And that is what we
really need - an RPC into guest supporting various transports. What it
lacks at the moment is security - it has neither authentication nor ACL.

Salt also provides RPC service, but it has a couple of disadvantages:
it is tightly coupled with ZeroMQ and it needs a server process to
run. A single transport option (ZeroMQ) is a limitation we really want
to avoid. OpenStack could be deployed with various messaging
providers, and we can't limit the choice to a single option in the
guest agent. Though it could be changed in the future, it is an
obstacle to consider.

Running yet another server process within OpenStack, as it was already
pointed out, is expensive. It means another server to deploy and take
care of, +1 to overall OpenStack complexity. And it does not look it
could be fixed any time soon.

For given reasons I give favor to an agent based on oslo.messaging.


An agent based on oslo.messaging is a potential security attack vector
and a possible scalability problem.  We do not want the guest agents
communicating over the same RPC servers as the rest of OpenStack

I don't think we're talking about agents talking to the exact same
RabbitMQ/Qpid/etc. bus that things under the cloud are talking to. That
would definitely raise some eyebrows. No doubt it will be in the realm
of possibility if deployers decide to do that, but so is letting your
database server sit on the same flat network as your guests.


This is my concern.


I have a hard time seeing how using the same library is a security
risk though.
Yes, unless the use of the library is abused by the deployer, is itself 
not a security risk.


Regards
-steve


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

[openstack-dev] [bugs] definition of triaged

Hi, I'm trying to overhaul the bug triage process for nova (initially)
to make it much lighter and more effective.

I'll be sending a more comprehensive mail shortly but one thing that
has been giving me pause is this:
"
Confirmed The bug was reproduced or confirmed as a genuine bug
Triaged The bug comments contain a full analysis on how to properly
fix the issue
"

>From wiki.openstack.org/wiki/Bugs

Putting aside the difficulty of complete reproduction sometimes, I
don't understand the use of Triaged here.

In LP they mean:

Confirmed Verified by someone other than the reporter.
Triaged Verified by the bug supervisor.

So our meaning is very divergent. I'd like us to consolidate on the
standard meaning - which is that the relative priority of having a
doctor [developer] attack the problem has been assessed.

Specifically:
 - we should use Triaged to indicate that:
- we have assigned a priority
- we believe it's a genuine bug
- we have routed[tagged] it to what is probably the right place
[vendor driver/low-hanging-fruit etc]
 - we should use Incomplete if we aren't sure that its a bug and need
the reporter to tell us more to be sure
 - triagers shouldn't ever set 'confirmed' - thats reserved solely for
end users to tell us that more than one user is encountering the
problem.

-Rob



-- 
Robert Collins 
Distinguished Technologist
HP Converged Cloud

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Nova] Support for Pecan in Nova

2013-12-12 Thread Jonathan LaCour

On December 11, 2013 at 2:34:07 PM, Doug Hellmann (doug.hellm...@dreamhost.com) 
wrote:

> On Wed, Dec 11, 2013 at 3:41 PM, Ryan Petrello wrote:
> 
> > 1. It looks like the Nova v3 API is composed *entirely* of
> > extensions (including “core” API calls), and that extensions and
> > their routes are discoverable and extensible via installed
> > software that registers itself via stevedore. This seems to lead
> > to an API that’s composed of installed software, which in my
> > opinion, makes it fairly hard to map out the API (as opposed to
> > how routes are manually defined in other WSGI frameworks). I
> > assume at this time, this design decision has already been
> > solidified for v3?
> 
> Yeah, I brought this up at the summit. I am still having some
> trouble understanding how we are going to express a stable core API
> for compatibility testing if the behavior of the API can be varied
> so significantly by deployment decisions. Will we just list each
> "required" extension, and forbid any extras for a compliant cloud?
> 
> Maybe the issue is caused by me misunderstanding the term
> "extension," which (to me) implies an optional component but is
> perhaps reflecting a technical implementation detail instead?

After taking a close look at how the API is constructed, I
actually think that the current approach of having the API be
defined purely through extensions is flawed, for a few reasons:

1. The code is extremely difficult to read and follow, because the API
   structure is entirely built at runtime based upon what is
   installed, rather than expressed declaratively in code.

2. As a company providing a public cloud based upon OpenStack, with a
   desire to express compatibility with the "OpenStack API," its
   difficult to document the "standard" baseline Nova API. I shouldn't
   have to say "it depends" in API documentation.

3. Based upon my read, extensions are in no way "quarantined" from the
   the baseline/standard/required API. In fact, they seem to be able
   to pollute the standard API with additional parameters and
   functionality. I can not envision a world in which this is sane.

In my opinion, a well-designed and architected API should have the
core functionality declaratively defined in the code itself, so as to
give a good, well-documented, standard, and unchanging baseline. Then,
an "extension" capability should be layered on in such a way that it
doesn't alter the core API or serialized data.

Note: my opinion isn’t altered by the fact that some of the “core” 
API involves “required” extensions. The result is still difficult to
read and document.

That said, I don’t want to diminish or minimize the hard work that
has been done on the V3 API thus far! Lots of thinking and heavy
lifting has already been done, and its much appreciated. I am just
concerned that we lost our way somewhere. Major API revisions only
come along so often, and I’d prefer to raise my objections now 
rather than to hold them in and regret it!

-- 
Jonathan LaCour

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Solum] Using Zuul in the Git-pull blueprint

2013-12-12 Thread devdatta kulkarni

We followed on the Zuul question in this week's git-integration working group 
meeting.

mordred has created an etherpad with a high-level description of Zuul and how 
it might
fit with Solum't git integration workflow

https://etherpad.openstack.org/p/ZuulSolum

The working group seemed to be coming to the consensus that we want to use a 
single workflow
engine, as far as possible, for all of Solum's workflow needs.
This brought up the question about, what are really Solum's workflow 
requirements. 

At a high-level, I think that Solum has three different kinds of workflows.

1) Workflow around getting user code into Solum
   - This is the git integration piece being worked out in the git-integration
 working group.

2) Workflow around creating language pack(s).
   - The main workflow requirement here involves ability to run tests before 
creating a language pack.
 There was some discussion in language-pack working group about this 
requirement.

3) Workflow around deploying created language pack(s) in order to instantiate 
an assembly.
   - The deployment may potentially contain several steps, some of which may be 
long running, such as
   populating a database. Further, there may be a need to checkpoint 
intermediate steps
   and retry the workflow from the failed point.


mordred mentioned that #1 can be achieved by Zuul (both, push-to-solum and 
pull-by-solum)
We want to know if #2 and #3 can also be achieved by Zuul.
If not, we want to know what are the available options.

mordred, thanks for the etherpad; looking forward to the digram :)


thanks,
devkulkarni


-Original Message-
From: "Roshan Agrawal" 
Sent: Monday, December 9, 2013 10:57am
To: "OpenStack Development Mailing List (not for usage questions)" 

Subject: Re: [openstack-dev] [Solum] Using Zuul in the Git-pull blueprint


> -Original Message-
> From: Krishna Raman [mailto:kra...@gmail.com]
> Sent: Sunday, December 08, 2013 11:24 PM
> To: OpenStack Development Mailing List (not for usage questions)
> Subject: [openstack-dev] [Solum] Using Zuul in the Git-pull blueprint
> 
> Hi all,
> 
> We had a very good meeting last week around the git-pull blueprint. During
> the discussion, Monty suggested using Zuul to manage the git repository
> access and workflow.
> While he is working on sending the group a diagram and description of what
> he has in mind, I had a couple of other questions which I am hoping the
> extended group will be able to answer.
> 
> 1) Zuul is currently an infrastructure project.
>   - Is there anything that prevents us from using it in Solum?
>   - Does it need to be moved to a normal OpenStack project?
> 
> 2) Zuul provides a sort of workflow engine. This workflow engine could
> potentially be used to initiate and manage: API Post -> git flow -> lang pack
> flow.
>   - Have there been any discussion after the F2F where we have
> discussed using some other workflow engine?

There hasn't been further discussion since F2F.
Most of the processes in Solum will really be customizable workflows, and use 
of a generic workflow engine is definitely worth discussing. We may still use 
to leverage Zuul for the gerrit/git/checkin piece, but Solum will have workflow 
needs beyond that. 

>   - Is Zuul's engine generic enough to be used in Solum? (Hoping
> Monty can help with this one)
>   - Perhaps only use it to manage the API post -> git flow
> stages?
> 
> Thanks
> -Krishna
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [TripleO][Tuskar] Terminology

2013-12-12 Thread Tzu-Mainn Chen

Thanks for all the replies so far!  Let me try and distill the thread down to 
the points of
interest and contention:

1) NODE vs INSTANCE

This is a distinction that Robert brings up, and I think it's a good one that 
people agree
with.  The various ROLES can apply to either.

What isn't clear to me (and it's orthogonal to this particular thread) is 
whether the
wireframes are correct in classifying things by NODES, when it feels like we 
might want
to classify things by INSTANCE?

2) UNDEPLOYED NODE - a node that has not been deployed with an instance

Other suggestions included  UNASSIGED, UNMAPPED, FREE, and AVAILABLE.  Some 
people (I'm one of them)
find AVAILABLE to be a bit of an overloaded term, as it can also be construed 
to mean that, say,
a service instance is now running on a node and is now "available for use".  
I'm in favor of an
"UN" word, and it sounds like "UNDEPLOYED" was the most generally acceptable?

3) SIZE THE DEPLOYMENT - the act of deciding how many nodes will need to be 
assigned to each role in a deployment

Other suggestions included "SET NODE COUNT", "DESIGN THE DEPLOYMENT", "SIZE THE 
CLOUD".  SIZE THE DEPLOYMENT
sounds like the suggested option that used the most words from the other 
choices, so I picked it as a likely
candidate :)  I also like that it uses the word deployment, as that's what 
we're calling the end result.

4) SERVICE CLASS/RESOURCE CLASS/ROLE CONFIGURATION

So, an example of this was

> KVM Compute is a role configuration. KVM compute(GPU) might be another

I'm personally somewhat averse to "role configuration".  Although there are 
aspects of configuration to
this, "configuration" seems somewhat misleading to me when the purpose of this 
classification is something
more like - a subrole?

5) NODE PROFILE/FLAVOR

Flavor seemed generally acceptable, although there was some mention of it being 
an overloaded term.  Is there
a case for using a more "user-friendly" term in the UI (like node profile or 
hardware configuration or. . .)?
Or should we just expect users to be familiar with OpenStack terminology?


Please feel free to correct me if I've left anything out or misrepresented 
anything!

Mainn



- Original Message -
> Hi,
> 
> I'm trying to clarify the terminology being used for Tuskar, which may be
> helpful so that we're sure
> that we're all talking about the same thing :)  I'm copying responses from
> the requirements thread
> and combining them with current requirements to try and create a unified
> view.  Hopefully, we can come
> to a reasonably rapid consensus on any desired changes; once that's done, the
> requirements can be
> updated.
> 
> * NODE a physical general purpose machine capable of running in many roles.
> Some nodes may have hardware layout that is particularly
>useful for a given role.
> 
>  * REGISTRATION - the act of creating a node in Ironic
> 
>  * ROLE - a specific workload we want to map onto one or more nodes.
>  Examples include 'undercloud control plane', 'overcloud control
>plane', 'overcloud storage', 'overcloud compute' etc.
> 
>  * MANAGEMENT NODE - a node that has been mapped with an undercloud
>  role
>  * SERVICE NODE - a node that has been mapped with an overcloud role
> * COMPUTE NODE - a service node that has been mapped to an
> overcloud compute role
> * CONTROLLER NODE - a service node that has been mapped to an
> overcloud controller role
> * OBJECT STORAGE NODE - a service node that has been mapped to an
> overcloud object storage role
> * BLOCK STORAGE NODE - a service node that has been mapped to an
> overcloud block storage role
> 
>  * UNDEPLOYED NODE - a node that has not been mapped with a role
>   * another option - UNALLOCATED NODE - a node that has not been
>   allocated through nova scheduler (?)
>- (after reading lifeless's explanation, I
>agree that "allocation" may be a
>   misleading term under TripleO, so I
>   personally vote for UNDEPLOYED)
> 
>  * INSTANCE - A role deployed on a node - this is where work actually
>  happens.
> 
> * DEPLOYMENT
> 
>  * SIZE THE ROLES - the act of deciding how many nodes will need to be
>  assigned to each role
>* another option - DISTRIBUTE NODES (?)
>  - (I think the former is more accurate, but
>  perhaps there's a better way to say it?)
> 
>  * SCHEDULING - the process of deciding which role is deployed on which
>  node
> 
>  * SERVICE CLASS - a further categorization within a service role for a
>  particular deployment.
> 
>   * NODE PROFILE - a set of requirements that specify what attributes
>   a node mu

Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements

2013-12-12 Thread Keith Basil

On Dec 12, 2013, at 4:05 PM, Jay Dobies wrote:

>> Maybe this is a valid use case?
>> 
>> Cloud operator has several core service nodes of differing configuration
>> types.
>> 
>> [node1]  <-- balanced mix of disk/cpu/ram for general core services
>> [node2]  <-- lots of disks for Ceilometer data storage
>> [node3]  <-- low-end "appliance like" box for a specialized/custom core 
>> service
>>   (SIEM box for example)
>> 
>> All nodes[1,2,3] are in the same deployment grouping ("core services)".  As 
>> such,
>> this is a heterogenous deployment grouping.  Heterogeneity in this case 
>> defined by
>> differing roles and hardware configurations.
>> 
>> This is a real use case.
>> 
>> How do we handle this?
> 
> This is the sort of thing I had been concerned with, but I think this is just 
> a variation on Robert's GPU example. Rather than butcher it by paraphrasing, 
> I'll just include the relevant part:
> 
> 
> "The basic stuff we're talking about so far is just about saying each
> role can run on some set of undercloud flavors. If that new bit of kit
> has the same coarse metadata as other kit, Nova can't tell it apart.
> So the way to solve the problem is:
> - a) teach Ironic about the specialness of the node (e.g. a tag 'GPU')
> - b) teach Nova that there is a flavor that maps to the presence of
> that specialness, and
>   c) teach Nova that other flavors may not map to that specialness
> 
> then in Tuskar whatever Nova configuration is needed to use that GPU
> is a special role ('GPU compute' for instance) and only that role
> would be given that flavor to use. That special config is probably
> being in a host aggregate, with an overcloud flavor that specifies
> that aggregate, which means at the TripleO level we need to put the
> aggregate in the config metadata for that role, and the admin does a
> one-time setup in the Nova Horizon UI to configure their GPU compute
> flavor."
> 

Yes, the core services example is a variation on the above.  The idea
of _undercloud_ flavor assignment (flavor to role mapping) escaped me
when I read that earlier.

It appears to be very elegant and provides another attribute for Tuskar's
notion of resource classes.  So +1 here.


> You mention three specific nodes, but what you're describing is more likely 
> three concepts:
> - Balanced Nodes
> - High Disk I/O Nodes
> - Low-End Appliance Nodes
> 
> They may have one node in each, but I think your example of three nodes is 
> potentially *too* simplified to be considered as proper sample size. I'd 
> guess there are more than three in play commonly, in which case the concepts 
> breakdown starts to be more appealing.

Correct - definitely more than three, I just wanted to illustrate the use case.

> I think the disk flavor in particular has quite a few use cases, especially 
> until SSDs are ubiquitous. I'd want to flag those (in Jay terminology, "the 
> disk hotness") as hosting the data-intensive portions, but where I had 
> previously been viewing that as manual allocation, it sounds like the 
> approach is to properly categorize them for what they are and teach Nova how 
> to use them.
> 
> Robert - Please correct me if I misread any of what your intention was, I 
> don't want to drive people down the wrong path if I'm misinterpretting 
> anything.

-k


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] Unified Guest Agent proposal

Excerpts from Steven Dake's message of 2013-12-12 12:32:55 -0800:
> On 12/12/2013 10:24 AM, Dmitry Mescheryakov wrote:
> > Clint, Kevin,
> >
> > Thanks for reassuring me :-) I just wanted to make sure that having 
> > direct access from VMs to a single facility is not a dead end in terms 
> > of security and extensibility. And since it is not, I agree it is much 
> > simpler (and hence better) than hypervisor-dependent design.
> >
> >
> > Then returning to two major suggestions made:
> >  * Salt
> >  * Custom solution specific to our needs
> >
> > The custom solution could be made on top of oslo.messaging. That gives 
> > us RPC working on different messaging systems. And that is what we 
> > really need - an RPC into guest supporting various transports. What it 
> > lacks at the moment is security - it has neither authentication nor ACL.
> >
> > Salt also provides RPC service, but it has a couple of disadvantages: 
> > it is tightly coupled with ZeroMQ and it needs a server process to 
> > run. A single transport option (ZeroMQ) is a limitation we really want 
> > to avoid. OpenStack could be deployed with various messaging 
> > providers, and we can't limit the choice to a single option in the 
> > guest agent. Though it could be changed in the future, it is an 
> > obstacle to consider.
> >
> > Running yet another server process within OpenStack, as it was already 
> > pointed out, is expensive. It means another server to deploy and take 
> > care of, +1 to overall OpenStack complexity. And it does not look it 
> > could be fixed any time soon.
> >
> > For given reasons I give favor to an agent based on oslo.messaging.
> >
> 
> An agent based on oslo.messaging is a potential security attack vector 
> and a possible scalability problem.  We do not want the guest agents 
> communicating over the same RPC servers as the rest of OpenStack

I don't think we're talking about agents talking to the exact same
RabbitMQ/Qpid/etc. bus that things under the cloud are talking to. That
would definitely raise some eyebrows. No doubt it will be in the realm
of possibility if deployers decide to do that, but so is letting your
database server sit on the same flat network as your guests.

I have a hard time seeing how using the same library is a security
risk though.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Nova] Support for Pecan in Nova

2013-12-12 Thread Christopher Yeoh

On Thu, Dec 12, 2013 at 3:17 PM, Mike Perez  wrote:

> On 10:06 Thu 12 Dec , Christopher Yeoh wrote:
> > On Thu, Dec 12, 2013 at 8:59 AM, Doug Hellmann
> > wrote:
> >
> > So for compatibility testing I think what will probably happen is that
> > we'll be defining a minimum set (API_V3_CORE_EXTENSIONS)
> > that must be implemented and clients can rely on that always being
> present
> > on a compliant cloud. But clients can also then query through /extensions
> > what other functionality (which is backwards compatible with respect to
> > core) may also be present on that specific cloud.
>
> I'm worried by the discrepancies this will create among the programs. You
> mentioned maintainability being a plus for this. I don't think it'll be
> great
> from the deployers perspective when you have one program that thinks
> everything
> is an extension and some of them have to be enabled that the deployer has
> to be
> mindful of, while the rest of the programs consider all extensions to be
>

Just to be clear, the deployer doesn't have to enable these, they are
enabled automatically,
but it will complain if a deployer attempts to disable them. Also anything
not core has a
"os-" prefix, while core plugins don't so it is easy to distinguish between
them. But yes I'd agree
that from a deployers point of view its not useful to talk about extensions
which have
to be enabled.

Chris
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Ironic] firmware security

2013-12-12 Thread Devananda van der Veen

On Thu, Dec 12, 2013 at 12:50 AM, Lu, Lianhao  wrote:

> Hi Ironic folks,
>
> I remembered once seeing that ironic was calling for firmware security.
> Can anyone elaborate with a little bit details about what Ironic needs for
> this "firmware security"? I'm wondering if there are some existing
> technologies(e.g. TPM, TXT, etc) that can be used for this purpose.
>
> Best Regards,
> -Lianhao
>

Hi Lianhao,

The topic of firmware support in Ironic has lead to very interesting
discussions: questions about scope, multi-vendor support, and, invariably,
questions about how we might validate / ensure the integrity of existing
firmware or the firmware Ironic would be loading onto a machine. A proposal
was put forward at the last summit to add a generic mechanism for flashing
firmware, as part of a generic utility ramdisk. Other work is taking
priority this cycle, but here are the blueprints / discussion.
  https://blueprints.launchpad.net/ironic/+spec/firmware-update
  https://blueprints.launchpad.net/ironic/+spec/utility-ramdisk

To get back to your question about security, UEFI + hardware TPM is, as far
as I know, the commonly-acknowledged best approach today, even though it is
not necessarily available on all hardware. I believe Ironic will need to
support interacting with these both locally (eg, via CPU bus) and remotely
(eg, via vendor's OOB management controllers).

-Devananda
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] Unified Guest Agent proposal

2013-12-12 Thread Fox, Kevin M

Yeah, I think the extra nic is unnecessary too. There already is a working 
route to 169.254.169.254, and a metadata proxy -> server running on it.

So... lets brainstorm for a minute and see if there are enough pieces already 
to do most of the work.

We already have:
  * An http channel out from private vm's, past network namespaces all the way 
to the node running the neutron-metadata-agent.

We need:
  * Some way to send a command, plus arguments to the vm to execute some action 
and get a response back.

OpenStack has focused on REST api's for most things and I think that is a great 
tradition to continue. This allows the custom agent plugins to be written in 
any language that can speak http (All of them?) on any platform.

A REST api running in the vm wouldn't be accessible from the outside though on 
a private network.

Random thought, can some glue "unified guest agent" be written to bridge the 
gap?

How about something like the following:

The "unified guest agent" starts up, makes an http request to 
169.254.169.254/unified-agent//connect
If at any time the connection returns, it will auto reconnect.
It will block as long as possible and the data returned will be an http 
request. The request will have a special header with a request id.
The http request will be forwarded to localhost: and 
the response will be posted to 
169.254.169.254/unified-agent/cnc_type/response/

The neutron-proxy-server would need to be modified slightly so that, if it sees 
a /unified-agent//* request it:
looks in its config file, unified-agent section, and finds the ip/port to 
contact for a given ', and forwards the request to that server, 
instead of the regular metadata one.

Once this is in place, savana or trove can have their webapi registered with 
the proxy as the server for the "savana" or "trove" cnc_type. They will be 
contacted by the clients as they come up, and will be able to make web requests 
to them, an get responses back.

What do you think?

Thanks,
Kevin

From: Ian Wells [ijw.ubu...@cack.org.uk]
Sent: Thursday, December 12, 2013 11:02 AM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] Unified Guest Agent proposal

On 12 December 2013 19:48, Clint Byrum 
mailto:cl...@fewbar.com>> wrote:
Excerpts from Jay Pipes's message of 2013-12-12 10:15:13 -0800:
> On 12/10/2013 03:49 PM, Ian Wells wrote:
> > On 10 December 2013 20:55, Clint Byrum 
> > mailto:cl...@fewbar.com>
> > >> wrote:
> I've read through this email thread with quite a bit of curiosity, and I
> have to say what Ian says above makes a lot of sense to me. If Neutron
> can handle the creation of a "management vNIC" that has some associated
> iptables rules governing it that provides a level of security for guest
> <-> host and guest <-> $OpenStackService, then the transport problem
> domain is essentially solved, and Neutron can be happily ignorant (as it
> should be) of any guest agent communication with anything else.
>

Indeed I think it could work, however I think the NIC is unnecessary.

Seems likely even with a second NIC that said address will be something
like 169.254.169.254 (or the ipv6 equivalent?).

There *is* no ipv6 equivalent, which is one standing problem.  Another is that 
(and admittedly you can quibble about this problem's significance) you need a 
router on a network to be able to get to 169.254.169.254 - I raise that because 
the obvious use case for multiple networks is to have a net which is *not* 
attached to the outside world so that you can layer e.g. a private DB service 
behind your app servers.

Neither of these are criticisms of your suggestion as much as they are standing 
issues with the current architecture.
--
Ian.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements

2013-12-12 Thread Jay Dobies


Maybe this is a valid use case?

Cloud operator has several core service nodes of differing configuration
types.

[node1]  <-- balanced mix of disk/cpu/ram for general core services
[node2]  <-- lots of disks for Ceilometer data storage
[node3]  <-- low-end "appliance like" box for a specialized/custom core service
 (SIEM box for example)

All nodes[1,2,3] are in the same deployment grouping ("core services)".  As 
such,
this is a heterogenous deployment grouping.  Heterogeneity in this case defined 
by
differing roles and hardware configurations.

This is a real use case.

How do we handle this?


This is the sort of thing I had been concerned with, but I think this is 
just a variation on Robert's GPU example. Rather than butcher it by 
paraphrasing, I'll just include the relevant part:



"The basic stuff we're talking about so far is just about saying each
role can run on some set of undercloud flavors. If that new bit of kit
has the same coarse metadata as other kit, Nova can't tell it apart.
So the way to solve the problem is:
 - a) teach Ironic about the specialness of the node (e.g. a tag 'GPU')
 - b) teach Nova that there is a flavor that maps to the presence of
that specialness, and
   c) teach Nova that other flavors may not map to that specialness

then in Tuskar whatever Nova configuration is needed to use that GPU
is a special role ('GPU compute' for instance) and only that role
would be given that flavor to use. That special config is probably
being in a host aggregate, with an overcloud flavor that specifies
that aggregate, which means at the TripleO level we need to put the
aggregate in the config metadata for that role, and the admin does a
one-time setup in the Nova Horizon UI to configure their GPU compute
flavor."


You mention three specific nodes, but what you're describing is more 
likely three concepts:

- Balanced Nodes
- High Disk I/O Nodes
- Low-End Appliance Nodes

They may have one node in each, but I think your example of three nodes 
is potentially *too* simplified to be considered as proper sample size. 
I'd guess there are more than three in play commonly, in which case the 
concepts breakdown starts to be more appealing.


I think the disk flavor in particular has quite a few use cases, 
especially until SSDs are ubiquitous. I'd want to flag those (in Jay 
terminology, "the disk hotness") as hosting the data-intensive portions, 
but where I had previously been viewing that as manual allocation, it 
sounds like the approach is to properly categorize them for what they 
are and teach Nova how to use them.


Robert - Please correct me if I misread any of what your intention was, 
I don't want to drive people down the wrong path if I'm misinterpretting 
anything.






-k


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Nova] [Neutron] How do we know a host is ready to have servers scheduled onto it?

2013-12-12 Thread Joshua Harlow

Maybe time to revive something like:

https://review.openstack.org/#/c/12759/


>From experience, all sites (and those internal to yahoo) provide a /status
(or equivalent) that is used for all sorts of things (from basic
load-balancing up/down) to other things like actually introspecting the
state of the process (or to get basics about what the process is doing).
Typically this is not exposed to the public (its why
http://www.yahoo.com/status works for me but not for u). It seems like
something like that could help (but of course not completely solve) the
type of response jay mentioned.

-Josh

On 12/12/13 10:10 AM, "Jay Pipes"  wrote:

>On 12/12/2013 12:53 PM, Kyle Mestery wrote:
>> On Dec 12, 2013, at 11:44 AM, Jay Pipes  wrote:
>>> On 12/12/2013 12:36 PM, Clint Byrum wrote:
 Excerpts from Russell Bryant's message of 2013-12-12 09:09:04 -0800:
> On 12/12/2013 12:02 PM, Clint Byrum wrote:
>> I've been chasing quite a few bugs in the TripleO automated bring-up
>> lately that have to do with failures because either there are no
>>valid
>> hosts ready to have servers scheduled, or there are hosts listed and
>> enabled, but they can't bind to the network because for whatever
>>reason
>> the L2 agent has not checked in with Neutron yet.
>>
>> This is only a problem in the first few minutes of a nova-compute
>>host's
>> life. But it is critical for scaling up rapidly, so it is important
>>for
>> me to understand how this is supposed to work.
>>
>> So I'm asking, is there a standard way to determine whether or not a
>> nova-compute is definitely ready to have things scheduled on it?
>>This
>> can be via an API, or even by observing something on the
>>nova-compute
>> host itself. I just need a definitive signal that "the compute host
>>is
>> ready".
>
> If a nova compute host has registered itself to start having
>instances
> scheduled to it, it *should* be ready.  AFAIK, we're not doing any
> network sanity checks on startup, though.
>
> We already do some sanity checks on startup.  For example,
>nova-compute
> requires that it can talk to nova-conductor.  nova-compute will
>block on
> startup until nova-conductor is responding if they happened to be
> brought up at the same time.
>
> We could do something like this with a networking sanity check if
> someone could define what that check should look like.
>
 Could we ask Neutron if our compute host has an L2 agent yet? That
seems
 like a valid sanity check.
>>>
>>> ++
>>>
>> This makes sense to me as well. Although, not all Neutron plugins have
>> an L2 agent, so I think the check needs to be more generic than that.
>> For example, the OpenDaylight MechanismDriver we have developed
>> doesn't need an agent. I also believe the Nicira plugin is agent-less,
>> perhaps there are others as well.
>>
>> And I should note, does this sort of integration also happen with
>>cinder,
>> for example, when we're dealing with storage? Any other services which
>> have a requirement on startup around integration with nova as well?
>
>Right, it's more general than "is the L2 agent alive and running". It's
>more about having each service understand the relative dependencies it
>has on other supporting services.
>
>For instance, have each service implement a:
>
>GET /healthcheck
>
>that would return either a 200 OK or 409 Conflict with the body
>containing a list of service types that it is waiting to hear back from
>in order to provide a 200 OK for itself.
>
>Anyway, just some thoughts...
>
>-jay
>
>
>
>___
>OpenStack-dev mailing list
>OpenStack-dev@lists.openstack.org
>http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] Incubation Request for Barbican

2013-12-12 Thread Adam Young


On 12/04/2013 08:58 AM, Jarret Raim wrote:

While I am all for adding a new program, I think we should only add one
if we
rule out all existing programs as a home. With that in mind why not add
this
to the  keystone program? Perhaps that may require a tweak to keystones
mission
statement, but that is doable. I saw a partial answer to this somewhere
but not a full one.

 From our point of view, Barbican can certainly help solve some problems
related to identity like SSH key management and client certs. However,
there is a wide array of functionality that Barbican will handle that is
not related to identity.


Some examples, there is some additional detail in our application if you
want to dig deeper [1].


* Symmetric key management - These keys are used for encryption of data at
rest in various places including Swift, Nova, Cinder, etc. Keys are
resources that roll up to a project, much like servers or load balancers,
but they have no direct relationship to an identity.

* SSL / TLS certificates - The management of certificate authorities and
the issuance of keys for SSL / TLS. Again, these are resources rather than
anything attached to identity.

* SSH Key Management - These could certainly be managed through keystone
if we think that¹s the right way to go about it, but from Barbican¹s point
of view, these are just another type of a key to be generated and tracked
that roll up to an identity.


* Client certificates - These are most likely tied to an identity, but
again, just managed as resources from a Barbican point of view.

* Raw Secret Storage - This functionality is usually used by applications
residing on an Cloud. An app can use Barbican to store secrets such as
sensitive configuration files, encryption keys and the like. This data
belongs to the application rather than any particular user in Keystone.
For example, some Rackspace customers don¹t allow their application dev /
maintenance teams direct access to the Rackspace APIs.

* Boot Verification - This functionality is used as part of the trusted
boot functionality for transparent disk encryption on Nova.

* Randomness Source - Barbican manages HSMs which allow us to offer a
source of true randomness.



In short (ha), I would encourage everyone to think of keys / certificates
as resources managed by an API in much the same way we think of VMs being
managed by the Nova API. A consumer of Barbican (either as an OpenStack
service or a consumer of an OpenStack cloud) will have an API to create
and manage various types of secrets that are owned by their project.


My reason for keeping them separate is more practical:  the Keystone 
team is already somewhat overloaded.  I know that a couple of us have 
interest in contributing to Barbican, the question is time and 
prioritization.


Unless there is some benefit to having both projects in the same program 
with essentially different teams, I think Barbican should proceed as 
is.  I personally plan on contributing to Barbican.






Keystone plays a critical role for us (as it does with every service) in
authenticating the user to a particular project and storing the roles that
the user has for that project. Barbican then enforces these restrictions.
However, keys / secrets are fundamentally divorced from identities in much
the same way that databases in Trove are, they are owned by a project, not
a particular user.

Hopefully our thought process makes some sense, let me know if I can
provide more detail.



Jarret





[1] https://wiki.openstack.org/wiki/Barbican/Incubation


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] Unified Guest Agent proposal

2013-12-12 Thread Steven Dake

On 12/12/2013 10:24 AM, Dmitry Mescheryakov wrote:

Clint, Kevin,

Thanks for reassuring me :-) I just wanted to make sure that having 
direct access from VMs to a single facility is not a dead end in terms 
of security and extensibility. And since it is not, I agree it is much 
simpler (and hence better) than hypervisor-dependent design.

Then returning to two major suggestions made:
 * Salt
 * Custom solution specific to our needs

The custom solution could be made on top of oslo.messaging. That gives 
us RPC working on different messaging systems. And that is what we 
really need - an RPC into guest supporting various transports. What it 
lacks at the moment is security - it has neither authentication nor ACL.

Salt also provides RPC service, but it has a couple of disadvantages: 
it is tightly coupled with ZeroMQ and it needs a server process to 
run. A single transport option (ZeroMQ) is a limitation we really want 
to avoid. OpenStack could be deployed with various messaging 
providers, and we can't limit the choice to a single option in the 
guest agent. Though it could be changed in the future, it is an 
obstacle to consider.

Running yet another server process within OpenStack, as it was already 
pointed out, is expensive. It means another server to deploy and take 
care of, +1 to overall OpenStack complexity. And it does not look it 
could be fixed any time soon.

For given reasons I give favor to an agent based on oslo.messaging.

An agent based on oslo.messaging is a potential security attack vector 
and a possible scalability problem.  We do not want the guest agents 
communicating over the same RPC servers as the rest of OpenStack

Thanks,

Dmitry

2013/12/11 Fox, Kevin M mailto:kevin@pnnl.gov>>

Yeah. Its likely that the metadata server stuff will get more
scalable/hardened over time. If it isn't enough now, lets fix it
rather then coming up with a new system to work around it.

I like the idea of using the network since all the hypervisors
have to support network drivers already. They also already have to
support talking to the metadata server. This keeps OpenStack out
of the hypervisor driver business.

Kevin

From: Clint Byrum [cl...@fewbar.com ]
Sent: Tuesday, December 10, 2013 1:02 PM
To: openstack-dev
Subject: Re: [openstack-dev] Unified Guest Agent proposal

Excerpts from Dmitry Mescheryakov's message of 2013-12-10 12:37:37
-0800:
> >> What is the exact scenario you're trying to avoid?
>
> It is DDoS attack on either transport (AMQP / ZeroMQ provider)
or server
> (Salt / Our own self-written server). Looking at the design, it
doesn't
> look like the attack could be somehow contained within a tenant
it is
> coming from.
>

We can push a tenant-specific route for the metadata server, and a
tenant
specific endpoint for in-agent things. Still simpler than
hypervisor-aware
guests. I haven't seen anybody ask for this yet, though I'm sure
if they
run into these problems it will be the next logical step.

> In the current OpenStack design I see only one similarly vulnerable
> component - metadata server. Keeping that in mind, maybe I just
> overestimate the threat?
>

Anything you expose to the users is "vulnerable". By using the
localized
hypervisor scheme you're now making the compute node itself
vulnerable.
Only now you're asking that an already complicated thing
(nova-compute)
add another job, rate limiting.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org

http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org

http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Nova] [Neutron] How do we know a host is ready to have servers scheduled onto it?

On 12/12/2013 01:36 PM, Clint Byrum wrote:
> Excerpts from Kyle Mestery's message of 2013-12-12 09:53:57 -0800:
>> On Dec 12, 2013, at 11:44 AM, Jay Pipes  wrote:
>>> On 12/12/2013 12:36 PM, Clint Byrum wrote:
 Excerpts from Russell Bryant's message of 2013-12-12 09:09:04 -0800:
> On 12/12/2013 12:02 PM, Clint Byrum wrote:
>> I've been chasing quite a few bugs in the TripleO automated bring-up
>> lately that have to do with failures because either there are no valid
>> hosts ready to have servers scheduled, or there are hosts listed and
>> enabled, but they can't bind to the network because for whatever reason
>> the L2 agent has not checked in with Neutron yet.
>>
>> This is only a problem in the first few minutes of a nova-compute host's
>> life. But it is critical for scaling up rapidly, so it is important for
>> me to understand how this is supposed to work.
>>
>> So I'm asking, is there a standard way to determine whether or not a
>> nova-compute is definitely ready to have things scheduled on it? This
>> can be via an API, or even by observing something on the nova-compute
>> host itself. I just need a definitive signal that "the compute host is
>> ready".
>
> If a nova compute host has registered itself to start having instances
> scheduled to it, it *should* be ready.  AFAIK, we're not doing any
> network sanity checks on startup, though.
>
> We already do some sanity checks on startup.  For example, nova-compute
> requires that it can talk to nova-conductor.  nova-compute will block on
> startup until nova-conductor is responding if they happened to be
> brought up at the same time.
>
> We could do something like this with a networking sanity check if
> someone could define what that check should look like.
>
 Could we ask Neutron if our compute host has an L2 agent yet? That seems
 like a valid sanity check.
>>>
>>> ++
>>>
>> This makes sense to me as well. Although, not all Neutron plugins have
>> an L2 agent, so I think the check needs to be more generic than that.
>> For example, the OpenDaylight MechanismDriver we have developed
>> doesn't need an agent. I also believe the Nicira plugin is agent-less,
>> perhaps there are others as well.
>>
>> And I should note, does this sort of integration also happen with cinder,
>> for example, when we're dealing with storage? Any other services which
>> have a requirement on startup around integration with nova as well?
>>
> 
> Does cinder actually have per-compute-host concerns? I admit to being a
> bit cinder-stupid here.

No, it doesn't.

> Anyway, it seems to me that any service that is compute-host aware
> should be able to respond to the compute host whether or not it is a)
> aware of it, and b) ready to serve on it.
> 
> For agent-less drivers that is easy, you just always return True. And
> for drivers with agents, you return false unless you can find an agent
> for the host.
> 
> So something like:
> 
> GET /host/%(compute-host-name)
> 
> And then in the response include a "ready" attribute that would signal
> whether all networks that should work there, can work there.
> 
> As a first pass, just polling until that is "ready" before nova-compute
> enables itself would solve the problems I see (and that I think users
> would see as a cloud provider scales out compute nodes). Longer term
> we would also want to aim at having notifications available for this
> so that nova-compute could subscribe to that notification bus and then
> disable itself if its agent ever goes away.
> 
> I opened this bug to track the issue. I suspect there are duplicates of
> it already reported, but would like to start clean to make sure it is
> analyzed fully and then we can use those other bugs as test cases and
> confirmation:
> 
> https://bugs.launchpad.net/nova/+bug/1260440

Sounds good.  I'm happy to do this in Nova, but we'll have to get the
Neutron API bit sorted out first.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Nova] [Neutron] How do we know a host is ready to have servers scheduled onto it?

On 12/12/2013 12:35 PM, Clint Byrum wrote:
> Excerpts from Chris Friesen's message of 2013-12-12 09:19:42 -0800:
>> On 12/12/2013 11:02 AM, Clint Byrum wrote:
>>
>>> So I'm asking, is there a standard way to determine whether or not a
>>> nova-compute is definitely ready to have things scheduled on it? This
>>> can be via an API, or even by observing something on the nova-compute
>>> host itself. I just need a definitive signal that "the compute host is
>>> ready".
>>
>> Is it not sufficient that "nova service-list" shows the compute service 
>> as "up"?
>>
> 
> I could spin waiting for "at least one". Not a bad idea actually. However,
> I suspect that will only handle the situations I've gotten where the
> scheduler returns "NoValidHost".

Right it solves this case

> I say that because I think if it shows there, it matches the all hosts
> filter and will have things scheduled on it. With one compute host I
> get failures after scheduling because neutron has no network segment to
> bind to. That is because the L2 agent on the host has not yet registered
> itself with Neutron.

but not this one.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Nova] New official bugtag 'Ironic' ?

On 12/12/2013 03:03 PM, Robert Collins wrote:
> We have official tags for most of the hypervisors, but not "ironic" as
> yet - any objections to adding one?

Nope, go ahead.  For reference, to add it we need to:

1) Make it an official tag in launchpad

2) Update https://wiki.openstack.org/wiki/BugTags

3) Update https://wiki.openstack.org/wiki/Nova/BugTriage

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

[openstack-dev] [Nova] New official bugtag 'Ironic' ?

We have official tags for most of the hypervisors, but not "ironic" as
yet - any objections to adding one?

-Rob

-- 
Robert Collins 
Distinguished Technologist
HP Converged Cloud

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] Unified Guest Agent proposal

2013-12-12 Thread Vladik Romanovsky

Dmitry,

I understand that :)
The only hypervisor dependency it has is how it communicates with the host, 
while this can be extended and turned into a binding, so people could connect 
to it in multiple ways.

The real value, as I see it, is which features this guest agent already 
implements and the fact that this is a mature code base.

Thanks,
Vladik 

- Original Message -
> From: "Dmitry Mescheryakov" 
> To: "OpenStack Development Mailing List (not for usage questions)" 
> 
> Sent: Thursday, 12 December, 2013 12:27:47 PM
> Subject: Re: [openstack-dev] Unified Guest Agent proposal
> 
> Vladik,
> 
> Thanks for the suggestion, but hypervisor-dependent solution is exactly what
> scares off people in the thread :-)
> 
> Thanks,
> 
> Dmitry
> 
> 
> 2013/12/11 Vladik Romanovsky < vladik.romanov...@enovance.com >
> 
> 
> 
> Maybe it will be useful to use Ovirt guest agent as a base.
> 
> http://www.ovirt.org/Guest_Agent
> https://github.com/oVirt/ovirt-guest-agent
> 
> It is already working well on linux and windows and has a lot of
> functionality.
> However, currently it is using virtio-serial for communication, but I think
> it can be extended for other bindings.
> 
> Vladik
> 
> - Original Message -
> > From: "Clint Byrum" < cl...@fewbar.com >
> > To: "openstack-dev" < openstack-dev@lists.openstack.org >
> > Sent: Tuesday, 10 December, 2013 4:02:41 PM
> > Subject: Re: [openstack-dev] Unified Guest Agent proposal
> > 
> > Excerpts from Dmitry Mescheryakov's message of 2013-12-10 12:37:37 -0800:
> > > >> What is the exact scenario you're trying to avoid?
> > > 
> > > It is DDoS attack on either transport (AMQP / ZeroMQ provider) or server
> > > (Salt / Our own self-written server). Looking at the design, it doesn't
> > > look like the attack could be somehow contained within a tenant it is
> > > coming from.
> > > 
> > 
> > We can push a tenant-specific route for the metadata server, and a tenant
> > specific endpoint for in-agent things. Still simpler than hypervisor-aware
> > guests. I haven't seen anybody ask for this yet, though I'm sure if they
> > run into these problems it will be the next logical step.
> > 
> > > In the current OpenStack design I see only one similarly vulnerable
> > > component - metadata server. Keeping that in mind, maybe I just
> > > overestimate the threat?
> > > 
> > 
> > Anything you expose to the users is "vulnerable". By using the localized
> > hypervisor scheme you're now making the compute node itself vulnerable.
> > Only now you're asking that an already complicated thing (nova-compute)
> > add another job, rate limiting.
> > 
> > ___
> > OpenStack-dev mailing list
> > OpenStack-dev@lists.openstack.org
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> > 
> 
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> 
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

[openstack-dev] [barbican] Meeting Today at 20:00 UTC (2 Central)

2013-12-12 Thread Jarret Raim

The barbican meeting today will cover our progress on the incubation
issues brought up by the TC, test planning and any other issues.


If you are interested in barbican and have questions, stop on by
#openstack-meeting-alt in 20 minutes.




Thanks,

--
Jarret Raim 
@jarretraim




smime.p7s
Description: S/MIME cryptographic signature
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

[openstack-dev] [Tempest][Production] Tempest / the gate / real world load

A few times now we've run into patches for devstack-gate / devstack
that change default configuration to handle 'tempest load'.

For instance - https://review.openstack.org/61137 (Sorry Salvatore I'm
not picking on you really!)

So there appears to be a meme that the gate is particularly stressful
- a bad environment - and that real world situations have less load.

This could happen a few ways: (a) deployers might separate out
components more; (b) they might have faster machines; (c) they might
have less concurrent activity.

(a) - unlikely! Deployers will cram stuff together as much as they can
to save overheads. Big clouds will have components split out - yes,
but they will also have correspondingly more load to drive that split
out.

(b) Perhaps, but not orders of magnitude faster, the clouds we run on
are running on fairly recent hardware, and by using big instances we
don't get crammed it with that many other tenants.

(c) Almost certainly not. Tempest currently does a maximum of four
concurrent requests. A small business cloud could easily have 5 or 6
people making concurrent requests from time to time, and bigger but
not huge clouds will certainly have that. Their /average/ rate of API
requests may be much lower, but when they point service orchestration
tools at it -- particularly tools that walk their dependencies in
parallel - load is going to be much much higher than what we generate
with Tempest.

tl;dr : if we need to change a config file setting in devstack-gate or
devstack *other than* setting up the specific scenario, think thrice -
should it be a production default and set in the relevant projects
default config setting.

Cheers,
Rob
-- 
Robert Collins 
Distinguished Technologist
HP Converged Cloud

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Neutron][IPv6] Agenda for the meeting today

2013-12-12 Thread Ian Wells

Can we go over the use cases for the multiple different address allocation
techniques, per my comment on the blueprint that suggests we expose
different dnsmasq modes?

And perhaps also what we're going to do with routers in terms of equivalent
behaviour for the current floating-ip versus no-floating-ip systems.  One
idea is that we would offer tenants a routeable address regardless of which
network they're on (something itself which is not what we do in v4 and
which doesn't really fit with current subnet-create) and rather than NAT we
have two default firewalling schemes in routers for an externally
accessible versus an inaccessible address, but I'd really like to hear what
other ideas there are out there.
-- 
Ian.


On 12 December 2013 18:22, Collins, Sean wrote:

> The agenda for today is pretty light - if there is anything that people
> would like to discuss, please feel free to add.
>
>
> https://wiki.openstack.org/wiki/Meetings/Neutron-IPv6-Subteam#Agenda_for_Dec_12._2013
>
> --
> Sean M. Collins
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Solum] Plan files and resources

2013-12-12 Thread Clayton Coleman



- Original Message -
> 
> > On Dec 11, 2013, at 4:45 PM, "Clayton Coleman"  wrote:
> > - Original Message -
> >> Devdatta,
> >> 
> >> On Dec 10, 2013, at 12:37 PM, devdatta kulkarni
> >>  wrote:
> >> 
> >>> Hi Adrian,
> >>> 
> >>> Thanks for creating https://etherpad.openstack.org/p/solum-demystified
> >>> 
> >>> I am really excited to see the examples. Especially cool is how
> >>> examples 2 and 3 demonstrate using a component ("solum_glance_id")
> >>> created
> >>> as part of example 1.
> >>> 
> >>> 
> >>> Some questions/comments:
> >>> 
> >>> 1) Summarizing the sequence of events just to make sure I understand them
> >>> correctly:
> >>>  a) User selects a language pack and specifies its id in the plan file
> >> 
> >> They could put the language pack reference into a Plan file, or we could
> >> generate a Plan file with a CLI command that feeds an auto-generated file
> >> to
> >> the API for the user. That might reduce the user complexity a bit for the
> >> general case.
> > 
> > It seems like the reasonable M1 and M2 scenarios are to get the bones of an
> > integration working that allow a flexible Plan to exist (but not
> > necessarily something an average user would edit).
> 
> To be clear, are you suggesting that we ask users to place stock plan files
> in their code repos as a first step? This would certainly minimize work for
> us to get to milestone-1.

Possibly - or that plan generation is as absolutely simple as possible (we 
either take a plan as input in the client, or pregenerate one with 1-2 
replacement variables).  

> 
> > M2 and M3 can focus on the support around making Plans that mere mortals
> > can throw together (whether generated or precreated by an operator), and a
> > lot of how that evolves depends on the other catalog work.
> 
> This would mean revisiting the simplicity of the plan file, documenting lots
> of examples of them so the are well understood. At that point we could
> demonstrate ways to tweak them to accommodate a variety of workload types
> with Solum, not just deploy simple web apps fitting a single system
> architecture.
> 
> > You could argue the resistance from some quarters to the current PaaS model
> > is that the "Plan" equivalent is hardcoded and non-flexible - what is
> > being done differently here is to offer the concepts necessary to allow
> > other types of plans and application models to coexist in a single system.
> 
> Agreed 100%.
> 
> >>>  b) User creates repo with the plan file in it.
> >> 
> >> We could scan the repo for a Plan file to override the auto-generation
> >> step,
> >> to allow a method for customization.
> >> 
> >>>  After this the flow could be:
> >>>  c.1) User uses solum cli to 'create' an application by giving reference
> >>>  to
> >>> the repo uri
> >> 
> >> I view this as the use of the cli "app create" command as the first step.
> >> They can optionally specify a Plan file to use for either the build
> >> sequence, or the app deployment sequence, or both (for a total of TWO Plan
> >> files). We could also allow plan files to be placed in the Git repo, and
> >> picked up there in the event that none are specified on the command line.
> >> 
> >> Note that they may also put a HOT file in their repo, and bypass HOT file
> >> generation/catalog-lookup and cause Solum to use the supplied template.
> >> This
> >> would be useful for power users who want the ability to further influence
> >> the arrangement of the Heat stack.
> >> 
> >>>  c.1.1) Solum creates a plan resource
> >>>  c.1.2) Solum model interpreter creates a Heat stack and does the
> >>>  rest
> >>>  of the
> >>>   things needed to create a assembly.
> >>>  (The created plan resource does not play any part in assembly
> >>>  creation as such.
> >>>   Its only role is being a 'trackback' to track the plan from which
> >>>   the assembly was created.)
> >> 
> >> It's also a way to find out what services the given requirements were
> >> mapped
> >> to. In a Plan file, the services array contains ServiceSpecfications (see
> >> the EX[1-3] YAML examples under the "services" node for an example of what
> >> those look like. In a Plan resource, the services array includes a list of
> >> service resources so you can see what Solum's model interpreter mapped
> >> your
> >> requirements to.
> >> 
> >>>  or,
> >>>  c.2) User uses solum cli to 'create/register' a plan by providing
> >>>  reference to the repo uri.
> >>>   c.2.1) Solum creates the plan resource.
> >>>  c.2) User uses solum cli to 'create' an application by specifying the
> >>>  created plan
> >>>   resource uri
> >>>   (In this flow, the plan is actively used).
> >> 
> >> Yes, this would be another option. I expect that this approach may be used
> >> by
> >> users who want to create multitudes of Assemblies from a given Plan
> >> resource.
> >> 
> >>> 2) Addition of new solum specific attributes in a plan specification is
> >>> interesting.
> >>>  I

Re: [openstack-dev] Unified Guest Agent proposal

2013-12-12 Thread Ian Wells

On 12 December 2013 19:48, Clint Byrum  wrote:

> Excerpts from Jay Pipes's message of 2013-12-12 10:15:13 -0800:
> > On 12/10/2013 03:49 PM, Ian Wells wrote:
> > > On 10 December 2013 20:55, Clint Byrum  > > > wrote:
> > I've read through this email thread with quite a bit of curiosity, and I
> > have to say what Ian says above makes a lot of sense to me. If Neutron
> > can handle the creation of a "management vNIC" that has some associated
> > iptables rules governing it that provides a level of security for guest
> > <-> host and guest <-> $OpenStackService, then the transport problem
> > domain is essentially solved, and Neutron can be happily ignorant (as it
> > should be) of any guest agent communication with anything else.
> >
>
> Indeed I think it could work, however I think the NIC is unnecessary.
>
> Seems likely even with a second NIC that said address will be something
> like 169.254.169.254 (or the ipv6 equivalent?).
>

There *is* no ipv6 equivalent, which is one standing problem.  Another is
that (and admittedly you can quibble about this problem's significance) you
need a router on a network to be able to get to 169.254.169.254 - I raise
that because the obvious use case for multiple networks is to have a net
which is *not* attached to the outside world so that you can layer e.g. a
private DB service behind your app servers.

Neither of these are criticisms of your suggestion as much as they are
standing issues with the current architecture.
-- 
Ian.
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] Stackalytics 0.4 released! [metrics]

2013-12-12 Thread Stefano Maffulli

On 12/12/2013 07:49 AM, Ilya Shakhat wrote:
> Stackalytics team is happy to announce the release of version 0.4.
[...]

Good job Ilya, congratulations on the release. I may not be able to join
the meeting (too early for me) so I leave here some feedback for you.

I like the new punchcards in the personal activity pages: good way to
see when people are mostly active during the day/week. I think it would
be good to have one comprehensive view of the activity of a person or
company in one page. Do you already have plans to merge this view
http://stackalytics.com/?user_id=&project_type=all&release=all&company=Red+Hat&metric=all
with http://stackalytics.com/report/companies/red%20hat?

Just yesterday I noticed another incident where people take all our
reported numbers as 'solid gold' while they are subject to
interpretation and verification. I think there should be a warning on
every page, especially the pages reporting companies activity (and a
link to how to fix the data if somebody finds a mistake would be good
too). I and Tom filed a bug for stackalytics and Activity Board:
https://bugs.launchpad.net/stackalytics/+bug/1260135

I think you're doing a very good job with the reporting. We may want to
start discussing how to improve the company-person affiliation problem
in the long term. There is a bug filed for this topic too:
https://bugs.launchpad.net/openstack-community/+bug/1260140

let's keep this going, we want to have good data available automatically
all the time :)

/stef

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] State of the Gate - Dec 12

2013-12-12 Thread Joe Gordon

On Thu, Dec 12, 2013 at 7:19 PM, Matt Riedemann
wrote:

>
>
> On 12/12/2013 7:20 AM, Sean Dague wrote:
>
>> Current Gate Length: 12hrs*, 41 deep
>>
>> (top of gate entered 12hrs ago)
>>
>> It's been an *exciting* week this week. For people not paying attention
>> we had 2 external events which made things terrible earlier in the week.
>>
>> ==
>> Event 1: sphinx 1.2 complete breakage - MOSTLY RESOLVED
>> ==
>>
>> It turns out sphinx 1.2 + distutils (which pbr magic call through) means
>> total sadness. The fix for this was a requirements pin to sphinx < 1.2,
>> and until a project has taken that they will fail in the gate.
>>
>> It also turns out that tox installs pre-released software by default (a
>> terrible default behavior), so you also need a tox.ini change like this
>> - https://github.com/openstack/nova/blob/master/tox.ini#L9 otherwise
>> local users will install things like sphinx 1.2b3. They will also break
>> in other ways.
>>
>> Not all projects have merged this. If you are a project that hasn't,
>> please don't send any other jobs to the gate until you do. A lot of
>> delay was added to the gate yesterday by Glance patches being pushed to
>> the gate before their doc jobs were done.
>>
>> ==
>> Event 2: apt.puppetlabs.com outage - RESOLVED
>> ==
>>
>> We use that apt repository to setup the devstack nodes in nodepool with
>> puppet. We were triggering an issue with grenade where it's apt-get
>> calls were failing, because it does apt-get update once to make sure
>> life is good. This only triggered in grenade (noth other devstack runs)
>> because we do set -o errexit aggressively.
>>
>> A fix in grenade to ignore these errors was merged yesterday afternoon
>> (the purple line - http://status.openstack.org/elastic-recheck/ you can
>> see where it showed up).
>>
>> ==
>> Top Gate Bugs
>> ==
>>
>> We normally do this as a list, and you can see the whole list here -
>> http://status.openstack.org/elastic-recheck/ (now sorted by number of
>> FAILURES in the last 2 weeks)
>>
>> That being said, our bigs race bug is currently this one bug -
>> https://bugs.launchpad.net/tempest/+bug/1253896 - and if you want to
>> merge patches, fixing that one bug will be huge.
>>
>> Basically, you can't ssh into guests that get created. That's sort of a
>> fundamental property of a cloud. It shows up more frequently on neutron
>> jobs, possibly due to actually testing the metadata server path. There
>> have been many attempts on retry logic on this, we actually retry for
>> 196 seconds to get in and only fail once we can't get in, so waiting
>> isn't helping. It doesn't seem like the env is under that much load.
>>
>> Until we resolve this, life will not be good in landing patches.
>>
>> -Sean
>>
>>
>>
>> ___
>> OpenStack-dev mailing list
>> OpenStack-dev@lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>>
> There have been a few threads [1][2] on gate failures and the process
> around what happens when we go about identifying, tracking and fixing them.
>
> I couldn't find anything outside of the mailing list to keep a record of
> this so started a page here [3].
>

Thanks for writing down all that content! but we are trying to keep all the
elastic-recheck documentation in elastic-recheck.  So a patch to
elastic-recheck docs would be very welcome.

https://review.openstack.org/#/c/61300/1
https://review.openstack.org/#/c/61298/


The one big thing that wiki doesn't mention, is one of the most important
parts, actually fixing the bugs from
http://status.openstack.org/elastic-recheck/.


>
> Feel free to contribute so we can point people to how they can easily help
> in working these faster.
>
> [1] http://lists.openstack.org/pipermail/openstack-dev/2013-
> November/020280.html
> [2] http://lists.openstack.org/pipermail/openstack-dev/2013-
> November/019931.html
> [3] https://wiki.openstack.org/wiki/ElasticRecheck
>
> --
>
> Thanks,
>
> Matt Riedemann
>
>
>
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Nova] cross-project bug fixes

2013-12-12 Thread Matt Riedemann




On Thursday, December 12, 2013 10:29:11 AM, Russell Bryant wrote:

On 12/12/2013 11:21 AM, Hamilton, Peter A. wrote:

I am in the process of getting a bug fix approved for a bug found in 
openstack.common:

https://review.openstack.org/#/c/60500/

The bug is present in both nova and cinder. The above patch is under nova; do I 
need to submit a separate cinder patch covering the same fix, or does the 
shared nature of the openstack.common module allow for updates across projects 
without needing separate project patches?


The part under nova/openstack/common needs to be fixed in the
oslo-incubator git repository first.  From there you sync the fix into
nova and cinder.



Peter,

FYI: https://wiki.openstack.org/wiki/Oslo#Syncing_Code_from_Incubator

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] Unified Guest Agent proposal

Excerpts from Jay Pipes's message of 2013-12-12 10:15:13 -0800:
> On 12/10/2013 03:49 PM, Ian Wells wrote:
> > On 10 December 2013 20:55, Clint Byrum  > > wrote:
> >
> > If it is just a network API, it works the same for everybody. This
> > makes it simpler, and thus easier to scale out independently of compute
> > hosts. It is also something we already support and can very easily
> > expand
> > by just adding a tiny bit of functionality to neutron-metadata-agent.
> >
> > In fact we can even push routes via DHCP to send agent traffic through
> > a different neutron-metadata-agent, so I don't see any issue where we
> > are piling anything on top of an overstressed single resource. We can
> > have neutron route this traffic directly to the Heat API which hosts it,
> > and that can be load balanced and etc. etc. What is the exact scenario
> > you're trying to avoid?
> >
> >
> > You may be making even this harder than it needs to be.  You can create
> > multiple networks and attach machines to multiple networks.  Every point
> > so far has been 'why don't we use  as a backdoor into our VM
> > without affecting the VM in any other way' - why can't that just be one
> > more network interface set aside for whatever management  instructions
> > are appropriate?  And then what needs pushing into Neutron is nothing
> > more complex than strong port firewalling to prevent the slaves/minions
> > talking to each other.  If you absolutely must make the communication
> > come from a system agent and go to a VM, then that can be done by
> > attaching the system agent to the administrative network - from within
> > the system agent, which is the thing that needs this, rather than within
> > Neutron, which doesn't really care how you use its networks.  I prefer
> > solutions where other tools don't have to make you a special case.
> 
> I've read through this email thread with quite a bit of curiosity, and I 
> have to say what Ian says above makes a lot of sense to me. If Neutron 
> can handle the creation of a "management vNIC" that has some associated 
> iptables rules governing it that provides a level of security for guest 
> <-> host and guest <-> $OpenStackService, then the transport problem 
> domain is essentially solved, and Neutron can be happily ignorant (as it 
> should be) of any guest agent communication with anything else.
> 

Indeed I think it could work, however I think the NIC is unnecessary.

Seems likely even with a second NIC that said address will be something
like 169.254.169.254 (or the ipv6 equivalent?).

If we want to attach that network as a second NIC instead of pushing a
route to it via DHCP, that is fine. But I don't think it actually gains
much, and the current neutron-metadata-agent already facilitates the
conversation between private guests and 169.254.169.254. We just need to
make sure we can forward more than port 80 through that.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] Performance Regression in Neutron/Havana compared to Quantum/Grizzly

2013-12-12 Thread Nathani, Sreedhar (APS)

Hello Salvatore,

Thanks for your feedback. Does the patch 
https://review.openstack.org/#/c/57420/ which you are working on bug 
https://bugs.launchpad.net/neutron/+bug/1253993
will help to correct the OVS agent loop slowdown issue?
Does this patch address the DHCP agent updating the host file once in a minute 
and finally sending SIGKILL to dnsmasq process?

I have tested with Marun's patch https://review.openstack.org/#/c/61168/ 
regarding 'Send DHCP notifications regardless of agent status' but this patch
Also observed the same behavior.


Thanks & Regards,
Sreedhar Nathani

From: Salvatore Orlando [mailto:sorla...@nicira.com]
Sent: Thursday, December 12, 2013 6:21 PM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] Performance Regression in Neutron/Havana compared 
to Quantum/Grizzly


I believe your analysis is correct and inline with the findings reported in the 
bug concerning OVS agent loop slowdown.

The issue has become even more prominent with the ML2 plugin due to an 
increased number of notifications sent.

Another issue which makes delays on the DHCP agent worse is that instances send 
a discover message once a minute.

Salvatore
Il 11/dic/2013 11:50 "Nathani, Sreedhar (APS)" 
mailto:sreedhar.nath...@hp.com>> ha scritto:
Hello Peter,

Here are the tests I have done. Already have 240 instances active across all 
the 16 compute nodes. To make the tests and data collection easy,
I have done the tests on single compute node

First Test -
*   240 instances already active,  16 instances on the compute node where I 
am going to do the tests
*   deploy 10 instances concurrently using nova boot command with 
num-instances option in single compute node
*   All the instances could get IP during the instance boot time.

-   Instances are created at  2013-12-10 13:41:01
-   From the compute host, DHCP requests are sent from 13:41:20 but those 
are not reaching the DHCP server
Reply from the DHCP server got at 13:43:08 (A delay of 108 seconds)
-   DHCP agent updated the host file from 13:41:06 till 13:42:54. Dnsmasq 
process got SIGHUP message every time the hosts file is updated
-   In compute node tap devices are created between 13:41:08 and 13:41:18
Security group rules are received between 13:41:45 and 13:42:56
IP table rules were updated between 13:41:50 and 13:43:04

Second Test -
*   Deleted the newly created 10 instances.
*   240 instances already active,  16 instances on the compute node where I 
am going to do the tests
*   Deploy 30 instances concurrently using nova boot command with 
num-instances option in single compute node
*   None  of the instances could get the IP during the instance boot.


-   Instances are created at  2013-12-10 14:13:50

-   From the compute host, DHCP Requests are sent from  14:14:14 but those 
are not reaching the DHCP Server
(don't see any DHCP requests are reaching the DHCP server 
from the tcpdump on the network node)

-   Reply from the DHCP server only got at 14:22:10 ( A delay of 636 
seconds)

-   From the strace of the DHCP agent process, it first updated the hosts 
file at 14:14:05, after this there is a gap of close to 60 min for
Updating next instance address, it repeated till 7th 
instance which was updated at 14:19:50.  30th instance updated at 14:20:00

-   During the 30 instance creation, dnsmasq process got SIGHUP after the 
host file is updated, but at 14:19:52 it got SIGKILL and new process
   created - 14:19:52.881088 +++ killed by 
SIGKILL +++

-   In the compute node, tap devices are created between 14:14:03 and 
14:14:38
From the strace of L2 agent log, can see security group related 
messages are received from 14:14:27 till 14:20:02
During this period in the L2 agent log see many rpc timeout messages 
like below
Timeout: Timeout while waiting on RPC response - topic: "q-plugin", RPC 
method: "security_group_rules_for_devices" info: ""

Due to security group related messages received by this compute 
node with delay, it's taking very long time to update the iptable rules
(Can see it was updated till 14:20) which is causing the DHCP 
packets to be dropped at compute node itself without reaching to DHCP server


Here is my understanding based on the tests.
Instances are creating fast and so its TAP devices. But there is a considerable 
delay in updating the network port details in dnsmasq host file and sending
The security group related info to the compute nodes due to which compute nodes 
are not able to update the iptable rules fast enough which is causing
Instance not able to get the IP.

I have collected the tcpdump from controller node, compute nodes + strace of 
dhcp, dnsmasq, OVS L2 agents incase if you are interested to look at it

Thanks & Regards,
Sreedhar Nathani


-Orig

Re: [openstack-dev] [Keystone] policy has no effect because of hard coded assert_admin?

2013-12-12 Thread Morgan Fainberg

As Dolph stated, V3 is where the policy file protects.  This is one of the many 
reasons why I would encourage movement to using V3 Keystone over V2.

The V2 API is officially deprecated in the Icehouse cycle, I think that moving 
the decorator potentially could cause more issues than not as stated for 
compatibility.  I would be very concerned about breaking compatibility with 
deployments and maintaining the security behavior with the encouragement to 
move from V2 to V3.  I am also not convinced passing the context down to the 
manager level is the right approach.  Making a move on where the protection 
occurs likely warrants a deeper discussion (perhaps in Atlanta?).

Cheers,
Morgan Fainberg

On December 12, 2013 at 10:32:40, Dolph Mathews (dolph.math...@gmail.com) wrote:

The policy file is protecting v3 API calls at the controller layer, but you're 
calling the v2 API. The policy decorators should be moved to the manager layer 
to protect both APIs equally... but we'd have to be very careful not to break 
deployments depending on the trivial "assert_admin" behavior (hence the reason 
we only wrapped v3 with the new policy decorators).


On Thu, Dec 12, 2013 at 1:41 AM, Qiu Yu  wrote:
Hi,

I was trying to fine tune some keystone policy rules. Basically I want to grant 
"create_project" action to user in "ops" role. And following are my steps.

1. Adding a new user "usr1"
2. Creating new role "ops"
3. Granting this user a "ops" role in "service" tenant
4. Adding new lines to keystone policy file

        "ops_required": [["role:ops"]],
        "admin_or_ops": [["rule:admin_required"], ["rule:ops_required"]],

5. Change

        "identity:create_project": [["rule:admin_required"]],
    to
        "identity:create_project": [["rule:admin_or_ops"]],

6. Restart keystone service

keystone tenant-create with credential of user "usr1" still returns 403 
Forbidden error.
“You are not authorized to perform the requested action, admin_required. (HTTP 
403)”

After some quick scan, it seems that create_project function has a hard-coded 
assert_admin call[1], which does not respect settings in the policy file.

Any ideas why? Is it a bug to fix? Thanks!
BTW, I'm running keystone havana release with V2 API.

[1] 
https://github.com/openstack/keystone/blob/master/keystone/identity/controllers.py#L105

Thanks,
--
Qiu Yu

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




--

-Dolph
___  
OpenStack-dev mailing list  
OpenStack-dev@lists.openstack.org  
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev  
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Nova] [Neutron] How do we know a host is ready to have servers scheduled onto it?

Excerpts from Kyle Mestery's message of 2013-12-12 09:53:57 -0800:
> On Dec 12, 2013, at 11:44 AM, Jay Pipes  wrote:
> > On 12/12/2013 12:36 PM, Clint Byrum wrote:
> >> Excerpts from Russell Bryant's message of 2013-12-12 09:09:04 -0800:
> >>> On 12/12/2013 12:02 PM, Clint Byrum wrote:
>  I've been chasing quite a few bugs in the TripleO automated bring-up
>  lately that have to do with failures because either there are no valid
>  hosts ready to have servers scheduled, or there are hosts listed and
>  enabled, but they can't bind to the network because for whatever reason
>  the L2 agent has not checked in with Neutron yet.
>  
>  This is only a problem in the first few minutes of a nova-compute host's
>  life. But it is critical for scaling up rapidly, so it is important for
>  me to understand how this is supposed to work.
>  
>  So I'm asking, is there a standard way to determine whether or not a
>  nova-compute is definitely ready to have things scheduled on it? This
>  can be via an API, or even by observing something on the nova-compute
>  host itself. I just need a definitive signal that "the compute host is
>  ready".
> >>> 
> >>> If a nova compute host has registered itself to start having instances
> >>> scheduled to it, it *should* be ready.  AFAIK, we're not doing any
> >>> network sanity checks on startup, though.
> >>> 
> >>> We already do some sanity checks on startup.  For example, nova-compute
> >>> requires that it can talk to nova-conductor.  nova-compute will block on
> >>> startup until nova-conductor is responding if they happened to be
> >>> brought up at the same time.
> >>> 
> >>> We could do something like this with a networking sanity check if
> >>> someone could define what that check should look like.
> >>> 
> >> Could we ask Neutron if our compute host has an L2 agent yet? That seems
> >> like a valid sanity check.
> > 
> > ++
> > 
> This makes sense to me as well. Although, not all Neutron plugins have
> an L2 agent, so I think the check needs to be more generic than that.
> For example, the OpenDaylight MechanismDriver we have developed
> doesn't need an agent. I also believe the Nicira plugin is agent-less,
> perhaps there are others as well.
> 
> And I should note, does this sort of integration also happen with cinder,
> for example, when we're dealing with storage? Any other services which
> have a requirement on startup around integration with nova as well?
> 

Does cinder actually have per-compute-host concerns? I admit to being a
bit cinder-stupid here.

Anyway, it seems to me that any service that is compute-host aware
should be able to respond to the compute host whether or not it is a)
aware of it, and b) ready to serve on it.

For agent-less drivers that is easy, you just always return True. And
for drivers with agents, you return false unless you can find an agent
for the host.

So something like:

GET /host/%(compute-host-name)

And then in the response include a "ready" attribute that would signal
whether all networks that should work there, can work there.

As a first pass, just polling until that is "ready" before nova-compute
enables itself would solve the problems I see (and that I think users
would see as a cloud provider scales out compute nodes). Longer term
we would also want to aim at having notifications available for this
so that nova-compute could subscribe to that notification bus and then
disable itself if its agent ever goes away.

I opened this bug to track the issue. I suspect there are duplicates of
it already reported, but would like to start clean to make sure it is
analyzed fully and then we can use those other bugs as test cases and
confirmation:

https://bugs.launchpad.net/nova/+bug/1260440

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] How to best make User Experience a priority in every project

On Wed, Dec 11, 2013 at 4:25 PM, Stefano Maffulli wrote:

> On 12/06/2013 02:19 AM, Jaromir Coufal wrote:
> > We are growing. At the moment we are 4 core members and others are
> > coming in. But honestly, contributors are not coming to specific
> > projects - they go to reach UX community in a sense - OK this is awesome
> > effort, how can I help? What can I work on?
>
> It seems to me that from the comments in the thread, we may have these
> fresh energies directed at reviewing code from the UX perspective. Do
> you think that code reviews across all projects are something in scope
> for the UX team? If so, how do you think we can make it easier for the
> UX team to discover reviews that may require comments?
>

Unfortunately, that's probably most patches. However, I imagine most
commits with DocImpact also have very obvious UX impact - so I'd start by
directing attention to those patches before they merge.


>
>
> /stef
>
> --
> Ask and answer questions on https://ask.openstack.org
>
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>



-- 

-Dolph
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Keystone] policy has no effect because of hard coded assert_admin?

The policy file is protecting v3 API calls at the controller layer, but
you're calling the v2 API. The policy decorators should be moved to the
manager layer to protect both APIs equally... but we'd have to be very
careful not to break deployments depending on the trivial "assert_admin"
behavior (hence the reason we only wrapped v3 with the new policy
decorators).


On Thu, Dec 12, 2013 at 1:41 AM, Qiu Yu  wrote:

> Hi,
>
> I was trying to fine tune some keystone policy rules. Basically I want to
> grant "create_project" action to user in "ops" role. And following are my
> steps.
>
> 1. Adding a new user "usr1"
> 2. Creating new role "ops"
> 3. Granting this user a "ops" role in "service" tenant
> 4. Adding new lines to keystone policy file
>
> "ops_required": [["role:ops"]],
> "admin_or_ops": [["rule:admin_required"], ["rule:ops_required"]],
>
> 5. Change
>
> "identity:create_project": [["rule:admin_required"]],
> to
> "identity:create_project": [["rule:admin_or_ops"]],
>
> 6. Restart keystone service
>
> keystone tenant-create with credential of user "usr1" still returns 403
> Forbidden error.
> “You are not authorized to perform the requested action, admin_required.
> (HTTP 403)”
>
> After some quick scan, it seems that create_project function has a
> hard-coded assert_admin call[1], which does not respect settings in the
> policy file.
>
> Any ideas why? Is it a bug to fix? Thanks!
> BTW, I'm running keystone havana release with V2 API.
>
> [1]
> https://github.com/openstack/keystone/blob/master/keystone/identity/controllers.py#L105
>
> Thanks,
> --
> Qiu Yu
>
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>


-- 

-Dolph
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] State of the Gate - Dec 12

2013-12-12 Thread Matt Riedemann




On 12/12/2013 7:20 AM, Sean Dague wrote:

Current Gate Length: 12hrs*, 41 deep

(top of gate entered 12hrs ago)

It's been an *exciting* week this week. For people not paying attention
we had 2 external events which made things terrible earlier in the week.

==
Event 1: sphinx 1.2 complete breakage - MOSTLY RESOLVED
==

It turns out sphinx 1.2 + distutils (which pbr magic call through) means
total sadness. The fix for this was a requirements pin to sphinx < 1.2,
and until a project has taken that they will fail in the gate.

It also turns out that tox installs pre-released software by default (a
terrible default behavior), so you also need a tox.ini change like this
- https://github.com/openstack/nova/blob/master/tox.ini#L9 otherwise
local users will install things like sphinx 1.2b3. They will also break
in other ways.

Not all projects have merged this. If you are a project that hasn't,
please don't send any other jobs to the gate until you do. A lot of
delay was added to the gate yesterday by Glance patches being pushed to
the gate before their doc jobs were done.

==
Event 2: apt.puppetlabs.com outage - RESOLVED
==

We use that apt repository to setup the devstack nodes in nodepool with
puppet. We were triggering an issue with grenade where it's apt-get
calls were failing, because it does apt-get update once to make sure
life is good. This only triggered in grenade (noth other devstack runs)
because we do set -o errexit aggressively.

A fix in grenade to ignore these errors was merged yesterday afternoon
(the purple line - http://status.openstack.org/elastic-recheck/ you can
see where it showed up).

==
Top Gate Bugs
==

We normally do this as a list, and you can see the whole list here -
http://status.openstack.org/elastic-recheck/ (now sorted by number of
FAILURES in the last 2 weeks)

That being said, our bigs race bug is currently this one bug -
https://bugs.launchpad.net/tempest/+bug/1253896 - and if you want to
merge patches, fixing that one bug will be huge.

Basically, you can't ssh into guests that get created. That's sort of a
fundamental property of a cloud. It shows up more frequently on neutron
jobs, possibly due to actually testing the metadata server path. There
have been many attempts on retry logic on this, we actually retry for
196 seconds to get in and only fail once we can't get in, so waiting
isn't helping. It doesn't seem like the env is under that much load.

Until we resolve this, life will not be good in landing patches.

-Sean



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



There have been a few threads [1][2] on gate failures and the process 
around what happens when we go about identifying, tracking and fixing them.


I couldn't find anything outside of the mailing list to keep a record of 
this so started a page here [3].


Feel free to contribute so we can point people to how they can easily 
help in working these faster.


[1] 
http://lists.openstack.org/pipermail/openstack-dev/2013-November/020280.html
[2] 
http://lists.openstack.org/pipermail/openstack-dev/2013-November/019931.html

[3] https://wiki.openstack.org/wiki/ElasticRecheck

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] Unified Guest Agent proposal


On 12/10/2013 03:49 PM, Ian Wells wrote:

On 10 December 2013 20:55, Clint Byrum mailto:cl...@fewbar.com>> wrote:

If it is just a network API, it works the same for everybody. This
makes it simpler, and thus easier to scale out independently of compute
hosts. It is also something we already support and can very easily
expand
by just adding a tiny bit of functionality to neutron-metadata-agent.

In fact we can even push routes via DHCP to send agent traffic through
a different neutron-metadata-agent, so I don't see any issue where we
are piling anything on top of an overstressed single resource. We can
have neutron route this traffic directly to the Heat API which hosts it,
and that can be load balanced and etc. etc. What is the exact scenario
you're trying to avoid?


You may be making even this harder than it needs to be.  You can create
multiple networks and attach machines to multiple networks.  Every point
so far has been 'why don't we use  as a backdoor into our VM
without affecting the VM in any other way' - why can't that just be one
more network interface set aside for whatever management  instructions
are appropriate?  And then what needs pushing into Neutron is nothing
more complex than strong port firewalling to prevent the slaves/minions
talking to each other.  If you absolutely must make the communication
come from a system agent and go to a VM, then that can be done by
attaching the system agent to the administrative network - from within
the system agent, which is the thing that needs this, rather than within
Neutron, which doesn't really care how you use its networks.  I prefer
solutions where other tools don't have to make you a special case.


I've read through this email thread with quite a bit of curiosity, and I 
have to say what Ian says above makes a lot of sense to me. If Neutron 
can handle the creation of a "management vNIC" that has some associated 
iptables rules governing it that provides a level of security for guest 
<-> host and guest <-> $OpenStackService, then the transport problem 
domain is essentially solved, and Neutron can be happily ignorant (as it 
should be) of any guest agent communication with anything else.


Best,
-jay


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Nova] [Neutron] How do we know a host is ready to have servers scheduled onto it?


On 12/12/2013 12:53 PM, Kyle Mestery wrote:

On Dec 12, 2013, at 11:44 AM, Jay Pipes  wrote:

On 12/12/2013 12:36 PM, Clint Byrum wrote:

Excerpts from Russell Bryant's message of 2013-12-12 09:09:04 -0800:

On 12/12/2013 12:02 PM, Clint Byrum wrote:

I've been chasing quite a few bugs in the TripleO automated bring-up
lately that have to do with failures because either there are no valid
hosts ready to have servers scheduled, or there are hosts listed and
enabled, but they can't bind to the network because for whatever reason
the L2 agent has not checked in with Neutron yet.

This is only a problem in the first few minutes of a nova-compute host's
life. But it is critical for scaling up rapidly, so it is important for
me to understand how this is supposed to work.

So I'm asking, is there a standard way to determine whether or not a
nova-compute is definitely ready to have things scheduled on it? This
can be via an API, or even by observing something on the nova-compute
host itself. I just need a definitive signal that "the compute host is
ready".


If a nova compute host has registered itself to start having instances
scheduled to it, it *should* be ready.  AFAIK, we're not doing any
network sanity checks on startup, though.

We already do some sanity checks on startup.  For example, nova-compute
requires that it can talk to nova-conductor.  nova-compute will block on
startup until nova-conductor is responding if they happened to be
brought up at the same time.

We could do something like this with a networking sanity check if
someone could define what that check should look like.


Could we ask Neutron if our compute host has an L2 agent yet? That seems
like a valid sanity check.


++


This makes sense to me as well. Although, not all Neutron plugins have
an L2 agent, so I think the check needs to be more generic than that.
For example, the OpenDaylight MechanismDriver we have developed
doesn't need an agent. I also believe the Nicira plugin is agent-less,
perhaps there are others as well.

And I should note, does this sort of integration also happen with cinder,
for example, when we're dealing with storage? Any other services which
have a requirement on startup around integration with nova as well?


Right, it's more general than "is the L2 agent alive and running". It's 
more about having each service understand the relative dependencies it 
has on other supporting services.


For instance, have each service implement a:

GET /healthcheck

that would return either a 200 OK or 409 Conflict with the body 
containing a list of service types that it is waiting to hear back from 
in order to provide a 200 OK for itself.


Anyway, just some thoughts...

-jay



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] Stackalytics 0.4 released!

2013-12-12 Thread chandan kumar

Hello ,

On Thu, Dec 12, 2013 at 10:11 PM, Monty Taylor  wrote:
>
>
> On 12/12/2013 04:49 PM, Ilya Shakhat wrote:
>> Hello everyone!
>>
>> Stackalytics team is happy to announce the release of version 0.4. This
>> release is completely dedicated to different types of reports. We added
>> highly demanded top reviewers chart acknowledged as an essential tool
>> for finding most active reviewers
>> (ex. http://stackalytics.com/report/reviews/neutron-group/30). Open
>> reviews report to help core engineers with tracking the backlog and
>> reviews that stay for too long
>> (http://stackalytics.com/report/reviews/nova/open). And activity report,
>> the one to show all work done by engineer and another by company. Also
>> this report includes nice punch-card and the one can find that there are
>> really world-wide never-sleeping contributors
>> like http://stackalytics.com/report/companies/red%20hat :)
>
> Nice work. On the activity chart, it shows an activity graph of time and
> day. What timezone are those hours shown in?
>
>> In details, total changes are:
>>
>>   * Added review stats report
>>  that shows
>> top reviewers with breakdown by marks and disagreement ratio against
>> core's decision
>>   * Added open reviews report
>>  that shows top
>> longest reviews and backlog summary
>>   * Added activity report
>>  with engineer's
>> activity log and punch-card of usual online hours (in UTC). The same
>> report is available for companies
>>   * Fixed review stats calculation, now Approve marks are counted
>> separately
>>   * Fixed commit date calculation, now it is date of merge, not commit
>>   * Minor improvements in filter selectors
>>   * Incorporated 21 updates to user and company profiles in default data
>>
>> The next Stackalytics meeting will be on Monday, Dec 16 at 15:00 UTC in
>> #openstack-meeting. Come and join us, we have somemore things for the
>> next release.
>>
>> Thanks,
>> Ilya
>>
>>
Thank you Ilya for bringing lots of changes in the stackalytics.
I would like to help in the development of stackalytics. Last time i
have missed.
This time i will not.

Thanks,
Chandan Kumar

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] Introducing the new OpenStack service for Containers

2013-12-12 Thread Rick Harris

Hi all,

Was wondering if there's been any more work done on the proposed
Container-Service (Capsule?) API?

Haven't seen much on the ML on this, so just want to make sure the current
plan is still to have a draft of the Capsule API, compare the delta to the
existing Nova API, and determine whether a separate service still makes
sense for the current use-cases.

Thanks!

Rick


On Fri, Nov 22, 2013 at 2:35 PM, Russell Bryant  wrote:

> On 11/22/2013 02:29 PM, Krishna Raman wrote:
> >
> > On Nov 22, 2013, at 10:26 AM, Eric Windisch  > > wrote:
> >
> >> On Fri, Nov 22, 2013 at 11:49 AM, Krishna Raman  >> > wrote:
> >>> Reminder: We are meting in about 15 minutes on #openstack-meeting
> >>> channel.
> >>
> >> I wasn't able to make it. Was meeting-bot triggered? Is there a log of
> >> today's discussion?
> >
> > Yes. Logs are
> > here:
> http://eavesdrop.openstack.org/meetings/nova/2013/nova.2013-11-22-17.01.log.html
>
> Yep, I used the 'nova' meeting topic for this one.  If the meeting turns
> in to a regular thing, we should probably switch it to some sort of
> sub-team type name ... like nova-containers.
>
> --
> Russell Bryant
>
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Nova] [Neutron] How do we know a host is ready to have servers scheduled onto it?

2013-12-12 Thread Kyle Mestery

On Dec 12, 2013, at 11:44 AM, Jay Pipes  wrote:
> On 12/12/2013 12:36 PM, Clint Byrum wrote:
>> Excerpts from Russell Bryant's message of 2013-12-12 09:09:04 -0800:
>>> On 12/12/2013 12:02 PM, Clint Byrum wrote:
 I've been chasing quite a few bugs in the TripleO automated bring-up
 lately that have to do with failures because either there are no valid
 hosts ready to have servers scheduled, or there are hosts listed and
 enabled, but they can't bind to the network because for whatever reason
 the L2 agent has not checked in with Neutron yet.
 
 This is only a problem in the first few minutes of a nova-compute host's
 life. But it is critical for scaling up rapidly, so it is important for
 me to understand how this is supposed to work.
 
 So I'm asking, is there a standard way to determine whether or not a
 nova-compute is definitely ready to have things scheduled on it? This
 can be via an API, or even by observing something on the nova-compute
 host itself. I just need a definitive signal that "the compute host is
 ready".
>>> 
>>> If a nova compute host has registered itself to start having instances
>>> scheduled to it, it *should* be ready.  AFAIK, we're not doing any
>>> network sanity checks on startup, though.
>>> 
>>> We already do some sanity checks on startup.  For example, nova-compute
>>> requires that it can talk to nova-conductor.  nova-compute will block on
>>> startup until nova-conductor is responding if they happened to be
>>> brought up at the same time.
>>> 
>>> We could do something like this with a networking sanity check if
>>> someone could define what that check should look like.
>>> 
>> Could we ask Neutron if our compute host has an L2 agent yet? That seems
>> like a valid sanity check.
> 
> ++
> 
This makes sense to me as well. Although, not all Neutron plugins have
an L2 agent, so I think the check needs to be more generic than that.
For example, the OpenDaylight MechanismDriver we have developed
doesn't need an agent. I also believe the Nicira plugin is agent-less,
perhaps there are others as well.

And I should note, does this sort of integration also happen with cinder,
for example, when we're dealing with storage? Any other services which
have a requirement on startup around integration with nova as well?

Thanks,
Kyle

> -jay
> 
> 
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] Unified Guest Agent proposal

Excerpts from Dmitry Mescheryakov's message of 2013-12-12 09:24:13 -0800:
> Clint, Kevin,
> 
> Thanks for reassuring me :-) I just wanted to make sure that having direct
> access from VMs to a single facility is not a dead end in terms of security
> and extensibility. And since it is not, I agree it is much simpler (and
> hence better) than hypervisor-dependent design.
> 
> 
> Then returning to two major suggestions made:
>  * Salt
>  * Custom solution specific to our needs
> 
> The custom solution could be made on top of oslo.messaging. That gives us
> RPC working on different messaging systems. And that is what we really need
> - an RPC into guest supporting various transports. What it lacks at the
> moment is security - it has neither authentication nor ACL.
> 

I bet salt would be super open to modularizing their RPC. Since
oslo.messaging includes ZeroMQ, and is a library now, I see no reason to
avoid opening that subject with our fine friends in the Salt community.
Perhaps a few of them are even paying attention right here. :)

The benefit there is that we get everything except the plugins we want
to write already done. And we could start now with the ZeroMQ-only
salt agent if we could at least get an agreement on principle that Salt
wouldn't mind using an abstraction layer for RPC.

That does make the "poke a hole out of private networks" conversation
_slightly_ more complex. It is one thing to just let ZeroMQ out, another
to let all of oslo.messaging's backends out. But I think in general
they'll all share the same thing: you want an address+port to be routed
intelligently out of the private network into something running under
the cloud.

Next steps (all can be done in parallel, as all are interdependent):

* Ask Salt if oslo.messaging is a path they'll walk with us
* Experiment with communicating with salt agents from an existing
  OpenStack service (Savanna, Trove, Heat, etc)
* Deep-dive into Salt to see if it is feasible

As I have no cycles for this, I can't promise to do any, but I will
try to offer assistance if I can.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Nova] [Neutron] How do we know a host is ready to have servers scheduled onto it?


On 12/12/2013 12:36 PM, Clint Byrum wrote:

Excerpts from Russell Bryant's message of 2013-12-12 09:09:04 -0800:

On 12/12/2013 12:02 PM, Clint Byrum wrote:

I've been chasing quite a few bugs in the TripleO automated bring-up
lately that have to do with failures because either there are no valid
hosts ready to have servers scheduled, or there are hosts listed and
enabled, but they can't bind to the network because for whatever reason
the L2 agent has not checked in with Neutron yet.

This is only a problem in the first few minutes of a nova-compute host's
life. But it is critical for scaling up rapidly, so it is important for
me to understand how this is supposed to work.

So I'm asking, is there a standard way to determine whether or not a
nova-compute is definitely ready to have things scheduled on it? This
can be via an API, or even by observing something on the nova-compute
host itself. I just need a definitive signal that "the compute host is
ready".


If a nova compute host has registered itself to start having instances
scheduled to it, it *should* be ready.  AFAIK, we're not doing any
network sanity checks on startup, though.

We already do some sanity checks on startup.  For example, nova-compute
requires that it can talk to nova-conductor.  nova-compute will block on
startup until nova-conductor is responding if they happened to be
brought up at the same time.

We could do something like this with a networking sanity check if
someone could define what that check should look like.


Could we ask Neutron if our compute host has an L2 agent yet? That seems
like a valid sanity check.


++

-jay


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

[openstack-dev] [neutron] [third-party-testing] Meeting Minutes for our first meeting

2013-12-12 Thread Kyle Mestery

Hi everyone:

We had a meeting around Neutron Third-Party testing today on IRC.
The logs are available here [1]. We plan to host another meeting
next week, and it will be at 2200 UTC on Wednesday in the
#openstack-meeting-alt channel on IRC. Please attend and update
the etherpad [2] with any items relevant to you before then.

Thanks again!
Kyle

[1] http://eavesdrop.openstack.org/meetings/networking_third_party_testing/2013/
[2] https://etherpad.openstack.org/p/multi-node-neutron-tempest
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Nova] Support for Pecan in Nova


On 12/11/2013 11:47 PM, Mike Perez wrote:

On 10:06 Thu 12 Dec , Christopher Yeoh wrote:

On Thu, Dec 12, 2013 at 8:59 AM, Doug Hellmann
mailto:doug.hellm...@dreamhost.com>>wrote:






On Wed, Dec 11, 2013 at 3:41 PM, Ryan Petrello <
ryan.petre...@dreamhost.com
>

wrote:



Hello,

I’ve spent the past week experimenting with using Pecan for
Nova’s

API,

and have opened an experimental review:

https://review.openstack.org/#/c/61303/6

…which implements the `versions` v3 endpoint using pecan (and

paves the

way for other extensions to use pecan).  This is a *potential*


approach

I've considered for gradually moving the V3 API, but I’m open
to other suggestions (and feedback on this approach).  I’ve
also got a few open questions/general observations:

1.  It looks like the Nova v3 API is composed *entirely* of
extensions (including “core” API calls), and that extensions
and their routes are discoverable and extensible via installed
software that registers

itself

via stevedore.  This seems to lead to an API that’s composed of


installed

software, which in my opinion, makes it fairly hard to map out
the

API (as

opposed to how routes are manually defined in other WSGI

frameworks).  I

assume at this time, this design decision has already been

solidified for

v3?



Yeah, I brought this up at the summit. I am still having some
trouble understanding how we are going to express a stable core
API for compatibility testing if the behavior of the API can be
varied so significantly by deployment decisions. Will we just
list each

"required"

extension, and forbid any extras for a compliant cloud?




Maybe the issue is caused by me misunderstanding the term
"extension," which (to me) implies an optional component but is
perhaps reflecting a technical implementation detail instead?



Yes and no :-) As Ryan mentions, all API code is a plugin in the V3
API. However, some must be loaded or the V3 API refuses to start
up. In nova/api/openstack/__init__.py we have
API_V3_CORE_EXTENSIONS which hard codes which extensions must be
loaded and there is no config option to override this (blacklisting
a core plugin will result in the V3 API not starting up).

So for compatibility testing I think what will probably happen is
that we'll be defining a minimum set (API_V3_CORE_EXTENSIONS) that
must be implemented and clients can rely on that always being

present

on a compliant cloud. But clients can also then query through
/extensions what other functionality (which is backwards compatible
with respect to core) may also be present on that specific cloud.


This really seems similar to the idea of having a router class, some
controllers and you map them. From my observation at the summit,
calling everything an extension creates confusion. An extension
"extends" something. For example, Chrome has extensions, and they
extend the idea of the core features of a browser. If you want to do
more than back/forward, go to an address, stop, etc, that's an
extension. If you want it to play an audio clip "stop, hammer time"
after clicking the stop button, that's an example of an extension.

In OpenStack, we use extensions to extend core. Core are the
essential feature(s) of the project. In Cinder for example, core is
volume. In core you can create a volume, delete a volume, attach a
volume, detach a volume, etc. If you want to go beyond that, that's
an extension. If you want to do volume encryption, that's an example
of an extension.

I'm worried by the discrepancies this will create among the programs.
You mentioned maintainability being a plus for this. I don't think
it'll be great from the deployers perspective when you have one
program that thinks everything is an extension and some of them have
to be enabled that the deployer has to be mindful of, while the rest
of the programs consider all extensions to be optional.


+1. I agree with most of what Mike says above. The idea that there are 
core "extensions" in Nova's v3 API doesn't make a whole lot of sense to me.


Best,
-jay

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Nova] [Neutron] How do we know a host is ready to have servers scheduled onto it?

Excerpts from Russell Bryant's message of 2013-12-12 09:09:04 -0800:
> On 12/12/2013 12:02 PM, Clint Byrum wrote:
> > I've been chasing quite a few bugs in the TripleO automated bring-up
> > lately that have to do with failures because either there are no valid
> > hosts ready to have servers scheduled, or there are hosts listed and
> > enabled, but they can't bind to the network because for whatever reason
> > the L2 agent has not checked in with Neutron yet.
> > 
> > This is only a problem in the first few minutes of a nova-compute host's
> > life. But it is critical for scaling up rapidly, so it is important for
> > me to understand how this is supposed to work.
> > 
> > So I'm asking, is there a standard way to determine whether or not a
> > nova-compute is definitely ready to have things scheduled on it? This
> > can be via an API, or even by observing something on the nova-compute
> > host itself. I just need a definitive signal that "the compute host is
> > ready".
> 
> If a nova compute host has registered itself to start having instances
> scheduled to it, it *should* be ready.  AFAIK, we're not doing any
> network sanity checks on startup, though.
> 
> We already do some sanity checks on startup.  For example, nova-compute
> requires that it can talk to nova-conductor.  nova-compute will block on
> startup until nova-conductor is responding if they happened to be
> brought up at the same time.
> 
> We could do something like this with a networking sanity check if
> someone could define what that check should look like.
> 

Could we ask Neutron if our compute host has an L2 agent yet? That seems
like a valid sanity check.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Nova] [Neutron] How do we know a host is ready to have servers scheduled onto it?

Excerpts from Chris Friesen's message of 2013-12-12 09:19:42 -0800:
> On 12/12/2013 11:02 AM, Clint Byrum wrote:
> 
> > So I'm asking, is there a standard way to determine whether or not a
> > nova-compute is definitely ready to have things scheduled on it? This
> > can be via an API, or even by observing something on the nova-compute
> > host itself. I just need a definitive signal that "the compute host is
> > ready".
> 
> Is it not sufficient that "nova service-list" shows the compute service 
> as "up"?
> 

I could spin waiting for "at least one". Not a bad idea actually. However,
I suspect that will only handle the situations I've gotten where the
scheduler returns "NoValidHost".

I say that because I think if it shows there, it matches the all hosts
filter and will have things scheduled on it. With one compute host I
get failures after scheduling because neutron has no network segment to
bind to. That is because the L2 agent on the host has not yet registered
itself with Neutron.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] Unified Guest Agent proposal

2013-12-12 Thread Dmitry Mescheryakov

Vladik,

Thanks for the suggestion, but hypervisor-dependent solution is exactly
what scares off people in the thread :-)

Thanks,

Dmitry



2013/12/11 Vladik Romanovsky 

>
> Maybe it will be useful to use Ovirt guest agent as a base.
>
> http://www.ovirt.org/Guest_Agent
> https://github.com/oVirt/ovirt-guest-agent
>
> It is already working well on linux and windows and has a lot of
> functionality.
> However, currently it is using virtio-serial for communication, but I
> think it can be extended for other bindings.
>
> Vladik
>
> - Original Message -
> > From: "Clint Byrum" 
> > To: "openstack-dev" 
> > Sent: Tuesday, 10 December, 2013 4:02:41 PM
> > Subject: Re: [openstack-dev] Unified Guest Agent proposal
> >
> > Excerpts from Dmitry Mescheryakov's message of 2013-12-10 12:37:37 -0800:
> > > >> What is the exact scenario you're trying to avoid?
> > >
> > > It is DDoS attack on either transport (AMQP / ZeroMQ provider) or
> server
> > > (Salt / Our own self-written server). Looking at the design, it doesn't
> > > look like the attack could be somehow contained within a tenant it is
> > > coming from.
> > >
> >
> > We can push a tenant-specific route for the metadata server, and a tenant
> > specific endpoint for in-agent things. Still simpler than
> hypervisor-aware
> > guests. I haven't seen anybody ask for this yet, though I'm sure if they
> > run into these problems it will be the next logical step.
> >
> > > In the current OpenStack design I see only one similarly vulnerable
> > > component - metadata server. Keeping that in mind, maybe I just
> > > overestimate the threat?
> > >
> >
> > Anything you expose to the users is "vulnerable". By using the localized
> > hypervisor scheme you're now making the compute node itself vulnerable.
> > Only now you're asking that an already complicated thing (nova-compute)
> > add another job, rate limiting.
> >
> > ___
> > OpenStack-dev mailing list
> > OpenStack-dev@lists.openstack.org
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> >
>
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [keystone] domain admin role query

On Thu, Dec 12, 2013 at 8:50 AM, Adam Young  wrote:

> On 12/11/2013 10:11 PM, Paul Belanger wrote:
>
>> On 13-12-11 11:18 AM, Lyle, David wrote:
>>
>>> +1 on moving the domain admin role rules to the default policy.json
>>>
>>> -David Lyle
>>>
>>> From: Dolph Mathews [mailto:dolph.math...@gmail.com]
>>> Sent: Wednesday, December 11, 2013 9:04 AM
>>> To: OpenStack Development Mailing List (not for usage questions)
>>> Subject: Re: [openstack-dev] [keystone] domain admin role query
>>>
>>>
>>> On Tue, Dec 10, 2013 at 10:49 PM, Jamie Lennox 
>>> wrote:
>>> Using the default policies it will simply check for the admin role and
>>> not care about the domain that admin is limited to. This is partially a
>>> left over from the V2 api when there wasn't domains to worry > about.
>>>
>>> A better example of policies are in the file
>>> etc/policy.v3cloudsample.json. In there you will see the rule for
>>> create_project is:
>>>
>>>  "identity:create_project": "rule:admin_required and
>>> domain_id:%(project.domain_id)s",
>>>
>>> as opposed to (in policy.json):
>>>
>>>  "identity:create_project": "rule:admin_required",
>>>
>>> This is what you are looking for to scope the admin role to a domain.
>>>
>>> We need to start moving the rules from policy.v3cloudsample.json to the
>>> default policy.json =)
>>>
>>>
>>> Jamie
>>>
>>> - Original Message -
>>>
 From: "Ravi Chunduru" 
 To: "OpenStack Development Mailing List" >>> openstack.org>
 Sent: Wednesday, 11 December, 2013 11:23:15 AM
 Subject: [openstack-dev] [keystone] domain admin role query

 Hi,
 I am trying out Keystone V3 APIs and domains.
 I created an domain, created a project in that domain, created an user
 in
 that domain and project.
 Next, gave an admin role for that user in that domain.

 I am assuming that user is now admin to that domain.
 Now, I got a scoped token with that user, domain and project. With that
 token, I tried to create a new project in that domain. It worked.

 But, using the same token, I could also create a new project in a
 'default'
 domain too. I expected it should throw authentication error. Is it a
 bug?

 Thanks,
 --
 Ravi


>> One of the issues I had this week while using the
>> policy.v3cloudsample.json was I had no easy way of creating a domain with
>> the id of 'admin_domain_id'.  I basically had to modify the SQL directly to
>> do it.
>>
> You should not have to edit the SQL.  You should be able, at a minimum, to
> re-enable the ADMIN_TOKEN in the config file to create any object inside of
> Keystone.
>
>  open a bug for the problem, and describe what you did step by step?
>
>
>
>> Any chance we can create a 2nd domain using 'admin_domain_id' via
>> keystone-manage sync_db?
>>
>
I totally forgot about this piece -- this is just another incarnation of
this bug at the domain level which we should avoid furthering:

  https://bugs.launchpad.net/keystone/+bug/968696

But, to answer your question: no. It's intended to be a placeholder in the
policy file for an actual domain ID (modify the policy file, don't hack at
the SQL backend).


>
>>
>
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>



-- 

-Dolph
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements

2013-12-12 Thread Will Foster


On 12/12/13 09:42 +1300, Robert Collins wrote:

On 12 December 2013 01:17, Jaromir Coufal  wrote:

On 2013/10/12 23:09, Robert Collins wrote:



The 'easiest' way is to support bigger companies with huge deployments,
tailored infrastructure, everything connected properly.

But there are tons of companies/users who are running on old
heterogeneous
hardware. Very likely even more than the number of companies having
already
mentioned large deployments. And giving them only the way of 'setting up
rules' in order to get the service on the node - this type of user is not
gonna use our deployment system.



Thats speculation. We don't know if they will or will not because we
haven't given them a working system to test.


Some part of that is speculation, some part of that is feedback from people
who are doing deployments (of course its just very limited audience).
Anyway, it is not just pure theory.


Sure. Let be me more precise. There is a hypothesis that lack of
direct control will be a significant adoption blocker for a primary
group of users.

I think it's safe to say that some users in the group 'sysadmins
having to deploy an OpenStack cloud' will find it a bridge too far and
not use a system without direct control. Call this group A.

I think it's also safe to say that some users will not care in the
slightest, because their deployment is too small for them to be
particularly worried (e.g. about occasional downtime (but they would
worry a lot about data loss)). Call this group B.

I suspect we don't need to consider group C - folk who won't use a
system if it *has* manual control, but thats only a suspicion. It may
be that the side effect of adding direct control is to reduce
usability below the threshold some folk need...

To assess 'significant adoption blocker' we basically need to find the
% of users who will care sufficiently that they don't use TripleO.

How can we do that? We can do questionnaires, and get such folk to
come talk with use, but that suffers from selection bias - group B can
use the system with or without direct manual control, so have little
motivation to argue vigorously in any particular direction. Group A
however have to argue because they won't use the system at all without
that feature, and they may want to use the system for other reasons,
so that because a crucial aspect for them.

A much better way IMO is to test it - to get a bunch of volunteers and
see who responds positively to a demo *without* direct manual control.

To do that we need a demoable thing, which might just be mockups that
show a set of workflows (and include things like Jay's
shiny-new-hardware use case in the demo).

I rather suspect we're building that anyway as part of doing UX work,
so maybe what we do is put a tweet or blog post up asking for
sysadmins who a) have not yet deployed openstack, b) want to, and c)
are willing to spend 20-30 minutes with us, walk them through a demo
showing no manual control, and record what questions they ask, and
whether they would like to have that product to us, and if not, then
(a) what use cases they can't address with the mockups and (b) what
other reasons they have for not using it.

This is a bunch of work though!

So, do we need to do that work?

*If* we can layer manual control on later, then we could defer this
testing until we are at the point where we can say 'the nova scheduled
version is ready, now lets decide if we add the manual control'.

OTOH, if we *cannot* layer manual control on later - if it has
tentacles through too much of the code base, then we need to decide
earlier, because it will be significantly harder to add later and that
may be too late of a ship date for vendors shipping on top of TripleO.

So with that as a prelude, my technical sense is that we can layer
manual scheduling on later: we provide an advanced screen, show the
list of N instances we're going to ask for and allow each instance to
be directly customised with a node id selected from either the current
node it's running on or an available node. It's significant work both
UI and plumbing, but it's not going to be made harder by the other
work we're doing AFAICT.

-> My proposal is that we shelve this discussion until we have the
nova/heat scheduled version in 'and now we polish' mode, and then pick
it back up and assess user needs.

An alternative argument is to say that group A is a majority of the
userbase and that doing an automatic version is entirely unnecessary.
Thats also possible, but I'm extremely skeptical, given the huge cost
of staff time, and the complete lack of interest my sysadmin friends
(and my former sysadmin self) have in doing automatable things by
hand.


I just wanted to add a few thoughts:

For some comparative information here "from the field" I work
extensively on deployments of large OpenStack implementations,
most recently with a ~220node/9rack deployment (scaling up to 
42racks / 1024 nodes soon).  My primary role is of a Devops/Sysadmin 
nature, and no

Re: [openstack-dev] Unified Guest Agent proposal

2013-12-12 Thread Dmitry Mescheryakov

Clint, Kevin,

Thanks for reassuring me :-) I just wanted to make sure that having direct
access from VMs to a single facility is not a dead end in terms of security
and extensibility. And since it is not, I agree it is much simpler (and
hence better) than hypervisor-dependent design.


Then returning to two major suggestions made:
 * Salt
 * Custom solution specific to our needs

The custom solution could be made on top of oslo.messaging. That gives us
RPC working on different messaging systems. And that is what we really need
- an RPC into guest supporting various transports. What it lacks at the
moment is security - it has neither authentication nor ACL.

Salt also provides RPC service, but it has a couple of disadvantages: it is
tightly coupled with ZeroMQ and it needs a server process to run. A single
transport option (ZeroMQ) is a limitation we really want to avoid.
OpenStack could be deployed with various messaging providers, and we can't
limit the choice to a single option in the guest agent. Though it could be
changed in the future, it is an obstacle to consider.

Running yet another server process within OpenStack, as it was already
pointed out, is expensive. It means another server to deploy and take care
of, +1 to overall OpenStack complexity. And it does not look it could be
fixed any time soon.

For given reasons I give favor to an agent based on oslo.messaging.

Thanks,

Dmitry



2013/12/11 Fox, Kevin M 

> Yeah. Its likely that the metadata server stuff will get more
> scalable/hardened over time. If it isn't enough now, lets fix it rather
> then coming up with a new system to work around it.
>
> I like the idea of using the network since all the hypervisors have to
> support network drivers already. They also already have to support talking
> to the metadata server. This keeps OpenStack out of the hypervisor driver
> business.
>
> Kevin
>
> 
> From: Clint Byrum [cl...@fewbar.com]
> Sent: Tuesday, December 10, 2013 1:02 PM
> To: openstack-dev
> Subject: Re: [openstack-dev] Unified Guest Agent proposal
>
> Excerpts from Dmitry Mescheryakov's message of 2013-12-10 12:37:37 -0800:
> > >> What is the exact scenario you're trying to avoid?
> >
> > It is DDoS attack on either transport (AMQP / ZeroMQ provider) or server
> > (Salt / Our own self-written server). Looking at the design, it doesn't
> > look like the attack could be somehow contained within a tenant it is
> > coming from.
> >
>
> We can push a tenant-specific route for the metadata server, and a tenant
> specific endpoint for in-agent things. Still simpler than hypervisor-aware
> guests. I haven't seen anybody ask for this yet, though I'm sure if they
> run into these problems it will be the next logical step.
>
> > In the current OpenStack design I see only one similarly vulnerable
> > component - metadata server. Keeping that in mind, maybe I just
> > overestimate the threat?
> >
>
> Anything you expose to the users is "vulnerable". By using the localized
> hypervisor scheme you're now making the compute node itself vulnerable.
> Only now you're asking that an already complicated thing (nova-compute)
> add another job, rate limiting.
>
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Nova] [Neutron] How do we know a host is ready to have servers scheduled onto it?

2013-12-12 Thread Stephen Gran

On 12/12/13 17:19, Chris Friesen wrote:

On 12/12/2013 11:02 AM, Clint Byrum wrote:

So I'm asking, is there a standard way to determine whether or not a
nova-compute is definitely ready to have things scheduled on it? This
can be via an API, or even by observing something on the nova-compute
host itself. I just need a definitive signal that "the compute host is
ready".

Is it not sufficient that "nova service-list" shows the compute service
as "up"?

If not, then maybe we should call that a bug in nova...

The nova-compute service does not, currently, know about the health of,
say, the neutron openvswitch agent running on the same hardware,
although that being in good shape is necessary to be able to start
instances and have them be useful. This kind of cross-project state
coordination doesn't exist right now, AFAIK.

Cheers,
--
Stephen Gran
Senior Systems Integrator - theguardian.com
Please consider the environment before printing this email.
--
Visit theguardian.com

On your mobile, download the Guardian iPhone app theguardian.com/iphone and our iPad edition theguardian.com/iPad
Save up to 33% by subscribing to the Guardian and Observer - choose the papers you want and get full digital access.

Visit subscribe.theguardian.com

This e-mail and all attachments are confidential and may also
be privileged. If you are not the named recipient, please notify
the sender and delete the e-mail and all attachments immediately.
Do not disclose the contents to another person. You may not use
the information for any purpose, or store, or copy, it in any way.

Guardian News & Media Limited is not liable for any computer
viruses or other material transmitted with or as part of this
e-mail. You should employ virus checking software.

Guardian News & Media Limited

A member of Guardian Media Group plc
Registered Office
PO Box 68164
Kings Place
90 York Way
London
N1P 2AP

Registered in England Number 908396

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

[openstack-dev] [Neutron][IPv6] Agenda for the meeting today

2013-12-12 Thread Collins, Sean

The agenda for today is pretty light - if there is anything that people
would like to discuss, please feel free to add.

https://wiki.openstack.org/wiki/Meetings/Neutron-IPv6-Subteam#Agenda_for_Dec_12._2013

-- 
Sean M. Collins
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] Incubation Request for Barbican

2013-12-12 Thread Clark, Robert Graham

From: Bryan D. Payne [mailto:bdpa...@acm.org] 
Sent: 12 December 2013 16:12
To: OpenStack Development Mailing List (not for usage questions)
Cc: openstack...@lists.openstack.org; cloudkeep@googlegroups. com;
barbi...@lists.rackspace.com
Subject: Re: [openstack-dev] Incubation Request for Barbican

>$ git shortlog -s -e | sort -n -r
>   172 John Wood 
>   150 jfwood 
>65 Douglas Mendizabal 
>39 Jarret Raim 
>17 Malini K. Bhandaru 
>10 Paul Kehrer 
>10 Jenkins 
> 8 jqxin2006 
> 7 Arash Ghoreyshi 
> 5 Chad Lung 
> 3 Dolph Mathews 
> 2 John Vrbanac 
> 1 Steven Gonzales 
> 1 Russell Bryant 
> 1 Bryan D. Payne 
>
>It appears to be an effort done by a group, and not an individual.
Most
>commits by far are from Rackspace, but there is at least one
non-trivial
>contributor (Malini) from another company (Intel), so I think this is
OK.

There has been some interest from some quarters (RedHat, HP and others)
in
additional support. I hope that the incubation process will help
accelerate external contributions.

For what it's worth, I just wanted to express my intent to get more
involved in Barbican in the near future.  I plan to be helping out with
both reviews and coding on a variety of pieces.  So that will help (a
little) with the diversification situation.  I would also mention that
there has been great interest in Barbican among the OSSG crowd and it
wouldn't surprise me to see more people from that group getting involved
in the future.

Cheers,

-bryan

Just adding a +1 here from HP.

We're very excited about some of the capabilities that Barbican will
bring and will be engaging with the development of the project.

Cheers,

-Rob

smime.p7s
Description: S/MIME cryptographic signature
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Nova] [Neutron] How do we know a host is ready to have servers scheduled onto it?

2013-12-12 Thread Chris Friesen


On 12/12/2013 11:02 AM, Clint Byrum wrote:


So I'm asking, is there a standard way to determine whether or not a
nova-compute is definitely ready to have things scheduled on it? This
can be via an API, or even by observing something on the nova-compute
host itself. I just need a definitive signal that "the compute host is
ready".


Is it not sufficient that "nova service-list" shows the compute service 
as "up"?


If not, then maybe we should call that a bug in nova...

Chris


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements

2013-12-12 Thread Keith Basil

On Dec 11, 2013, at 3:42 PM, Robert Collins wrote:

> On 12 December 2013 01:17, Jaromir Coufal  wrote:
>> On 2013/10/12 23:09, Robert Collins wrote:
> 
 The 'easiest' way is to support bigger companies with huge deployments,
 tailored infrastructure, everything connected properly.
 
 But there are tons of companies/users who are running on old
 heterogeneous
 hardware. Very likely even more than the number of companies having
 already
 mentioned large deployments. And giving them only the way of 'setting up
 rules' in order to get the service on the node - this type of user is not
 gonna use our deployment system.
>>> 
>>> 
>>> Thats speculation. We don't know if they will or will not because we
>>> haven't given them a working system to test.
>> 
>> Some part of that is speculation, some part of that is feedback from people
>> who are doing deployments (of course its just very limited audience).
>> Anyway, it is not just pure theory.
> 
> Sure. Let be me more precise. There is a hypothesis that lack of
> direct control will be a significant adoption blocker for a primary
> group of users.
> 
> I think it's safe to say that some users in the group 'sysadmins
> having to deploy an OpenStack cloud' will find it a bridge too far and
> not use a system without direct control. Call this group A.
> 
> I think it's also safe to say that some users will not care in the
> slightest, because their deployment is too small for them to be
> particularly worried (e.g. about occasional downtime (but they would
> worry a lot about data loss)). Call this group B.
> 
> I suspect we don't need to consider group C - folk who won't use a
> system if it *has* manual control, but thats only a suspicion. It may
> be that the side effect of adding direct control is to reduce
> usability below the threshold some folk need...
> 
> To assess 'significant adoption blocker' we basically need to find the
> % of users who will care sufficiently that they don't use TripleO.
> 
> How can we do that? We can do questionnaires, and get such folk to
> come talk with use, but that suffers from selection bias - group B can
> use the system with or without direct manual control, so have little
> motivation to argue vigorously in any particular direction. Group A
> however have to argue because they won't use the system at all without
> that feature, and they may want to use the system for other reasons,
> so that because a crucial aspect for them.
> 
> A much better way IMO is to test it - to get a bunch of volunteers and
> see who responds positively to a demo *without* direct manual control.
> 
> To do that we need a demoable thing, which might just be mockups that
> show a set of workflows (and include things like Jay's
> shiny-new-hardware use case in the demo).
> 
> I rather suspect we're building that anyway as part of doing UX work,
> so maybe what we do is put a tweet or blog post up asking for
> sysadmins who a) have not yet deployed openstack, b) want to, and c)
> are willing to spend 20-30 minutes with us, walk them through a demo
> showing no manual control, and record what questions they ask, and
> whether they would like to have that product to us, and if not, then
> (a) what use cases they can't address with the mockups and (b) what
> other reasons they have for not using it.
> 
> This is a bunch of work though!
> 
> So, do we need to do that work?
> 
> *If* we can layer manual control on later, then we could defer this
> testing until we are at the point where we can say 'the nova scheduled
> version is ready, now lets decide if we add the manual control'.
> 
> OTOH, if we *cannot* layer manual control on later - if it has
> tentacles through too much of the code base, then we need to decide
> earlier, because it will be significantly harder to add later and that
> may be too late of a ship date for vendors shipping on top of TripleO.
> 
> So with that as a prelude, my technical sense is that we can layer
> manual scheduling on later: we provide an advanced screen, show the
> list of N instances we're going to ask for and allow each instance to
> be directly customised with a node id selected from either the current
> node it's running on or an available node. It's significant work both
> UI and plumbing, but it's not going to be made harder by the other
> work we're doing AFAICT.
> 
> -> My proposal is that we shelve this discussion until we have the
> nova/heat scheduled version in 'and now we polish' mode, and then pick
> it back up and assess user needs.
> 
> An alternative argument is to say that group A is a majority of the
> userbase and that doing an automatic version is entirely unnecessary.
> Thats also possible, but I'm extremely skeptical, given the huge cost
> of staff time, and the complete lack of interest my sysadmin friends
> (and my former sysadmin self) have in doing automatable things by
> hand.
> 
>>> Lets break the concern into two halves:
>>> A) Users who could have t

Re: [openstack-dev] [Nova] [Neutron] How do we know a host is ready to have servers scheduled onto it?

On 12/12/2013 12:02 PM, Clint Byrum wrote:
> I've been chasing quite a few bugs in the TripleO automated bring-up
> lately that have to do with failures because either there are no valid
> hosts ready to have servers scheduled, or there are hosts listed and
> enabled, but they can't bind to the network because for whatever reason
> the L2 agent has not checked in with Neutron yet.
> 
> This is only a problem in the first few minutes of a nova-compute host's
> life. But it is critical for scaling up rapidly, so it is important for
> me to understand how this is supposed to work.
> 
> So I'm asking, is there a standard way to determine whether or not a
> nova-compute is definitely ready to have things scheduled on it? This
> can be via an API, or even by observing something on the nova-compute
> host itself. I just need a definitive signal that "the compute host is
> ready".

If a nova compute host has registered itself to start having instances
scheduled to it, it *should* be ready.  AFAIK, we're not doing any
network sanity checks on startup, though.

We already do some sanity checks on startup.  For example, nova-compute
requires that it can talk to nova-conductor.  nova-compute will block on
startup until nova-conductor is responding if they happened to be
brought up at the same time.

We could do something like this with a networking sanity check if
someone could define what that check should look like.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

[openstack-dev] [Nova] [Neutron] How do we know a host is ready to have servers scheduled onto it?

I've been chasing quite a few bugs in the TripleO automated bring-up
lately that have to do with failures because either there are no valid
hosts ready to have servers scheduled, or there are hosts listed and
enabled, but they can't bind to the network because for whatever reason
the L2 agent has not checked in with Neutron yet.

This is only a problem in the first few minutes of a nova-compute host's
life. But it is critical for scaling up rapidly, so it is important for
me to understand how this is supposed to work.

So I'm asking, is there a standard way to determine whether or not a
nova-compute is definitely ready to have things scheduled on it? This
can be via an API, or even by observing something on the nova-compute
host itself. I just need a definitive signal that "the compute host is
ready".

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] Generic question: Any tips for 'keeping up' with the mailing lists?

Excerpts from Justin Hammond's message of 2013-12-12 08:23:24 -0800:
> I am a developer who is currently having troubles keeping up with the
> mailing list due to volume, and my inability to organize it in my client.
> I am nearly forced to use Outlook 2011 for Mac and I have read and
> attempted to implement
> https://wiki.openstack.org/wiki/MailingListEtiquette but it is still a lot
> to deal with. I read once a topic or wiki page on using X-Topics but I
> have no idea how to set that in outlook (google has told me that the
> feature was removed).

Justin I'm sorry that the volume is catching up with you. I have a highly
optimized email-list-reading work-flow using sup-mail and a few filters,
and I still spend 2 hours a day sifting through all of the lists I am on
(not just openstack lists). It is worth it to keep aware and to avoid
duplicating efforts, even if it means I have to hit the "kill this thread"
button a lot.

Whomever is forcing you to use this broken client, I suggest that you
explain to them your situation. It is the reason for your problems. Note
that you can just subscribe to the list from a different address than
you post from, and configure a good e-mail client like Thunderbird to set
your From: address so that you still are representing your organization
the way you'd like to. So if it is just a mail server thing, that is
one way around it.

Also the setup I use makes use of offlineimap, which can filter things
for you, so if you have IMAP access to your inbox, you can use that and
then just configure your client for local access (I believe Thunderbird
even supports a local Maildir mode).

Anyway, you _MUST_ have a threaded email client that quotes well for
replies. If not, I'm afraid it will remain unnecessarily difficult to
participate on this list.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] Generic question: Any tips for 'keeping up' with the mailing lists?

2013-12-12 Thread Thierry Carrez

Russell Bryant wrote:
> On 12/12/2013 11:23 AM, Justin Hammond wrote:
>> I am a developer who is currently having troubles keeping up with the
>> mailing list due to volume, and my inability to organize it in my client.
>> I am nearly forced to use Outlook 2011 for Mac and I have read and
>> attempted to implement
>> https://wiki.openstack.org/wiki/MailingListEtiquette but it is still a lot
>> to deal with. I read once a topic or wiki page on using X-Topics but I
>> have no idea how to set that in outlook (google has told me that the
>> feature was removed).
>>
>> I'm not sure if this is a valid place for this question, but I *am* having
>> difficulty as a developer.
>>
>> Thank you for anyone who takes the time to read this.
> 
> The trick is defining what "keeping up" means for you.  I doubt anyone
> reads everything.  I certainly don't.
> 
> First, I filter all of openstack-dev into its own folder.  I'm sure
> others filter more aggressively based on topic, but I don't since I know
> I may be interested in threads in any of the topics.  Figure out what
> filtering works for you.
> 
> I scan subjects for the threads I'd probably be most interested in.
> While I'm scanning, I'm first looking for topic tags, like [Nova], then
> I read the subject and decide whether I want to dive in and read the
> rest.  It happens very quickly, but that's roughly my thought process.
> 
> With whatever is left over: mark all as read.  :-)

I used to have headaches keeping up with openstack-dev, but now I follow
something very similar to what Russell describes. In addition I use
starring to mark threads I want to follow more closely, for quick retrieval.

The most useful tip I can give you: accept that you can't be reading
everything, and that there are things that may happen in OpenStack that
you can't control. I've been involved with OpenStack since the
beginning, and part of my job was to be aware of everything. With the
explosive growth of the project, that doesn't scale that well. Since I
started ignoring stuff (and "marking thread read" and "marking folder
read" as necessary) I end up being able to start doing some useful work
mid-morning (rather than mid-afternoon).

-- 
Thierry Carrez (ttx)

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] Stackalytics 0.4 released!

2013-12-12 Thread Monty Taylor



On 12/12/2013 04:49 PM, Ilya Shakhat wrote:
> Hello everyone!
> 
> Stackalytics team is happy to announce the release of version 0.4. This
> release is completely dedicated to different types of reports. We added
> highly demanded top reviewers chart acknowledged as an essential tool
> for finding most active reviewers
> (ex. http://stackalytics.com/report/reviews/neutron-group/30). Open
> reviews report to help core engineers with tracking the backlog and
> reviews that stay for too long
> (http://stackalytics.com/report/reviews/nova/open). And activity report,
> the one to show all work done by engineer and another by company. Also
> this report includes nice punch-card and the one can find that there are
> really world-wide never-sleeping contributors
> like http://stackalytics.com/report/companies/red%20hat :)

Nice work. On the activity chart, it shows an activity graph of time and
day. What timezone are those hours shown in?

> In details, total changes are:
> 
>   * Added review stats report
>  that shows
> top reviewers with breakdown by marks and disagreement ratio against
> core's decision
>   * Added open reviews report
>  that shows top
> longest reviews and backlog summary
>   * Added activity report
>  with engineer's
> activity log and punch-card of usual online hours (in UTC). The same
> report is available for companies
>   * Fixed review stats calculation, now Approve marks are counted
> separately
>   * Fixed commit date calculation, now it is date of merge, not commit
>   * Minor improvements in filter selectors
>   * Incorporated 21 updates to user and company profiles in default data 
> 
> The next Stackalytics meeting will be on Monday, Dec 16 at 15:00 UTC in
> #openstack-meeting. Come and join us, we have somemore things for the
> next release.
> 
> Thanks,
> Ilya
> 
> 
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements

2013-12-12 Thread Keith Basil

On Dec 10, 2013, at 5:09 PM, Robert Collins wrote:

> On 11 December 2013 05:42, Jaromir Coufal  wrote:
>> On 2013/09/12 23:38, Tzu-Mainn Chen wrote:
>>> The disagreement comes from whether we need manual node assignment or not.
>>> I would argue that we
>>> need to step back and take a look at the real use case: heterogeneous
>>> nodes.  If there are literally
>>> no characteristics that differentiate nodes A and B, then why do we care
>>> which gets used for what?  Why
>>> do we need to manually assign one?
>> 
>> 
>> Ideally, we don't. But with this approach we would take out the possibility
>> to change something or decide something from the user.
> 
> So, I think this is where the confusion is. Using the nova scheduler
> doesn't prevent change or control. It just ensures the change and
> control happen in the right place: the Nova scheduler has had years of
> work, of features and facilities being added to support HPC, HA and
> other such use cases. It should have everything we need [1], without
> going down to manual placement. For clarity: manual placement is when
> any of the user, Tuskar, or Heat query Ironic, select a node, and then
> use a scheduler hint to bypass the scheduler.
> 
>> The 'easiest' way is to support bigger companies with huge deployments,
>> tailored infrastructure, everything connected properly.
>> 
>> But there are tons of companies/users who are running on old heterogeneous
>> hardware. Very likely even more than the number of companies having already
>> mentioned large deployments. And giving them only the way of 'setting up
>> rules' in order to get the service on the node - this type of user is not
>> gonna use our deployment system.
> 
> Thats speculation. We don't know if they will or will not because we
> haven't given them a working system to test.
> 
> Lets break the concern into two halves:
> A) Users who could have their needs met, but won't use TripleO because
> meeting their needs in this way is too hard/complex/painful.
> 
> B) Users who have a need we cannot meet with the current approach.
> 
> For category B users, their needs might be specific HA things - like
> the oft discussed failure domains angle, where we need to split up HA
> clusters across power bars, aircon, switches etc. Clearly long term we
> want to support them, and the undercloud Nova scheduler is entirely
> capable of being informed about this, and we can evolve to a holistic
> statement over time. Lets get a concrete list of the cases we can
> think of today that won't be well supported initially, and we can
> figure out where to do the work to support them properly.
> 
> For category A users, I think that we should get concrete examples,
> and evolve our design (architecture and UX) to make meeting those
> needs pleasant.
> 
> What we shouldn't do is plan complex work without concrete examples
> that people actually need. Jay's example of some shiny new compute
> servers with special parts that need to be carved out was a great one
> - we can put that in category A, and figure out if it's easy enough,
> or obvious enough - and think about whether we document it or make it
> a guided workflow or $whatever.
> 
>> Somebody might argue - why do we care? If user doesn't like TripleO
>> paradigm, he shouldn't use the UI and should use another tool. But the UI is
>> not only about TripleO. Yes, it is underlying concept, but we are working on
>> future *official* OpenStack deployment tool. We should care to enable people
>> to deploy OpenStack - large/small scale, homo/heterogeneous hardware,
>> typical or a bit more specific use-cases.
> 
> The difficulty I'm having is that the discussion seems to assume that
> 'heterogeneous implies manual', but I don't agree that that
> implication is necessary!
> 
>> As an underlying paradigm of how to install cloud - awesome idea, awesome
>> concept, it works. But user doesn't care about how it is being deployed for
>> him. He cares about getting what he wants/needs. And we shouldn't go that
>> far that we violently force him to treat his infrastructure as cloud. I
>> believe that possibility to change/control - if needed - is very important
>> and we should care.
> 
> I propose that we make concrete use cases: 'Fred cannot use TripleO
> without manual assignment because XYZ'. Then we can assess how
> important XYZ is to our early adopters and go from there.
> 
>> And what is key for us is to *enable* users - not to prevent them from using
>> our deployment tool, because it doesn't work for their requirements.
> 
> Totally agreed :)
> 
>>> If we can agree on that, then I think it would be sufficient to say that
>>> we want a mechanism to allow
>>> UI users to deal with heterogeneous nodes, and that mechanism must use
>>> nova-scheduler.  In my mind,
>>> that's what resource classes and node profiles are intended for.
>> 
>> 
>> Not arguing on this point. Though that mechanism should support also cases,
>> where user specifies a role for a node / removes node fro

Re: [openstack-dev] Generic question: Any tips for 'keeping up' with the mailing lists?

On 12/12/2013 11:23 AM, Justin Hammond wrote:
> I am a developer who is currently having troubles keeping up with the
> mailing list due to volume, and my inability to organize it in my client.
> I am nearly forced to use Outlook 2011 for Mac and I have read and
> attempted to implement
> https://wiki.openstack.org/wiki/MailingListEtiquette but it is still a lot
> to deal with. I read once a topic or wiki page on using X-Topics but I
> have no idea how to set that in outlook (google has told me that the
> feature was removed).
> 
> I'm not sure if this is a valid place for this question, but I *am* having
> difficulty as a developer.
> 
> Thank you for anyone who takes the time to read this.

The trick is defining what "keeping up" means for you.  I doubt anyone
reads everything.  I certainly don't.

First, I filter all of openstack-dev into its own folder.  I'm sure
others filter more aggressively based on topic, but I don't since I know
I may be interested in threads in any of the topics.  Figure out what
filtering works for you.

I scan subjects for the threads I'd probably be most interested in.
While I'm scanning, I'm first looking for topic tags, like [Nova], then
I read the subject and decide whether I want to dive in and read the
rest.  It happens very quickly, but that's roughly my thought process.

With whatever is left over: mark all as read.  :-)

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Nova] cross-project bug fixes