Re: [openstack-dev] [Heat] Convergence proof-of-concept showdown

2014-12-15 Thread Zane Bitter

On 15/12/14 10:15, Anant Patil wrote:

On 13-Dec-14 05:42, Zane Bitter wrote:

On 12/12/14 05:29, Murugan, Visnusaran wrote:




-Original Message-
From: Zane Bitter [mailto:zbit...@redhat.com]
Sent: Friday, December 12, 2014 6:37 AM
To: openstack-dev@lists.openstack.org
Subject: Re: [openstack-dev] [Heat] Convergence proof-of-concept
showdown

On 11/12/14 08:26, Murugan, Visnusaran wrote:

[Murugan, Visnusaran]
In case of rollback where we have to cleanup earlier version of
resources,

we could get the order from old template. We'd prefer not to have a
graph table.

In theory you could get it by keeping old templates around. But that
means keeping a lot of templates, and it will be hard to keep track
of when you want to delete them. It also means that when starting an
update you'll need to load every existing previous version of the
template in order to calculate the dependencies. It also leaves the
dependencies in an ambiguous state when a resource fails, and
although that can be worked around it will be a giant pain to implement.



Agree that looking to all templates for a delete is not good. But
baring Complexity, we feel we could achieve it by way of having an
update and a delete stream for a stack update operation. I will
elaborate in detail in the etherpad sometime tomorrow :)


I agree that I'd prefer not to have a graph table. After trying a
couple of different things I decided to store the dependencies in the
Resource table, where we can read or write them virtually for free
because it turns out that we are always reading or updating the
Resource itself at exactly the same time anyway.



Not sure how this will work in an update scenario when a resource does
not change and its dependencies do.


We'll always update the requirements, even when the properties don't
change.



Can you elaborate a bit on rollback.


I didn't do anything special to handle rollback. It's possible that we
need to - obviously the difference in the UpdateReplace + rollback case
is that the replaced resource is now the one we want to keep, and yet
the replaced_by/replaces dependency will force the newer (replacement)
resource to be checked for deletion first, which is an inversion of the
usual order.



This is where the version is so handy! For UpdateReplaced ones, there is


*sometimes*


an older version to go back to. This version could just be template ID,
as I mentioned in another e-mail. All resources are at the current
template ID if they are found in the current template, even if they is
no need to update them. Otherwise, they need to be cleaned-up in the
order given in the previous templates.

I think the template ID is used as version as far as I can see in Zane's
PoC. If the resource template key doesn't match the current template
key, the resource is deleted. The version is misnomer here, but that
field (template id) is used as though we had versions of resources.


Correct. Because if we had wanted to keep it, we would have updated it 
to the new template version already.



However, I tried to think of a scenario where that would cause problems
and I couldn't come up with one. Provided we know the actual, real-world
dependencies of each resource I don't think the ordering of those two
checks matters.

In fact, I currently can't think of a case where the dependency order
between replacement and replaced resources matters at all. It matters in
the current Heat implementation because resources are artificially
segmented into the current and backup stacks, but with a holistic view
of dependencies that may well not be required. I tried taking that line
out of the simulator code and all the tests still passed. If anybody can
think of a scenario in which it would make a difference, I would be very
interested to hear it.

In any event though, it should be no problem to reverse the direction of
that one edge in these particular circumstances if it does turn out to
be a problem.


We had an approach with depends_on
and needed_by columns in ResourceTable. But dropped it when we figured out
we had too many DB operations for Update.


Yeah, I initially ran into this problem too - you have a bunch of nodes
that are waiting on the current node, and now you have to go look them
all up in the database to see what else they're waiting on in order to
tell if they're ready to be triggered.

It turns out the answer is to distribute the writes but centralise the
reads. So at the start of the update, we read all of the Resources,
obtain their dependencies and build one central graph[1]. We than make
that graph available to each resource (either by passing it as a
notification parameter, or storing it somewhere central in the DB that
they will all have to read anyway, i.e. the Stack). But when we update a
dependency we don't update the central graph, we update the individual
Resource so there's no global lock required.

[1]
https://github.co

[openstack-dev] [Heat] No meeting this week

2014-04-28 Thread Zane Bitter
Since this is the designated "off" week, I'm going to cancel this week's 
IRC meeting. By happy coincidence, that will give us until after Summit 
to decide on a new alternate meeting time :)


The next meeting will be on the 7th of May, at the regular time (2000 UTC).

cheers,
Zane.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] [Heat] Custom Nova Flavor creation through Heat (pt.2)

2014-05-06 Thread Zane Bitter

On 05/05/14 13:40, Solly Ross wrote:

One thing that I was discussing with @jaypipes and @dansmith over
on IRC was the possibility of breaking flavors down into separate
components -- i.e have a disk flavor, a CPU flavor, and a RAM flavor.
This way, you still get the control of the size of your building blocks
(e.g. you could restrict RAM to only 2GB, 4GB, or 16GB), but you avoid
exponential flavor explosion by separating out the axes.


Dimitry and I have discussed this on IRC already (no-one changed their 
mind about anything as a result), but I just wanted to note here that I 
think even this idea is crazy.


VMs are not allocated out of a vast global pool of resources. They're 
allocated on actual machines that have physical hardware costing real 
money in fixed ratios.


Here's a (very contrived) example. Say your standard compute node can 
support 16 VCPUs and 64GB of RAM. You can sell a bunch of flavours: 
maybe 1 VCPU + 4GB, 2 VCPU + 8GB, 4 VCPU + 16GB... &c. But if (as an 
extreme example) you sell a server with 1 VCPU and 64GB of RAM you have 
a big problem: 15 VCPUs that nobody has paid for and you can't sell. 
(Disks add a new dimension of wrongness to the problem.)


The insight of flavours, which is fundamental to the whole concept of 
IaaS, is that users must pay the *opportunity cost* of their resource 
usage. If you allow users to opt, at their own convenience, to pay only 
the actual cost of the resources they use regardless of the opportunity 
cost to you, then your incentives are no longer aligned with your 
customers. You'll initially be very popular with the kind of customers 
who are taking advantage of you, but you'll have to hike prices across 
the board to make up the cost leading to a sort of dead-sea effect. A 
Gresham's Law of the cloud, if you will, where bad customers drive out 
good customers.


Simply put, a cloud allowing users to define their own flavours *loses* 
to one with predefined flavours 10 times out of 10.


In the above example, you just tell the customer: bad luck, you want 
64GB of RAM, you buy 16 VCPUs whether you want them or not. It can't 
actually hurt to get _more_ than you wanted, even though you'd rather 
not pay for it (provided, of course, that everyone else *is* paying for 
it, and cross-subsidising you... which they won't).


Now, it's not the OpenStack project's job to prevent operators from 
going bankrupt. But I think at the point where we are adding significant 
complexity to the project just to enable people to confirm the 
effectiveness of a very obviously infallible strategy for losing large 
amounts of money, it's time to draw a line.



By the way, the whole theory behind this idea seems to be that this:

  nova server-create --cpu-flavor=4 --ram-flavour=16G --disk-flavor=200G

minimises the cognitive load on the user, whereas this:

  nova server-create --flavor=4-64G-200G

will cause the user's brain to explode from its combinatorial 
complexity. I find this theory absurd.


In other words, if you really want to lose some money, it's perfectly 
feasible with the existing flavour implementation. The operator is only 
ever 3 for-loops away from setting up every combination of flavours 
possible from combining the CPU, RAM and disk options, and can even 
apply whatever constraints they desire.



All that said, Heat will expose any API that Nova implements. Choose wisely.

cheers,
Zane.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [Mistral][Heat] Feedback on the Mistral DSL

2014-05-07 Thread Zane Bitter

Hi Mistral folks,
Congrats on getting the 0.0.2 release out. I had a look at Renat's 
screencast and the examples, and I wanted to share some feedback based 
on my experience with Heat. Y'all will have to judge for yourselves to 
what extent this experience is applicable to Mistral. (Assume that 
everything I know about it was covered in the screencast and you won't 
be far wrong.)


The first thing that struck me looking at 
https://github.com/stackforge/mistral-extra/tree/master/examples/create_vm 
is that I have to teach Mistral how to talk to Nova. I can't overstate 
how surprising this is as a user, because Mistral is supposed to want to 
become a part of OpenStack. It should know how to talk to Nova! There is 
actually an existing DSL for interacting with OpenStack[1], and here's 
what the equivalent operation looks like:


os server create $server_name --image $image_id --flavor $flavor_id 
--nic net-id=$network_id


Note that this is approximately exactly 96.875% shorter (or 3200% 
shorter, if you're in advertising).


This approach reminds me a bit of TOSCA, in the way that it requires you 
to define every node type before you use it. (Even TOSCA is moving away 
from this by developing a Simple Profile that includes the most common 
ones in the box - an approach I assume/hope you're considering also.) 
The stated reason for this is that they want TOSCA templates to run on 
any cloud regardless of its underlying features (rather than take a 
lowest-common-denominator approach, as other attempts at hybrid clouds 
have done). Contrast that with Heat, which is unapologetically an 
orchestration system *for OpenStack*.


I note from the screencast that Mistral's stated mission is to:

  Provide a mechanism to define and execute
  tasks and workflows *in OpenStack clouds*

(My emphasis.) IMO the design doesn't reflect the mission. You need to 
decide whether you are trying to build the OpenStack workflow DSL or the 
workflow DSL to end all workflow DSLs.



That problem could be solved by including built-in definitions for core 
OpenStack service in a similar way to std.* (i.e. take the TOSCA Simple 
Profile approach), but I'm actually not sure that goes far enough. The 
lesson of Heat is that we do best when we orchestrate *only* OpenStack APIs.


For example, when we started working on Heat, there was no autoscaling 
in OpenStack so we implemented it ourselves inside Heat. Two years 
later, there's still no autoscaling in OpenStack other than what we 
implemented, and we've been struggling for a year to try to split Heat's 
implementation out into a separate API so that everyone can use it.


Looking at things like std.email, I feel a similar way about them. 
OpenStack is missing something equivalent to SNS, where a message on a 
queue can trigger an email or another type of notification, and a lot of 
projects are going to eventually need something like that. It would be 
really unfortunate if all of them went out and invented it 
independently. It's much better to implement such things as their own 
building blocks that can be combined together in complex ways rather 
than adding that complexity to a bunch of services.


Such a notification service could even be extended to do std.http-like 
ReST calls, although personally the whole idea of OpenStack services 
calling out to arbitrary HTTP APIs makes me extremely uncomfortable. 
Much better IMO to just post messages to queues and let the receiver 
(long) poll for it.


So I would favour a DSL that is *much* simpler, and replaces all of 
std.* with functions that call OpenStack APIs, and only OpenStack APIs, 
including the API for posting messages to Marconi queues, which would be 
the method of communication to the outside world. (If the latter part 
sounds a bit like SWF, it's for a good reason, but the fact that it 
would allow access directly to all of the OpenStack APIs before 
resorting to an SDK makes it much more powerful, as well as providing a 
solid justification for why this should be part of OpenStack.)


The ideal way to get support for all of the possible OpenStack APIs 
would be to do it by introspection on python-openstackclient. That means 
you'd only have to do the work once and it will stay up to date. This 
would avoid the problem we have in Heat, where we have to implement each 
resource type separately. (This is the source of a great deal of Heat's 
value to users - the existence of tested resource plugins - but also the 
thing that stops us from iterating the code quicker.)



I'm also unsure that it's a good idea for things like timers to be set 
up inside the DSL. I would prefer that the DSL just define workflows and 
export entry points to them. Then have various ways to trigger them: 
from the API manually, from a message to a Marconi queue, from a timer, 
&c. The latter two you'd set up through the Mistral API. If a user 
wanted a single document that set up one or more workflows and their 
triggers, a Heat template woul

Re: [openstack-dev] [heat] How to cross-reference resources within OS::Heat::ResourceGroup

2014-05-07 Thread Zane Bitter

On 06/05/14 16:07, Janczuk, Tomasz wrote:

Could this be accomplished with 3 resource groups instead of one? The
first would create the ports, the second floating IPs, and the last the
VMs? In that case, would there be a way to construct a reference to a
particular instance of, say, a port, when creating an instance of a
floating IP?


No, and that wouldn't make any sense. The scaling unit is the 3 
resources together, so you should group them into a template and scale that.


Also, please post usage questions to ask.openstack.org, not to the 
development list.


thanks,
Zane.



On 5/6/14, 12:41 PM, "Randall Burt"  wrote:


A resource group's definition contains only one resource and you seem to
want groups of multiple resources. You would need to use a nested stack
or provider template to do what you're proposing.

On May 6, 2014, at 2:23 PM, "Janczuk, Tomasz" 
wrote:


I am trying to create an OS::Heat::ResourceGroup of VMs and assign each
VM a floating IP. As far as I know this requires cross-referencing the
VM, port, and floating IP resources. How can I do that within a
OS::Heat::ResourceGroup definition?

The `port: { get_resource: vm_cluster.vm_port.0 }` below is rejected by
Heat.

Any help appreciated.

Thanks,
Tomasz

  vm_cluster:
type: OS::Heat::ResourceGroup
properties:
  count: { get_param: num_instances }
  resource_def:
vm:
  type: OS::Nova::Server
  properties:
key_name: { get_param: key_name }
flavor: { get_param: flavor }
image: { get_param: image }
networks:
  - port: { get_resource: vm_cluster.vm_port.0 }
vm_port:
  type: OS::Neutron::Port
  properties:
network_id: { get_param: private_net_id }
fixed_ips:
  - subnet_id: { get_param: private_subnet_id }
security_groups: [{ get_resource: rabbit_security_group }]
vm_floating_ip:
  type: OS::Neutron::FloatingIP
  properties:
floating_network_id: { get_param: public_net_id }
port_id: { get_resource: vm_cluster.vm_port.0 }

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] [Heat] Custom Nova Flavor creation through Heat (pt.2)

2014-05-20 Thread Zane Bitter

On 20/05/14 12:17, Jay Pipes wrote:

Hi Zane, sorry for the delayed response. Comments inline.

On 05/06/2014 09:09 PM, Zane Bitter wrote:

On 05/05/14 13:40, Solly Ross wrote:

One thing that I was discussing with @jaypipes and @dansmith over
on IRC was the possibility of breaking flavors down into separate
components -- i.e have a disk flavor, a CPU flavor, and a RAM flavor.
This way, you still get the control of the size of your building blocks
(e.g. you could restrict RAM to only 2GB, 4GB, or 16GB), but you avoid
exponential flavor explosion by separating out the axes.


Dimitry and I have discussed this on IRC already (no-one changed their
mind about anything as a result), but I just wanted to note here that I
think even this idea is crazy.

VMs are not allocated out of a vast global pool of resources. They're
allocated on actual machines that have physical hardware costing real
money in fixed ratios.

Here's a (very contrived) example. Say your standard compute node can
support 16 VCPUs and 64GB of RAM. You can sell a bunch of flavours:
maybe 1 VCPU + 4GB, 2 VCPU + 8GB, 4 VCPU + 16GB... &c. But if (as an
extreme example) you sell a server with 1 VCPU and 64GB of RAM you have
a big problem: 15 VCPUs that nobody has paid for and you can't sell.
(Disks add a new dimension of wrongness to the problem.)


You are assuming a public cloud provider use case above. As much as I
tend to focus on the utility cloud model, where the incentives are
around maximizing the usage of physical hardware by packing in as many
paying tenants into a fixed resource, this is only one domain for
OpenStack.


I was assuming the use case advanced in this thread, which sounded like 
a semi-public cloud model.


However, I'm actually trying to argue from a higher level of abstraction 
here. In any situation where there are limited resources, optimal 
allocation of those resources will occur when the incentives of the 
suppliers and consumers of said resources are aligned, independently of 
whose definition of "optimal" you use. This applies equally to public 
clouds, private clouds, lemonade stands, and the proverbial two guys 
stranded on a desert island. In other words, it's an immutable property 
of economies, not anything specific to one use case.



There are, for good or bad, IT shops and telcos that frankly are willing
to dump money into an inordinate amount of hardware -- and see that
hardware be inefficiently used -- in order to appease the demands of
their application customer tenants. The impulse of onboarding teams for
these private cloud systems is to "just say yes", with utter disregard
to the overall cost efficiency of the proposed customer use cases.


Fine, but what I'm saying is that you can just give the customer _more_ 
than they really wanted (i.e. round up to the nearest flavour). You can 
charge them the same if you want - you can even decouple pricing from 
the flavour altogether if you want. But what you can't do is assume 
that, just because you gave the customer exactly what they needed and 
not one kilobyte more, you still get to use/sell the excess capacity you 
didn't allocate to them. Because you may not.



If there was a simple switching mechanism that allowed a deployer to
turn on or off this ability to allow tenants to construct specialized
instance type configurations, then who really loses here? Public or
utility cloud providers would simply leave the switch to its default of
"off" and folks who wanted to provide this functionality to their users
could provide it. Of course, there are clear caveats around lack of
portability to other clouds -- but let's face it, cross-cloud
portability has other challenges beyond this particular point ;)


The insight of flavours, which is fundamental to the whole concept of
IaaS, is that users must pay the *opportunity cost* of their resource
usage. If you allow users to opt, at their own convenience, to pay only
the actual cost of the resources they use regardless of the opportunity
cost to you, then your incentives are no longer aligned with your
customers.


Again, the above assumes a utility cloud model. Sadly, that isn't the
only cloud model.


The only assumption is that resources are not (effectively) unlimited.


You'll initially be very popular with the kind of customers
who are taking advantage of you, but you'll have to hike prices across
the board to make up the cost leading to a sort of dead-sea effect. A
Gresham's Law of the cloud, if you will, where bad customers drive out
good customers.

Simply put, a cloud allowing users to define their own flavours *loses*
to one with predefined flavours 10 times out of 10.

In the above example, you just tell the customer: bad luck, you want
64GB of RAM, you buy 16 VCPUs whether you want them or not. It can't
actually hurt to get _more_ than you wanted, even though you'd rather
not pay for it (pro

[openstack-dev] [Heat] Meeting times for Juno

2014-05-21 Thread Zane Bitter
I know people are very interested in an update on this, so here's the 
plan such as it is.


I'd like to reassure everyone that we will definitely be keeping our 
original meeting time of Wednesdays at 2000 UTC, every second week 
(including this week).


I want to try out a new time for the alternate meetings, so let's bring 
it forward by 12 hours to Wednesdays at 1200 UTC. Unfortunately we'll 
lose the west coast of the US, but participation from there was not high 
anyway due to bad timing, and we'll gain folks in Europe. I'm also 
hoping that it will be at least as good or better for folks in Asia. The 
first meeting at this time will be next week.


I've reserved #openstack-meeting for this purpose at 
https://wiki.openstack.org/wiki/Meetings but from past experience we 
know that people sometimes don't book it yet still expect not to be 
kicked out, so we'll have to see how it goes ;) If you get lost, look in 
#heat.


We'll see how this works over the next few meetings at that time and 
re-evaluate.


If in doubt, check https://wiki.openstack.org/wiki/Meetings/HeatAgenda 
for times; I try to keep it up to date despite the provocation of a 
ridiculously short timeout on the auth cookies.


Today's meeting is at 2000 UTC - see y'all there :)

cheers,
Zane.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Openstack-docs] [Heat][Documentation] Heat template documentation

2014-05-27 Thread Zane Bitter

On 23/05/14 06:38, Andreas Jaeger wrote:

On 05/23/2014 12:13 PM, Steven Hardy wrote:

[...]
I'll hold my hand up as one developer who tried to contribute but ran away
screaming due to all the XML-java-ness of the current process.

I don't think markup complexity is a major barrier to contribution. Needing
to use a closed source editor and download unfathomably huge amounts of
java to build locally definitely are though IMO/IME.


You do not need a closed sourced editor for XML - I'm using emacs and
others in the team use vi for it.


I mostly agree with this. DocBook is actually not too bad to write (I 
say this as a non-fan of XML), and it is by far the most expressive 
markup language. For the kinds of use cases typical of documentation (to 
take one example, marking which parts of example commands are 
substitutable) you can't really beat it. I've tried a lot of markup 
languages, and even written one, but they can only be simpler that 
DocBook when they're adapted to support only particular use cases that 
are less complex than the ones we have.


I may be misremembering, but I seem to recall that the docs produced 
with the proprietary tool that I forget the name of have some 
idiosyncratic formatting (in terms of the locations of line breaks &c.) 
that is very difficult to match by hand. People are going to refer to 
the existing source as a guide to what to do, and if it looks really 
hard to duplicate that is one barrier to entry.



Yes, it downloads a lot Java once. We also now build the documents as
part of the gate, so you can also check changes by clicking the
"checkbuild" target, it will show you the converted books,


This is definitely a great improvement. It's only part of the solution 
though - if developers were using the gate to run PEP8 instead of 
running it locally, I would tell them to stop wasting my time, since 
everybody on the review list gets 3 new emails each time they upload a 
patchset to fix some whitespace problem. What you need when you're 
editing a complex markup language manually is a fast feedback loop, and 
uploading a new patchset to Gerrit is not that.


Last time I looked at this stuff, it involved spending several hours 
trying to install the Java tools, and culminated in me giving up and 
just shipping the patch and hoping someone else would notice if they 
didn't work (IIRC this was actually creating the WADLs for the Heat API, 
and miraculously they did work).


IMHO this remains a huge barrier to anyone getting started who is not 
exceptionally motivated (by which I mean contributing to OpenStack docs 
is more or less their full time job). Ideally we would have something as 
simple as, or preferably simpler than, running unit tests with tox to 
rapidly build the docs without performing a huge local installation. (I 
don't know what the solution here looks like though... maybe a doc 
building service?)


cheers,
Zane.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Designate Incubation Request

2014-05-28 Thread Zane Bitter

On 28/05/14 14:52, Sean Dague wrote:

I would agree this doesn't make sense in Neutron.

I do wonder if it makes sense in the Network program. I'm getting
suspicious of the programs for projects model if every new project
incubating in seems to need a new program. Which isn't really a
reflection on designate, but possibly on our program structure.


I agree, the whole program/project thing is confusing (as we've 
discovered this week, programs actually have code names which are... 
identical to the name of a project in the program) and IMHO unnecessary. 
Programs share a common core team, so it is inevitable that most 
incubated projects (including, I would think, Designate) will make sense 
in a separate program. (TripleO incorporating Tuskar is the only 
counterexample I can think of.)


Let's just get rid of the 'project' terminology. Let programs organise 
whatever repos they have in whatever way they see fit, with the proviso 
that they consult with the TC on any change in scope.


cheers,
Zane.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Designate Incubation Request

2014-05-29 Thread Zane Bitter

On 29/05/14 05:26, Thierry Carrez wrote:

Sean Dague wrote:

I honestly just think we might want to also use it as a time to rethink
our program concept. Because all our programs that include projects that
are part of the integrated release are 1 big source tree, and maybe a
couple of little trees that orbit it (client and now specs repos). If we
always expect that to be the case, I'm not really sure why we built this
intermediate grouping.


Programs were established to solve two problems. First one is the
confusion around project types. We used to have project types[1] that
were trying to reflect and include all code repositories that we wanted
to make "official". That kept on changing, was very confusing, and did
not allow flexibility for each team in how they preferred to organize
their code repositories. The second problem that solved was to recognize
non-integrated-project efforts which were still essential to the
production of OpenStack, like Infra or Docs.

[1] https://wiki.openstack.org/wiki/ProjectTypes

"Programs" just let us bless goals and teams and let them organize code
however they want, with contribution to any code repo under that
umbrella being considered "official" and ATC-status-granting.


This is definitely how it *should* work.

I think the problem is that we still have elements of the 'project' 
terminology around from the bad old days of the pointless 
core/core-but-don't-call-it-core/library/gating/supporting project 
taxonomy, where project == repository. The result is that every time a 
new project gets incubated, the reaction is always "Oh man, you want a 
new *program* too? That sounds really *heavyweight*." If people treated 
the terms 'program' and 'project' as interchangeable and just referred 
to repositories by another name ('repositories', perhaps?) then this 
wouldn't keep coming up.


(IMHO the quickest way to effect this change in mindset would be to drop 
the term 'program' and call the programs projects. In what meaningful 
sense is e.g. Infra or Docs not a "project"?)



I would be
a bit reluctant to come back to the projecttypes mess and create
categories of programs (integrated projects on one side, and "others").


I agree, but why do we need different categories? Is anybody at all 
confused about this? Are there people out there installing our custom 
version of Gerrit and wondering why it won't boot VMs?


The categories existed largely because of the aforementioned strange 
definition of 'project' and the need to tightly control the membership 
of the TC. Now that the latter is no longer an issue, we could eliminate 
the distinction between programs and projects without bringing the 
categories back.



Back to the topic, the tension here is because DNS is seen as a
"network" thing and therefore it sounds like it makes sense under
"Networking". But "programs" are not categories or themes. They are
teams aligned on a mission statement. If the teams are different
(Neutron and Designate) then it doesn't make sense to artificially merge
them just because you think of "networking" as a theme. If the teams
converge, yes it makes sense. If they don't, we should just create a new
program. They are cheap and should reflect how we work, not the other
way around.


+1

cheers,
Zane.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] [Ironic] [Heat] Mid-cycle collaborative meetup

2014-05-29 Thread Zane Bitter

On 29/05/14 13:33, Mike Spreitzer wrote:

Devananda van der Veen  wrote on 05/29/2014
01:26:12 PM:

 > Hi Jaromir,
 >
 > I agree that the midcycle meetup with TripleO and Ironic was very
 > beneficial last cycle, but this cycle, Ironic is co-locating its
 > sprint with Nova. Our focus needs to be working with them to merge
 > the nova.virt.ironic driver. Details will be forthcoming as we work
 > out the exact details with Nova. That said, I'll try to make the
 > TripleO sprint as well -- assuming the dates don't overlap.
 >
 > Cheers,
 > Devananda
 >

 > On Wed, May 28, 2014 at 4:05 AM, Jaromir Coufal 
wrote:
 > Hi to all,
 >
 > after previous TripleO & Ironic mid-cycle meetup, which I believe
 > was beneficial for all, I would like to suggest that we meet again
 > in the middle of Juno cycle to discuss current progress, blockers,
 > next steps and of course get some beer all together :)
 >
 > Last time, TripleO and Ironic merged their meetings together and I
 > think it was great idea. This time I would like to invite also Heat
 > team if they want to join. Our cooperation is increasing and I think
 > it would be great, if we can discuss all issues together.
 >
 > Red Hat offered to host this event, so I am very happy to invite you
 > all and I would like to ask, who would come if there was a mid-cycle
 > meetup in following dates and place:
 >
 > * July 28 - Aug 1
 > * Red Hat office, Raleigh, North Carolina
 >
 > If you are intending to join, please, fill yourselves into this etherpad:
 > https://etherpad.openstack.org/p/juno-midcycle-meetup
 >
 > Cheers
 > -- Jarda

Given the organizers, I assume this will be strongly focused on TripleO
and Ironic.
Would this be a good venue for all the mid-cycle discussion that will be
relevant to Heat?
Is anyone planning a distinct Heat-focused mid-cycle meetup?


We haven't had one in the past, but the project is getting bigger so, 
given our need to sync with the TripleO folks anyway, this may be a good 
opportunity to try. Certainly it's unlikely that any Heat developers 
attending will spend the _whole_ week working with the TripleO team, so 
there should be time to do something like what you're suggesting. I 
think we just need to see who is willing & able to attend, and work out 
an agenda on that basis.


For my part, I will certainly be there for the whole week if it's July 
28 - Aug 1. If it's the week before I may not be able to make it at all.


BTW one timing option I haven't seen mentioned is to follow Pycon-AU's 
model of running e.g. Friday-Tuesday (July 25-29). I know nobody wants 
to be stuck in Raleigh, NC on a weekend (I've lived there, I understand 
;), but for folks who have a long ways to travel it's one weekend lost 
instead of two.


cheers,
Zane.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Glance] [Heat] Glance Metadata Catalog for Capabilities and Tags

2014-05-30 Thread Zane Bitter

On 29/05/14 18:42, Tripp, Travis S wrote:

Hello everyone!

At the summit in Atlanta we demonstrated the “Graffiti” project
concepts.  We received very positive feedback from members of multiple
dev projects as well as numerous operators.  We were specifically asked
multiple times about getting the Graffiti metadata catalog concepts into
Glance so that we can start to officially support the ideas we
demonstrated in Horizon.

After a number of additional meetings at the summit and working through
ideas the past week, we’ve created the initial proposal for adding a
Metadata Catalog to Glance for capabilities and tags.  This is distinct
from the “Artifact Catalog”, but we do see that capability and tag
catalog can be used with the artifact catalog.

We’ve detailed our initial proposal in the following Google Doc.  Mark
Washenberger agreed that this was a good place to capture the initial
proposal and we can later move it over to the Glance spec repo which
will be integrated with Launchpad blueprints soon.

https://docs.google.com/document/d/1cS2tJZrj748ZsttAabdHJDzkbU9nML5S4oFktFNNd68

Please take a look and let’s discuss!

Also, the following video is a brief recap of what was demo’ d at the
summit.  It should help to set a lot of understanding behind the ideas
in the proposal.

https://www.youtube.com/watch?v=Dhrthnq1bnw

Thank you!

Travis Tripp (HP)

Murali Sundar (Intel)

*A Few Related Blueprints *

https://blueprints.launchpad.net/horizon/+spec/instance-launch-using-capability-filtering

https://blueprints.launchpad.net/horizon/+spec/tagging

https://blueprints.launchpad.net/horizon/+spec/faceted-search

https://blueprints.launchpad.net/horizon/+spec/host-aggregate-update-metadata

https://blueprints.launchpad.net/python-cinderclient/+spec/support-volume-image-metadata


+1, this is something that will be increasingly important to 
orchestration. The folks working on the TOSCA (and others) -> HOT 
translator project might be able to comment in more detail, but 
basically as people start wanting to write templates that run on 
multiple clouds (potentially even non-OpenStack clouds) some sort of 
catalog for capabilities will become crucial.


cheers,
Zane.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Heat] Short term scaling strategies for large Heat stacks

2014-05-30 Thread Zane Bitter

On 29/05/14 19:52, Clint Byrum wrote:

Multiple Stacks
===

We could break the stack up between controllers, and compute nodes. The
controller will be less likely to fail because it will probably be 3 nodes
for a reasonably sized cloud. The compute nodes would then live in their
own stack of (n) nodes. We could further break that up into chunks of
compute nodes, which would further mitigate failure. If a small chunk of
compute nodes fails, we can just migrate off of them. One challenge here
is that compute nodes need to know about all of the other compute nodes
to support live migration. We would have to do a second stack update after
creation to share data between all of these stacks to make this work.

Pros: * Exists today

Cons: * Complicates host awareness
   * Still vulnerable to stack failure (just reduces probability and
 impact).


Separating the controllers and compute nodes is something you should do 
anyway (although moving to autoscaling, which will be even better when 
it is possible, would actually have the same effect). Splitting the 
compute nodes into smaller groups would certainly reduce the cost of 
failure. If we were to use an OS::Heat::Stack resource that calls 
python-heatclient instead of creating a nested stack in the same engine, 
then these child stacks would get split across a multi-engine deployment 
automagically. There's a possible implementation already at 
https://review.openstack.org/53313



update-failure-recovery
===

This is a blueprint I believe Zane is working on to land in Juno. It will
allow us to retry a failed create or update action. Combined with the
separate controller/compute node strategy, this may be our best option,
but it is unclear whether that code will be available soon or not. The
chunking is definitely required, because with 500 compute nodes, if
node #250 fails, the remaining 249 nodes that are IN_PROGRESS will be
cancelled, which makes the impact of a transient failure quite extreme.
Also without chunking, we'll suffer from some of the performance
problems we've seen where a single engine process will have to do all of
the work to bring up a stack.

Pros: * Uses blessed strategy

Cons: * Implementation is not complete
  * Still suffers from heavy impact of failure
  * Requires chunking to be feasible


I've already started working on this and I'm expecting to have this 
ready some time between the j-1 and j-2 milestones.


I think these two strategies combined could probably get you a long way 
in the short term, though obviously they are not a replacement for the 
convergence strategy in the long term.



BTW You missed off another strategy that we have discussed in the past, 
and which I think Steve Baker might(?) be working on: retrying failed 
calls at the client level.


cheers,
Zane.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [Heat] Blueprint process (heat-specs repo)

2014-05-30 Thread Zane Bitter
Last week we agreed[1] to follow that other project in setting up a 
specs repo.[2] (Many thanks to Ying, Monty and Clint for getting this up 
and running.)


I'm still figuring out myself how this is going to work, but the basic 
idea seems to be this:


- All new blueprints should be submitted as Gerrit reviews to the specs 
repo. Do NOT submit new blueprints to Launchpad

- Existing blueprints in Launchpad are fine, there's no need to touch them.
- If you need to add design information to an existing blueprint, please 
do so by submitting a Gerrit review to the specs repo and linking to it 
from Launchpad, instead of using a wiki page.


A script will create Launchpad blueprints from approved specs and 
heat-drivers (i.e. the core team) will target them to milestones. Once 
this system is up and running, anything not targeted to a milestone will 
be subject to getting bumped from the series goal by another script. 
(That's why you don't want to create new bps in Launchpad.)


If anybody has questions, I am happy to make up answers.

Let's continue to keep things lightweight. Remember, this is not a tool 
to enforce a process, it's a better medium for the communication that's 
already happening. As a guide:


- the more ambitious/crazy/weird your idea is, the more detail you need 
to communicate.
- the harder it would be to throw part or all of the work away, the 
earlier you need to communicate it.
- as always, do whatever you judge best, for whatever definition of 
'best' you judge best.


cheers,
Zane.


[1] 
http://eavesdrop.openstack.org/meetings/heat/2014/heat.2014-05-21-20.00.html

[2] http://git.openstack.org/cgit/openstack/heat-specs

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [heat] Resource action API

2014-06-04 Thread Zane Bitter

On 04/06/14 03:01, yang zhang wrote:

Hi all,
Now heat only supports suspending/resuming a whole stack, all the
resources of the stack will be suspended/resumed,
but sometime we just want to suspend or resume only a part of resources


Any reason you wouldn't put that subset of resources into a nested stack 
and suspend/resume that?



in the stack, so I think adding resource-action API for heat is
necessary. this API will be helpful to solve 2 problems:


I'm sceptical of this idea because the whole justification for having 
suspend/resume in Heat is that it's something that needs to follow the 
same dependency tree as stack delete/create.


Are you suggesting that if you suspend an individual resource, all of 
the resources dependent on it will also be suspended?



 - If we want to suspend/resume the resources of the stack, you need
to get the phy_id first and then call the API of other services, and
this won't update the status
of the resource in heat, which often cause some unexpected problem.


This is true, except for stack resources, which obviously _do_ store the 
state.



 - this API could offer a turn on/off function for some native
resources, e.g., we can turn on/off the autoscalinggroup or a single
policy with
the API, this is like the suspend/resume services feature[1] in AWS.


Which, I notice, is not exposed in CloudFormation.


  I registered a bp for it, and you are welcome for discussing it.
https://blueprints.launchpad.net/heat/+spec/resource-action-api


Please propose blueprints to the heat-specs repo:
http://lists.openstack.org/pipermail/openstack-dev/2014-May/036432.html

thanks,
Zane.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Heat]Heat template parameters encryption

2014-06-04 Thread Zane Bitter

On 04/06/14 15:58, Vijendar Komalla wrote:

Hi Devs,
I have submitted an WIP review (https://review.openstack.org/#/c/97900/)
for Heat parameters encryption blueprint
https://blueprints.launchpad.net/heat/+spec/encrypt-hidden-parameters
This quick and dirty implementation encrypts all the parameters on on
Stack 'store' and decrypts on on Stack 'load'.
Following are couple of improvements I am thinking about;
1. Instead of encrypting individual parameters, on Stack 'store' encrypt
all the parameters together as a dictionary  [something like
crypt.encrypt(json.dumps(param_dictionary))]


Yeah, definitely don't encrypt them individually.


2. Just encrypt parameters that were marked as 'hidden', instead of
encrypting all parameters

I would like to hear your feedback/suggestions.


Just as a heads-up, we will soon need to store the properties of 
resources too, at which point parameters become the least of our 
problems. (In fact, in theory we wouldn't even need to store 
parameters... and probably by the time convergence is completely 
implemented, we won't.) Which is to say that there's almost certainly no 
point in discriminating between hidden and non-hidden parameters.


I'll refrain from commenting on whether the extra security this affords 
is worth the giant pain it causes in debugging, except to say that IMO 
there should be a config option to disable the feature (and if it's 
enabled by default, it should probably be disabled by default in e.g. 
devstack).


cheers,
Zane.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [heat] Resource action API

2014-06-05 Thread Zane Bitter

On 05/06/14 03:32, yang zhang wrote:


Thanks so much for your commits.

 > Date: Wed, 4 Jun 2014 14:39:30 -0400
 > From: zbit...@redhat.com
 > To: openstack-dev@lists.openstack.org
 > Subject: Re: [openstack-dev] [heat] Resource action API
 >
 > On 04/06/14 03:01, yang zhang wrote:
 > > Hi all,
 > > Now heat only supports suspending/resuming a whole stack, all the
 > > resources of the stack will be suspended/resumed,
 > > but sometime we just want to suspend or resume only a part of resources
 >
 > Any reason you wouldn't put that subset of resources into a nested stack
 > and suspend/resume that?

I think that using nested-stack is a little complicated, and we
can't build a nested-stack
for each resource, hope this bp could make it easier.
 >
 > > in the stack, so I think adding resource-action API for heat is
 > > necessary. this API will be helpful to solve 2 problems:
 >
 > I'm sceptical of this idea because the whole justification for having
 > suspend/resume in Heat is that it's something that needs to follow the
 > same dependency tree as stack delete/create.
 >
 > Are you suggesting that if you suspend an individual resource, all of
 > the resources dependent on it will also be suspended?

 I thought about this, and I think just suspending an individual
resource without dependent
is ok, now the resources that can be suspended are very few, and almost
all of those resources
(Server, alarm, user, etc) could be suspended individually.


Then just suspend them individually using their own APIs. If there's no 
orchestration involved then it doesn't belong in Heat.



 > > - If we want to suspend/resume the resources of the stack, you need
 > > to get the phy_id first and then call the API of other services, and
 > > this won't update the status
 > > of the resource in heat, which often cause some unexpected problem.
 >
 > This is true, except for stack resources, which obviously _do_ store the
 > state.
 >
 > > - this API could offer a turn on/off function for some native
 > > resources, e.g., we can turn on/off the autoscalinggroup or a single
 > > policy with
 > > the API, this is like the suspend/resume services feature[1] in AWS.
 >
 > Which, I notice, is not exposed in CloudFormation.

  I found it on AWS web, It seems a auotscalinggroup feature, this may
be not
  exposed in CloudFormation, but I think it's really a good idea.


Sure, but the solution here is to have a separate Autoscaling API (this 
is a long-term goal for us already) that exposes this feature.


cheers,
Zane.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Marconi] Adopt Spec

2014-06-05 Thread Zane Bitter

On 05/06/14 12:03, Kurt Griffiths wrote:

I just learned that some projects are thinking about having the specs
process be the channel for submitting new feature ideas, rather than
registering blueprints. I must admit, that would be kind of nice because
it would provide some much-needed structure around the triaging process.

I wonder if we can get some benefit out of the spec process while still
keeping it light? The temptation will be to start documenting everything
in excruciating detail, but we can mitigate that by codifying some
guidelines on our wiki and baking it into the team culture.

What does everyone think?


FWIW we just adopted a specs repo in Heat, and all of us feel exactly 
the same way as you do:


http://lists.openstack.org/pipermail/openstack-dev/2014-May/036432.html

I can't speak for every project, but you are far from the only ones 
wanting to use this as lightweight process. Hopefully we'll all figure 
out together how to make that happen :)


cheers,
Zane.


From: Kurt Griffiths mailto:kurt.griffi...@rackspace.com>>
Date: Tuesday, June 3, 2014 at 9:34 AM
To: OpenStack Dev mailto:openstack-dev@lists.openstack.org>>
Subject: Re: [openstack-dev] [Marconi] Adopt Spec

I think it becomes more useful the larger your team. With a smaller team
it is easier to keep everyone on the same page just through the mailing
list and IRC. As for where to document design decisions, the trick there
is more one of being diligent about capturing and recording the why of
every decision made in discussions and such; gerrit review history can
help with that, but it isn’t free.

If we’d like to give the specs process a try, I think we could do an
experiment in j-2 with a single bp. Depending on how that goes, we may
do more in the K cycle. What does everyone think?

From: Malini Kamalambal mailto:malini.kamalam...@rackspace.com>>
Reply-To: OpenStack Dev mailto:openstack-dev@lists.openstack.org>>
Date: Monday, June 2, 2014 at 2:45 PM
To: OpenStack Dev mailto:openstack-dev@lists.openstack.org>>
Subject: Re: [openstack-dev] [Marconi] Adopt Spec

+1 – Requiring specs for every blueprint is going to make the
development process very cumbersome, and will take us back to waterfall
days.
I like how the Marconi team operates now, with design decisions being
made in IRC/ team meetings.
So Spec might become more of an overhead than add value, given how our
team functions.

_'If'_ we agree to use Specs, we should use that only for the blue
prints that make sense.
For example, the unit test decoupling that we are working on now – this
one will be a good candidate to use specs, since there is a lot of back
and forth going on how to do this.
On the other hand something like Tempest Integration for Marconi will
not warrant a spec, since it is pretty straightforward what needs to be
done.
In the past we have had discussions around where to document certain
design decisions (e.g. Which endpoint/verb is the best fit for pop
operation?)
Maybe spec is the place for these?

We should leave it to the implementor to decide, if the bp warrants a
spec or not & what should be in the spec.


From: Kurt Griffiths mailto:kurt.griffi...@rackspace.com>>
Reply-To: "OpenStack Development Mailing List (not for usage questions)"
mailto:openstack-dev@lists.openstack.org>>
Date: Monday, June 2, 2014 1:33 PM
To: "OpenStack Development Mailing List (not for usage questions)"
mailto:openstack-dev@lists.openstack.org>>
Subject: Re: [openstack-dev] [Marconi] Adopt Spec

I’ve been in roles where enormous amounts of time were spent on writing
specs, and in roles where specs where non-existent. Like most things,
I’ve become convinced that success lies in moderation between the two
extremes.

I think it would make sense for big specs, but I want to be careful we
use it judiciously so that we don’t simply apply more process for the
sake of more process. It is tempting to spend too much time recording
every little detail in a spec, when that time could be better spent in
regular communication between team members and with customers, and on
iterating the code (/short/ iterations between demo/testing, so you
ensure you are on staying on track and can address design problems
early, often).

IMO, specs are best used more as summaries, containing useful
big-picture ideas, diagrams, and specific “memory pegs” to help us
remember what was discussed and decided, and calling out specific
“promises” for future conversations where certain design points are TBD.

From: Malini Kamalambal mailto:malini.kamalam...@rackspace.com>>
Reply-To: OpenStack Dev mailto:openstack-dev@lists.openstack.org>>
Date: Monday, June 2, 2014 at 9:51 AM
To: OpenStack Dev mailto:openstack-dev@lists.openstack.org>>
Subject: [openstack-dev] [Marconi] Adopt Spec

Hello all,

We are seeing more & more design questions in #openstack-marconi.
It will be a good idea to formalize our design process a bit more
& start using spec.
We are kind of late to the party –so we already have a lot of precedent

[openstack-dev] [Heat] Reminder: meeting at alternate time this week

2014-06-10 Thread Zane Bitter
A lot of people forgot last time, so this is your reminder that this 
week the Heat IRC meeting will be held at the alternate time:


Wednesday at 1200 UTC in #openstack-meeting

or in your local time zone:

http://www.timeanddate.com/worldclock/fixedtime.html?iso=20140611T12

cheers,
Zane.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [Heat][TripleO] Heat mid-cycle meetup

2014-06-11 Thread Zane Bitter
There has been a lot of interest in holding a Heat meetup during the 
Juno cycle. We have the opportunity to piggy-back on the TripleO meetup 
- and I expect at least some Heat developers to attend that - but we 
also know that some key people from both Heat and TripleO will be unable 
to attend (Tomas, Clint, Clint's team leads in Bangalore), and for 
others (me) it would be very inconvenient. So I've put together an 
Etherpad to gauge interest in Clint's suggestion of a separate Heat 
meetup in August. If we held it, the venue would also be Red Hat's 
office in Raleigh, NC (same as the TripleO meetup).


https://etherpad.openstack.org/p/heat-juno-midcycle-meetup

If you're interested in attending, please indicate your availability in 
the Etherpad. If you can attend both, sign up for both. If you can 
attend either but not both, sign up for both but indicate that they're 
mutually exclusive. Since dates are not confirmed yet, please include 
any constraints on your availability.


thanks,
Zane.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Trove] Heat integration

2014-06-16 Thread Zane Bitter

On 16/06/14 13:56, Nikhil Manchanda wrote:


Denis Makogon writes:

Because Trove should not do what it does now (cloud service orchestration
is not the part of the OS Database Program). Trove should delegate all
tasks to Cloud Orchestration Service (Heat).



Agreed that Trove should delegate provisioning, and orchestration tasks
to Heat. Tasks like taking backups, configuring an instance, promoting
it to master, etc are database specific tasks, and I'm not convinced
that Heat is the route to take for them.


I don't think anyone disagrees with you on this, although in the future 
Mistral might be helpful for some of the task running aspects (as we 
discussed at the design summit).



Here comes the third (mixed) manager called “MIGRATION”. It allows to work
with previously provisioned instances through NATIVES engine (resizes,
migration, deletion) but new instances which would be provisioned in future
will be provisioned withing stacks through ORCHESTRATOR.

So, there are three valid options:

-

use NATIVES if there's no available Heat;
-

use ORCHESTRATOR to work with Heat only;
-

use MIGRATION to work with mixed manager;


This does seem a bit overly complex. Steve mentioned the idea of stack
adopt (Thanks!), and I think that would be quite a bit simpler. I think
it behooves us to investigate that as a mechanism for creating a stack
from existing resources, rather than having something like a mixed
migration manager that has been proposed.


+1 for the stack adopt, this is an ideal use case for it IMO.


[...]

implement instance resize; Done


-

implement volume resize; Done





IIRC we did have an open issue and were trying to work with heat devs to
expose a callback to trove in the case of the VERIFY_RESIZE during
instance resize. Is that now done?


No, this remains an outstanding issue. There are, however, plans to 
address it:


https://blueprints.launchpad.net/heat/+spec/stack-lifecycle-plugpoint

cheers,
Zane.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [Heat] Mid-cycle Heat meetup - confirmed dates

2014-06-20 Thread Zane Bitter
I am pleased to announce that I have booked the facilities required for 
the Heat mid-cycle meetup for Juno, as discussed at the Heat IRC meeting 
this week.[1] Therefore I can confirm that the meetup will be held:


Monday 18th - Wednesday 20th August
@ Red Hat Tower in Raleigh, North Carolina

If you plan to attend, please sign up on the Etherpad here:

https://etherpad.openstack.org/p/heat-juno-midcycle-meetup

We may be able to get a block booking on a downtown Hotel; I will look 
into that next week when we have a firm idea of numbers.


It goes without saying that we welcome all Heat developers. The focus 
will be on getting work done, not just planning, but of course we 
welcome developers from other projects who are interested in working 
with the Heat team too.


cheers,
Zane.


[1] 
http://eavesdrop.openstack.org/meetings/heat/2014/heat.2014-06-18-20.00.html

(Note that the dates were accidentally recorded incorrectly in the minutes.)

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [heat] Sergey Kraynev for heat-core

2014-06-26 Thread Zane Bitter

On 26/06/14 18:08, Steve Baker wrote:

I'd like to nominate Sergey Kraynev for heat-core. His reviews are
valuable and prolific, and his commits have shown a sound understanding
of heat internals.

http://stackalytics.com/report/contribution/heat-group/60


+1


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Ceilometer] [Heat] Ceilometer aware people, please advise us on processing notifications..

2014-06-26 Thread Zane Bitter

On 23/06/14 19:25, Clint Byrum wrote:

Hello! I would like to turn your attention to this specification draft
that I've written:

https://review.openstack.org/#/c/100012/1/specs/convergence-continuous-observer.rst

Angus has suggested that perhaps Ceilometer is a better place to handle
this. Can you please comment on the review, or can we have a brief
mailing list discussion about how best to filter notifications?

Basically in Heat when a user boots an instance, we would like to act as
soon as it is active, and not have to poll the nova API to know when
that is. Angus has suggested that perhaps we can just tell ceilometer to
hit Heat with a web hook when that happens.


I'm all in favour of having Ceilometer filter the firehose for us if we 
can :)


Webhooks would seem to add a lot of overhead though (set up + tear down 
a connection for every notification), that could perhaps be avoided by 
using a message bus? Given that both setting up and receiving these 
notifications would be admin-only operations, is there any benefit to 
handling them through a webhook API rather than through oslo.messaging?


cheers,
Zane.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Heat] Convergence proof-of-concept showdown

2014-12-17 Thread Zane Bitter

On 17/12/14 13:05, Gurjar, Unmesh wrote:

I'm storing a tuple of its name and database ID. The data structure is
resource.GraphKey. I was originally using the name for something, but I
suspect I could probably drop it now and just store the database ID, but I
haven't tried it yet. (Having the name in there definitely makes debugging
more pleasant though ;)



I agree, having name might come in handy while debugging!


When I build the traversal graph each node is a tuple of the GraphKey and a
boolean to indicate whether it corresponds to an update or a cleanup
operation (both can appear for a single resource in the same graph).


Just to confirm my understanding, cleanup operation takes care of both:
1. resources which are deleted as a part of update and
2. previous versioned resource which was updated by replacing with a new
resource (UpdateReplace scenario)


Yes, correct. Also:

3. resource versions which failed to delete for whatever reason on a 
previous traversal



Also, the cleanup operation is performed after the update completes 
successfully.


NO! They are not separate things!

https://github.com/openstack/heat/blob/stable/juno/heat/engine/update.py#L177-L198


If I am correct, you are updating all resources on update regardless
of their change which will be inefficient if stack contains a million resource.


I'm calling update() on all resources regardless of change, but update() will
only call handle_update() if something has changed (unless the plugin has
overridden Resource._needs_update()).

There's no way to know whether a resource needs to be updated before
you're ready to update it, so I don't think of this as 'inefficient', just 
'correct'.


We have similar questions regarding other areas in your
implementation, which we believe if we understand the outline of your
implementation. It is difficult to get a hold on your approach just by looking

at code. Docs strings / Etherpad will help.



About streams, Yes in a million resource stack, the data will be huge, but

less than template.

No way, it's O(n^3) (cubed!) in the worst case to store streams for each
resource.


Also this stream is stored
only In IN_PROGRESS resources.


Now I'm really confused. Where does it come from if the resource doesn't
get it until it's already in progress? And how will that information help it?



When an operation on stack is initiated, the stream will be identified.


OK, this may be one of the things I was getting confused about - I 
though a 'stream' belonged to one particular resource and just contained 
all of the paths to reaching that resource. But here it seems like 
you're saying that a 'stream' is a representation of the entire graph? 
So it's essentially just a gratuitously bloated NIH serialisation of the 
Dependencies graph?



To begin
the operation, the action is initiated on the leaf (or root) resource(s) and the
stream is stored (only) in this/these IN_PROGRESS resource(s).


How does that work? Does it get deleted again when the resource moves to 
COMPLETE?



The stream should then keep getting passed to the next/previous level of 
resource(s) as
and when the dependencies for the next/previous level of resource(s) are met.


That sounds... identical to the way it's implemented in my prototype 
(passing a serialisation of the graph down through the notification 
triggers), except for the part about storing it in the Resource table. 
Why would we persist to the database data that we only need for the 
duration that we already have it in memory anyway?


If we're going to persist it we should do so once, in the Stack table, 
at the time that we're preparing to start the traversal.



The reason to have entire dependency list to reduce DB queries while a

stack update.

But we never need to know that. We only need to know what just happened
and what to do next.



As mentioned earlier, each level of resources in a graph pass on the dependency
list/stream to their next/previous level of resources. This is information 
should further
be used to determine what is to be done next and drive the operation to 
completion.


Why would we store *and* forward?


When you have a singular dependency on each resources similar to your
implantation, then we will end up loading Dependencies one at a time and

altering almost all resource's dependency regardless of their change.


Regarding a 2 template approach for delete, it is not actually 2
different templates. Its just that we have a delete stream To be taken up

post update.

That would be a regression from Heat's current behaviour, where we start
cleaning up resources as soon as they have nothing depending on them.
There's not even a reason to make it worse than what we already have,
because it's actually a lot _easier_ to treat update and clean up as the same
kind of operation and throw both into the same big graph. The dual
implementations and all of the edge cases go away and you can just trust in
the graph traversal to do the Right Thing in th

Re: [openstack-dev] [Heat] Convergence proof-of-concept showdown

2014-12-18 Thread Zane Bitter

On 15/12/14 07:47, Murugan, Visnusaran wrote:

We have similar questions regarding other
areas in your implementation, which we believe if we understand the outline of 
your implementation. It is difficult to get
a hold on your approach just by looking at code. Docs strings / Etherpad will 
help.


I added a bunch of extra docstrings and comments:

https://github.com/zaneb/heat-convergence-prototype/commit/5d79e009196dc224bd588e19edef5f0939b04607

I also implemented a --pdb option that will automatically set 
breakpoints at the beginning of all of the asynchronous events, so that 
you'll be dropped into the debugger and can single-step through the 
code, look at variables and so on:


https://github.com/zaneb/heat-convergence-prototype/commit/2a7a56dde21cad979fae25acc9fb01c6b4d9c6f7

I hope that helps. If you have more questions, please feel free to ask.

cheers,
Zane.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Fw: [Heat] Multiple_Routers_Topoloy

2014-12-22 Thread Zane Bitter
The -dev mailing list is not for usage questions. Please post your 
question to ask.openstack.org and include the text of the error message 
you when trying to add a RouterInterface.


cheers,
Zane.

On 22/12/14 04:18, Rao Shweta wrote:



Hi All

I am working on openstack Heat and i wanted to make below topolgy using
heat template :



For this i am using a template as given :

AWSTemplateFormatVersion: '2010-09-09'
Description: Sample Heat template that spins up multiple instances and a
private network
   (JSON)
Resources:
   heat_network_01:
 Properties: {name: heat-network-01}
 Type: OS::Neutron::Net
   heat_network_02:
 Properties: {name: heat-network-02}
 Type: OS::Neutron::Net
   heat_router_01:
 Properties: {admin_state_up: 'True', name: heat-router-01}
 Type: OS::Neutron::Router
   heat_router_02:
 Properties: {admin_state_up: 'True', name: heat-router-02}
 Type: OS::Neutron::Router
   heat_router_int0:
 Properties:
   router_id: {Ref: heat_router_01}
   subnet_id: {Ref: heat_subnet_01}
 Type: OS::Neutron::RouterInterface
   heat_router_int1:
 Properties:
   router_id: {Ref: heat_router_02}
   subnet_id: {Ref: heat_subnet_02}
 Type: OS::Neutron::RouterInterface
   heat_subnet_01:
 Properties:
   cidr: 10.10.10.0/24
   dns_nameservers: [172.16.1.11, 172.16.1.6]
   enable_dhcp: 'True'
   gateway_ip: 10.10.10.254
   name: heat-subnet-01
   network_id: {Ref: heat_network_01}
 Type: OS::Neutron::Subnet
   heat_subnet_02:
 Properties:
   cidr: 10.10.11.0/24
   dns_nameservers: [172.16.1.11, 172.16.1.6]
   enable_dhcp: 'True'
   gateway_ip: 10.10.11.254
   name: heat-subnet-01
   network_id: {Ref: heat_network_02}
 Type: OS::Neutron::Subnet
   instance0:
 Properties:
   flavor: m1.nano
   image: cirros-0.3.2-x86_64-uec
   name: heat-instance-01
   networks:
   - port: {Ref: instance0_port0}
 Type: OS::Nova::Server
   instance0_port0:
 Properties:
   admin_state_up: 'True'
   network_id: {Ref: heat_network_01}
 Type: OS::Neutron::Port
   instance1:
 Properties:
   flavor: m1.nano
   image: cirros-0.3.2-x86_64-uec
   name: heat-instance-02
   networks:
   - port: {Ref: instance1_port0}
 Type: OS::Nova::Server
   instance1_port0:
 Properties:
   admin_state_up: 'True'
   network_id: {Ref: heat_network_01}
 Type: OS::Neutron::Port
   instance11:
 Properties:
   flavor: m1.nano
   image: cirros-0.3.2-x86_64-uec
   name: heat-instance11-01
   networks:
   - port: {Ref: instance11_port0}
 Type: OS::Nova::Server
   instance11_port0:
 Properties:
   admin_state_up: 'True'
   network_id: {Ref: heat_network_02}
 Type: OS::Neutron::Port
   instance1:
 Properties:
   flavor: m1.nano
   image: cirros-0.3.2-x86_64-uec
   name: heat-instance12-02
   networks:
   - port: {Ref: instance12_port0}
 Type: OS::Nova::Server
   instance12_port0:
 Properties:
   admin_state_up: 'True'
   network_id: {Ref: heat_network_02}
 Type: OS::Neutron::Port

I am able to create topology using the template but i am not able to
connect two routers. Neither i can get a template example on internet
through which i can connect two routers. Can you please help me with :

1.) Can we connect two routers? I tried with making a interface on
router 1 and connecting it to the subnet2 which is showing error.

   heat_router_int0:
 Properties:
   router_id: {Ref: heat_router_01}
   subnet_id: {Ref: heat_subnet_02}

Can you please guide me how can we connect routers or have link between
routers using template.

2.) Can you please forward a link or a example template from which i can
refer and implement reqiured topology using heat template.

Waiting for a response



Thankyou

Regards
Shweta Rao
Mailto: rao.shw...@tcs.com
Website: http://www.tcs.com


=-=-=
Notice: The information contained in this e-mail
message and/or attachments to it may contain
confidential or privileged information. If you are
not the intended recipient, any dissemination, use,
review, distribution, printing or copying of the
information contained in this e-mail message
and/or attachments to it are strictly prohibited. If
you have received this communication in error,
please notify us by reply e-mail or telephone and
immediately and permanently delete the message
and any attachments. Thank you



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [heat] Application level HA via Heat

2014-12-22 Thread Zane Bitter

On 22/12/14 13:21, Steven Hardy wrote:

Hi all,

So, lately I've been having various discussions around $subject, and I know
it's something several folks in our community are interested in, so I
wanted to get some ideas I've been pondering out there for discussion.

I'll start with a proposal of how we might replace HARestarter with
AutoScaling group, then give some initial ideas of how we might evolve that
into something capable of a sort-of active/active failover.

1. HARestarter replacement.

My position on HARestarter has long been that equivalent functionality
should be available via AutoScalingGroups of size 1.  Turns out that
shouldn't be too hard to do:

  resources:
   server_group:
 type: OS::Heat::AutoScalingGroup
 properties:
   min_size: 1
   max_size: 1
   resource:
 type: ha_server.yaml

   server_replacement_policy:
 type: OS::Heat::ScalingPolicy
 properties:
   # FIXME: this adjustment_type doesn't exist yet
   adjustment_type: replace_oldest
   auto_scaling_group_id: {get_resource: server_group}
   scaling_adjustment: 1


One potential issue with this is that it is a little bit _too_ 
equivalent to HARestarter - it will replace your whole scaled unit 
(ha_server.yaml in this case) rather than just the failed resource inside.



So, currently our ScalingPolicy resource can only support three adjustment
types, all of which change the group capacity.  AutoScalingGroup already
supports batched replacements for rolling updates, so if we modify the
interface to allow a signal to trigger replacement of a group member, then
the snippet above should be logically equivalent to HARestarter AFAICT.

The steps to do this should be:

  - Standardize the ScalingPolicy-AutoScaling group interface, so
aynchronous adjustments (e.g signals) between the two resources don't use
the "adjust" method.

  - Add an option to replace a member to the signal interface of
AutoScalingGroup

  - Add the new "replace adjustment type to ScalingPolicy


I think I am broadly in favour of this.


I posted a patch which implements the first step, and the second will be
required for TripleO, e.g we should be doing it soon.

https://review.openstack.org/#/c/143496/
https://review.openstack.org/#/c/140781/

2. A possible next step towards active/active HA failover

The next part is the ability to notify before replacement that a scaling
action is about to happen (just like we do for LoadBalancer resources
already) and orchestrate some or all of the following:

- Attempt to quiesce the currently active node (may be impossible if it's
   in a bad state)

- Detach resources (e.g volumes primarily?) from the current active node,
   and attach them to the new active node

- Run some config action to activate the new node (e.g run some config
   script to fsck and mount a volume, then start some application).

The first step is possible by putting a SofwareConfig/SoftwareDeployment
resource inside ha_server.yaml (using NO_SIGNAL so we don't fail if the
node is too bricked to respond and specifying DELETE action so it only runs
when we replace the resource).

The third step is possible either via a script inside the box which polls
for the volume attachment, or possibly via an update-only software config.

The second step is the missing piece AFAICS.

I've been wondering if we can do something inside a new heat resource,
which knows what the current "active" member of an ASG is, and gets
triggered on a "replace" signal to orchestrate e.g deleting and creating a
VolumeAttachment resource to move a volume between servers.

Something like:

  resources:
   server_group:
 type: OS::Heat::AutoScalingGroup
 properties:
   min_size: 2
   max_size: 2
   resource:
 type: ha_server.yaml

   server_failover_policy:
 type: OS::Heat::FailoverPolicy
 properties:
   auto_scaling_group_id: {get_resource: server_group}
   resource:
 type: OS::Cinder::VolumeAttachment
 properties:
 # FIXME: "refs" is a ResourceGroup interface not currently
 # available in AutoScalingGroup
 instance_uuid: {get_attr: [server_group, refs, 1]}

   server_replacement_policy:
 type: OS::Heat::ScalingPolicy
 properties:
   # FIXME: this adjustment_type doesn't exist yet
   adjustment_type: replace_oldest
   auto_scaling_policy_id: {get_resource: server_failover_policy}
   scaling_adjustment: 1


This actually fails because a VolumeAttachment needs to be updated in 
place; if you try to switch servers but keep the same Volume when 
replacing the attachment you'll get an error.


TBH {get_attr: [server_group, refs, 1]} is doing most of the heavy 
lifting here, so in theory you could just have an 
OS::Cinder::VolumeAttachment instead of the FailoverPolicy and then all 
you need is a way of triggering a stack update with the same template & 
params. I know Ton added a PATCH method to update in Juno so that you 

Re: [openstack-dev] [heat] Application level HA via Heat

2015-01-02 Thread Zane Bitter

On 24/12/14 05:17, Steven Hardy wrote:

On Mon, Dec 22, 2014 at 03:42:37PM -0500, Zane Bitter wrote:

On 22/12/14 13:21, Steven Hardy wrote:

Hi all,

So, lately I've been having various discussions around $subject, and I know
it's something several folks in our community are interested in, so I
wanted to get some ideas I've been pondering out there for discussion.

I'll start with a proposal of how we might replace HARestarter with
AutoScaling group, then give some initial ideas of how we might evolve that
into something capable of a sort-of active/active failover.

1. HARestarter replacement.

My position on HARestarter has long been that equivalent functionality
should be available via AutoScalingGroups of size 1.  Turns out that
shouldn't be too hard to do:

  resources:
   server_group:
 type: OS::Heat::AutoScalingGroup
 properties:
   min_size: 1
   max_size: 1
   resource:
 type: ha_server.yaml

   server_replacement_policy:
 type: OS::Heat::ScalingPolicy
 properties:
   # FIXME: this adjustment_type doesn't exist yet
   adjustment_type: replace_oldest
   auto_scaling_group_id: {get_resource: server_group}
   scaling_adjustment: 1


One potential issue with this is that it is a little bit _too_ equivalent to
HARestarter - it will replace your whole scaled unit (ha_server.yaml in this
case) rather than just the failed resource inside.


Personally I don't see that as a problem, because the interface makes that
explicit - if you put a resource in an AutoScalingGroup, you expect it to
get created/deleted on group adjustment, so anything you don't want
replaced stays outside the group.


I guess I was thinking about having the same mechanism work when the 
size of the scaling group is not fixed at 1.



Happy to consider other alternatives which do less destructive replacement,
but to me this seems like the simplest possible way to replace HARestarter
with something we can actually support long term.


Yeah, I just get uneasy about features that don't compose. Here you have 
to decide between the replacement policy feature and the feature of 
being able to scale out arbitrary stacks. The two uses are so different 
that they almost don't make sense as the same resource. The result will 
be a lot of people implementing scaling groups inside scaling groups in 
order to take advantage of both sets of behaviour.



Even if "just replace failed resource" is somehow made available later,
we'll still want to support AutoScalingGroup, and "replace_oldest" is
likely to be useful in other situations, not just this use-case.

Do you have specific ideas of how the just-replace-failed-resource feature
might be implemented?  A way for a signal to declare a resource failed so
convergence auto-healing does a less destructive replacement?


So, currently our ScalingPolicy resource can only support three adjustment
types, all of which change the group capacity.  AutoScalingGroup already
supports batched replacements for rolling updates, so if we modify the
interface to allow a signal to trigger replacement of a group member, then
the snippet above should be logically equivalent to HARestarter AFAICT.

The steps to do this should be:

  - Standardize the ScalingPolicy-AutoScaling group interface, so
aynchronous adjustments (e.g signals) between the two resources don't use
the "adjust" method.

  - Add an option to replace a member to the signal interface of
AutoScalingGroup

  - Add the new "replace adjustment type to ScalingPolicy


I think I am broadly in favour of this.


Ok, great - I think we'll probably want replace_oldest, replace_newest, and
replace_specific, such that both alarm and operator driven replacement have
flexibility over what member is replaced.


We probably want to allow users to specify the replacement policy (e.g. 
oldest first vs. newest first) for the scaling group itself to use when 
scaling down or during rolling updates. If we had that, we'd probably 
only need a single "replace" adjustment type - if a particular member is 
specified in the message then it would replace that specific one, 
otherwise the scaling group would choose which to replace based on the 
specified policy.



I posted a patch which implements the first step, and the second will be
required for TripleO, e.g we should be doing it soon.

https://review.openstack.org/#/c/143496/
https://review.openstack.org/#/c/140781/

2. A possible next step towards active/active HA failover

The next part is the ability to notify before replacement that a scaling
action is about to happen (just like we do for LoadBalancer resources
already) and orchestrate some or all of the following:

- Attempt to quiesce the currently active node (may be impossible if it's
   in a bad state)

- Detach resources (e.g volumes primarily?) from the current active node,
   and attach t

Re: [openstack-dev] [Heat] Precursor to Phase 1 Convergence

2015-01-09 Thread Zane Bitter

On 09/01/15 01:07, Angus Salkeld wrote:

I am not in favor of the --continue as an API. I'd suggest responding to
resource timeouts and if there is no response from the task, then
re-start (continue)
the task.


Yeah, I am not in favour of a new API either. In fact, I believe we 
already have this functionality: if you do another update with the same 
template and parameters then it will break the lock and continue the 
update if the engine running the previous update has failed. And when we 
switch over to convergence it will still do the Right Thing without any 
extra implementation effort.


There is one improvement we can make to the API though: in Juno, Ton 
added a PATCH method to stack update such that you can reuse the 
existing parameters without specifying them again. We should extend this 
to the template also, so you wouldn't have to supply any data to get 
Heat to start another update with the same template and parameters.


I'm not sure if there is a blueprint for this already; co-ordinate with 
Ton if you are planning to work on it.


cheers,
Zane.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Heat] Convergence proof-of-concept showdown

2015-01-09 Thread Zane Bitter

On 08/01/15 05:39, Anant Patil wrote:

1. The stack was failing when there were single disjoint resources or
just one resource in template. The graph did not include this resource
due to a minor bug in dependency_names(). I have added a test case and
fix here:
https://github.com/anantpatil/heat-convergence-prototype/commit/b58abd77cf596475ecf3f19ed38adf8ad3bb6b3b


Thanks, sorry about that! I will push a patch to fix it up.


2. The resource graph is created with keys in both forward order
traversal and reverse order traversal and the update will finish the
forward order and attempt the reverse order. If this is the case, then
the update-replaced resources will be deleted before the update is
complete and if the update fails, the old resource is not available for
roll-back; a new resource has to be created then. I have added a test
case at the above mentioned location.

In our PoC, the updates (concurrent updates) won't remove a
update-replaced resource until all the resources are updated, and
resource clean-up phase is started.


Hmmm, this is a really interesting question actually. That's certainly 
not how Heat works at the moment; we've always assumed that rollback is 
"best-effort" at recovering the exact resources you had before. It would 
be great to have users weigh in on how they expect this to behave. I'm 
curious now what CloudFormation does.


I'm reluctant to change it though because I'm pretty sure this is 
definitely *not* how you would want e.g. a rolling update of an 
autoscaling group to happen.



It is unacceptable to remove the old
resource to be rolled-back to since it may have changes which the user
doesn't want to loose;


If they didn't want to lose it they shouldn't have tried an update that 
would replace it. If an update causes a replacement or an interruption 
to service then I consider the same fair game for the rollback - the 
user has already given us permission for that kind of change. (Whether 
the user's consent was informed is a separate question, addressed by 
Ryan's update-preview work.)



and that's why probably they use the roll-back
flag.


I don't think there's any basis for saying that. People use the rollback 
flag because they want the stack left in a consistent state even if an 
error occurs.


cheers,
Zane.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Heat] Where to keep data about stack breakpoints?

2015-01-12 Thread Zane Bitter

On 12/01/15 10:49, Ryan Brown wrote:

On 01/12/2015 10:29 AM, Tomas Sedovic wrote:

Hey folks,

I did a quick proof of concept for a part of the Stack Breakpoint
spec[1] and I put the "does this resource have a breakpoint" flag into
the metadata of the resource:

https://review.openstack.org/#/c/146123/

I'm not sure where this info really belongs, though. It does sound like
metadata to me (plus we don't have to change the database schema that
way), but can we use it for breakpoints etc., too? Or is metadata
strictly for Heat users and not for engine-specific stuff?


I'd rather not store it in metadata so we don't mix user metadata with
implementation-specific-and-also-subject-to-change runtime metadata. I
think this is a big enough feature to warrant a schema update (and I
can't think of another place I'd want to put the breakpoint info).


+1

I'm actually not convinced it should be in the template at all. Steve's 
suggestion of putting it the environment might be a good one, or maybe 
it should even just be an extra parameter to the stack create/update 
APIs (like e.g. the timeout is)?



I also had a chat with Steve Hardy and he suggested adding a STOPPED
state to the stack (this isn't in the spec). While not strictly
necessary to implement the spec, this would help people figure out that
the stack has reached a breakpoint instead of just waiting on a resource
that takes a long time to finish (the heat-engine log and event-list
still show that a breakpoint was reached but I'd like to have it in
stack-list and resource-list, too).

It makes more sense to me to call it PAUSED (we're not completely
stopping the stack creation after all, just pausing it for a bit), I'll
let Steve explain why that's not the right choice :-).


+1 to PAUSED. To me, STOPPED implies an end state (which a breakpoint is
not).


I agree we need an easy way for the user to see why nothing is 
happening, but adding additional states to the stack is a pretty 
dangerous change that risks creating regressions all over the place. If 
we can find _any_ other way to surface the information, it would be 
preferable IMHO.


cheers,
Zane.


For sublime end user confusion, we could use BROKEN. ;)


Tomas

[1]:
http://specs.openstack.org/openstack/heat-specs/specs/juno/stack-breakpoint.html


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev





__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Heat] Where to keep data about stack breakpoints?

2015-01-12 Thread Zane Bitter

On 12/01/15 13:05, Steven Hardy wrote:

>I also had a chat with Steve Hardy and he suggested adding a STOPPED state
>to the stack (this isn't in the spec). While not strictly necessary to
>implement the spec, this would help people figure out that the stack has
>reached a breakpoint instead of just waiting on a resource that takes a long
>time to finish (the heat-engine log and event-list still show that a
>breakpoint was reached but I'd like to have it in stack-list and
>resource-list, too).
>
>It makes more sense to me to call it PAUSED (we're not completely stopping
>the stack creation after all, just pausing it for a bit), I'll let Steve
>explain why that's not the right choice :-).

So, I've not got strong opinions on the name, it's more the workflow:

1. User triggers a stack create/update
2. Heat walks the graph, hits a breakpoint and stops.
3. Heat explicitly triggers continuation of the create/update


Did you mean the user rather than Heat for (3)?


My argument is that (3) is always a stack update, either a PUT or PATCH
update, e.g we_are_  completely stopping stack creation, then a user can
choose to re-start it (either with the same or a different definition).


Hmmm, ok that's interesting. I have not been thinking of it that way. 
I've always thought of it like this:


http://docs.aws.amazon.com/AutoScaling/latest/DeveloperGuide/adding-lifecycle-hooks.html

(Incidentally, this suggests an implementation where the lifecycle hook 
is actually a resource - with its own API, naturally.)


So, if it's requested, before each operation we send out a notification 
(hopefully via Zaqar), and if a breakpoint is set that operation is not 
carried out until the user makes an API call acknowledging it.



So, it_is_  really an end state, as a user might never choose to update
from the stopped state, in which case *_STOPPED makes more sense.


That makes a bit more sense now.

I think this is going to be really hard to implement though. Because 
while one branch of the graph stops, other branches have to continue as 
far as they can. At what point do you change the state of the stack?



Paused implies the same action as the PATCH update, only we trigger
continuation of the operation from the point we reached via some sort of
user signal.

If we actually pause an in-progress action via the scheduler, we'd have to
start worrying about stuff like token expiry, hitting timeouts, resilience
to engine restarts, etc, etc.  So forcing an explicit update seems simpler
to me.


Yes, token expiry and stack timeouts are annoying things we'd have to 
deal with. (Resilience to engine restarts is not affected though.) 
However, I'm not sure your model is simpler, and in particular it sounds 
much harder to implement in the convergence architecture.


cheers,
Zane.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Heat] Where to keep data about stack breakpoints?

2015-01-13 Thread Zane Bitter

On 13/01/15 11:58, Tomas Sedovic wrote:

I also had a chat with Steve Hardy and he suggested adding a STOPPED
state to the stack (this isn't in the spec). While not strictly
necessary to implement the spec, this would help people figure out that
the stack has reached a breakpoint instead of just waiting on a
resource
that takes a long time to finish (the heat-engine log and event-list
still show that a breakpoint was reached but I'd like to have it in
stack-list and resource-list, too).

It makes more sense to me to call it PAUSED (we're not completely
stopping the stack creation after all, just pausing it for a bit), I'll
let Steve explain why that's not the right choice :-).


+1 to PAUSED. To me, STOPPED implies an end state (which a breakpoint is
not).


I agree we need an easy way for the user to see why nothing is
happening, but adding additional states to the stack is a pretty
dangerous change that risks creating regressions all over the place. If
we can find _any_ other way to surface the information, it would be
preferable IMHO.


Would adding a new state to resources be similarly tricky, or could we
do that instead? That way you'd see what's going on in `resource-list`
which is should be good enough.


Yeah, that would be considerably less risky I think.

- ZB

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tc][python-clients] More freedom for all python clients

2015-01-13 Thread Zane Bitter

On 13/01/15 10:01, Jeremy Stanley wrote:

On 2015-01-13 07:46:38 -0500 (-0500), Sean Dague wrote:

Why doesn't rally just remove itself from projects.txt, then there
would be no restrictions on what it adds.


I second this recommendation. If Rally wants to depend on things
which are not part of OpenStack and not required to run or test
OpenStack in an official capacity (and since it's not an official
OpenStack project it's entirely within its rights to want to do
that), then forcing it to comply with the list of requirements for
official OpenStack projects is an unnecessarily restrictive choice.


FWIW I don't really agree with this advice. The purpose of 
openstack/requirements is to ensure that it's possible to create a 
distribution of OpenStack without conflicting version requirements, not 
to force every project to use every dependency listed. As such, some 
co-ordination around client libraries for related projects seems like it 
ought to be an uncontroversial addition. (Obviously it's easy to imagine 
potential additions that would legitimately be highly controversial, but 
IMHO this isn't one of them.) Saying that we require people to use our 
tools to get into the club but that our tools are not going to 
accommodate them in any way until they are in sounds a bit too close to 
"Go away" to my ears.



That said, I'd like to suggest another possible workaround: in Heat we 
keep resource plugins for non-official projects in the /contrib tree, 
and each plugin has it's own setup.cfg and requirements.txt. For example:


http://git.openstack.org/cgit/openstack/heat/tree/contrib/heat_barbican

So the user has the option to install each plugin, and it comes with its 
own requirements, independent of the main Heat installation. Perhaps 
Rally could consider something similar.


cheers,
Zane.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [Heat] Final steps toward a Convergence design

2015-01-19 Thread Zane Bitter

Hi folks,
I'd like to come to agreement on the last major questions of the 
convergence design. I well aware that I am the current bottleneck as I 
have been struggling to find enough time to make progress on it, but I 
think we are now actually very close.


I believe the last remaining issue to be addressed is the question of 
what to do when we want to update a resource that is still IN_PROGRESS 
as the result of a previous (now cancelled, obviously) update.


There are, of course, a couple of trivial and wrong ways to handle it:

1) Throw UpdateReplace and make a new one
 - This is obviously a terrible solution for the user

2) Poll the DB in a loop until the previous update finishes
 - This is obviously horribly inefficient

So the preferred solution here needs to involve retriggering the 
resource's task in the current update once the one from the previous 
update is complete.



I've implemented some changes in the simulator - although note that 
unlike stuff I implemented previously, this is extremely poorly tested 
(if at all) since the simulator runs the tasks serially and therefore 
never hits this case. So code review would be appreciated. I committed 
the changes on a new branch, "resumable":


https://github.com/zaneb/heat-convergence-prototype/commits/resumable

Here is a brief summary:
- The SyncPoints are now:
  * created for every resource, regardless of how many dependencies it has.
  * created at the beginning of an update and deleted before beginning 
another update.
  * contain only the list of satisfied dependencies (and their RefId 
and attribute values).
- The graph is now stored in the Stack table, rather than passed through 
the chain of trigger notifications.
- We'll use locks in the Resource table to ensure that only one action 
at a time can happen on a Resource.
- When a trigger is received for a resource that is locked (i.e. status 
is IN_PROGRESS and the engine owning it is still alive), the trigger is 
ignored.
- When processing of a resource completes, a failure to find any of the 
sync points that are to be notified (every resource has at least one, 
since the last resource in each chain must notify the stack that it is 
complete) indicates that the current update has been cancelled and 
triggers a new check on the resource with the data for the current 
update (retrieved from the Stack table) if it is ready (as indicated by 
its SyncPoint entry).


I'm not 100% happy with the amount of extra load this puts on the 
database, but I can't see a way to do significantly better and still 
solve this locking issue. Suggestions are welcome. At least the common 
case is considerably better than the worst case.


There are two races here that we need to satisfy ourselves we have 
answers for (I think we do):
1) Since old SyncPoints are deleted before a new transition begins and 
we only look for them after unlocking the resource being processed, I 
don't believe that both the previous and the new update can fail to 
trigger the check on the resource in the new update's traversal. (If 
there are any DB experts out there, I'd be interested in their input on 
this one.)
2) When both the previous and the new update end up triggering a check 
on the resource in the new update's traversal, we'll only perform one 
because one will succeed in locking the resource and the other will just 
be ignored after it fails to acquire the lock. (This one is watertight, 
since both processes are acting on the same lock.)



I believe that this model is very close to what Anant and his team are 
proposing. Arguably this means I've been wasting everyone's time, but a 
happier way to look at it is that two mostly independent design efforts 
converging on a similar solution is something we can take a lot of 
confidence from ;)


My next task is to start breaking this down into blueprints that folks 
can start implementing. In the meantime, it would be great if we could 
identify any remaining discrepancies between the two designs and 
completely close those last gaps.


cheers,
Zane.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [Heat] Convergence Phase 1 implementation plan

2015-01-23 Thread Zane Bitter
I've mentioned this in passing a few times, but I want to lay it out 
here in a bit more detail for comment. Basically we're implementing 
convergence at a time when we still have a lot of 'unit' tests that are 
really integration tests, and we don't want to have to rewrite them to 
anticipate this new architecture, nor wait until they have all been 
converted into functional tests. And of course it goes without saying 
that we have to land all of these changes without breaking anything for 
users.


To those ends, my proposal is that we (temporarily) support two code 
paths: the existing, legacy in-memory path and the new, distributed 
convergence path. Stacks will contain a field indicating which code path 
they were created with, and each stack will be operated on only by that 
same code path throughout its lifecycle (i.e. a stack created in legacy 
mode will always use the legacy code). We'll add a config option, off by 
default, to enable the new code path. That way users can switch over at 
a time of their choosing. When we're satisfied that it's stable enough 
we can flip the default (note: IMHO this would have to happen before 
kilo-3 in order to make it for the Kilo release).


Based on this plan, I had a go at breaking the work down into discrete 
tasks, and because it turned out to be really long I put it in an 
etherpad rather than include it here:


https://etherpad.openstack.org/p/heat-convergence-tasks

If anyone has additions/changes then I suggest adding them to that 
etherpad and replying to this message to flag your changes.


To be clear, it's unlikely I will have time to do the implementation 
work on any of these tasks myself (although I will be trying to review 
as many of them as possible). So the goal here is to get as many 
contributors involved in doing stuff in parallel as we can.


There are obviously dependencies between many of these tasks, so my plan 
is to raise each one as a blueprint so we can see the magic picture that 
Launchpad shows. I want to get feedback first though, because there are 
18 of them so far, and rejigging everything in response to feedback 
would be a bit of a pain.


I'm also prepared to propose specs for all of these _if_ people think 
that would be helpful. I see three options here:

 - Propose 18 fairly minimal specs (maybe in a single review?)
 - Propose 1 large spec (essentially the contents of that etherpad)
 - Just discuss in the etherpad rather than Gerrit

Obviously that's in decreasing order of the amount of work required, but 
I'll do whatever folks think best for discussion.


cheers,
Zane.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [heat][hot]

2015-01-26 Thread Zane Bitter

On 25/01/15 10:41, Dmitry wrote:

Hello,
I need to receive instance id as part of the instance installation script.
Something like:
params:
   $current_id: {get_param: $this.id }


I have no idea what this is supposed to mean, sorry.


Is it possible?


The get_resource function will return the server UUID for a server 
resource, but you can't use it from within that resource itself (it 
would be a circular reference).


The UUID of a server is provided to the server through the Nova 
metadata; you should retrieve it from there in your user_data script.


cheers,
Zane.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Heat] core team changes

2015-01-28 Thread Zane Bitter

On 27/01/15 20:36, Angus Salkeld wrote:

Hi all

After having a look at the stats:
http://stackalytics.com/report/contribution/heat-group/90
http://stackalytics.com/?module=heat-group&metric=person-day

I'd like to propose the following changes to the Heat core team:

Add:
Qiming Teng
Huang Tianhua

Remove:
Bartosz Górski (Bartosz has indicated that he is happy to be removed and
doesn't have the time to work on heat ATM).

Core team please respond with +/- 1.


+1

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [heat] multicloud support for ec2

2015-01-28 Thread Zane Bitter

On 25/01/15 00:03, Hongbin Lu wrote:

Hi Heat team,

I am looking for a solution to bridge between OpenStack and EC2.
According to documents, it seems that Heat has multicloud support but
the remote cloud(s) must be OpenStack.


It actually doesn't, although it is planned. (We have multi-region 
support, but that implies a shared Keystone for both regions.)



I wonder if Heat supports
multicloud in the context of supporting remote EC2 cloud. For example,
does Heat support a remote stack that contains resources from EC2 cloud?


No, and that's not planned either.


As a result, creating a stack will provision local OpenStack resources
along with remote EC2 resources.

If this feature is not supported, will the dev team accept blueprint
and/or contributions for that?


I think the most accurate short answer here is "no". Of course, I can't 
claim to speak for everyone.


There are some contributions I think we would be willing to accept, 
though. For example, the biggest obstacle to writing AWS plugins is to 
find a way to provide the user's AWS credentials to the plugin securely. 
If we had a solution to the credential problem it would also be helpful 
for multi-cloud in Heat across two OpenStack clouds that lack Keystone 
federation, so I think that would definitely be valuable.


You're definitely not the only ones wanting to drive AWS from Heat, so 
maybe you could set up a separate StackForge project or the like to 
develop a set of AWS plugins.


cheers,
Zane.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [Heat][Keystone] Native keystone resources in Heat

2015-01-29 Thread Zane Bitter
I got a question today about creating keystone users/roles/tenants in 
Heat templates. We currently support creating users via the 
AWS::IAM::User resource, but we don't have a native equivalent.


IIUC keystone now allows you to add users to a domain that is otherwise 
backed by a read-only backend (i.e. LDAP). If this means that it's now 
possible to configure a cloud so that one need not be an admin to create 
users then I think it would be a really useful thing to expose in Heat. 
Does anyone know if that's the case?


I think roles and tenants are likely to remain admin-only, but we have 
precedent for including resources like that in /contrib... this seems 
like it would be comparably useful.


Thoughts?

cheers,
Zane.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Heat][Keystone] Native keystone resources in Heat

2015-01-29 Thread Zane Bitter

On 29/01/15 12:03, Steven Hardy wrote:

On Thu, Jan 29, 2015 at 11:41:36AM -0500, Zane Bitter wrote:

>IIUC keystone now allows you to add users to a domain that is otherwise
>backed by a read-only backend (i.e. LDAP). If this means that it's now
>possible to configure a cloud so that one need not be an admin to create
>users then I think it would be a really useful thing to expose in Heat. Does
>anyone know if that's the case?


I've not heard of that feature, but it's definitely now possible to
configure per-domain backends, so for example you could have the "heat"
domain backed by SQL and other domains containing real human users backed
by a read-only directory.


http://adam.younglogic.com/2014/08/getting-service-users-out-of-ldap/


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Heat][Keystone] Native keystone resources in Heat

2015-01-30 Thread Zane Bitter

On 30/01/15 05:20, Steven Hardy wrote:

On Thu, Jan 29, 2015 at 12:31:17PM -0500, Zane Bitter wrote:

On 29/01/15 12:03, Steven Hardy wrote:

On Thu, Jan 29, 2015 at 11:41:36AM -0500, Zane Bitter wrote:

IIUC keystone now allows you to add users to a domain that is otherwise
backed by a read-only backend (i.e. LDAP). If this means that it's now
possible to configure a cloud so that one need not be an admin to create
users then I think it would be a really useful thing to expose in Heat. Does
anyone know if that's the case?


I've not heard of that feature, but it's definitely now possible to
configure per-domain backends, so for example you could have the "heat"
domain backed by SQL and other domains containing real human users backed
by a read-only directory.


http://adam.younglogic.com/2014/08/getting-service-users-out-of-ldap/


Perhaps we need to seek clarification from Adam/Henry, but my understanding
of that feature is not that it enables you to add users to domains backed
by a read-only directory, but rather that multiple backends are possible,
such that one domain can be backed by a read-only backend, and another
(different) domain can be backed by a different read/write one.

E.g in the example above, you might have the "freeipa" domain backed by
read-only LDAP which contains your directory of human users, and you might
also have a different domain e.g "services" or "heat" backed by a
read/write backend e.g Sql.


Ah, you're right, I've been misinterpreting that post this whole time. 
Thanks!


- ZB


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Heat][Keystone] Native keystone resources in Heat

2015-02-02 Thread Zane Bitter

On 30/01/15 02:19, Thomas Spatzier wrote:

From: Zane Bitter 
To: openstack Development Mailing List



Date: 29/01/2015 17:47
Subject: [openstack-dev] [Heat][Keystone] Native keystone resources in

Heat


I got a question today about creating keystone users/roles/tenants in
Heat templates. We currently support creating users via the
AWS::IAM::User resource, but we don't have a native equivalent.

IIUC keystone now allows you to add users to a domain that is otherwise
backed by a read-only backend (i.e. LDAP). If this means that it's now
possible to configure a cloud so that one need not be an admin to create
users then I think it would be a really useful thing to expose in Heat.
Does anyone know if that's the case?

I think roles and tenants are likely to remain admin-only, but we have
precedent for including resources like that in /contrib... this seems
like it would be comparably useful.

Thoughts?


I am really not a keystone expert, so don't know what the security
implications would be, but I have heard the requirement or wish to be able
to create users, roles etc. from a template many times. I've talked to
people who want to explore this for onboarding use cases, e.g. for
onboarding of lines of business in a company, or for onboarding customers
in a public cloud case. They would like to be able to have templates that
lay out the overall structure for authentication stuff, and then
parameterize it for each onboarding process.
If this is something to be enabled, that would be interesting to explore.


Thanks for the input everyone. I raised a spec + blueprint here:

https://review.openstack.org/152309

I don't have any immediate plans to work on this, so if anybody wants to 
grab it they'd be more than welcome :)


cheers,
Zane.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Heat] Convergence Phase 1 implementation plan

2015-02-02 Thread Zane Bitter

On 26/01/15 19:04, Angus Salkeld wrote:

On Sat, Jan 24, 2015 at 7:00 AM, Zane Bitter mailto:zbit...@redhat.com>> wrote:
I'm also prepared to propose specs for all of these _if_ people
think that would be helpful. I see three options here:
  - Propose 18 fairly minimal specs (maybe in a single review?)


This sounds fine, but if possible group them a bit 18 sounds like a lot
and many of these look like small jobs.
I am also open to using bugs for smaller items. Basically this is just
the red tape, so what ever is the least effort
and makes things easier to divide the work up.


OK, here are the specs:

https://review.openstack.org/#/q/status:open+project:openstack/heat-specs+branch:master+topic:convergence,n,z

Let's get reviewing (and implementing!) :)

cheers,
Zane.


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [heat] operators vs users for choosing convergence engine

2015-02-03 Thread Zane Bitter

On 02/02/15 19:52, Steve Baker wrote:

A spec has been raised to add a config option to allow operators to
choose whether to use the new convergence engine for stack operations.
For some context you should read the spec first [1]

Rather than doing this, I would like to propose the following:


I am strongly, strongly opposed to making this part of the API.


* Users can (optionally) choose which engine to use by specifying an
engine parameter on stack-create (choice of classic or convergence)
* Operators can set a config option which determines which engine to use
if the user makes no explicit choice
* Heat developers will set the default config option from classic to
convergence when convergence is deemed sufficiently mature


We'd also need a way for operators to prevent users from enabling 
convergence if they're not ready to support it.



I realize it is not ideal to expose this kind of internal implementation
detail to the user, but choosing convergence _will_ result in different
stack behaviour (such as multiple concurrent update operations) so there
is an argument for giving the user the choice. Given enough supporting
documentation they can choose whether convergence might be worth trying
for a given stack (for example, a large stack which receives frequent
updates)


It's supposed to be a strict improvement; we don't need to ask 
permission. We have made major changes of this type in practically every 
Heat release. When we switched from creating resources serially to 
creating them in parallel in Havana we didn't ask permission. We just 
did it. We when started allowing users to recover from a failed 
operation in Juno we didn't ask permission. We just did it. We don't 
need to ask permission to allow concurrent updates. We can just do it.


The only difference here is that we are being a bit smarter and 
uncoupling our development schedule from the release cycle. There are 15 
other blueprints, essentially all of which have to be complete before 
convergence is usable at all. It won't do *anything at all* until we are 
at least 12 blueprints in. The config option buys us time to land them 
without the risk of something half-finished appearing in the release 
(trunk-chasers will also thank us). It has no other legitimate purpose IMO.


The goal is IN NO WAY to maintain separate code paths in the long term. 
The config option is simply a development strategy to allow us to land 
code without screwing up a release and while maintaining as much test 
coverage as possible.



Operators likely won't feel they have enough knowledge to make the call
that a heat install should be switched to using all convergence, and
users will never be able to try it until the operators do (or the
default switches).


Hardly anyone should have to make a call. We should flip the default as 
soon as all of the blueprints have landed (i.e. as soon as it works at 
all), provided that a release is not imminent. (Realistically, at this 
point I think we have to say the target is to do it as early as in 
Lizard as we can.) That means for those chasing trunk they get it as 
soon as it works at all, and for those using stable releases they get it 
at the next release, just like every other feature we have ever added.


As a bonus, trunk-chasing operators who need to can temporarily delay 
enabling of convergence until a point of their choosing in the release 
cycle by overriding the default. Anybody in that position likely has 
enough knowledge to make the right call for them.


So I believe that all of our stakeholders are catered to by the config 
option: operators & users who want a stable, tested release; 
operator/users who want to experiment on the bleeding edge; and 
operators who chase trunk but whose users require stability.


The only group that benefits from enshrining the choice in the API - 
users who want to experiment with the bleeding edge, but who don't 
control their own OpenStack deployment - doesn't actually exist, and if 
it did then this would be among the least of their problems.



Finally, there are also some benefits to heat developers. Creating a
whole new gate job to test convergence-enabled heat will consume its
share of CI resource. I'm hoping to make it possible for some of our
functional tests to run against a number of scenarios/environments.
Being able to run tests under classic and convergence scenarios in one
test run will be a great help (for performance profiling too).


I think this is the strongest argument in favour. However, I'd like to 
think it would be possible to run the functional tests twice in the 
gate, changing the config and restarting the engine in between.


But if the worst comes to the worst, then although I think it's 
preferable to use one VM for twice as long vs. two VMs for the same 
length of time, I don't think the impact on resource utilisation in the 
gate of choosing one over the other is likely to be huge. And I don't 
see this situation persisting for a long 

Re: [openstack-dev] [Heat] Add extraroutes support to neutron routers

2015-02-05 Thread Zane Bitter

On 05/02/15 12:23, James Denton wrote:

Hello all,

Regarding
https://blueprints.launchpad.net/heat/+spec/router-properties-object

Does anyone know if there are plans to implement this functionality in
an upcoming release?


Unlikely - unfortunately the Neutron API for extra routes makes it 
impossible to implement correctly. That said, Kevin's code did land in 
/contrib, so the plugin is there if you need it:


http://git.openstack.org/cgit/openstack/heat/tree/contrib/extraroute

cheers,
Zane.


Our use case meets the one described by Kevin, but
rather than trying to route traffic to an outside resource, we are
routing to another instance off the router.

Thanks!

—
James Denton


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [heat] operators vs users for choosing convergence engine

2015-02-06 Thread Zane Bitter

On 03/02/15 14:12, Clint Byrum wrote:

The visible change in making things parallel was minimal. In talking
about convergence, it's become clear that users can and should expect
something radically different when they issue stack updates. I'd love to
say that it can be done to just bind convergence into the old ways, but
doing so would also remove the benefit of having it.

Also allowing resume wasn't a new behavior, it was fixing a bug really
(that state was lost on failed operations). Convergence is a pretty
different beast from the current model,


That's not actually the case for Phase 1; really nothing much should 
change from the user point of view, except that if you issue an update 
before a previous one is finished then you won't get an error back any more.



In any event, I think Angus's comment on the review is correct, we 
actually have two different problems here. One is how to land the code, 
and a config option is indisputably the right choice here: until many, 
many blueprints have landed then the convergence code path will do 
literally nothing at all. There is no conceivable advantage to users for 
opting in to that.


The second question, which we can continue to discuss, is whether to 
allow individual users to opt in/out once operators have enabled the 
convergence flow path. I'm not convinced that there is anything 
particular special about this feature that warrants such a choice more 
than any other feature that we have developed in the past. However, I 
don't think we need to decide until around the time that we're preparing 
to flip the default on. By that time we should have better information 
about the level of stability we're dealing with, and we can get input 
from operators on what kind of additional steps we should take to 
maintain stability in the face of possible regressions.


cheers,
Zane.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Mistral] Changing "expression" delimiters in Mistral DSL

2015-02-18 Thread Zane Bitter

On 16/02/15 16:06, Dmitri Zimine wrote:

2) Use functions, like Heat HOT or TOSCA:

HOT templates and TOSCA doesn’t seem to have a concept of typed
variables to borrow from (please correct me if I missed it). But they
have functions: function: { function_name: {foo: [parameter1, parameter
2], bar:"xxx”}}. Applied to Mistral, it would look like:

 publish:
  - bool_var: { yaql: “1+1+$.my.var < 100” }

Not bad, but currently rejected as it reads worse than delimiter-based
syntax, especially in simplified one-line action invocation.


Note that you don't actually need the quotes there, so this would be 
equivalent:


publish:
 - bool_var: {yaql: 1+1+$.my.var < 100}

FWIW I am partial to this or to Renat's p7 suggestion:

publish:
 - bool_var: yaql{1+1+$.my.var < 100}

Both offer the flexibility to introduce new syntax in the future without 
breaking backwards compatibility.


cheers,
Zane.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [stable][all] Revisiting the 6 month release cycle

2015-02-24 Thread Zane Bitter

On 23/02/15 19:14, Joe Gordon wrote:

Was:
http://lists.openstack.org/pipermail/openstack-dev/2015-February/057578.html

There has been frustration with our current 6 month development cadence.
This is an attempt to explain those frustrations and propose a very
rough outline of a possible alternative.


Currently we follow a 6 month release cadence, with 3 intermediate
milestones (6 weeks apart), as explained here:
https://wiki.openstack.org/wiki/Kilo_Release_Schedule


Current issues

  * 3 weeks of feature freeze for all projects at the end of each cycle
(3 out of the 26 feature blocked)
  * 3 weeks of release candidates. Once a candidate is cut development
is open for next release. While this is good in theory, not much
work actually starts on next release.
  * some projects have non priority feature freezes and at Milestone 2
(so 9 out of 26 weeks restricted in those projects)
  * vicious development circle
  o vicious circle:
  + big push right to land lots of features right before the release
  + a lot of effort is spent getting the release ready
  + after the release people are a little burnt out and take it
easy until the next summit
  + Blueprints have to be re-discussed and re-approved for the
next cycle


I'm sure there's a good reason for this one that I'm not aware of, but 
it certainly sounds a lot like declaring a war of attrition against a 
project's own contributors.


If core teams think that a design that they previously approved needs to 
change because of other changes _they_ approved in the previous release 
cycle or a decision _they_ made during the design summit, the onus 
should be on _them_ to actively make the change (even if that's as 
simple as proposing a patch deleting *a* previously-approved spec - not 
the whole lot en masse). To delete everything by default and force the 
contributor to get it reapproved is tantamount to handing them a shovel 
and forcing them to shift the goalposts themselves. It's flirting with 
the line between poor policy and outright evil.



  + makes it hard to land blueprints early in the cycle casing
the bug rush at the end of the cycle, see step 1
  o Makes it hard to plan things across a release
  o This actually destabilizes things right as we go into the
stabilization period (We actually have great data on this too)
  o Means postponing blueprints that miss a deadline several months


I'm yet to be convinced by the solution offered here, but I do think 
this is a fairly accurate description of the problem. I always tell 
folks that if you want big features to land in the next release, you 
have to start working on them as soon as you can finish up on any 
release blockers and well before Summit. Mostly that month feels like 
dead time, though.


Sometimes I daydream about what it would be like if we had the design 
summit a few weeks _before_ the release instead of after. Usually I snap 
out of it when I remember the downsides, like having developers under 
pressure to fix bugs at the same time as the design summit, or not 
having a shiny new release to trumpet at the conference, or everyone 
getting excited about new feature discussions and not working on urgent 
bugs when they get back. Still, I keep thinking about it...


cheers,
Zane.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Openstack-operators] Resources owned by a project/tenant are not cleaned up after that project is deleted from keystone

2015-02-25 Thread Zane Bitter

On 25/02/15 15:37, Joe Gordon wrote:



On Sat, Feb 21, 2015 at 5:03 AM, Tim Bell mailto:tim.b...@cern.ch>> wrote:


A few inline comments and a general point

How do we handle scenarios like volumes when we have a per-component
janitor rather than a single co-ordinator ?

To be clean,

1. nova should shutdown the instance
2. nova should then ask the volume to be detached
3. cinder could then perform the 'project deletion' action as
configured by the operator (such as shelve or backup)
4. nova could then perform the 'project deletion' action as
configured by the operator (such as VM delete or shelve)

If we have both cinder and nova responding to a single message,
cinder would do 3. Immediately and nova would be doing the shutdown
which is likely to lead to a volume which could not be shelved cleanly.

The problem I see with messages is that co-ordination of the actions
may require ordering between the components.  The disable/enable
cases would show this in a worse scenario.


You raise two good points.

* How to clean something up may be different for different clouds
* Some cleanup operations have to happen in a specific order

Not sure what the best way to address those two points is.  Perhaps the
best way forward is a openstack-specs spec to hash out these details.


For completeness, if nothing else, it should be noted that another 
option is for Keystone to refuse to delete the project until all 
resources within it have been removed by a user.


It's hard to know at this point which would be more painful. Both sound 
horrific in their own way :D


cheers,
Zane.



Tim

 > -Original Message-
 > From: Ian Cordasco [mailto:ian.corda...@rackspace.com
]
 > Sent: 19 February 2015 17:49
 > To: OpenStack Development Mailing List (not for usage questions);
Joe Gordon
 > Cc: openstack-operat...@lists.openstack.org

 > Subject: Re: [Openstack-operators] [openstack-dev] Resources
owned by a
 > project/tenant are not cleaned up after that project is deleted
from keystone
 >
 >
 >
 > On 2/2/15, 15:41, "Morgan Fainberg" mailto:morgan.fainb...@gmail.com>> wrote:
 >
 > >
 > >On February 2, 2015 at 1:31:14 PM, Joe Gordon
(joe.gord...@gmail.com )
 > >wrote:
 > >
 > >
 > >
 > >On Mon, Feb 2, 2015 at 10:28 AM, Morgan Fainberg
 > >mailto:morgan.fainb...@gmail.com>>
wrote:
 > >
 > >I think the simple answer is "yes". We (keystone) should emit
 > >notifications. And yes other projects should listen.
 > >
 > >The only thing really in discussion should be:
 > >
 > >1: soft delete or hard delete? Does the service mark it as
orphaned, or
 > >just delete (leave this to nova, cinder, etc to discuss)
 > >
 > >2: how to cleanup when an event is missed (e.g rabbit bus goes
out to
 > >lunch).
 > >
 > >
 > >
 > >
 > >
 > >
 > >I disagree slightly, I don't think projects should directly
listen to
 > >the Keystone notifications I would rather have the API be something
 > >from a keystone owned library, say keystonemiddleware. So
something like
 > this:
 > >
 > >
 > >from keystonemiddleware import janitor
 > >
 > >
 > >keystone_janitor = janitor.Janitor()
 > >keystone_janitor.register_callback(nova.tenant_cleanup)
 > >
 > >
 > >keystone_janitor.spawn_greenthread()
 > >
 > >
 > >That way each project doesn't have to include a lot of boilerplate
 > >code, and keystone can easily modify/improve/upgrade the
notification
 > mechanism.
 > >
 > >


I assume janitor functions can be used for

- enable/disable project
- enable/disable user

> >
> >
> >
> >
> >
> >
> >
> >
> >
> >Sure. I’d place this into an implementation detail of where that
> >actually lives. I’d be fine with that being a part of Keystone
> >Middleware Package (probably something separate from auth_token).
> >
> >
> >—Morgan
> >
>
> I think my only concern is what should other projects do and how much do 
we
> want to allow operators to configure this? I can imagine it being 
preferable to
> have safe (without losing much data) policies for this as a default and 
to allow
> operators to configure more destructive policies as part of deploying 
certain
> services.
>

Depending on the cloud, an operator could want different semantics
for delete project's impact, between delete or 'shelve' style or
maybe disable.

 >
 > >
 > >
 > >
 > >
 > >
 > >--Morgan
 > >
 > >Sent via mobile
 > >
 > >> On Feb 2, 2015, at 10:16, Matthew Treinish
mailto:mtrein...@kortar.o

Re: [openstack-dev] [Openstack-operators] Resources owned by a project/tenant are not cleaned up after that project is deleted from keystone

2015-02-25 Thread Zane Bitter

On 25/02/15 19:15, Dolph Mathews wrote:


On Wed, Feb 25, 2015 at 5:42 PM, Zane Bitter mailto:zbit...@redhat.com>> wrote:

On 25/02/15 15:37, Joe Gordon wrote:



On Sat, Feb 21, 2015 at 5:03 AM, Tim Bell mailto:tim.b...@cern.ch>
<mailto:tim.b...@cern.ch <mailto:tim.b...@cern.ch>>> wrote:


 A few inline comments and a general point

 How do we handle scenarios like volumes when we have a
per-component
 janitor rather than a single co-ordinator ?

 To be clean,

 1. nova should shutdown the instance
 2. nova should then ask the volume to be detached
 3. cinder could then perform the 'project deletion' action as
 configured by the operator (such as shelve or backup)
 4. nova could then perform the 'project deletion' action as
 configured by the operator (such as VM delete or shelve)

 If we have both cinder and nova responding to a single message,
 cinder would do 3. Immediately and nova would be doing the
shutdown
 which is likely to lead to a volume which could not be
shelved cleanly.

 The problem I see with messages is that co-ordination of
the actions
 may require ordering between the components.  The
disable/enable
 cases would show this in a worse scenario.


You raise two good points.

* How to clean something up may be different for different clouds
* Some cleanup operations have to happen in a specific order

Not sure what the best way to address those two points is.
Perhaps the
best way forward is a openstack-specs spec to hash out these
details.


For completeness, if nothing else, it should be noted that another
option is for Keystone to refuse to delete the project until all
resources within it have been removed by a user.


Keystone has no knowledge of the tenant-owned resources in OpenStack
(nor is it a client of the other services), so that's not really feasible.


As pointed out above, Keystone doesn't have any knowledge of how to 
orchestrate the deletion of the tenant-owned resources either (and in 
large part neither do the other services - except Heat, and then only 
for the ones it created), so by that logic neither option is feasible.


Choose your poison ;)



It's hard to know at this point which would be more painful. Both
sound horrific in their own way :D

cheers,
Zane.



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Heat] Multi region support for Heat

2013-07-25 Thread Zane Bitter

On 25/07/13 17:08, Bartosz Górski wrote:

First of all sorry for the late reply. I needed some time to understand
you vision and check a few things.

On 07/24/2013 08:40 AM, Clint Byrum wrote:

Excerpts from Adrian Otto's message of 2013-07-23 21:22:14 -0700:

Clint,

On Jul 23, 2013, at 10:03 AM, Clint Byrum 
  wrote:


Excerpts from Steve Baker's message of 2013-07-22 21:43:05 -0700:

On 07/23/2013 10:46 AM, Angus Salkeld wrote:

On 22/07/13 16:52 +0200, Bartosz Górski wrote:

Hi folks,

I would like to start a discussion about the blueprint I raised
about
multi region support.
I would like to get feedback from you. If something is not clear or
you have questions do not hesitate to ask.
Please let me know what you think.

Blueprint:
https://blueprints.launchpad.net/heat/+spec/multi-region-support

Wikipage:
https://wiki.openstack.org/wiki/Heat/Blueprints/Multi_Region_Support_for_Heat



What immediatley looks odd to me is you have a MultiCloud Heat
talking
to other Heat's in each region. This seems like unneccessary
complexity to me.
I would have expected one Heat to do this job.

Yes. You are right. I'm seeing it now. One heat talking to other heat
service would be an overkill and unnecessary complexity.
Better solution is to use one heat which will be talking directly to
services (nova, glance, ...).


+1, and this is reasonably easy for Heat to do, simply by asking for a 
different region's service catalog from keystone.


Also solution with two heat services (one for single region and one for
multi region) has a lot of common parts.
For example single region heat needs to create a dependencies graph
where each node is a resource and multi region a graph where each node
is template.


I'm not convinced about the need for this though.

I looked at your example on the wiki, and it just contains a bunch of 
East resources that reference each other and a bunch of West resources 
that reference each other and never the twain shall meet. And that seems 
inevitable - you can't, e.g. connect a Cinder volume in one region to a 
Nova server in another region. So I'm wondering why we would ever want 
to mix regions in a single template, with a single dependency graph, 
when it's not really meaningful to have dependencies between resources 
in different regions. There's no actual orchestration to do at that level.


It seems to me your example would have been better as two templates (or, 
even better, the same template launched in two different regions, since 
I couldn't detect any differences between East vs. West).


Note that there are plans in the works to make passing multiple files to 
Heat a more pleasant experience.


I think creating an OS::Heat::Stack resource with a Region property 
solves 95%+ of the problem, without adding or modifying any other resources.


cheers,
Zane.


So in fact it is better to have only one but more powerful heat service.

It should be possible to achieve this with a single Heat
installation -
that would make the architecture much simpler.


Agreed that it would be simpler and is definitely possible.

However, consider that having a Heat in each region means Heat is more
resilient to failure. So focusing on a way to make multiple Heat's
collaborate, rather than on a way to make one Heat talk to two regions
may be a more productive exercise.

I agree with Angus, Steve Baker, and Randall on this one. We should
aim for simplicity where practical. Having Heat services interacting
with other Heat services seems like a whole category of complexity
that's difficult to justify. If it were implemented as Steve Baker
described, and the local Heat service were unavailable, the client
may still have the option to use a Heat service in another region and
still successfully orchestrate. That seems to me like a failure mode
that's easier for users to anticipate and plan for.

Steve I really like you concept with the context as a resource. What do
you think how we should proceed with it to make it happened?

What looks wared for me is the concept that orchestration service
deployed in one region can orchestrate other regions.
My understating of regions was that they are separated and do not know
about each other. So the heat service which is
responsible for orchestrating multi region should not be deployed in any
of those regions but somewhere else.

Right now I also do not see a point for having separate heat service in
each region.
One heat service with multi region support not deployed in any of the
existing regions (logically not physically) looks fine for me.




I'm all for keeping the solution simple. However, I am not for making
it simpler than it needs to be to actually achieve its stated goals.


Can you further explain your perspective? What sort of failures would
you expect a network of coordinated Heat services to be more
effective with? Is there any way this would be more simple or more
elegant than other options?

I expect partitions across regions to be common. Regions should

Re: [openstack-dev] [Heat] Multi region support for Heat

2013-07-26 Thread Zane Bitter

On 25/07/13 19:07, Bartosz Górski wrote:

We want to start from something simple. At the beginning we are assuming
no dependencies between resources from different region. Our first use
case (the one on the wikipage) uses this assumptions. So this is why it
can be easily split on two separate single region templates.

Our goal is to support dependencies between resources from different
regions. Our second use case (I will add it with more details to the
wikipage soon) is similar to deploying two instances (app server + db
server) wordpress in two different regions (app server in the first
region and db server in the second). Regions will be connected to each
other via VPN connection . In this case configuration of app server
depends on db server. We need to know IP address of created DB server to
properly configure App server. It forces us to wait with creating app
server until db server will be created.


That's still a fairly simple case that could be handled by a pair of 
OS::Heat::Stack resources (one provides a DBServerIP output it is passed 
as a parameter to the other region using {'Fn::GetAtt': 
['FirstRegionStack', 'Outputs.DBServerIP']}. But it's possible to 
imagine circumstances where that approach is at least suboptimal (e.g. 
when creating the actual DB server is comparatively quick, but we have 
to wait for the entire template, which might be slow).



More complicated use case with load balancers and more regions are also
in ours minds.


Good to know, thanks. I'll look forward to reading more about it on the 
wiki.


What I'd like to avoid is a situation where anything _appears_ to be 
possible (Nova server and Cinder volume in different regions? Sure! 
Connect 'em together? Sure!), and the user only finds out later that it 
doesn't work. It would be much better to structure the templates in such 
a way that only things that are legitimate are expressible. That's not 
an achievable goal, but IMO we want to be much closer to the latter than 
the former.


cheers,
Zane.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Heat] Multi region support for Heat

2013-07-29 Thread Zane Bitter

On 29/07/13 02:04, Angus Salkeld wrote:

On 26/07/13 09:43 -0700, Clint Byrum wrote:

Excerpts from Zane Bitter's message of 2013-07-26 06:37:09 -0700:

On 25/07/13 19:07, Bartosz Górski wrote:
> We want to start from something simple. At the beginning we are
assuming
> no dependencies between resources from different region. Our first use
> case (the one on the wikipage) uses this assumptions. So this is
why it
> can be easily split on two separate single region templates.
>
> Our goal is to support dependencies between resources from different
> regions. Our second use case (I will add it with more details to the
> wikipage soon) is similar to deploying two instances (app server + db
> server) wordpress in two different regions (app server in the first
> region and db server in the second). Regions will be connected to each
> other via VPN connection . In this case configuration of app server
> depends on db server. We need to know IP address of created DB
server to
> properly configure App server. It forces us to wait with creating app
> server until db server will be created.

That's still a fairly simple case that could be handled by a pair of
OS::Heat::Stack resources (one provides a DBServerIP output it is passed
as a parameter to the other region using {'Fn::GetAtt':
['FirstRegionStack', 'Outputs.DBServerIP']}. But it's possible to
imagine circumstances where that approach is at least suboptimal (e.g.
when creating the actual DB server is comparatively quick, but we have
to wait for the entire template, which might be slow).



How about we add an actual heat resource?

So you could aggregate stacks.

We kinda have one with "OS::Heat::Stack", but it doesn't use


(aside: this doesn't actually exist yet, we only have 
AWS::CloudFormation::Stack at present.)



python-heatclient. We could solve this by adding an "endpoint"
  property to the "OS::Heat::Stack" resource. Then if it is not
local then it uses python-heatclient to create the nested stack
remotely.


Yes, that's what I was trying (and failing) to suggest.



Just a thought.

-Angus



If you break that stack up into two stacks, db and "other slow stuff"
then you can get the Output of the db stack earlier, so that is a
solvable problem.


+1


> More complicated use case with load balancers and more regions are
also
> in ours minds.

Good to know, thanks. I'll look forward to reading more about it on the
wiki.

What I'd like to avoid is a situation where anything _appears_ to be
possible (Nova server and Cinder volume in different regions? Sure!
Connect 'em together? Sure!), and the user only finds out later that it
doesn't work. It would be much better to structure the templates in such
a way that only things that are legitimate are expressible. That's not
an achievable goal, but IMO we want to be much closer to the latter than
the former.



These are all predictable limitations and can be handled at the parsing
level.  You will know as soon as you have template + params whether or
not that cinder volume in region A can be attached to the nova server
in region B.


That's true, but IMO it's even better if it's obvious at the time you 
are writing the template. e.g. if (as is currently the case) there is no 
mechanism within a template to select a region for each resource, then 
it's obvious you have to write separate templates for each region (and 
combine them somehow).



I'm still convinced that none of this matters if you rely on a single
Heat
in one of the regions. The whole point of multi region is to eliminate
a SPOF.


So the idea here would be that you spin up a master template in one 
region, and this would contain OS::Heat::Stack resources that use 
python-heatclient to connect to Heat APIs in other regions to spin up 
the constituent stacks in each region. If a region goes down, even if it 
is the one with your master template, that's no problem because you can 
still interact with the constituent stacks directly in whatever 
region(s) you _can_ reach.


So it solves the non-obviousness problem and the single-point-of-failure 
problem in one fell swoop. The question for me is whether there are 
legitimate use cases that this would shut out.


cheers,
Zane.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Heat] Multi region support for Heat

2013-07-29 Thread Zane Bitter

On 29/07/13 17:16, Bartosz Górski wrote:

I want to be sure that we are on the same page. By master template you
mean a template that has only nested stacks as resources? Or also other
types of resources (like single server) which will be created in the
region where heat engine is located to which the master template was
sent? I think it would be great if master template contains only of
nested stacks as resources.


I don't think there would be a special case in the code, it should just 
be an ordinary template. I think having only nested stacks as resources 
in that template would be a good design principle for the template 
author, but not something we should enforce.



For each nested stack a context is specified
(tenant/project, endpoint/region, etc.) and heat uses python-client to
create all of them (even for those from the same region where heat
engine is located). In this situation it will be possible to try delete
created multi region stack using different heat engine (located in
different region).


It's possible to delete a nested stack independently now, even though 
they're created internally and not with python-heatclient.


cheers,
Zane.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Heat] Multi region support for Heat

2013-07-29 Thread Zane Bitter

On 29/07/13 17:40, Clint Byrum wrote:

Excerpts from Zane Bitter's message of 2013-07-29 00:51:38 -0700:

>On 29/07/13 02:04, Angus Salkeld wrote:

> >On 26/07/13 09:43 -0700, Clint Byrum wrote:

> >>
> >>These are all predictable limitations and can be handled at the parsing
> >>level.  You will know as soon as you have template + params whether or
> >>not that cinder volume in region A can be attached to the nova server
> >>in region B.

>
>That's true, but IMO it's even better if it's obvious at the time you
>are writing the template. e.g. if (as is currently the case) there is no
>mechanism within a template to select a region for each resource, then
>it's obvious you have to write separate templates for each region (and
>combine them somehow).
>

The language doesn't need to be training wheels for template writers. The


Sure it does ;)

But seriously, the language should reflect the underlying model.


language parser should just make it obvious when you've done something
patently impossible. Because cinderclient will only work with a region
at a time, we can make that presumption at that resource level, and flag
the problem during creation before any resources have been allocated.

But it would be quite presumptuous of Heat, at a language layer, to
say all things are single region.


Well, CloudFormation has made that presumption. And while I would 
*never* suggest that we limit ourselves to features that CloudFormation 
supports, it behooves us to pause and consider why that might be. (I'm 
thinking here of a great many things in Heat, not just this particular 
one.) Because if we do just charge ahead with every idea, the complexity 
of the resulting system will be baroque.



There's an entirely good use case
for cross-region volumes and if it is ever implemented that way, and
cinderclient got some multi-region features, we wouldn't want users
to be faced with a big rewrite of their templates just to make use of
them. We should just be able to do the cross region thing now, because
we have that capability.


This is interesting, because I think of Availability Zones within a 
Region as "things that are separated somewhat, but still make sense to 
be used together sometimes" and Regions as "things that don't make sense 
to be used together". So we have this three-tier system already (local, 
different AZ, different Region) that cloud providers can use to specify 
the capabilities of resources, yet we propose to make two of those tiers 
effectively equivalent? That doesn't sound like we are effectively 
modelling the problem domain.


If we accept that distinction between AZs and Regions, then your example 
is, by definition, not going to happen. Presumably you don't accept that 
definition though, so I'd be curious to hear what everybody else thinks 
it means when a cloud provider creates a separate Region.



I dislike the language defining everything and putting users in too
tight of a box. How many times have you been using a DSL and shaken your
fists in the air, "get out of my way!"? I would suggest that cross region
support is handled resource by resource.


Probably only because I haven't used a lot of DSLs but, honestly, I 
genuinely can't ever recall that happening. However, I wish I had a 
dollar for every time I've tried desperately to get some combination of 
things that worked individually to work together and ended up having to 
read the source code to figure out that it wasn't supported.



> >>I'm still convinced that none of this matters if you rely on a single
> >>Heat
> >>in one of the regions. The whole point of multi region is to eliminate
> >>a SPOF.

>
>So the idea here would be that you spin up a master template in one
>region, and this would contain OS::Heat::Stack resources that use
>python-heatclient to connect to Heat APIs in other regions to spin up
>the constituent stacks in each region. If a region goes down, even if it
>is the one with your master template, that's no problem because you can
>still interact with the constituent stacks directly in whatever
>region(s) you_can_  reach.
>

Interacting with a nested stack directly would be a very uncomfortable
proposition. How would this affect changes from the master? Would the
master just overwrite any changes I make or refuse to operate?


Heat resources (with the exception of a few hacks that are implemented 
internally *cough*AutoScaling*cough*) are stateless. We only really 
store the uuid of the target resource.


So if you updated the template for the nested stack directly, the Heat 
engine with the master stack wouldn't notice or care. If you deleted it 
then you should probably delete it from the master stack before you try 
to do an update that modifies it, but that should go smoothly because 
Heat ignores NotFound errors when trying to delete resources. We can and 
probably should improve on that by checking that resources still exist 
during an update, and recreating them if not.


Note that this is completely

Re: [openstack-dev] [Heat] Unique vs. non-unique stack names

2013-07-31 Thread Zane Bitter

On 30/07/13 21:34, Jason Dunsmore wrote:

Hello Heat devs,

I've started doing some testing to find multi-engine bugs, and I
discovered that it's possible to create two stacks with the same name
when only a single heat-engine is running.

Here are the results of my tests:
http://dunsmor.com/heat/multi-engine.html#sec-1-1


Well spotted, thanks. Eventlet magic strikes again :/


Before I create a bug report about this, is it even necessary to enforce
unique stack names within a tenant?  Why don't we just key off of the
stack ID when possible and throw an error when it's not possible to look
up the stack by name?  This is how novaclient does it.


My 2c: if names don't uniquely name things, they're not particularly 
useful. i.e. if any other user in your tenant (or even you yourself) can 
[accidentally?] create something with the same name as your thing, then 
you have no choice but to remember the UUID of everything you create, 
lest you be unable to find it again.


I can't speak for everyone, but I find names easier to remember than UUIDs.

- ZB


$ nova image-show F17_test
ERROR: Multiple image matches found for 'F17_test', use an ID to be more
specific.

Regards,
Jason



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Heat] Multi region support for Heat

2013-07-31 Thread Zane Bitter

On 31/07/13 18:37, Bartosz Górski wrote:


Right now I have a problem with the way how the template is passed to
the nested stack.
 From what I see the only way right now is providing url to it and it is
really annoying.
Is there any other way to do that?


Yes, that is the only way right now and yes, it is really annoying.

The plan for fixing that goes something like this: 
http://lists.openstack.org/pipermail/openstack-dev/2013-May/009551.html


- ZB

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Heat] as-update-policy implementation details

2013-08-08 Thread Zane Bitter

On 08/08/13 06:59, Chan, Winson C wrote:

Proposal for the implementation is documented at 
https://wiki.openstack.org/wiki/Heat/Blueprints/as-update-policy.  Please 
review and provide feedback.


Looks pretty good to me. I think there's a simpler way to trigger 
updates to the instances based on changes to the launch config though. 
Because we don't have a real autoscaling API and the launch config does 
not represent any actual physical resource, it just returns its 
(logical) name as the Ref value (that's why renaming it is the only way 
to trigger an update at the moment). It ought to be returning the 
_physical_ resource name, which is (now) unique for each incarnation. So 
if you change something in the launch config that causes it to be 
replaced during update, it will get a new Ref id, and that will trigger 
updates to the autoscaling group also.


Other resource types shouldn't need this, because for the most part they 
are returning the uuid of the underlying resource. The exception are the 
ones where there is no underlying resource (i.e. resource_id is None).


So the "Modify LaunchConfiguration class" section can be changed to just 
"Return self.physical_resource_name() from FnGetRefId().", and the 
"Modify StackUpdate class" and "Modify Resource class" sections can be 
removed. The rest looks good.


cheers,
Zane.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Heat] as-update-policy implementation details

2013-08-09 Thread Zane Bitter

On 09/08/13 08:11, Chan, Winson C wrote:

I tested FnGetRefId() in LaunchConfiguration to return 
unicode(self.physical_resource_name()).  It doesn't work so well.  When 
InstanceGroup resolve runtime data and resolve resource refs, the physical 
resource name (I.e.test_stack-JobServerConfig-jnzjnpomcxv5) replaced { 'Ref': 
'JobServerConfig' }.  So when InstanceGroup tries to look up the 
LaunchConfiguration resource during handle_create, it throws a KeyError on the 
physical resource name.  The LaunchConfiguration is using key 'JobServerConfig' 
in the stack's resources dictionary.  Any I missing something here?


Yeah, if you change the meaning of the input to the LaunchConfiguration 
property then obviously you need to change what you look up with that 
data at the same time. The hacky way to do this is to search through the 
stack looking for a resource with that physical resource name. (We used 
to have a wrongly-implemented physical_resource_name_find() function for 
this purpose, but it has mercifully been removed.) This approach shares 
with the existing one the problem of not being able to refer to 
LaunchConfigurations outside the same stack. We really need to be able 
to orchestrate anything in Heat without having the Resource object that 
corresponds to it (in fact, this shouldn't even need to exist - users 
should be able to define the resource outside of Heat and pass 
references in).


A better way is probably to store the configuration details in the 
ResourceData table in the database, and use the id of that row (which 
should be a uuid, but isn't) for the resource_id (and Ref id) of the 
LaunchConfiguration. There will be some tricky security stuff to get 
this right though.


The long-term solution to this is a separate Autoscaling API (or, at 
least, database tables) so that we can use the physical name or uuid to 
look up a particular launch configuration.


cheers,
Zane.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Heat] How the autoscale API should control scaling in Heat

2013-08-16 Thread Zane Bitter

On 16/08/13 00:50, Christopher Armstrong wrote:

*Introduction and Requirements*

So there's kind of a perfect storm happening around autoscaling in Heat
right now. It's making it really hard to figure out how I should compose
this email. There are a lot of different requirements, a lot of
different cool ideas, and a lot of projects that want to take advantage
of autoscaling in one way or another: Trove, OpenShift, TripleO, just to
name a few...

I'll try to list the requirements from various people/projects that may
be relevant to autoscaling or scaling in general.

1. Some users want a service like Amazon's Auto Scaling or Rackspace's
Otter -- a simple API that doesn't really involve orchestration.
2. If such a API exists, it makes sense for Heat to take advantage of
its functionality instead of reimplementing it.


+1, obviously. But the other half of the story is that the API is likely 
be implemented using Heat on the back end, amongst other reasons because 
that implementation already exists. (As you know, since you wrote it ;)


So, just as we will have an RDS resource in Heat that calls Trove, and 
Trove will use Heat for orchestration:


  user => [Heat =>] Trove => Heat => Nova

there will be a similar workflow for Autoscaling:

  user => [Heat =>] Autoscaling -> Heat => Nova

where the first, optional, Heat stack contains the RDS/Autoscaling 
resource and the backend Heat stack contains the actual Nova instance(s).


One difference might be that the Autoscaling -> Heat step need not 
happen via the public ReST API. Since both are part of the Heat project, 
I think it would also be OK to do this over RPC only.



3. If Heat integrates with that separate API, however, that API will
need two ways to do its work:


Wut?


1. native instance-launching functionality, for the "simple" use


This is just the simplest possible case of 3.2. Why would we maintain a 
completely different implementation?



2. a way to talk back to Heat to perform orchestration-aware scaling
operations.


[IRC discussions clarified this to mean scaling arbitrary resource 
types, rather than just Nova servers.]



4. There may be things that are different than AWS::EC2::Instance that
we would want to scale (I have personally been playing around with the
concept of a ResourceGroup, which would maintain a nested stack of
resources based on an arbitrary template snippet).
5. Some people would like to be able to perform manual operations on an
instance group -- such as Clint Byrum's recent example of "remove
instance 4 from resource group A".

Please chime in with your additional requirements if you have any! Trove
and TripleO people, I'm looking at you :-)


*TL;DR*

Point 3.2. above is the main point of this email: exactly how should the
autoscaling API talk back to Heat to tell it to add more instances? I
included the other points so that we keep them in mind while considering
a solution.

*Possible Solutions*

I have heard at least three possibilities so far:

1. the autoscaling API should maintain a full template of all the nodes
in the autoscaled nested stack, manipulate it locally when it wants to
add or remove instances, and post an update-stack to the nested-stack
associated with the InstanceGroup.


This is what I had been thinking.


Pros: It doesn't require any changes to Heat.

Cons: It puts a lot of burden of state management on the autoscale API,


All other APIs need to manage state too, I don't really have a problem 
with that. It already has to handle e.g. the cooldown state; your 
scaling strategy (uh, for the service) will be determined by that.



and it arguably spreads out the responsibility of "orchestration" to the
autoscale API.


Another line of argument would be that this is not true by definition ;)


Also arguable is that automated agents outside of Heat
shouldn't be managing an "internal" template, which are typically
developed by devops people and kept in version control.

2. There should be a new custom-built API for doing exactly what the
autoscaling service needs on an InstanceGroup, named something
unashamedly specific -- like "instance-group-adjust".


+1 to having a custom (RPC-only) API if it means forcing some state out 
of the autoscaling service.


-1 for it talking to an InstanceGroup - that just brings back all our 
old problems about having "resources" that don't have their own separate 
state and APIs, but just exist inside of Heat plugins. Those are the 
cause of all of the biggest design problems in Heat. They're the thing I 
want the Autoscaling API to get rid of. (Also, see below.)



Pros: It'll do exactly what it needs to do for this use case; very
little state management in autoscale API; it lets Heat do all the
orchestration and only give very specific delegation to the external
autoscale API.

Cons: The API grows an additional method for a specific use case.

3. the autoscaling API should update the "Size" Property of the
InstanceGroup resource in the stack that it i

Re: [openstack-dev] [Heat] as-update-policy implementation details

2013-08-16 Thread Zane Bitter

On 15/08/13 19:14, Chan, Winson C wrote:

I updated the implementation section of 
https://wiki.openstack.org/wiki/Heat/Blueprints/as-update-policy on instance 
naming to support UpdatePolicy where in the case of the LaunchConfiguration 
change, all the instances need to be replaced and to support 
MinInstancesInService, the handle_update should create new instances first 
before deleting old ones in a batch per MaxBatchSize (i.e., group capacity of 2 
with MaxBatchSize=2 and MinInstancesInService=2).  Please review as I may not 
understand the original motivation for the existing scheme in instance naming.  
Thanks.


Yeah, I don't think the naming is that important any more. Note that 
physical_resource_name() (i.e. the name used in Nova) now includes a 
randomised component on the end (stackname-resourcename-uniqueid).


So they'll probably look a bit like:

MyStack-MyASGroup--MyASGroup-1-

because the instances are now resources inside a nested stack (whose 
name is of the same form).


If we still were subclassing Instance in the autoscaling code to 
override other stuff, I'd suggest overriding physical-resource-name to 
return something like:


MyStack-MyASGroup-

(i.e. forget about numbering instances at all), but we're not 
subclassing any more, so I'm not sure if it's worth it.


cheers,
Zane.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [heat] Propose Liang Chen for heat-core

2013-08-22 Thread Zane Bitter

On 22/08/13 17:57, Steven Hardy wrote:

Hi,

I'd like to propose that we add Liang Chen to the heat-core team[1]

Liang has been doing some great work recently, consistently providing good
review feedback[2][3], and also sending us some nice patches[4][5], implementing
several features and fixes for Havana.

Please respond with +1/-1.


+1

I have found Liang's reviews and patches to be of very high quality.



Thanks!

[1] https://wiki.openstack.org/wiki/Heat/CoreTeam
[2] http://russellbryant.net/openstack-stats/heat-reviewers-90.txt
[3] https://review.openstack.org/#/q/reviewer:cbjc...@linux.vnet.ibm.com,n,z
[4] 
https://github.com/openstack/heat/graphs/contributors?from=2013-04-18&to=2013-08-18&type=c
[5] https://review.openstack.org/#/q/owner:cbjc...@linux.vnet.ibm.com,n,z

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [heat] Heat mission statement

2013-08-27 Thread Zane Bitter

On 27/08/13 23:13, Robert Collins wrote:

I think there is some confusion about implementation vs intent here
:). Or at least I hope so. I wouldn't expect Nova's mission statement
to talk about 'modelling virtual machines' : modelling is internal
jargon, not a mission!


So, I don't really agree with either of those points. Nova, at its core, 
deals with virtual machines, while Heat deals with abstract 
representations of resources. Talking about models in the Heat mission 
statement seems about as out of place as talking about VMs would be in 
the Nova one. And "model" is not a term we use anywhere internally. It's 
not intended to be internal jargon (which would be one level of 
abstraction below Heat-the-service), it's intended to be at one level of 
abstraction _above_ Heat-the-service.


That said...


"Create a human and machine accessible service for managing the entire
lifecycle of infrastructure and applications within OpenStack clouds."


Sounds good enough.

- ZB

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Heat] Instance naming in IG/ASG and problems related to UpdatePolicy

2013-08-30 Thread Zane Bitter

Context: https://review.openstack.org/#/c/43571/

On 30/08/13 16:23, Chan, Winson C wrote:

Regarding the last set of comments on the UpdatePolicy, I want to bring your 
attention to a few items.  I already submitted a new patch set and didn't want 
to reply on the old patch set so that's why I emailed.

As you are aware, IG/ASG currently create instances by appending group name and 
#.  On resize, it identifies the newest instances to remove by sorting on the 
name string and removing from the end of the list.

Based on your comments, in the new patch set I have changed the naming of the 
instances to just a # without prefixing the group name (or self.name).  I also 
remove the name ranges stuff.  But we still have the following problems…

   1.  On a decrease in size where the oldest instances should be removed…  
Since the naming is still number based, this means we'll have to remove 
instances starting from 0 (since 0 is the oldest).  This leaves a gap in the 
beginning of the list.  So on the next resize to increase, where to increase?  
Continue the numbering from the end?


Yes, although I don't see it as a requirement for this patch that we 
remove the oldest first. (In fact, it would probably be better to 
minimise the changes in this patch, and switch from killing the newest 
first to the oldest first in some future patch.) I mentioned it because 
it's a likely future requirement, and therefore worth considering at the 
design stage.



   2.  On replace, I let the UpdateReplace handle the batch replacement.  
However, for the use case where we need to support MinInstancesInService (min 
instances in service = 2, batch size = 2, current size = 2), this means we need 
to create the new instances first before deleting the old ones instead of 
letting the instance update to handle it.  Also, with the naming restriction, 
this means I will have to create the 2 new replacements as '2' and '3'.  After 
I delete the original '0' and '1', there's a gap in the numbering of the 
instances…  Then this leads to the same question as above.  What happen on a 
resize after?


Urg, good point. I hadn't thought about that case :(

The new update algorithm will ensure that there are always two instances 
in service, because it won't delete the replacement until the new one 
has been created. The problem here is what we discussed the other day - 
we would need to update the Load Balancer at various points in the 
middle of the stack update.


As we discussed then, one way to do this would be to override the 
Instance class (as we used to do, see 
https://github.com/openstack/heat/blob/stable/grizzly/heat/engine/resources/autoscaling.py#L136) 
and insert the lb update by overriding some convenient trigger point. At 
the end of Resource.create() would cover every case except where we're 
swapping in an old resource during a rollback (here - 
https://github.com/openstack/heat/blob/46ae6848896a24dece79771037b86cc6f4b53292/heat/engine/update.py#L130). 
I *think* that doing it at the beginning of Resource.delete() as well 
would ensure that one gets covered.


Alternatively, we could pass a callback to Stack.update() and let the 
update algorithm notify us when something relevant is happening.


Unfortunately, both of these options involve fairly tight coupling 
between the autoscaling group and the update algorithm, to the point it 
would preclude us moving the autoscaling implementation to a separate 
service and having it interact with the instances template only through 
the ReST API. So unless anybody else has a bright idea, this is probably 
best avoided.



The ideal I think is to just use some random short id for the name of the 
instances and then store a creation timestamp somewhere with the resource and 
use the timestamp to determine the age of the instances for removal.  Thoughts?


I really like this idea. We already store the creation time in each 
resource, which should be pretty much exactly what you want to sort by here.


Making the (logical) resource names unique through randomness alone will 
require them to be quite long though. It would be better to use the 
lowest currently-unused integers... which is probably pretty close to 
what you're already doing that I said seemed unnecessary.


So, in conclusion, uh... ignore me and just do the simplest thing you 
think will work? ;)


cheers,
Zane.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Proposal for Raksha, a Data Protection As a Service project

2013-09-02 Thread Zane Bitter

On 01/09/13 23:11, Alex Rudenko wrote:

Hello everyone,

I would like to ask a question. But, first of all, I would like to say
that I'm new to OpenStack so the question might be irrelevant. From what
I've understood, the idea is to back up an entire stack including VMs,
volumes, networks etc. Let's call the information about how these pieces
are interconnected - a topology. This topology also has to be backed up
along with VMs, volumes, networks, right? And then this topology can be
used to restore the entire stack. As for me, it looks very similar to
what the Heat project does. Am I right? So maybe it's possible to use
the Heat project for this kind of backup/restore functionality?

Best regards,
Alex


That's actually an excellent question.

One of the things that's new in Heat for the Havana release is 
Suspend/Resume operations on stacks. Basically this involves going 
through the stack in (reverse) dependency order and calling 
suspend/resume APIs for each resource where that makes sense. Steve 
Hardy has written the code for this in such a way as to be pretty 
generic and allow us to add more operations quite easily in the future.


So to the extent that you just require something to go through every 
resource in a stack in dependency order and call an *existing* backup 
API, then Heat could fit the bill. If you require co-ordination between 
e.g. Nova and Cinder then Heat is probably not a good vehicle for 
implementing that.


cheers,
Zane.



On Sun, Sep 1, 2013 at 10:23 PM, Giri Basava mailto:giri.bas...@triliodata.com>> wrote:

Dear All,

This is a great discussion. If I understand this correctly, this is
a proposal for data protection as a whole for the OpenStack cloud,
however this is not yet an official "incubation" request. We are
having a good discussion on how we can better serve the adoption of
OpenStack.

Having said that, the proposal will reuse the existing API and
contributions by the community that are already in place. For
example, Catlin's point is very valid... the Cinder storage vendor
knows the best way to implement snapshots for their storage
platforms. No doubt, Raksha should be leveraging that IP. Similarly
Raksha will be leveraging Nova, Swift as well as Glance. Don't
forget Neutron networking is very critical part of data
protection for any VM or set of VMs.

No one project has one single answer for a comprehensive data
protection. The capabilities for backup and recovery exist in silos
in various projects...

1. Images are backed-up by Nova
2. Volumes are backed-up by Cinder
3. I am not aware of a solution that can backup network configuration.
4. Not sure if we have something that can backup the resources of a
VM ( vCPUs, Memory Configuration etc.)
5. One can't schedule and automate the above very easily.

Ronen's point about consistency groups is right on the mark. We need
to treat an application as an unit that may span multiple VMs, one
or more images and one or more volumes.

Just to reiterate, some form of these capabilities exist in the
current projects, however as a user of OpenStack, I may not have a
simple one click solution.

With this proposal, the ask is to engage in a discussion to address
the above use cases. IMHO we are on the right track with these
discussions. We will be submitting the code in few days and looking
forward for your feedback. I would also suggest a design summit
where we can flush out more details.

Regards,
Giri

-Original Message-
From: Avishay Traeger [mailto:avis...@il.ibm.com
]
Sent: Saturday, August 31, 2013 10:53 PM
To: OpenStack Development Mailing List
Subject: Re: [openstack-dev] Proposal for Raksha, a Data Protection
As a Service project

+1



From:   Ronen Kat/Haifa/IBM@IBMIL
To: OpenStack Development Mailing List
 mailto:openstack-dev@lists.openstack.org>>,
Date:   09/01/2013 08:02 AM
Subject:Re: [openstack-dev] Proposal for Raksha, a Data
Protection As a
 Service project



Hi Murali,

Thanks for answering. I think the issues you raised indeed make
sense, and important ones.

We need to provide backup both for:
1. Volumes
2. VM instances (VM image, VM metadata, and attached volumes)

While the Cinder-backup handles (1), and is a very mature service,
it does not provide (2), and for Cinder-backup it does not make
sense to handle (2) as well.
Backup of VMs (as a package) is beyond the scope of Cinder, which
implies that indeed something beyond Cinder should take this task.
I think this can be done by having Nova orchestrate or assist the
backup, either of volumes or VMs.

I think that from a backup perspective, there is also a need for
"consistency groups" - the set of entities (volumes) that 

Re: [openstack-dev] [Heat] How the autoscale API should control scaling in Heat

2013-09-11 Thread Zane Bitter

On 11/09/13 05:51, Adrian Otto wrote:

I have a different point of view. First I will offer some assertions:


It's not clear to me what you actually have an issue with? (Top-posting 
is not helping in this respect.)



A-1) We need to keep it simple.
A-1.1) Systems that are hard to comprehend are hard to debug, and 
that's bad.


Absolutely, and systems with higher entropy are harder to comprehend.


A-1.2) Complex systems tend to be much more brittle than simple ones.


"The Zen of Python" has it right here:

Simple is better than complex.
Complex is better than complicated.

Complicated systems have a lot of entropy. Complex systems (that is to 
say, systems composed of multiple simpler systems) are actually a tool 
for _reducing_ entropy.



A-2) Scale-up operations need to be as-fast-as-possible.
A-2.1) Auto-Scaling only works right if your new capacity is added 
quickly when your controller detects that you need more. If you spend a bunch 
of time goofing around before actually adding a new resource to a pool when its 
under staring.
A-2.2) The fewer network round trips between "add-more-resources-now" and 
"resources-added" the better. Fewer = less brittle.


I submit that the difference between a packet round-trip time within a 
single datacenter and the time to boot a Nova server is at least 3 
orders of magnitude.



A-3) The control logic for scaling different applications vary.
A-3.1) What metrics are watched may differ between various use cases.
A-3.2) The data types that represent sensor data may vary.
A-3.3) The policy that's applied to the metrics (such as max, min, and 
cooldown period) vary between applications. Not only the values vary, but the 
logic itself.
A-3.4) A scaling policy may not just be a handful of simple parameters. 
Ideally it allows configurable logic that the end-user can control to some 
extent.

A-4) Auto-scale operations are usually not orchestrations. They are usually 
simple linear workflows.


Well, one of the things Chris wants to do with this is to scale whole 
templates instead of just Nova servers.



A-4.1) The Taskflow project[1] offers a simple way to do workflows and 
stable state management that can be integrated directly into Autoscale.
A-4.2) A task flow (workflow) can trigger a Heat orchestration if 
needed.


If you're re-proposing Chris's original thought of having to different 
ways to do autoscaling depending on whether it's for individual 
instances or whole templates, then I fail to see how that is in any 
sense simpler than having only one way that handles everything.



Now a mental tool to think about control policies:

Auto-scaling is like steering a car. The control policy says that you want to 
drive equally between the two lane lines, and that if you drift off center, you 
gradually correct back toward center again. If the road bends, you try to 
remain in your lane as the lane lines curve. You try not to weave around in 
your lane, and you try not to drift out of the lane.


OK, in the sense that both are proportional control systems, sure. 
(Though in autoscaling, unlike the car, both the feedback loop and the 
response have significant non-linearities.)



If your controller notices that you are about to drift out of your lane because 
the road is starting to bend, and you are distracted, or your hands slip off 
the wheel, you might drift out of your lane into nearby traffic. That's why you 
don't want a Rube Goldberg Machine[2] between you and the steering wheel. See 
assertions A-1 and A-2.


But you probably do want a power steering device between the wheel and 
the steering rack. I think this metaphor is ready for the scrapheap ;)


There was (IMHO) a Rube Goldberg-like device proposed in this thread, 
but not by me :D



If you are driving an 18-wheel tractor/trailer truck, steering is different 
than if you are driving a Fiat. You need to wait longer and steer toward the 
outside of curves so your trailer does not lag behind on the inside of the 
curve behind you as you correct for a bend in the road. When you are driving 
the Fiat, you may want to aim for the middle of the lane at all times, possibly 
even apexing bends to reduce your driving distance, which is actually the 
opposite of what truck drivers need to do. Control policies apply to other 
parts of driving too. I want a different policy for braking than I use for 
steering. On some vehicles I go through a gear shifting workflow, and on others 
I don't. See assertion A-3.


Right, PID control systems are more general.

The idea of allowing the user to substitute their own scaling policy 
engine has always been on the road map since you and others raised it at 
Summit, though, and it's orthogonal to the parts of the design you're 
questioning below. So I'm not really sure what you're, uh, driving at 
(no pun intended).



So, I don't intend to argue the technical minutia of each design

Re: [openstack-dev] [Heat] Questions about plans for heat wadls moving forward

2013-09-13 Thread Zane Bitter

On 13/09/13 05:41, Monty Taylor wrote:



On 09/12/2013 04:33 PM, Steve Baker wrote:

On 09/13/2013 08:28 AM, Mike Asthalter wrote:

Hello,

Can someone please explain the plans for our 2 wadls moving forward:

   * wadl in original heat
 repo: 
https://github.com/openstack/heat/blob/master/doc/docbkx/api-ref/src/wadls/heat-api/src/heat-api-1.0.wadl
 
<%22https://github.com/openstack/heat/blob/master/doc/docbkx/api-ref/src/wadls/heat-api/src/heat-api-1.>
   * wadl in api-site
 repo: 
https://github.com/openstack/api-site/blob/master/api-ref/src/wadls/orchestration-api/src/v1/orchestration-api.wadl


The original intention was to delete the heat wadl when the api-site one
became merged.


+1


1. Is there a need to maintain 2 wadls moving forward, with the wadl
in the original heat repo containing calls that may not be
implemented, and the wadl in the api-site repo containing implemented
calls only?

 Anne Gentle advises as follows in regard to these 2 wadls:

 "I'd like the WADL in api-site repo to be user-facing. The other
 WADL can be truth if it needs to be a specification that's not yet
 implemented. If the WADL in api-site repo is true and implemented,
 please just maintain one going forward."


2. If we maintain 2 wadls, what are the consequences (gerrit reviews,
docs out of sync, etc.)?

3. If we maintain only the 1 orchestration wadl, how do we want to
pull in the wadl content to the api-ref doc
(https://github.com/openstack/heat/blob/master/doc/docbkx/api-ref/src/docbkx/api-ref.xml
<%22https://github.com/openstack/heat/blob/master/doc/docbkx/api-ref/src/docb>)
from the orchestration wadl in the api-site repo: subtree merge, other?



These are good questions, and could apply equally to other out-of-tree
docs as features get added during the development cycle.

I still think that our wadl should live only in api-site.  If api-site
has no branching policy to maintain separate Havana and Icehouse
versions then maybe Icehouse changes should be posted as WIP reviews
until they can be merged.


I believe there is no branching in api-site because it's describing API
and there is no such thing as a havana or icehouse version of an API -
there are the API versions and they are orthogonal to server release
versions. At least in theory. :)


Yes and no. When new API versions arrive, they always arrive with a 
particular release. So we do need some way to ensure the docs go live at 
the right time, but I think Steve's suggestion for handling that is fine.


cheers,
Zane.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [heat] cross-stack references

2013-09-23 Thread Zane Bitter

On 18/09/13 19:34, Mike Spreitzer wrote:

When we get into things like affinity concerns or managing network
bandwidth, we see the need for cross-stack relationships.  You may want
to place parts of a new stack near parts of an existing one, for
example.  I see that in CFN you can make cross-references between
different parts of a single stack using the resource names that appear
in the original template.  Is there a way to refer to something that did
not come from the same original template?  If not, won't we need such a
thing to be introduced?  Any thoughts on how that would be done?


Yes, you can do this now. In fact, nothing in a template should be 
making cross-references using only the resource name. Instead you should 
use {"Ref": "resource_name"}. The value returned by this varies, but in 
most cases it is just the UUID of the resource. You can return this 
value in an output, and you can use an input of the same form (i.e. a 
UUID in most cases) in place of a Ref to a local resource in cases where 
you want to refer to an existing resource instead of one managed by the 
template. A similar story applied to attributes (which are just strings) 
and Fn::GetAtt.


There's no magic in a template, it's just functions that manipulate strings.

cheers,
Zane.


Fine Print:
We violate this rule in a number of places in Heat, mostly where there's 
no real underlying OpenStack API for a given resource type so we've had 
to implement the resource within Heat. CloudFormation does not make this 
mistake, and I hope to clean up as many of these as possible over time. 
However, none of these are likely relevant to the use case you have in mind.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [heat] [scheduler] Bringing things together for Icehouse

2013-09-23 Thread Zane Bitter

On 15/09/13 09:19, Mike Spreitzer wrote:

But first I must admit that I am still a newbie to OpenStack, and still
am missing some important clues.  One thing that mystifies me is this: I
see essentially the same thing, which I have generally taken to calling
holistic scheduling, discussed in two mostly separate contexts: (1) the
(nova) scheduler context, and (2) the ambitions for heat.  What am I
missing?


I think what you're missing is that the only person discussing this in 
the context of Heat is you. Beyond exposing the scheduling parameters in 
other APIs to the user, there's nothing here for Heat to do.


So if you take [heat] out of the subject line then it will be discussed 
in only one context, and you will be mystified no longer. Hope that helps :)


cheers,
Zane.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [heat] [scheduler] Bringing things together for Icehouse

2013-09-24 Thread Zane Bitter

On 24/09/13 05:31, Mike Spreitzer wrote:

I was not trying to raise issues of geographic dispersion and other
higher level structures, I think the issues I am trying to raise are
relevant even without them.  This is not to deny the importance, or
relevance, of higher levels of structure.  But I would like to first
respond to the discussion that I think is relevant even without them.

I think it is valuable for OpenStack to have a place for holistic
infrastructure scheduling.  I am not the only one to argue for this, but
I will give some use cases.  Consider Hadoop, which stresses the path
between Compute and Block Storage.  In the usual way of deploying and
configuring Hadoop, you want each data node to be using directly
attached storage.  You could address this by scheduling one of those two
services first, and then the second with constraints from the first ---
but the decisions made by the first could paint the second into a
corner.  It is better to be able to schedule both jointly.  Also
consider another approach to Hadoop, in which the block storage is
provided by a bank of storage appliances that is equidistant (in
networking terms) from all the Compute.  In this case the Storage and
Compute scheduling decisions have no strong interaction --- but the
Compute scheduling can interact with the network (you do not want to
place Compute in a way that overloads part of the network).


Thanks for writing this up, it's very helpful for figuring out what you 
mean by a 'holistic' scheduler.


I don't yet see how this could be considered in-scope for the 
Orchestration program, which uses only the public APIs of other services.


To take the first example, wouldn't your holistic scheduler effectively 
have to reserve a compute instance and some directly attached block 
storage prior to actually creating them? Have you considered Climate 
rather than Heat as an integration point?



Once a holistic infrastructure scheduler has made its decisions, there
is then a need for infrastructure orchestration.  The infrastructure
orchestration function is logically downstream from holistic scheduling.


I agree that it's necessarily 'downstream' (in the sense of happening 
afterwards). I'd hesitate to use the word 'logically', since I think by 
it's very nature a holistic scheduler introduces dependencies between 
services that were intended to be _logically_ independent.



  I do not favor creating a new and alternate way of doing
infrastructure orchestration in this position.  Rather I think it makes
sense to use essentially today's heat engine.

Today Heat is the only thing that takes a holistic view of
patterns/topologies/templates, and there are various pressures to expand
the mission of Heat.  A marquee expansion is to take on software
orchestration.  I think holistic infrastructure scheduling should be
downstream from the preparatory stage of software orchestration (the
other stage of software orchestration is the run-time action in and
supporting the resources themselves).  There are other pressures to
expand the mission of Heat too.  This leads to conflicting usages for
the word "heat": it can mean the infrastructure orchestration function
that is the main job of today's heat engine, or it can mean the full
expanded mission (whatever you think that should be).  I have been
mainly using "heat" in that latter sense, but I do not really want to
argue over naming of bits and assemblies of functionality.  Call them
whatever you want.  I am more interested in getting a useful arrangement
of functionality.  I have updated my picture at
https://docs.google.com/drawings/d/1Y_yyIpql5_cdC8116XrBHzn6GfP_g0NHTTG_W4o0R9U---
do you agree that the arrangement of functionality makes sense?


Candidly, no.

As proposed, the software configs contain directives like 'hosted_on: 
server_name'. (I don't know that I'm a huge fan of this design, but I 
don't think the exact details are relevant in this context.) There's no 
non-trivial processing in the preparatory stage of software 
orchestration that would require it to be performed before scheduling 
could occur.


Let's make sure we distinguish between doing holistic scheduling, which 
requires a priori knowledge of the resources to be created, and 
automatic scheduling, which requires psychic knowledge of the user's 
mind. (Did the user want to optimise for performance or availability? 
How would you infer that from the template?) There's nothing that 
happens while preparing the software configurations that's necessary for 
the former nor sufficient for the latter.


cheers,
Zane.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Help us reduce gate resets by being specific about bugs being found

2013-09-27 Thread Zane Bitter

On 26/09/13 16:49, Sean Dague wrote:

As many folks know, gerrit takes comments of either the form

recheck bug #X
or
recheck no bug

To kick off the check queue jobs again to handle flakey tests.

The problem is that we're getting a lot more "no bug" than bugs at this
point. If a failure happens in the OpenStack gate, it's usually an
actual OpenStack race somewhere. Figuring out what the top races are is
*really* important to actually fixing those races, as it gives us focus
on what the top issues are in OpenStack that we need to fix. That makes
the gate good for everyone, and means less time babysitting your patches
through merge.

Now that Matt, Joe, and Clark have built the elastic-recheck bot, you
will often be given a hint in your review about the most probably race
that it was found. Please confirm the bug looks right before rechecking
with it, but it should help expedite finding the right issue.
http://status.openstack.org/rechecks/ is also helpful in seeing what's
most recently been causing issues.


Can I suggest including this link in Jenkins's comments on Gerrit? As 
things stand it is two clicks away and the link is buried in the middle 
of some very dense text. I know that for me personally this renders 
"recheck no bug" more tempting than it should be ;)


cheers,
Zane.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [heat] [scheduler] Bringing things together for Icehouse (now featuring software orchestration)

2013-09-27 Thread Zane Bitter

On 25/09/13 07:03, Mike Spreitzer wrote:


Zane wrote:
 > To take the first example, wouldn't your holistic scheduler
effectively have
 > to reserve a compute instance and some directly attached block
storage prior
 > to actually creating them? Have you considered Climate rather than
Heat as
 > an integration point?

I had not considered Climate.  Based on recent ML traffic, I see that
Climate is about scheduling into the future, whereas I am only trying to
talk about scheduling for the present.  OTOH, perhaps you are concerned
about concurrency issues.  I am too.  Doing a better job on that is a
big part of the revision my group is working on now.  I think it can be
done.  I plan to post a pointer to some details soon.


Your diagrams clearly show scheduling happening in a separate stage to 
(infrastructure) orchestration, which is to say that at the point where 
resources are scheduled, their actual creation is in the *future*.


I am not a Climate expert, but it seems to me that they have a 
near-identical problem to solve: how do they integrate with Heat such 
that somebody who has reserved resources in the past can actually create 
them (a) as part of a Heat stack or (b) as standalone resources, at the 
user's option. IMO OpenStack should solve this problem only once.



Perhaps the concern is about competition between two managers trying to
manage the same resources.  I think that is (a) something that can not
be completely avoided and (b) impossible to do well.  My preference is
to focus on one manager, and make sure it tolerates surprises in a way
that is not terrible.  Even without competing managers, bugs and other
unexpected failures will cause nasty surprises.

Zane later wrote:
 > As proposed, the software configs contain directives like 'hosted_on:
 > server_name'. (I don't know that I'm a huge fan of this design, but I
don't
 > think the exact details are relevant in this context.) There's no
 > non-trivial processing in the preparatory stage of software orchestration
 > that would require it to be performed before scheduling could occur.

I hope I have addressed that with my remarks above about software
orchestration.


If I understood your remarks correctly, we agree that there is no 
(known) reason that the scheduling has to occur in the middle of 
orchestration (which would have implied that it needed to be 
incorporated in some sense into Heat).



Zane also wrote:
 > Let's make sure we distinguish between doing holistic scheduling, which
 > requires a priori knowledge of the resources to be created, and automatic
 > scheduling, which requires psychic knowledge of the user's mind. (Did the
 > user want to optimise for performance or availability? How would you
infer
 > that from the template?)

One reason I favor holistic infrastructure scheduling is that I want its
input to be richer than today's CFN templates.  Like Debo, I think the
input can contain the kind of information that would otherwise require
mind-reading.  My group has been working examples involving multiple
levels of anti-co-location statements, network reachability and
proximity statements, disk exclusivity statements, and statements about
the presence of licensed products.


Right, so what I'm saying is that if all those things are _stated_ in 
the input then there's no need to run the orchestration engine to find 
out what they'll be; they're already stated.


cheers,
Zane.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [scheduler] [heat] Policy specifics

2013-09-27 Thread Zane Bitter

On 27/09/13 08:58, Mike Spreitzer wrote:

I have begun to draft some specifics about the sorts of policies that
might be added to infrastructure to inform a smart unified placement
engine.  These are cast as an extension to Heat templates.  See
https://wiki.openstack.org/wiki/Heat/PolicyExtension.  Comments solicited.


Mike,
These are not the kinds of specifics that are of any help at all in 
figuring out how (or, indeed, whether) to incorporate holistic 
scheduling into OpenStack.


- What would a holistic scheduling service look like? A standalone 
service? Part of heat-engine?
- How will the scheduling service reserve slots for resources in advance 
of them being created? How will those reservations be accounted for and 
billed?
- In the event that slots are reserved but those reservations are not 
taken up, what will happen?
- Once scheduled, how will resources be created in their proper slots as 
part of a Heat template?
- What about when the user calls the APIs directly? (i.e. does their own 
orchestration - either hand-rolled or using their own standalone Heat.)
- How and from where will the scheduling service obtain the utilisation 
data needed to perform the scheduling? What mechanism will segregate 
this information from the end user?
- How will users communicate their scheduling constraints to OpenStack? 
(Through which API and in what form?)
- What value does this provide (over and above non-holistic scheduler 
hints passed in individual API calls) to end users? Public cloud 
operators? Private cloud operators? How might the value be shared 
between users and operators, and how would that be accounted for?
- Does this fit within the scope of an existing OpenStack program? Which 
one? Why?
- What changes are required to existing services to accommodate this 
functionality?


Answer these questions first. *Then* you can talk about symmetric dyadic 
primitive policies as much as you like to anybody who will listen.


cheers,
Zane.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [scheduler] [heat] Policy specifics

2013-09-30 Thread Zane Bitter

On 27/09/13 17:58, Clint Byrum wrote:

Excerpts from Zane Bitter's message of 2013-09-27 06:58:40 -0700:

On 27/09/13 08:58, Mike Spreitzer wrote:

I have begun to draft some specifics about the sorts of policies that
might be added to infrastructure to inform a smart unified placement
engine.  These are cast as an extension to Heat templates.  See
https://wiki.openstack.org/wiki/Heat/PolicyExtension.  Comments solicited.


Mike,
These are not the kinds of specifics that are of any help at all in
figuring out how (or, indeed, whether) to incorporate holistic
scheduling into OpenStack.


I agree that the things in that page are a wet dream of logical deployment
fun. However, I think one can target just a few of the basic ones,
and see a real achievable case forming. I think I grasp Mike's ideas,
so I'll respond to your concerns with what I think. Note that it is
highly likely I've gotten some of this wrong.


Thanks for having a crack at this Clint. However, I think your example 
is not apposite, because it doesn't actually require any holistic 
scheduling. You can easily do anti-colocation of a bunch of servers just 
using scheduler hints to the Nova API (stick one in each zone until you 
run out of zones). This just requires Heat to expose the scheduler hints 
portion of the Nova API. To my mind this stuff is so basic that it falls 
squarely in the category of what you said in a previous thread:



There is
definitely a need for Heat to be able to communicate to the API's any
placement details that can be communicated. However, Heat should not
actually be "scheduling" anything.


But in any event, most of your answers appear to be predicated on this 
very simple case, not on a holistic scheduler. I think you are vastly 
underestimating the complexity of the problem.


What Mike is proposing is something more sophisticated, whereby you can 
solve for the optimal scheduling of resources of different types across 
different APIs. There may be a case for including this in Heat, but it 
needs to be made, and IMO it needs to be made by answering these kinds 
of questions at a similar level of detail to the symmetric dyadic 
primitives wiki page.


BTW there is one more question I should add:

- Who will implement and maintain this service/feature, and the 
associated changes to existing services?



- What would a holistic scheduling service look like? A standalone
service? Part of heat-engine?


I see it as a preprocessor of sorts for the current infrastructure engine.
It would take the logical expression of the cluster and either turn
it into actual deployment instructions or respond to the user that it
cannot succeed. Ideally it would just extend the same Heat API.


- How will the scheduling service reserve slots for resources in advance
of them being created? How will those reservations be accounted for and
billed?
- In the event that slots are reserved but those reservations are not
taken up, what will happen?


I dont' see the word "reserve" in Mike's proposal, and I don't think this
is necessary for the more basic models like Collocation and Anti-Collocation.


Right, but we're not talking about only the basic models. Reservations 
are very much needed according to my understanding of the proposal, 
because the whole point is to co-ordinate across multiple services in a 
way that is impossible to do atomically.



Reservations would of course make the scheduling decisions more likely to
succeed, but it isn't necessary if we do things optimistically. If the
stack create or update fails, we can retry with better parameters.


- Once scheduled, how will resources be created in their proper slots as
part of a Heat template?


In goes a Heat template (sorry for not using HOT.. still learning it. ;)

Resources:
   ServerTemplate:
 Type: Some::Defined::ProviderType
   HAThing1:
 Type: OS::Heat::HACluster
 Properties:
   ClusterSize: 3
   MaxPerAZ: 1
   PlacementStrategy: anti-collocation
   Resources: [ ServerTemplate ]

And if we have at least 2 AZ's available, it feeds to the heat engine:

Resources:
   HAThing1-0:
 Type: Some::Defined::ProviderType
   Parameters:
 availability-zone: zone-A
   HAThing1-1:
 Type: Some::Defined::ProviderType
   Parameters:
 availability-zone: zone-B
   HAThing1-2:
 Type: Some::Defined::ProviderType
   Parameters:
 availability-zone: zone-A

If not, holistic scheduler says back "I don't have enough AZ's to
satisfy MaxPerAZ".

Now, if Nova grows anti-affininty under the covers that it can manage
directly, a later version can just spit out:

Resources:
   HAThing1-0:
 Type: Some::Defined::ProviderType
   Parameters:
 instance-group: 0
 affinity-type: anti
   HAThing1-1:
 Type: Some::Defined::ProviderType
   Parameters:
 instance-group: 1
 affinity-type: anti
   HAThing1-2:
 Type: Some::Defined::ProviderType
   Parameters:
 instance-group: 0

Re: [openstack-dev] [scheduler] [heat] Policy specifics

2013-09-30 Thread Zane Bitter

On 27/09/13 20:59, Mike Spreitzer wrote:

Zane also raised an important point about value.  Any scheduler is
serving one master most directly, the cloud provider.  Any sane cloud
provider has some interest in serving the interests of the cloud users,
as well as having some concerns of its own.  The way my group has
resolved this is in the translation from the incoming requests to the
underlying optimization problem that is solved for placement; in that
translation we fold in the cloud provider's interests as well as the
cloud user's.  We currently have a fixed opinion of the cloud provider's
interests; generalizing that is a possible direction for future progress.


It's good that you've considered this. I guess the gist of my question 
was: do you think that public cloud providers in particular would feel 
the need to bill for some aspect of this service if they provided it? 
(And, if so, how?)


The benefits to at least some private cloud providers (particularly the 
ones using OpenStack for enterprisey pets-not-cattle workloads) seem 
pretty obvious, but particularly if we're talking about incorporating 
holistic scheduling into an existing service then we need to make sure 
this is something that benefits the whole OpenStack community.


cheers,
Zane.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [scheduler] [heat] Policy specifics

2013-09-30 Thread Zane Bitter

On 30/09/13 17:33, Clint Byrum wrote:

You are painting cloud providers as uncaring slum lords. Of course there
will be slum lords in any ecosystem, but there will also be high quality
service providers and private cloud operators with high expectations that
can use this type of feature as something to benefit only the users at a
high cost.


Heh, sorry, that wasn't my intention ;) There is a middle ground between 
"slum lord" and "radical transparency" that I expect 100% of public 
cloud operators would want to occupy. (Many private cloud operators may 
well be fine with radical transparency.)


- ZB

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] TC candidacy

2013-10-04 Thread Zane Bitter

I would like to propose my candidacy for the OpenStack Technical Committee.

I have been involved with OpenStack since we started the Heat project in 
early 2012. I'm still a member of the Heat Core team and - according to 
the semi-official statistics from Bitergia 
(http://bitergia.com/public/reports/openstack/2013_04_grizzly/), at 
least - among the more prolific patch contributors to OpenStack.


Over the last year I have often worked closely with the TC, beginning 
with helping to shepherd Heat through the incubation process. That 
process involved developing a consensus on the scope of the OpenStack 
project as a whole, and new procedures and definitions for incubating 
projects in OpenStack. These changes paved the way for projects like 
Trove, Marconi and Savanna to be incubated. I hope and expect that more 
projects will continue to follow in these footsteps.


I remain a reasonably frequent, if irregular, attendee at TC meetings - 
occasionally as a proxy for the Heat PTL, but more often just because I 
feel I can contribute. At this stage of its evolution, I think the main 
responsibility of the TC is to grow the OpenStack project in a 
responsible, sustainable way, so I take particular interest in 
discussions around incubation and graduation of new projects. Many new 
projects also have potential integration points with Heat, so having 
folks from the Heat core team involved seems valuable.


I also think that the TC could do more to communicate its inner workings 
(which are public but, I suspect, not widely read). While most decisions 
eventually come down to a vote and the results are reported, the most 
important work of the committee is not in voting but in building 
consensus. I believe the community would benefit from more insight into 
that process, and to that end I have started blogging about important 
decisions of the TC - not only the outcomes, but the reasons behind them 
and the issues that were considered along the way:


http://www.zerobanana.com/archive/2013/09/25#savanna-incubation
http://www.zerobanana.com/archive/2013/09/04#icehouse-incubation
http://www.zerobanana.com/archive/2013/08/07#non-relational-trove

These posts appear on Planet OpenStack and are regularly featured in the 
Community Newsletter, so I like to think that this is helping to bring 
the workings of the TC in front of an audience who might not otherwise 
be aware of them.


If elected, I'd like to act as an informal point of contact for projects 
that are already in incubation or are considering it, to help explain 
the incubation process and the committee's expectations around it.


I consider myself fortunate that my employer permits me to spend 
substantially all of my time on OpenStack, and that my colleagues and I 
have a clear mandate to do what we consider best for the _entire_ 
OpenStack community, because we know that we succeed only when everyone 
succeeds.


cheers,
Zane.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Heat] Plugin packaging

2013-10-15 Thread Zane Bitter

On 15/10/13 02:18, Sam Alba wrote:

Hello,

I am working on a Heat plugin that makes a new resource available in a
template. It's working great and I will opensource it this week if I
can get the packaging right...


Awesome! :)


Right now, I am linking my module.py file in /usr/lib/heat to get it
loaded when heat-engine starts. But according to the doc, I am
supposed to be able to make the plugin discoverable by heat-engine if
the module appears in the package "heat.engine.plugins"[1]


I think the documentation may be leading you astray here; it's the other 
way around: once the plugin is discovered by the engine, it will appear 
in the package heat.engine.plugins. So you should be able to do:


>>> import heat.engine.resources
>>> heat.engine.resources.initialise()
>>> from heat.engine.plugins import module
>>> print module.resource_mapping()

(FWIW this is working for me on latest master of Heat.)

As far as making the plugin discoverable is concerned, all you should 
have to do is install the module in /usr/lib/heat/.



I looked into the plugin_loader module in the Heat source code and it
looks like it should work. However I was unable to get a proper Python
package.

Has anyone been able to make this packaging right for an external Heat plugin?


I've never tried to do this with a Python package, the mechanism is 
really designed more for either dropping the module in there manually, 
or installing it from a Debian or RPM package.


It sounds like what you're doing is trying to install the package in 
/usr/lib/python2.7/site-packages/ (or in a venv) in the package 
heat.engine.plugins and get the engine to recognise it that way? I don't 
think there's a safe way to make that work, because the plugin loader 
creates its own heat.engine.plugins package that will replace anything 
that exists on that path (see line 41 of plugin_loader.py).


Heat (in fact, all of OpenStack) is designed as a system service, so the 
normal rules of Python _application_ packaging don't quite fit. e.g. If 
you want to use a plugin locally (for a single user) rather than install 
it globally, the way to do it is to specify a local plugin directory 
when running heat-engine, rather than have the plugin installed in a venv.


Hope that helps.

cheers,
Zane.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Mistral] Announcing a new task scheduling and orchestration service for OpenStack

2013-10-15 Thread Zane Bitter

On 14/10/13 21:40, Renat Akhmerov wrote:

Hi OpenStackers,

I am proud to announce the official launch of the Mistral project. At
Mirantis we have a team to start contributing to the project right away.
We invite anybody interested in task service & state management to join
the initiative.

Mistral is a new OpenStack service designed for task flow control,
scheduling, and execution. The project will implement Convection
proposal (https://wiki.openstack.org/wiki/Convection) and provide an API
and domain-specific language that enables users to manage tasks and
their dependencies, and to define workflows, triggers, and events. The
service will provide the ability to schedule tasks, as well as to define
and manage external sources of events to act as task execution triggers.


Cool, I think Convection was one of the most interesting idea to come 
out of the last summit, and it's great to see folks start to implement it.


That said, can we please, please, please not invent a *third* meaning of 
"orchestration"? The proposal, as I understand it, is Workflow as a 
Service, so let's call it that. The fact that orchestration uses a 
workflow does not make them the same thing. OpenStack already has an 
Orchestration program, calling Mistral a "task scheduling and 
orchestration service" just adds confusion.


cheers,
Zane.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Heat] HOT Software configuration proposal

2013-10-16 Thread Zane Bitter

On 16/10/13 06:56, Mike Spreitzer wrote:

What is the difference between what today's heat engine does and a
workflow?  I am interested to hear what you experts think, I hope it
will be clarifying.  I presume the answers will touch on things like
error handling, state tracking, and updates.


(Disclaimer: I'm not an expert, I just work on this stuff ;)

First off, to be clear, it was my understanding from this thread that 
the original proposal to add workflow syntax to HOT is effectively dead. 
(If I'm mistaken, add a giant -1 from me.) Mirantis have since 
announced, I assume not coincidentally, that they will start 
implementing a workflow service (Mistral, based on the original 
Convection proposal from Keith Bray at the Havana summit) for OpenStack, 
backed by the taskflow library. So bringing workflows back in to this 
discussion is confusing the issue.


(FWIW I think that having a workflow service will be a great thing for 
other reasons, but I also hope that all of Stan's original example will 
be possible in Heat *without* resorting to requiring users to define an 
explicit workflow.)


It must be acknowledged that the Heat engine does run a workflow. The 
workflow part can in principle, and probably should, be delegated to the 
taskflow library, and I would be surprised if we did not eventually end 
up doing this (though I'm not looking forward to actually implementing it).


To answer your question, the key thing that Heat does is take in two 
declarative models and generate a workflow to transform one into the 
other. (The general case of this is a stack update, where the two models 
are defined in the previous and new templates. Stack create and delete 
are special cases where one or the other of the models is empty.)


Workflows don't belong in HOT because they are a one-off thing. You need 
a different one for every situation, and this is exactly why Heat exists 
- to infer the correct workflow to reify a model in any given situation.


cheers,
Zane.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Heat] HOT Software configuration proposal

2013-10-16 Thread Zane Bitter

On 16/10/13 00:48, Steve Baker wrote:

I've just written some proposals to address Heat's HOT software
configuration needs, and I'd like to use this thread to get some feedback:
https://wiki.openstack.org/wiki/Heat/Blueprints/hot-software-config
https://wiki.openstack.org/wiki/Heat/Blueprints/native-tools-bootstrap-config


Wow, nice job, thanks for writing all of this up :)


Please read the proposals and reply to the list with any comments or
suggestions.


For me the crucial question is, how do we define the interface for 
synchronising and passing data from and to arbitrary applications 
running under an arbitrary configuration management system?


Compared to this, defining the actual format in which software 
applications are specified in HOT seems like a Simple Matter of 
Bikeshedding ;)


(BTW +1 for not having the relationships, hosted_on always reminded me 
uncomfortably of INTERCAL[1]. We already have DependsOn for resources 
though, and might well need it here too.)


I'm not a big fan of having Heat::Puppet, Heat::CloudInit, Heat::Ansible 
&c. component types insofar as they require your cloud provider to 
support your preferred configuration management system before you can 
use it. (In contrast, it's much easier to teach your configuration 
management system about Heat because you control it yourself, and 
configuration management systems are already designed for plugging in 
arbitrary applications.)


I'd love to be able to put this control in the user's hands by just 
using provider templates - i.e. you designate PuppetServer.yaml as the 
provider for an OS::Nova::Server in your template and it knows how to 
configure Puppet and handle the various components. We could make 
available a library of such provider templates, but users wouldn't be 
limited to only using those.


cheers,
Zane.


[1] https://en.wikipedia.org/wiki/COMEFROM

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Heat] HOT Software configuration proposal

2013-10-16 Thread Zane Bitter

On 16/10/13 15:58, Mike Spreitzer wrote:

Zane Bitter  wrote on 10/16/2013 08:25:38 AM:

 > To answer your question, the key thing that Heat does is take in two
 > declarative models and generate a workflow to transform one into the
 > other. (The general case of this is a stack update, where the two models
 > are defined in the previous and new templates. Stack create and delete
 > are special cases where one or the other of the models is empty.)
 >
 > Workflows don't belong in HOT because they are a one-off thing. You need
 > a different one for every situation, and this is exactly why Heat exists
 > - to infer the correct workflow to reify a model in any given situation.

Thanks for a great short sharp answer.  In that light, I see a concern.
  Once a workflow has been generated, the system has lost the ability to
adapt to changes in either model.  In a highly concurrent and dynamic
environment, that could be problematic.


I think you're referring to the fact if reality diverges from the model 
we have no way to bring it back in line (and even when doing an update, 
things can and usually will go wrong if Heat's idea of the existing 
template does not reflect reality any more). If so, then I agree that we 
are weak in this area. You're obviously aware of 
http://summit.openstack.org/cfp/details/95 so it is definitely on the radar.


cheers,
Zane.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Heat] HOT Software configuration proposal

2013-10-17 Thread Zane Bitter

On 16/10/13 21:02, Clint Byrum wrote:

Excerpts from Zane Bitter's message of 2013-10-16 06:16:33 -0700:

>I'd love to be able to put this control in the user's hands by just
>using provider templates - i.e. you designate PuppetServer.yaml as the
>provider for an OS::Nova::Server in your template and it knows how to
>configure Puppet and handle the various components. We could make
>available a library of such provider templates, but users wouldn't be
>limited to only using those.
>

This I don't think I understand well enough to pass judgement on. My
understanding of providers is that they are meant to make templates more
portable between clouds that have different capabilities.


That's certainly one use, and the main one we've been concentrating on. 
One of the cool things about providers IMHO is that it's a completely 
generic feature, not tied to one particular use case, so it's 
potentially very powerful, possibly in ways that we haven't even thought 
of yet.


So the idea here would be that you have a 
[Puppet|Chef|Salt|Ansible|CFEngine]Server provider template that 
contains an OS::Nova::Server resource with the UserData filled in to 
configure the CM system to start and e.g. grab the component configs 
from the metadata. You then use this template as the provider for one or 
more of your servers, link the components somehow and off you go.



Mapping that
onto different CM systems feels like a stretch and presents a few
questions for me. How do I have two OS::Nova::Server's using different
CM systems when I decide to deploy something that uses salt instead of
chef?


You can specify a provider for an individual resource, you don't have to 
map a whole resource type to it in the environment (although you can). 
Even if you did, you could just make up your own resource types (say, 
OS::Custom::ChefServer and OS::Custom::SaltServer).



As long as components can be composed of other components, then it seems
to me that you would just want the CM system to be a component. If you
find yourself including the same set of components constantly.. just
make a bigger component out of them.


This is a good point, and may well be the way to go. I was thinking that 
the CM system had to be effectively one step higher in the chain than 
the components that run under it, but maybe that's really only true of 
cloud-init.


(BTW I agree with Steve B here, addressing the problem is much more 
important to me than any particular solution.)


cheers,
Zane.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Heat] Plugin to use Docker containers a resources in a template

2013-10-21 Thread Zane Bitter

On 18/10/13 03:06, Sam Alba wrote:

Hi all,

I've been recently working on a Docker plugin for Heat that makes it
possible to use Docker containers as resources.

I've just opened the repository:
https://github.com/dotcloud/openstack-heat-docker


Cool, nice work. Thanks for sharing! :)

I agree that we shouldn't see this as a replacement for a Nova driver 
(mainly because it doesn't take advantage of Keystone for authenticating 
the user, nor abstract the pool of available hosts away from the user), 
but it is a really interesting concept to play around with. I too would 
definitely welcome it in Heat's /contrib directory where it can be 
subject to continuous testing to make sure that any changes in Heat 
don't break it.


So, here's a crazy, half-baked idea that I almost posted to the list 
last week: we've been discussing adding software configurations to the 
HOT format, to allow users (amongst other things) to deploy multiple 
independent software configurations to the same Nova VM... when we do so 
should we deploy each config in a Linux container?


Discuss.



It's now possible to do that via Nova (since there is now a Docker
driver for it). But the idea here is not to replace the Nova driver
with this Heat plugin, the idea is just to propose a different path.

Basically, Docker itself has a Rest API[1] with all features needed to
deploy and manage containers, the Nova driver uses this API. However
having the Nova API in front of it makes it hard to bring all Docker
features to the user, basically everything has to fit into the Nova
API. For instance, docker commit/push are mapped to nova snapshots,
etc... And a lot of Docker features are not available yet; I admit
that some of them will be hard to support (docker Env variables,
Volumes, etc... how should they fit in Nova?).

The idea of this Docker plugin for Heat is to use the whole Docker API
directly from a template. All possible parameters for creating a
container from the Docker API[2] can be defined from the template.
This allows more flexibility.

Since this approach is a bit different from the normal OpenStack
workflow (for instance, Nova's role is to abstract all computing units
right now), I am interested to get feedback on this.

Obviously, I'll keep maintaining the Docker driver for Nova and I'm
also working on putting together some new features I'll propose for
the next release.


[1] http://docs.docker.io/en/latest/api/docker_remote_api_v1.5/
[2] 
http://docs.docker.io/en/latest/api/docker_remote_api_v1.5/#create-a-container




___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Heat] HOT Software orchestration proposal for workflows

2013-10-21 Thread Zane Bitter

On 18/10/13 20:24, John Davidge -X (jodavidg - AAP3 INC at Cisco) wrote:

It looks like this discussion involves many of the issues faced when
developing the Curvature & Donabe frameworks, which were presented at the
Portland Summit - slides and video here:

http://www.openstack.org/summit/portland-2013/session-videos/presentation/i
nteractive-visual-orchestration-with-curvature-and-donabe

Much of the work on the Donabe side revolved around defining a simple
JSON-based API for describing the sorts of virtual application templates
being discussed. All of the code for both Curvature and Donabe has
recently been made open source and is available here:

http://ciscosystems.github.io/curvature/

http://ciscosystems.github.io/donabe/


Hey John,
Congrats on getting this stuff Open-Sourced BTW (I know it's been out 
for a while now).


Can you be more specific about the parts that are relevant to this 
discussion? I'd be interested to know how Donabe handles configuring the 
software on Nova servers for a start.



It looks like some of the ground covered by these projects can be helpful
to this discussion.


Yep, it would be great to get input from any folks in the community who 
have experience with this problem.


cheers,
Zane.



John Davidge
jodav...@cisco.com




-- Forwarded message --
From: Thomas Spatzier 
Date: Wed, Oct 9, 2013 at 12:40 AM
Subject: Re: [openstack-dev] [Heat] HOT Software orchestration
proposal for workflows
To: OpenStack Development Mailing List 


Excerpts from Clint Byrum's message


From: Clint Byrum 
To: openstack-dev ,
Date: 09.10.2013 03:54
Subject: Re: [openstack-dev] [Heat] HOT Software orchestration
proposal for workflows

Excerpts from Stan Lagun's message of 2013-10-08 13:53:45 -0700:

Hello,


That is why it is necessary to have some central coordination service

which

would handle deployment workflow and perform specific actions (create

VMs

and other OpenStack resources, do something on that VM) on each stage
according to that workflow. We think that Heat is the best place for

such

service.



I'm not so sure. Heat is part of the Orchestration program, not
workflow.



I agree. HOT so far was thought to be a format for describing templates in
a structural, declaritive way. Adding workflows would stretch it quite a
bit. Maybe we should see what aspects make sense to be added to HOT, and
then how to do workflow like orchestration in a layer on top.


Our idea is to extend HOT DSL by adding  workflow definition

capabilities

as an explicit list of resources, components¹ states and actions.

States

may depend on each other so that you can reach state X only after

you¹ve

reached states Y and Z that the X depends on. The goal is from initial
state to reach some final state ³Deployed².



We also would like to add some mechanisms to HOT for declaratively doing
software component orchestration in Heat, e.g. saying that one component
depends on another one, or needs input from another one once it has been
deployed etc. (I BTW started to write a wiki page, which is admittedly far
from complete, but I would be happy to work on it with interested folks -
https://wiki.openstack.org/wiki/Heat/Software-Configuration-Provider).
However, we must be careful not to make such features too complicated so
nobody will be able to use it any more. That said, I believe we could make
HOT cover some levels of complexity, but not all. And then maybe workflow
based orchestration on top is needed.



Orchestration is not workflow, and HOT is an orchestration templating
language, not a workflow language. Extending it would just complect two
very different (though certainly related) tasks.

I think the appropriate thing to do is actually to join up with the
TaskFlow project and consider building it into a workflow service or

tools

(it is just a library right now).


There is such state graph for each of our deployment entities

(service,

VMs, other things). There is also an action that must be performed on

each

state.


Heat does its own translation of the orchestration template into a
workflow right now, but we have already discussed using TaskFlow to
break up the orchestration graph into distributable jobs. As we get more
sophisticated on updates (rolling/canary for instance) we'll need to
be able to reason about the process without having to glue all the
pieces together.


We propose to extend HOT DSL with workflow definition capabilities

where

you can describe step by step instruction to install service and

properly

handle errors on each step.

We already have an experience in implementation of the DSL, workflow
description and processing mechanism for complex deployments and

believe

we¹ll all benefit by re-using this experience and existing code,

having

properly discussed and agreed on abstraction layers and distribution

of

responsibilities between OS components. There is an idea of

implementing

part of workflow processing mechanism as a part of Convection


Re: [openstack-dev] Call for a clear COPYRIGHT-HOLDERS file in all OpenStack projects (and [trove] python-troveclient_0.1.4-1_amd64.changes REJECTED)

2013-10-22 Thread Zane Bitter

On 21/10/13 19:45, Thomas Goirand wrote:

On 10/20/2013 09:00 PM, Jeremy Stanley wrote:

On 2013-10-20 22:20:25 +1300 (+1300), Robert Collins wrote:
[...]

OTOH registering one's nominated copyright holder on the first
patch to a repository is probably a sustainable overhead. And it's
probably amenable to automation - a commit hook could do it locally
and a check job can assert that it's done.


I know the Foundation's got work underway to improve the affiliate
map from the member database, so it might be possible to have some
sort of automated job which proposes changes to a copyright holders
list in each project by running a query with the author and date of
each commit looking for new affiliations. That seems like it would
be hacky, fragile and inaccurate, but probably still more reliable
than expecting thousands of contributors to keep that information up
to date when submitting patches?


My request wasn't to go *THAT* far. The main problem I was facing was
that troveclient has a few files stating that HP was the sole copyright
holder, when it clearly was not (since I have discussed a bit with some
the dev team in Portland, IIRC some of them are from Rackspace...).

Just writing HP as copyright holder to please the FTP masters because it
would match some of the source content, then seemed wrong to me, which
is why I raised the topic. Also, they didn't like that I list the
authors (from a "git log" output) in the copyright files.


Can't we just write "Copyright OpenStack Contributors"? (Where 
'contributors' means individuals or organisations who have signed the 
CLA.) As others have pointed out, this is how other large projects 
handle it.


It's not that knowing the copyright holders isn't important - it *is* 
important because the licence is meaningless unless granted/permitted by 
the actual copyright holders. But the actual individual names are 
irrelevant to Debian. Gerrit ensures that only OpenStack Contributors 
(those that have signed the CLA) can contribute to OpenStack, and 
contributors declare via the CLA that they have the legal right to 
licence the code (which is the best that you can do here). The paper 
trail is complete, everybody should be happy.



So, for me, the clean and easy way to fix this problem is to have a
simple copyright-holder.txt file, containing a list of company or
individuals. It doesn't really mater if some entities forget to write
themselves in. After all, that'd be their fault, no? The point is, at
least I'd have an upstream source file to show to the FTP masters as
something which has a chance to be a bit more accurate than
second-guessing through "git log" or reading a few source code files
which represent a wrong view of the reality.


This seems like an orthogonal question, but if we're going to relitigate 
it then I remain +1 on maintaining it in one file per project as you 
suggest, and -10 on trying to maintain it in every single source file. 
Not because that would be inaccurate - though clearly it would be worse 
than useless in terms of accuracy - but because it would add a whole 
layer of overhead that's as annoying as it is pointless.


cheers,
Zane.



Any thoughts?

Thomas Goirand (zigo)

P.S: I asked the FTP masters to write in this thread, though it seems
nobody had time to do so...



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Heat] HOT Software configuration proposal

2013-10-22 Thread Zane Bitter

On 22/10/13 09:15, Thomas Spatzier wrote:

BTW, the convention of properties being input and attributes being output,
i.e. that subtle distinction between properties and attributes is not
really intuitive, at least not to me as non-native speaker, because I used
to use both words as synonyms.


As a native speaker, I can confidently state that it's not intuitive to 
anyone ;)


We unfortunately inherited these names from the Properties section and 
the Fn::GetAtt function in cfn templates. It's even worse than that, 
because there's a whole category of... uh... things (DependsOn, 
DeletionPolicy, &c.) that don't even have a name - I always have to 
resist the urge to call them 'attributes' too.


- ZB

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Heat] HOT Software configuration proposal

2013-10-22 Thread Zane Bitter

On 22/10/13 16:35, Thomas Spatzier wrote:

Zane Bitter  wrote on 22.10.2013 15:24:28:

From: Zane Bitter 
To: openstack-dev@lists.openstack.org,
Date: 22.10.2013 15:27
Subject: Re: [openstack-dev] [Heat] HOT Software configuration proposal

On 22/10/13 09:15, Thomas Spatzier wrote:

BTW, the convention of properties being input and attributes being

output,

i.e. that subtle distinction between properties and attributes is not
really intuitive, at least not to me as non-native speaker, because I

used

to use both words as synonyms.


As a native speaker, I can confidently state that it's not intuitive to
anyone ;)


Phew, good to read that ;-)



We unfortunately inherited these names from the Properties section and
the Fn::GetAtt function in cfn templates. It's even worse than that,
because there's a whole category of... uh... things (DependsOn,
DeletionPolicy, &c.) that don't even have a name - I always have to
resist the urge to call them 'attributes' too.


So is this something we should try to get straight in HOT while we still
have the flexibility?


Y-yes. Provided that we can do it without making things *more* 
confusing, +1. That's hard though, because there are a number of places 
we have to refer to them, all with different audiences:

 - HOT users
 - cfn users
 - Existing developers
 - New developers
 - Plugin developers

and using different names for the same thing can cause problems. My test 
for this is: if you were helping a user on IRC debug an issue, is there 
a high chance you would spend 15 minutes talking past each other because 
they misunderstand the terminology?



Regarding properties/attributes for example, to me I would call both just
properties of a resource or component, and then I can write them or read
them like:

components:
   my_component:
 type: ...
 properties:
   my_prop: { get_property: [ other_component, other_component_prop ] }

   other_component:
 # ...

I.e. you write property 'my_prop' of 'my_component' in its properties
section, and you read property 'other_component_prop' of 'other_component'
using the get_property function.
... we can also call them attributes, but use one name, not two different
names for the same thing.


IMO inputs (Properties) and outputs (Fn::GetAtt) are different things 
(and they exist in different namespaces), so -1 for giving them the same 
name.


In an ideal world I'd like HOT to use something like get_output_data (or 
maybe just get_data), but OTOH we have e.g. FnGetAtt() and 
attributes_schema baked in to the plugin API that we can't really 
change, so it seems likely to lead to developers and users adopting 
different terminology, or making things very difficult for new 
developers, or both :(


cheers,
Zane.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] CLA

2013-10-22 Thread Zane Bitter

On 22/10/13 16:22, Jeremy Stanley wrote:

(Disclaimers: I am not a lawyer, which likely explains my lack of
interest in perversely pointless paperwork. Also, these opinions are
my own and do not necessarily reflect those of my employer. Setting
MFT to legal-discuss as a more appropriate forum for these sorts of
discussions.)

On 2013-10-22 15:11:25 +0200 (+0200), Zane Bitter wrote:
[...]

Can't we just write "Copyright OpenStack Contributors"? (Where
'contributors' means individuals or organisations who have signed
the CLA.)

[...]

Actually, technically not. There are other avenues through which
patches come (posts on mailing lists, attachments to bugs) and I
know that from time to time contributors git-am other authors' bug
fixes without first asking them to go agree to an OpenStack CLA and
prove that they have done so. The actual copyright belongs with the
author (or their employer under a work-for-hire agreement), not the
contributor who uploaded that work--and they aren't necessarily
always the same people.


Fair point, although as you note below if the contributor does not 
identify the actual copyright holder in the submission, that is their 
responsibility not OpenStack's responsibility. Likely a few copyright 
holders will fall through the cracks here (e.g. from legitimately 
identified external code like https://review.openstack.org/#/c/40330/), 
but many, many *more* will fall through the cracks in trying to compile 
a list of them.


I'm not suggesting here that the CLA can provide an accurate list of 
copyright holders (which is impossible anyway), I'm saying that it 
provides a paper-trail back to somebody who warrants that they have the 
right to licence the code under the ASL (however mistaken they may be 
about that), and that this is precisely the paper trail that the Debian 
FTP masters are looking for.



Gerrit ensures that only OpenStack Contributors (those that have
signed the CLA) can contribute to OpenStack

[...]

To echo Monty's sentiments earlier in the thread, and also as the
person who spear-headed the current CLA enforcement configuration in
our project's Gerrit instance, I don't see how our CLAs add anything
of value. It's patronizing, almost insulting, to ask developers to
pinky-swear that they're authorized to license the code they
contribute under the license included with the code they contribute.


It's exactly as silly as Debian requiring the copyright holders to be 
identified alongside the licence. As an engineer, I'm inclined to agree 
that it's pretty silly, because it doesn't actually change anything - 
nobody is ever surprised when their contribution to open source ends up 
as open source, and if it turns out that they were not entitled to so 
licence it then it's still effectively everyone's problem, CLA or no. 
Clearly there are lawyers who disagree though.



At best it may provide a warm fuzzy feeling for companies who are
unfamiliar with contributing to free software projects, since free
software licenses are all about waiving your rights rather than
enforcing them and that might sound scary to the uninitiated... but
better efforts toward educating them about free software may prove
more productive than relying on a legal security blanket.

Also as mentioned above, Gerrit does not enforce that the copyright
holder has agreed to this, it only enforces that the person
*uploading* the code into Gerrit has agreed to it... and section 7
of the ICLA has some interesting things to say about submitting
third-party contributions, which looks to me like a permitted
loophole for getting ASL code into the project without the author
directly agreeing to a CLA at all.


7. Should You wish to submit work that is not Your original
creation, You may submit it to the Project Manager separately
from any Contribution, identifying the complete details of its
source and of any license or other restriction (including, but
not limited to, related patents, trademarks, and license
agreements) of which you are personally aware, and conspicuously
marking the work as "Submitted on behalf of a third-party:
[named here]".


I wonder if the current de facto practice of allowing git's author
header to reflect the identity of the third-party counts as a
conspicuous mark for the purposes of ICLA section 7? And whether
submitting it to Gerrit where it can be openly inspected by the
entire project counts as a submission to the Project Manager (the
OpenStack Foundation) as well? At any rate, it seems that the
agreement boils down to "copyright holders promise that they're
contributing code under this license, or that they're submitting
someone else's work who probably is okay with it."


That's exactly what it boils down to, and coincidentally exactly what 
the requirement to list copyright holders in Debian also boils down to 
afaict. We sh

Re: [openstack-dev] [Heat] Network topologies

2013-10-28 Thread Zane Bitter

On 27/10/13 16:37, Edgar Magana wrote:

Heat Developers,

I am one of the core developers for Neutron who is lately working on the
concept of "Network Topologies". I want to discuss with you if the
following blueprint will make sense to have in heat or neutron code:
https://blueprints.launchpad.net/neutron/+spec/network-topologies-api

Basically, I want to let tenants “save”, “duplicate” and “share” network
topologies by means of an API and a standardized JSON format that
describes network topologies. This google document provides detailed
description:
https://docs.google.com/document/d/1nPkLcUma_nkmuHYxCuUZ8HuryH752gQnkmrZdXeE2LM/edit#


It sounds to me like the only thing there that Heat is not already doing 
is to dump the existing network configuration. What if you were to 
implement just that part and do it in the format of a Heat template? (An 
independent tool to convert the JSON output to a Heat template would 
also work, I guess.)


A non-Heat JSON output like the one in the document might conceivably be 
easier for people to build tools such as visualisations on top of. On 
the other hand, Heat templates can take advantage of the visualisation 
code that already exists in Horizon.


If this becomes read-only then maybe the API can change to something like:

/v2/​{tenant_id}​/networks/​{network_id}​/topology


There is a concern in Neutron of not duplicating efforts done by Heat
team and also to find the right place for this API.

The intended work does NOT include any application driven orchestration
system, does NOT include any already existing or vender-specific
standard format for describing topologies, actually we want to
standardized one based on Neutron but it is still on discussion.

If Heat developers could provide their point of view about this
proposal, wether it should be moved to Heat or it is fine to keep it in
Neutron.


It does sound very much like you're trying to solve the same problem as 
Heat.


cheers,
Zane.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Heat] Network topologies [and more]

2013-10-28 Thread Zane Bitter

On 28/10/13 15:07, Mike Spreitzer wrote:

Zane Bitter  wrote on 10/28/2013 06:47:50 AM:
 > On 27/10/13 16:37, Edgar Magana wrote:
 > > Heat Developers,
 > >
 > > I am one of the core developers for Neutron who is lately working
on the
 > > concept of "Network Topologies". I want to discuss with you if the
 > > following blueprint will make sense to have in heat or neutron code:
 > > https://blueprints.launchpad.net/neutron/+spec/network-topologies-api
 > >
 > > ...
 >
 > It sounds to me like the only thing there that Heat is not already doing
 > is to dump the existing network configuration. What if you were to
 > implement just that part and do it in the format of a Heat template? (An
 > independent tool to convert the JSON output to a Heat template would
 > also work, I guess.)
 >
 > ...
 >
 > It does sound very much like you're trying to solve the same problem as
 > Heat.
 >

In my templates I have more than a network topology.  How would I
combine the extracted/shared network topology with the other stuff I
want in my heat template?


Copy and paste?


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Heat] Comments on Steve Baker's Proposal on HOT Software Config

2013-10-29 Thread Zane Bitter

On 28/10/13 04:23, Lakshminaraya Renganarayana wrote:

Sorry, Re-posting this with [Heat] in the subject line, because many of
us have filters based on [Heat] in the subject line.

Hello,

A few us at IBM studied Steve Baker's proposal on HOT Software
Configuration. Overall the proposed constructs and syntax are great --
we really like the clean syntax and concise specification of components.
We would like to propose a few minor extensions that help with better
expression of dependencies among components and resources, and in-turn
enable cross-vm coordination. We have captured our thoughts on this on
the following Wiki page
_
__https://wiki.openstack.org/wiki/Heat/Blueprints/hot-software-config-ibm-response_

We would like to discuss these further ... please post your comments and
suggestions.


Thanks for posting this! It seems to me like there may be one basic idea 
behind many of the differences between this and Steve's proposal. I'm 
going to try to explain the dichotomy, and let me know if I haven't 
understood correctly, but if I have then this is probably the first 
question we want to answer. (Note that some of the suggestions are 
definitely applicable regardless.)


So basically, the two options I see are:

1) Configurations represent entire software 'packages'; outputs are 
provided asynchronously as they become available and dependent packages 
can synchronise on them. Basically like WaitConditions now. (Steve B.)
2) Configurations represent individual steps in the deployment of a 
package; outputs are provided at the end of each configuration and 
dependent packages are synchronised purely on the graph of configs. 
(Lakshmi et. al.)


(I wrote this before seeing Georgy's comments, but I believe he has 
picked up on the same thing.)


I'm on the fence here. I lean toward the latter for 
implementation-related reasons, though the former seems like it would 
probably be tidier for users. I think getting a concrete idea of how 
outputs would be signalled within the config management system in the 
former system would allow us to compare it better to the latter system 
(which is described to a moderate level of detail already on this wiki 
page).



As brief feedback on these suggestions:
E1: Probably +1 for inputs, but tentative -1 for attributes. I'm not 
sure we can check anything useful with that other than internal 
consistency of the template. I'd like to see some more detail about how 
inputs/outputs would be exposed in the configuration management systems 
- or, more specifically, how the user can extend this to arbitrary 
configuration management systems.


E2: +1 for Opt1
-1 for Opt2 (mixing namespaces is bad)
-1 for Opt3

E3: Sounds like a real issue (also, the solution for E2 has to take this 
into account too); not sure about the implementation. In this method 
(i.e. option (2) above) shouldn't we be building the dependency graph in 
Heat rather than running through them sequentially as specified by the 
user? In that case, we should use a dictionary not a list:


  app_server:
type: OS::Nova::Server
properties:
  components:
install_user_profile:
  definition: InstallWasProfile
  params:
user_id
install_admin_profile:
  definition: InstallWasProfile
  params:
admin_id

E4: +1

E5: +1 but a question on where this is specified. In the component 
definition itself, or in the particular invocation of it on a server? 
Seems like it would have to be the latter.


cheers,
Zane.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Heat] Comments on Steve Baker's Proposal on HOT Software Config

2013-10-29 Thread Zane Bitter

On 28/10/13 14:53, Steven Hardy wrote:

On Sun, Oct 27, 2013 at 11:23:20PM -0400, Lakshminaraya Renganarayana wrote:

A few us at IBM studied Steve Baker's proposal on HOT Software
Configuration. Overall the proposed constructs and syntax are great -- we
really like the clean syntax and concise specification of components. We
would like to propose a few minor extensions that help with better
expression of dependencies among components and resources, and in-turn
enable cross-vm coordination. We have captured our thoughts on this on the
following Wiki page

https://wiki.openstack.org/wiki/Heat/Blueprints/hot-software-config-ibm-response


Thanks for putting this together.  I'll post inline below with cut/paste
from the wiki followed by my response/question:


E2: Allow usage of component outputs (similar to resources):
There are fundamental differences between components and resources...


So... lately I've been thinking this is not actually true, and that
components are really just another type of resource.  If we can implement
the software-config functionality without inventing a new template
abstraction, IMO a lot of the issues described in your wiki page no longer
exist.

Can anyone provide me with a clear argument for what the "fundamental
differences" actually are?


Here's an argument: Component deployments exist within a server 
resource, so the dependencies don't work in the same way. The static 
part of the configuration has to happen before the server is created, 
but the actual runtime part is created after. So the dependencies are 
inherently circular.



My opinion is we could do the following:
- Implement software config "components" as ordinary resources, using the
   existing interfaces (perhaps with some enhancements to dependency
   declaration)
- Give OS::Nova::Server a components property, which simply takes a list of
   resources which describe the software configuration(s) to be applied


I think to overcome the problem described above, we would also need to 
create a third type of resource. So we'd have a Configuration, a Server 
and a Deployment. (In dependency terms, these are analogous to 
WaitConditionHandle, Server and WaitCondition, or possibly EIP, Server 
and EIPAssociation.) The deployment would reference the server and the 
configuration, you could pass parameters into it get attributes out of 
it, add explicit dependencies on it &c.


What I'm not clear on in this model is how much of the configuration 
needs to be built to go onto the server (in the UserData?) before the 
server is created, and how that would be represented in such a way as to 
inherently create the correct dependency relationship (i.e. get the 
_Server_ as well as the Deployment to depend on the configuration).



This provides a lot of benefits:
- Uniformity of interfaces (solves many of the interface-mapping issues you
   discuss in the wiki)
- Can use provider resources and environments functionality unmodified
- Conceptually simple, we don't have to confuse everyone with a new
   abstraction sub-type and related terminology
- Resources describing software components will be stateful, as described
   in (E4), only the states would be the existing resource states, e.g
   CREATE, IN_PROGRESS == CONFIGURING, and CREATE, COMPLETE ==
   CONFIG_COMPLETE


+1 if we can do it.

cheers,
Zane.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Heat] Network topologies

2013-10-29 Thread Zane Bitter

On 29/10/13 19:33, Edgar Magana wrote:

Tim,

You statement "building an api that manages a network topology more than
one that needs to build out the dependencies between resources to help
create the network topology"
Is exactly what we are proposing, and this is why we believe this is not
under Heat domain.

This is why we are NOT proposing to manage any dependency between network
elements, that part is what I call "intelligence" of the orchestration and
we are not proposing any orchestration system, you are already have that
in place :-)


Well, if you don't manage the dependencies then it won't work. 
Dependencies are not dependencies by definition unless it won't work 
without them. What I think you mean is that you infer the dependencies 
internally to Neutron and don't include them in the artefact that you 
give to the user. Although actually you probably do, it's just not quite 
as explicit.


So it's like Heat, but it only handles networks and the templates are 
harder to read and write.



So, we simple want an API that tenats may use to "save", "retrieve" and
"share" topologies. For instance, tenant A creates a topology with two
networks (192.168.0.0/24 and 192.168.1.0/24) both with dhcp enabled and a
router connecting them. So, we first create it using CLI commands or
Horizon and then we call the API to save the topology for that tenant,
that topology can be also share  between tenants if the owner wanted to do
that, the same concept that we have in Neutron for "share networks", So
Tenant B or any other Tenants, don't need to re-create the whole topology,
just "open" the shared topology from tenant A. Obviously, overlapping IPs
will be a "must" requirement.


So, to be clear, my interpretation is that in this case you will spit 
out a JSON file to the user in tenantA that says "two networks 
(192.168.0.0/24 and 192.168.1.0/24) both with dhcp enabled and a router 
connecting them" (BTW does "networks" here mean "subnets"?) and a user 
in tenant B loads the JSON file and it creates to *different* networks 
(192.168.0.0/24 and 192.168.1.0/24) both with dhcp enabled and a router 
connecting them in tenantB.


I just want to confirm that, because parts of your preceding paragraph 
could be read as implying that you just open up access to tenant A's 
networks from tenant B, rather than creating new ones.



I am including in this thread to Mark McClain who is the Neutron PTL and
the main guy expressing concerns in not  having overlapping
functionalities between Neutron and Heat or any other project.


I think he is absolutely right.


I am absolutely, happy to discuss further with you but if you are ok with
this approach we could start the development in Neutron umbrella, final
thoughts?


I stand by my original analysis that the input part of the API is 
basically just a subset of Heat reimplemented inside Neutron.


As a consumer of the Neutron API, this is something that we really 
wouldn't want to interact with, because it duplicates what we do in a 
different way and that just makes everything difficult for our model.


I strongly recommend that you only implement the output side of the 
proposed API, and that the output be in the format of a Heat template. 
As Clint already pointed out, when combined with the proposed stack 
adopt/abandon features (http://summit.openstack.org/cfp/details/200) 
this will give you a very tidy interface to exactly the functionality 
you want without reinventing half of Heat inside Neutron.


We would definitely love to discuss this stuff with you at the Design 
Summit. So far, however you don't seem to have convinced anybody that 
this does not overlap with Heat. That would appear to forecast a high 
probability of wasted effort were you to embark on implementing the 
blueprint as written before then.


cheers,
Zane.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Heat] Comments on Steve Baker's Proposal on HOT Software Config

2013-10-30 Thread Zane Bitter

On 30/10/13 20:35, Lakshminaraya Renganarayana wrote:

 >I'd like to see some more detail about how
 > inputs/outputs would be exposed in the configuration management systems
 > - or, more specifically, how the user can extend this to arbitrary
 > configuration management systems.

The way inputs/outputs are exposed in a CM system
would depend on its conventions. In our use with Chef, we expose these
inputs and outputs as a Chef's node attributes, i.e., via the node[][]
hash. I could imagine a similar scheme for Puppet. For a shell type of
CM provider the inputs/outputs can be exposed as Shell environment
variables. To avoid name conflicts, these inputs/outputs can be prefixed
by a namespace, say Heat.


Right, so who writes the code that exposes the inputs/outputs to the CM 
system in that way? If it is the user, where does that code go and how 
does it work? And if it's not the user, how would the user accommodate a 
CM system that has not been envisioned by their provider? That's what 
I'm trying to get at with this question.


cheers,
Zane.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [trove] [heat] Multi region support

2015-09-01 Thread Zane Bitter

On 01/09/15 11:41, Lowery, Mathew wrote:

This is a Trove question but including Heat as they seem to have solved
this problem.

Summary: Today, it seems that Trove is not capable of creating a cluster
spanning multiple regions. Is that the case and, if so, are there any
plans to work on that? Also, are we aware of any precedent solutions
(e.g. remote stacks in Heat) or even partially completed spec/code in Trove?

More details:

I found this nice diagram

 created
for Heat. As far as I understand it,


Clarifications below...


#1 is the absence of multi-region
support (i.e. what we have today). #2 seems to be a 100% client-based
solution. In other words, the Heat services never know about the other
stacks.


I guess you could say that.


In fact, there is nothing tying these stacks together at all.


I wouldn't go that far. The regional stacks still appear as resources in 
their parent stack, so they're tied together by whatever inputs and 
outputs are connected up in that stack.



#3
seems to show a "master" Heat server that understands "remote stacks"
and simply converts those "remote stacks" into calls on regional Heats.
I assume here the master stack record is stored by the master Heat.
Because the "remote stacks" are full-fledged stacks, they can be managed
by their regional Heats if availability of master or other regional
Heats is lost.


Yeah.


#4, the diagram doesn't seem to match the description
(instead of one global Heat, it seems the diagram should show two
regional Heats).


It does (they're the two orange boxes).


In this one, a single arbitrary region becomes the
owner of the stack and remote (low-level not stack) resources are
created as needed. One problem is the manageability is lost if the Heat
in the owning region is lost. Finally, #5. In #5, it's just #4 but with
one and only one Heat.

It seems like Heat solved this 
using #3 (Master Orchestrator)


No, we implemented #2.


but where there isn't necessarily a
separate master Heat. Remote stacks can be created by any regional stack.


Yeah, that was the difference between #3 and #2 :)

cheers,
Zane.


Trove questions:

 1. Having sub-clusters (aka remote clusters aka nested clusters) seems
to be useful (i.e. manageability isn't lost when a region is lost).
But then again, does it make sense to perform a cluster operation on
a sub-cluster?
 2. You could forego sub-clusters and just create full-fledged remote
standalone Trove instances.
 3. If you don't create full-fledged remote Trove instances (but instead
just call remote Nova), then you cannot do simple things like
getting logs from a node without going through the owning region's
Trove. This is an extra hop and a single point of failure.
 4. Even with sub-clusters, the only record of them being related lives
only in the "owning" region. Then again, some ID tying them all
together could be passed to the remote regions.
 5. Do we want to allow the piecing together of clusters (sort of like
Heat's "adopt")?

These are some questions floating around my head and I'm sure there are
plenty more. Any thoughts on any of this?

Thanks,
Mat


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [trove] [heat] Multi region support

2015-09-02 Thread Zane Bitter

On 01/09/15 19:47, Angus Salkeld wrote:

On Wed, Sep 2, 2015 at 8:30 AM Lowery, Mathew mailto:mlow...@ebay.com>> wrote:

Thank you Zane for the clarifications!

I misunderstood #2 and that led to the other misunderstandings.

Further questions:
* Are nested stacks aware of their nested-ness? In other words,
given any
nested stack (colocated with parent stack or not), can I trace it
back to
the parent stack? (On a possibly related note, I see that adopting a
stack


Yes, there is a link (url) to the parent_stack in the links section of
show stack.


That's true only for resources which derive from StackResource, and 
which are manipulated through the RPC API. Mat was, I think, asking 
specifically about OS::Heat::Stack resources, which may (or may not) be 
in remote regions and are manipulated through the ReST API. Those ones 
are not aware of their nested-ness.



is an option to reassemble a new parent stack from its regional parts in
the event that the old parent stack is lost.)
* Has this design met the users' needs? In other words, are there any
plans to make major modifications to this design?


AFAIK we have had zero feedback from the multi region feature.
No more plans, but we would obviously love feedback and suggestions
on how to improve region support.


Yeah, this has not been around so long that there has been a lot of 
feedback.


I know people want to also do multi-cloud (i.e. where the remote region 
has a different keystone). It's tricky to implement because we need 
somewhere to store the credentials... we'll possibly end up saying that 
Keystone federation is required, and then we'll only have to pass the 
keystone auth URL in addition to what we already have.


cheers,
Zane.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Heat] convergence rally test results (so far)

2015-09-02 Thread Zane Bitter

On 02/09/15 04:55, Steven Hardy wrote:

On Wed, Sep 02, 2015 at 04:33:36PM +1200, Robert Collins wrote:

On 2 September 2015 at 11:53, Angus Salkeld  wrote:


1. limit the number of resource actions in parallel (maybe base on the
number of cores)


I'm having trouble mapping that back to 'and heat-engine is running on
3 separate servers'.


I think Angus was responding to my test feedback, which was a different
setup, one 4-core laptop running heat-engine with 4 worker processes.

In that environment, the level of additional concurrency becomes a problem
because all heat workers become so busy that creating a large stack
DoSes the Heat services, and in my case also the DB.

If we had a configurable option, similar to num_engine_workers, which
enabled control of the number of resource actions in parallel, I probably
could have controlled that explosion in activity to a more managable series
of tasks, e.g I'd set num_resource_actions to (num_engine_workers*2) or
something.


I think that's actually the opposite of what we need.

The resource actions are just sent to the worker queue to get processed 
whenever. One day we will get to the point where we are overflowing the 
queue, but I guarantee that we are nowhere near that day. If we are 
DoSing ourselves, it can only be because we're pulling *everything* off 
the queue and starting it in separate greenthreads.


In an ideal world, we might only ever pull one task off that queue at a 
time. Any time the task is sleeping, we would use for processing stuff 
off the engine queue (which needs a quick response, since it is serving 
the ReST API). The trouble is that you need a *huge* number of 
heat-engines to handle stuff in parallel. In the reductio-ad-absurdum 
case of a single engine only processing a single task at a time, we're 
back to creating resources serially. So we probably want a higher number 
than 1. (Phase 2 of convergence will make tasks much smaller, and may 
even get us down to the point where we can pull only a single task at a 
time.)


However, the fewer engines you have, the more greenthreads we'll have to 
allow to get some semblance of parallelism. To the extent that more 
cores means more engines (which assumes all running on one box, but 
still), the number of cores is negatively correlated with the number of 
tasks that we want to allow.


Note that all of the greenthreads run in a single CPU thread, so having 
more cores doesn't help us at all with processing more stuff in parallel.


cheers,
Zane.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


<    1   2   3   4   5   6   7   8   9   >