On 15.1.2014 14:07, James Slagle wrote:
I'll start by laying out how I see editing or updating nodes working
in TripleO without Tuskar:

To do my initial deployment:
1.  I build a set of images for my deployment for different roles. The
images are different based on their role, and only contain the needed
software components to accomplish the role they intend to be deployed.
2.  I load the images into glance
3.  I create the Heat template for my deployment, likely from
fragments that are already avaiable. Set quantities, indicate which
images (via image uuid) are for which resources in heat.
4.  heat stack-create with my template to do the deployment

To update my deployment:
1.  If I need to edit a role (or create a new one), I create a new image.
2.  I load the new image(s) into glance
3.  I edit my Heat template, update any quantities, update any image uuids, etc.
4.  heat stack-update my deployment

In both cases above, I see the role of Tuskar being around steps 3 and 4.

+1. Although it's worth noting that if we want zero downtime updates, we'll probably need ability to migrate content off the machines being updated - that would be a pre-3 step. (And for that we need spare capacity equal to the number of nodes being updated, so we'll probably want to do updating in chunks in the future, not the whole overcloud at once).


I may be misinterpreting, but let me say that I don't think Tuskar
should be building images. There's been a fair amount of discussion
around a Nova native image building service [1][2]. I'm actually not
sure what the status/concensus on that is, but maybe longer term,
Tuskar might call an API to kick off an image build.

Yeah I don't think image building should be driven through Tuskar API (and probably not even Tuskar UI?). Tuskar should just fetch images from Glance imho. However, we should be aware that image building *is* our concern, as it's an important prerequisite for deployment. We should provide at least directions how to easily build images for use with Tuskar, not leave users in doubt.

<snip>

"We will have to store image metadata in tuskar probably, that would map to
glance, once the image is generated. I would say we need to store the list
of the elements and probably the commit hashes (because elements can
change). Also it should be versioned, as the images in glance will be also
versioned.

I'm not sure why this image metadata would be in Tuskar. I definitely
like the idea of knowing the versions/commit hashes of the software
components in your images, but that should probably be in Glance.

+1


We can't probably store it in the Glance, cause we will first store the
metadata, then generate image. Right?

I'm not sure I follow this point. But, mainly, I don't think Tuskar
should be automatically generating images.

+1


Then we could see whether image was created from the metadata and whether
that image was used in the heat-template. With versions we could also see
what has changed.

We'll be able to tell what image was used in the heat template, and
thus the deployment,  based on it's UUID.

I love the idea of seeing differences between images, especially
installed software versions, but I'm not sure that belongs in Tuskar.
That sort of utility functionality seems like it could apply to any
image you might want to launch in OpenStack, not just to do a
deployment.  So, I think it makes sense to have that as Glance
metadata or in Glance somehow. For instance, if I wanted to launch an
image that had a specific version of apache, it'd be nice to be able
to see that when I'm choosing an image to launch.

Yes. We might want to show the data to the user, but i don't see a need to run this through Tuskar API. Tuskar UI could query Glance directly and display the metadata to the user. (When using CLI, one could use Glance CLI directly. We're not adding any special logic on top.)


But there was also idea that there will be some generic image, containing
all services, we would just configure which services to start. In that case
we would need to version also this.

-1 to this.  I think we should stick with specialized images per role.
I replied on the wireframes thread, but I don't see how
enabling/disabling services in a prebuilt image should work. Plus, I
don't really think it fits with the TripleO model of having an image
created based on it's specific "role" (I hate to use that term and
muddy the water....i mean in the generic sense here).


= New Comments =

My comments on this train of thought:

- I'm afraid of the idea of applying changes immediately for the same
reasons I'm worried about a few other things. Very little of what we do will
actually finish executing immediately and will instead be long running
operations. If I edit a few roles in a row, we're looking at a lot of
outstanding operations executing against other OpenStack pieces (namely
Heat).

The idea of immediately also suffers from a sort of "Oh shit, that's not
what I meant" when hitting save. There's no way for the user to review what
the larger picture is before deciding to make it so.

+1

Yeah we probably can't immediately update everything. Apart from "that's not what i meant", this would probably not work when attempting zero downtime update.

On the other hand, i'd say we aim at keeping all machines in sync as much as possible. So it would be nice to have machines somehow displayed as "needs update", and user could then say "ok, update these machines".

We need some sort of task tracking that prevents overlapping operations from
executing at the same time. Tuskar needs to know what's happening instead of
simply having the UI fire off into other OpenStack components when the user
presses a button.

To rehash an earlier argument, this is why I advocate for having the
business logic in the API itself instead of at the UI. Even if it's just a
queue to make sure they don't execute concurrently (that's not enough IMO,
but for example), the server is where that sort of orchestration should take
place and be able to understand the differences between the configured state
in Tuskar and the actual deployed state.

Just to clarify: I think the prevailing opinion on the list was that there should be no significant business logic in the UI. It should either live in a library (e.g. as separated part of tuskarclient), or in the API. For details there's a long openstack-dev thread [5].

Even though i'm still not 100% convinced which path is right, i see that having logic in API will let us do long running tasks and track their progress (e.g. chunked updates i mentioned earlier, if we want to do them in the future). However, we should be careful not mirror data in our DB that belongs elsewhere.


I'm off topic a bit though. Rather than talk about how we pull it off, I'd
like to come to an agreement on what the actual policy should be. My
concerns focus around the time to create the image and get it into Glance
where it's available to actually be deployed. When do we bite that time off
and how do we let the user know it is or isn't ready yet?

I think this becomes simpler if you're not worried about building
images. Even so, some task tracking will likely be needed. TaskFlow[3]
and Mistral[4] may be relevant.


- Editing a node is going to run us into versioning complications. So far,
all we've entertained are ways to map a node back to the resource category
it was created under. If the configuration of that category changes, we have
no way of indicating that the node is out of sync.

We could store versioned resource categories in the Tuskar DB and have the
version information also find its way to the nodes (note: the idea is to use
the metadata field on a Heat resource to store the res-cat information, so
including version is possible). I'm less concerned with eventual reaping of
old versions here since it's just DB data, though we still hit the question
of when to delete old images.

Is resource category the same as role?  Sorry :), I probably need to
go back and re-read the terminology thread. If so, I think versioning
them in the Tuskar db makes sense. That way you know what's been
deployed and what hasn't, as well as any differences.

I wondering if we can fly without Resource Category versioning and linking the versions to nodes, but maybe can't. If i update a Resource Category config, but image stays the same, i don't have means of learning which nodes run the updated config, do i? So maybe we'll need to keep track of this ourselves. However, i'd like to keep such node metadata in Ironic, not in Tuskar.

This is a wider-reaching issue - we should try to avoid creating Node and Image resources in Tuskar if possible - i think these should belong to Ironic and Glance, respectively.


- For the comment on a generic image with service configuration, the first
thing that came to mind was the thread on creating images from packages [1].
It's not the exact same problem, but see Clint Byrum's comments in there
about drift. My gut feeling is that having specific images for each res-cat
will be easier to manage than trying to edit what services are running on a
node.

+1.

+1. Image per role will mean greater flexibility. I imagine that in advanced deployments users might want to deploy resource categories which do not run OpenStack services at all (e.g. RDO Foreman has such parts that help with high availability / load balancing setups).


[1] http://lists.openstack.org/pipermail/openstack-dev/2013-August/013122.html
[2] https://wiki.openstack.org/wiki/NovaImageBuilding
[3] https://wiki.openstack.org/wiki/TaskFlow
[4] https://wiki.openstack.org/wiki/Mistral


[5] http://lists.openstack.org/pipermail/openstack-dev/2013-December/021919.html


_______________________________________________
OpenStack-dev mailing list
[email protected]
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to