On 20 January 2016 at 10:03, Jiří Stránský <ji...@redhat.com> wrote:
> On 18.1.2016 19:49, Tzu-Mainn Chen wrote: > >> ----- Original Message ----- >> >>> On Thu, 2016-01-14 at 16:04 -0500, Tzu-Mainn Chen wrote: >>> >>>> >>>> ----- Original Message ----- >>>> >>>>> On Wed, Jan 13, 2016 at 04:41:28AM -0500, Tzu-Mainn Chen wrote: >>>>> >>>>>> Hey all, >>>>>> >>>>>> I realize now from the title of the other TripleO/Mistral thread >>>>>> [1] that >>>>>> the discussion there may have gotten confused. I think using >>>>>> Mistral for >>>>>> TripleO processes that are obviously workflows - stack >>>>>> deployment, node >>>>>> registration - makes perfect sense. That thread is exploring >>>>>> practicalities >>>>>> for doing that, and I think that's great work. >>>>>> >>>>>> What I inappropriately started to address in that thread was a >>>>>> somewhat >>>>>> orthogonal point that Dan asked in his original email, namely: >>>>>> >>>>>> "what it might look like if we were to use Mistral as a >>>>>> replacement for the >>>>>> TripleO API entirely" >>>>>> >>>>>> I'd like to create this thread to talk about that; more of a >>>>>> 'should we' >>>>>> than 'can we'. And to do that, I want to indulge in a thought >>>>>> exercise >>>>>> stemming from an IRC discussion with Dan and others. All, please >>>>>> correct >>>>>> me >>>>>> if I've misstated anything. >>>>>> >>>>>> The IRC discussion revolved around one use case: deploying a Heat >>>>>> stack >>>>>> directly from a Swift container. With an updated patch, the Heat >>>>>> CLI can >>>>>> support this functionality natively. Then we don't need a >>>>>> TripleO API; we >>>>>> can use Mistral to access that functionality, and we're done, >>>>>> with no need >>>>>> for additional code within TripleO. And, as I understand it, >>>>>> that's the >>>>>> true motivation for using Mistral instead of a TripleO API: >>>>>> avoiding custom >>>>>> code within TripleO. >>>>>> >>>>>> That's definitely a worthy goal... except from my perspective, >>>>>> the story >>>>>> doesn't quite end there. A GUI needs additional functionality, >>>>>> which boils >>>>>> down to: understanding the Heat deployment templates in order to >>>>>> provide >>>>>> options for a user; and persisting those options within a Heat >>>>>> environment >>>>>> file. >>>>>> >>>>>> Right away I think we hit a problem. Where does the code for >>>>>> 'understanding >>>>>> options' go? Much of that understanding comes from the >>>>>> capabilities map >>>>>> in tripleo-heat-templates [2]; it would make sense to me that >>>>>> responsibility >>>>>> for that would fall to a TripleO library. >>>>>> >>>>>> Still, perhaps we can limit the amount of TripleO code. So to >>>>>> give API >>>>>> access to 'getDeploymentOptions', we can create a Mistral >>>>>> workflow. >>>>>> >>>>>> Retrieve Heat templates from Swift -> Parse capabilities map >>>>>> >>>>>> Which is fine-ish, except from an architectural perspective >>>>>> 'getDeploymentOptions' violates the abstraction layer between >>>>>> storage and >>>>>> business logic, a problem that is compounded because >>>>>> 'getDeploymentOptions' >>>>>> is not the only functionality that accesses the Heat templates >>>>>> and needs >>>>>> exposure through an API. And, as has been discussed on a >>>>>> separate TripleO >>>>>> thread, we're not even sure Swift is sufficient for our needs; >>>>>> one possible >>>>>> consideration right now is allowing deployment from templates >>>>>> stored in >>>>>> multiple places, such as the file system or git. >>>>>> >>>>> >>>>> Actually, that whole capabilities map thing is a workaround for a >>>>> missing >>>>> feature in Heat, which I have proposed, but am having a hard time >>>>> reaching >>>>> consensus on within the Heat community: >>>>> >>>>> https://review.openstack.org/#/c/196656/ >>>>> >>>>> Given that is a large part of what's anticipated to be provided by >>>>> the >>>>> proposed TripleO API, I'd welcome feedback and collaboration so we >>>>> can move >>>>> that forward, vs solving only for TripleO. >>>>> >>>>> Are we going to have duplicate 'getDeploymentOptions' workflows >>>>>> for each >>>>>> storage mechanism? If we consolidate the storage code within a >>>>>> TripleO >>>>>> library, do we really need a *workflow* to call a single >>>>>> function? Is a >>>>>> thin TripleO API that contains no additional business logic >>>>>> really so bad >>>>>> at that point? >>>>>> >>>>> >>>>> Actually, this is an argument for making the validation part of the >>>>> deployment a workflow - then the interface with the storage >>>>> mechanism >>>>> becomes more easily pluggable vs baked into an opaque-to-operators >>>>> API. >>>>> >>>>> E.g, in the long term, imagine the capabilities feature exists in >>>>> Heat, you >>>>> then have a pre-deployment workflow that looks something like: >>>>> >>>>> 1. Retrieve golden templates from a template store >>>>> 2. Pass templates to Heat, get capabilities map which defines >>>>> features user >>>>> must/may select. >>>>> 3. Prompt user for input to select required capabilites >>>>> 4. Pass user input to Heat, validate the configuration, get a >>>>> mapping of >>>>> required options for the selected capabilities (nested validation) >>>>> 5. Push the validated pieces ("plan" in TripleO API terminology) to >>>>> a >>>>> template store >>>>> >>>>> This is a pre-deployment validation workflow, and it's a superset >>>>> of the >>>>> getDeploymentOptions feature you refer to. >>>>> >>>>> Historically, TripleO has had a major gap wrt workflow, meaning >>>>> that we've >>>>> always implemented it either via shell scripts (tripleo-incubator) >>>>> or >>>>> python code (tripleo-common/tripleo-client, potentially TripleO >>>>> API). >>>>> >>>>> So I think what Dan is exploring is, how do we avoid reimplementing >>>>> a >>>>> workflow engine, when a project exists which already does that. >>>>> >>>>> My gut reaction is to say that proposing Mistral in place of a >>>>>> TripleO API >>>>>> is to look at the engineering concerns from the wrong >>>>>> direction. The >>>>>> Mistral alternative comes from a desire to limit custom TripleO >>>>>> code at all >>>>>> costs. I think that is an extremely dangerous attitude that >>>>>> leads to >>>>>> compromises and workarounds that will quickly lead to a shaky >>>>>> code base >>>>>> full of design flaws that make it difficult to implement or >>>>>> extend any >>>>>> functionality cleanly. >>>>>> >>>>> >>>>> I think it's not about limiting TripleO code at all costs, it's >>>>> about >>>>> learning from past mistakes, where long-term TripleO specific >>>>> workarounds >>>>> for gaps in other projects have become serious technical debt. >>>>> >>>>> For example, the old merge.py approach to template composition was >>>>> a >>>>> workaround for missing heat features, then Tuskar was another >>>>> workaround >>>>> (arguably) for missing heat features, and now we're again proposing >>>>> a >>>>> long-term workaround for some missing heat features, some of which >>>>> are >>>>> already proposed (referring to the API for capabilities >>>>> resolution). >>>>> >>>>> >>>> This is an important point, thanks for bringing it up! >>>> >>>> I think that I might have a different understanding of the lessons to >>>> be >>>> learned from Tuskar's limitations. There were actually two issues >>>> that >>>> arose. The first was that Tuskar was far too specific in how it >>>> tried to >>>> manipulated Heat pieces. The second - and more serious, from my >>>> point of >>>> view - was that there literally was no way for an API-based GUI to >>>> perform the tasks it needed to in order to do the correct >>>> manipulation >>>> (environment selection), because there was no Heat API in place for >>>> doing >>>> so. >>>> >>>> My takeaway from the first issue was that any potential TripleO API >>>> in >>>> the future needed to be very low-level, a light skimming on top of >>>> the >>>> OpenStack services it uses. The plan creation process that the >>>> tripleo-common library spec describes is that: it's just a couple of >>>> methods designed to allow a user to create an environment file, which >>>> can then be used for deploying the overcloud. >>>> >>>> My takeaway from the second issue was a bit more complicated. A >>>> required feature was missing, and although the proper functionality >>>> needed to enable it in Heat was identified, it was unclear (and >>>> remains >>>> unclear) whether that feature truly belonged in Heat. What does a >>>> GUI >>>> do then? The GUI could take a cycle off, which is essentially what >>>> happened here; I don't think that's a reasonable solution. We could >>>> hope that we arrive at a 100% foolproof and immutable deployment >>>> solution >>>> in the future, arriving at a point where no new features would ever >>>> be >>>> needed; I don't think that's a practical hope. >>>> >>>> The third solution that came to mind was the idea of creating the >>>> TripleO API. It gives us a place to add in missing features if >>>> needed. >>>> And I think it also gives us a useful layer of indirection. The >>>> consumers of TripleO want a stable API, so that a new release doesn't >>>> force them to do a massive update of their code; the TripleO API >>>> would >>>> provide that, allowing us to switch code behind the scenes (say, if >>>> the capabilities feature lands in Heat). >>>> >>> >>> I think the above example would work equally well in a generic workflow >>> sort of tool. You could image that the inputs to the workflow remain >>> the same... but rather than running our own code in some interim step >>> we simply call Heat directly for the capabilities map feature. >>> >>> So regardless of whether we build our own API or use a generic workflow >>> too I think we still have what I would call a "release valve" to let us >>> inject some custom code (actions) into the workflow. Like we discussed >>> last week on IRC I would like to minimize the number of custom actions >>> we have (with an eye towards things living in the upstream OpenStack >>> projects) but it is fine to do this either way and would work equally >>> well w/ Mistral and TripleO API. >>> >>> >>>> I think I kinda view TripleO as a 'best practices' project. Using >>>> OpenStack is a confusing experience, with a million different options >>>> and choices to make. TripleO provides users with an excellent guide. >>>> But the problem is that best practices change, and I think that >>>> perceived instability is dangerous for adoption of TripleO. >>>> >>>> So having a TripleO library and its associated API be a 'best >>>> practices' >>>> library makes sense to me. It gives consumers a stable platform upon >>>> which to use TripleO, while allowing us to be flexible behind the >>>> scenes. >>>> The 'best practice' for Heat capabilities right now is a workaround, >>>> because it hasn't been judged to be suitable to go into Heat itself. >>>> If that changes, we get to shift as well - and all of these changes >>>> are >>>> invisible to the API consumer. >>>> >>> >>> >>> I mentioned this in my "Driving workflows with Mistral" thread but with >>> regards to stability I view say Heat's v1 API or Mistral's v2 API as >>> both being way more stable that what we could ever achieve with TripleO >>> API. The real trick to API stability with something like Heat or >>> Mistral is how we manage the inputs and outputs to Stacks and Workflows >>> themselves. So long as we are mindful of this I can't image an end user >>> (say a GUI writer or whoever) would really care whether they POST to >>> Mistral or something we've created. The nice thing about using other >>> OpenStack projects like Heat or Mistral is that they very likely have >>> better community and documentation around these things as well that we >>> would ever have. >>> >>> The more I look at using Mistral for some of the cases that have been >>> brought up the more it seems to make sense for a lot of the workflows >>> we need. I don't believe we can achieve better stability by creating >>> what sounds more and more like a shim/proxy API rather than using the >>> versioned API's that OpenStack already provides. >>> >>> There may be some corner cases where a "GUI helper" API comes into play >>> for some sort of caching or something. I'm not blocking anyone from >>> creating these sorts of features if they need them. And again if it is >>> something that could be added to an upstream OpenStack project like >>> Heat or Mistral I would look there first. So perhaps Zaqar for >>> websockets instead of rolling our own, this sort of thing. >>> >>> What does concern me is that we are overstating what TripleO API should >>> actually contain should we choose to pursue it. Initially it was >>> positioned as the "TripleO workflow API". I think we now agree that we >>> probably shouldn't put all of our workflows behind it. So if our stance >>> has changed would it make sense to compile a new list of what we >>> believe belongs behind our own TripleO API vs. what we consider >>> workflows. >>> >>> >> >> I wonder if it would be helpful to get operator feedback here - show them >> the advantages/disadvantages of both options and to get a sense of what >> might be useful/necessary for them to use TripleO effectively? >> > > (I'm going off on a tangent a bit, but please bear with me, i'm using all > that to support the point in the end. The implications of building a > TripleO API touch on various topics.) > > Yes i think we should gather operator feedback. We already got some, but > we should gather more whenever possible. > > One kind of (negative) feedback i've heard is that overcloud management is > too much of a "blackbox" compared to what operators are used to. The > feedback i recall was that it's hard to tell what is going to happen when > running an overcloud stack update, and that we cannot re-execute the > software config management independently. > > Building another umbrella API to rule the already largely umbrella-like > deployment process (think what all responsibilities lie within the > tripleo-heat-templates codebase, and within the single 'overcloud' Heat > stack) would probably make matters more blackboxy and go further in the > direction of "i feel like i don't know what's happening to my cloud when i > use the management tool". > I completely agree that we want to make the tool less of a blackbox. I am not convinced that Mistral will do this (do tripleo-heat-templates make things less blackbox-y because they are YAML users can look at? Maybe for some users but they still confuse me!). However, given that I think we all agree Mistral is a good fit for some of the workflow tasks (introspection, deploying, etc.) I think it is a good idea to see if Mistral will work well, or well enough for the other tasks we need (essentially some template introspection/processing). It will certainly be more obvious what is going on if all the actions are in Mistral and now split between it and a custom API. What i think could improve the situation for operators is trying to chunk > up what we already have into smaller, more independently operable parts. > The split-stack approach already discussed on the TripleO meeting and on > #tripleo could help with this. Essentially separating our hardware > management from our software config management. Being able to re-apply > software configuration without being afraid of having nodes accidentally > re-provisioned from scratch. > +1, this would be a very valuable change for the project generally. In general i think TripleO could be a little more "UNIXy" - composed of > smaller parts that make sense on their own, transparent to the operator, > more modular and modifiable, and in effect more receptive of how varying > are the real world deployment environments (various Neutron and Cinder > plugins, Keystone backends, composable set of services, custom node types > etc.). > > Workflow persisted in a data-like fashion is probably more modifiable by > the operator than Python code of a REST API. We've seen hard assumptions > cause problems in the past. (Think the unoverridable CLI parameters issue > we used to have, and how we had to move to a model of "CLI provides its > values, but you can always override them or provide additional ones with an > environment file if needed", which we now use extensively). I'm a bit > concerned that building a new REST API on top of everything would impose > new rigid assumptions that could cause more harm than good in the end. I'm > concerned that it would be usable only for very basic deployments, while > the world of real deployments has its own pace and requirements not fitting > the "best practices" as defined by the API, having to bypass the API far > too often and slowly pushing it into abandonment over time. > > My mind is probably biased towards the the operator feedback that > resonated with me the most, i've heard pro-blackbox opinions too (though > not from operators yet IIRC). So take what i wrote just as my 2 cents, but > i think it's necessary to consider the above issues when thinking about the > implications of building a TripleO API. > > Regarding the non-workflow kind of features we need for empowering GUI, > wouldn't those be useful for normal (tenant) Heat stack deployments in the > overcloud too? It sounds to me that features like "driving a Heat stack > deployment with the same powers from CLI or GUI", "updating a CLI-created > stack from GUI and vice versa", "understanding/parsing what are the > configuration options of my Heat templates" are all features that are not > specific to TripleO, and could be useful for tenant Heat stacks too. So > perhaps these should be implemented in Heat? If that can't happen fast > enough, then we might need to put some workarounds in place for now, but it > might be better if we didn't advertise those as a stable solution. > > > Jirka > > > >> Mainn >> >> >> >>> Dan >>> >>> >>>> Mainn >>>> >>>> >>>> >>>>> I think the correct attitude is to simply look at the problem >>>>>> we're >>>>>> trying to solve and find the correct architecture. For these >>>>>> get/set >>>>>> methods that the API needs, it's pretty simple: storage -> some >>>>>> logic -> >>>>>> a REST API. Adding a workflow engine on top of that is unneeded, >>>>>> and I >>>>>> believe that means it's an incorrect solution. >>>>>> >>>>> >>>>> What may help is if we can work through the proposed API spec, and >>>>> identify which calls can reasonably be considered workflows vs >>>>> those where >>>>> it's really just proxying an API call with some logic? >>>>> >>>>> When we have a defined list of "not workflow" API requirements, >>>>> it'll >>>>> probably be much easier to rationalize over the value of a bespoke >>>>> API vs >>>>> mistral? >>>>> >>>>> >>>>> Steve >>>>> >>>>> ___________________________________________________________________ >>>>> _______ >>>>> OpenStack Development Mailing List (not for usage questions) >>>>> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsu >>>>> bscribe >>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >>>>> >>>>> >>>> _____________________________________________________________________ >>>> _____ >>>> OpenStack Development Mailing List (not for usage questions) >>>> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubs >>>> cribe >>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >>>> >>> >>> >>> __________________________________________________________________________ >>> OpenStack Development Mailing List (not for usage questions) >>> Unsubscribe: >>> openstack-dev-requ...@lists.openstack.org?subject:unsubscribe >>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >>> >>> >> __________________________________________________________________________ >> OpenStack Development Mailing List (not for usage questions) >> Unsubscribe: >> openstack-dev-requ...@lists.openstack.org?subject:unsubscribe >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >> >> > > __________________________________________________________________________ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >
__________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev