On 2013/27/11 00:00, Robert Collins wrote:
On 26 November 2013 07:41, Jaromir Coufal <jcou...@redhat.com> wrote:
Hey Rob,

can we add 'Slick Overcloud deployment through the UI' to the list? There
was no session about that, but we discussed it afterwords and agreed that it
is high priority for Icehouse as well.

I just want to keep it on the list, so we are aware of that.
Certainly. Please add a blueprint for that and I'll mark itup appropriately.
I will do.

Related to that we had a long chat in IRC that I was to follow up here, so - ...

Tuskar is refocusing on getting the basics really right - slick basic
install, and then work up. At the same time, just about every nova
person I've spoken too (a /huge/ sample of three, but meh :)) has
expressed horror that Tuskar is doing it's own scheduling, and
confusion about the need to manage flavors in such detail.
So the discussion on IRC was about getting back to basics - a clean
core design and something that we aren't left with technical debt that
we need to eliminate in order to move forward - which the scheduler
stuff would be.

So: my question/proposal was this: lets set a couple of MVPs.

0: slick install homogeneous nodes:
  - ask about nodes and register them with nova baremetal / Ironic (can
use those APIs directly)
  - apply some very simple heuristics to turn that into a cloud:
    - 1 machine - all in one
    - 2 machines - separate hypervisor and the rest
    - 3 machines - two hypervisors and the rest
    - 4 machines - two hypervisors, HA the rest
    - 5 + scale out hypervisors
  - so total forms needed = 1 gather hw details
  - internals: heat template with one machine flavor used

1: add support for heterogeneous nodes:
  - for each service (storage compute etc) supply a list of flavors
we're willing to have that run on
  - pass that into the heat template
  - teach heat to deal with flavor specific resource exhaustion by
asking for a different flavor (or perhaps have nova accept multiple
flavors and 'choose one that works'): details to be discussed with
heat // nova at the right time.

2: add support for anti-affinity for HA setups:
  - here we get into the question about short term deliverables vs long
term desire, but at least we'll have a polished installer already.

-Rob

Important point here is, that we agree on starting with very basics - grow then. Which is great.

The whole deployment workflow (not just UI) is all about user experience which is built on top of TripleO's approach. Here I see two important factors:
- There are *users* who are having some *needs and expectations*.
- There is underlying *concept of TripleO*, which we are using for *implementing* features which are satisfying those needs.

We are circling around and trying to approach the problem from wrong end - which is implementation point of view (how to avoid own scheduling).

Let's try get out of the box and start with thinking about our audience first - what they expect, what they need. Then we go back, put our implementation thinking hat on and find out how we are going to re-use OpenStack components to achieve our goals. In the end we have detailed plan.


=== Users ===

I would like to start with our targeted audience first - without milestones, without implementation details.

I think here is the main point where I disagree and which leads to different approaches. I don't think, that user of TripleO cares *only* about deploying infrastructure without any knowledge where the things go. This is overcloud user's approach - 'I want VM and I don't care where it runs'. Those are self-service users / cloud users. I know we are OpenStack on OpenStack, but we shouldn't go that far that we expect same behavior from undercloud users. I can tell you various examples of why the operator will care about where the image goes and what runs on specific node.

/One quick example:/
I have three racks of homogenous hardware and I want to design it the way so that I have one control node in each, 3 storage nodes and the rest compute. With that smart deployment, I'll never know what my rack contains in the end. But if I have control over stuff, I can say that this node is controller, those three are storage and those are compute - I am happy from the very beginning.

Our targeted audience are sysadmins, operators. They hate 'magics'. They want to have control over things which they are doing. If we put in front of them workflow, where they click one button and they get cloud installed, they will get horrified.

That's why I am very sure and convinced that we need to have ability for user to have control over stuff. What node is having what role. We can be smart, suggest and advice. But not hiding this functionality from user. Otherwise, I am afraid that we can fail.

Furthermore, if we put lots of restrictions (like homogenous hardware) in front of users from the very beginning, we are discouraging people from using TripleO-UI. We are young project and trying to hit as broad audience as possible. If we do flexible enough approach to get large audience interested, solve their problems, we will get more feedback, we will get early adopters, we will get more contributors, etc.

First, let's help cloud operator, who is having some nodes and wants to deploy OpenStack on them. He wants to have control which node is controller, which node is compute or storage. Then we can get smarter and guide.


=== Milestones ===

Based on different user behavior I am talking about, I suggest different milestones:

V0: basic slick installer - flexibility and control first
- enable user to auto-discover (or manual register) nodes
- let user decide, which node is going to be controller, which is going to be compute or storage
- associate images with these nodes
- deploy

V1: monitoring and support for node profiles
- monitor the deployment, services and nodes
- allow user to define 'node profiles' (which are helping with suggestions where the node belongs, but user always has to have control on that)
- give user smart guidance where the hardware belongs

V2: advanced settings
- give possibility to choose which services are going where (at the moment he would have all controller services at one node).
- enhance networking setup

V?: grow, add functionality
- more views on infrastructure (network, physical reality - racking, etc).
- more monitoring
- more possibilities of various stuff management
- scheduled maintenance
- smart power consumption
- ...?


=== Implementation ===

Above mentioned approach shouldn't lead to reimplementing scheduler. We can still use nova-scheduler, but we can take advantage of extra params (like unique identifier), so that we specify more concretely what goes where.

More details should follow here - how to achieve above mentioned goals, like what should go through heat, what should go through nova, ironic, etc.

But first, let's agree on approach and goals.

-- Jarda
_______________________________________________
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to