Re: [openstack-dev] [ironic][edge] Notes from the PTG

2018-09-28 Thread Csatari, Gergely (Nokia - HU/Budapest)
Hi Jim,

Thanks for sharing your notes.

One note about the jumping automomus control plane requirement.
This requirement was already identified during the Dublin PTG workshop 
[1<https://wiki.openstack.org/w/index.php?title=OpenStack_Edge_Discussions_Dublin_PTG>].
 This is needed for two reasons the edge cloud instance should stay operational 
even if there is a network break towards other edge cloud instances and the 
edge cloud instance should work together with other edge cloud instances 
running other version of the control plane. In Denver we deided to leave out 
these requirements form the MVP architecture discussions.

Br,
Gerg0

[1]: 
https://wiki.openstack.org/w/index.php?title=OpenStack_Edge_Discussions_Dublin_PTG



From: Jim Rollenhagen mailto:j...@jimrollenhagen.com>>
Reply-To: 
"openstack-dev@lists.openstack.org<mailto:openstack-dev@lists.openstack.org>" 
mailto:openstack-dev@lists.openstack.org>>
Date: Wednesday, September 19, 2018 at 10:49 AM
To: 
"openstack-dev@lists.openstack.org<mailto:openstack-dev@lists.openstack.org>" 
mailto:openstack-dev@lists.openstack.org>>
Subject: [openstack-dev] [ironic][edge] Notes from the PTG

I wrote up some notes from my perspective at the PTG for some internal teams 
and figured I may as well share them here. They're primarily from the ironic 
and edge WG rooms. Fairly raw, very long, but hopefully useful to someone. 
Enjoy.

Tuesday: edge

Edge WG (IMHO) has historically just talked about use cases, hand-waved a bit, 
and jumped to requiring an autonomous control plane per edge site - thus 
spending all of their time talking about how they will make glance and keystone 
sync data between control planes.

penick described roughly what we do with keystone/athenz and how that can be 
used in a federated keystone deployment to provide autonomy for any control 
plane, but also a single view via a global keystone.

penick and I both kept pushing for people to define a real architecture, and we 
ended up with 10-15 people huddled around an easel for most of the afternoon. 
Of note:

- Windriver (and others?) refuse to budge on the many control plane thing
- This means that they will need some orchestration tooling up top in the 
main DC / client machines to even come close to reasonably managing all of 
these sites
- They will probably need some syncing tooling
- glance->glance isn’t a thing, no matter how many people say it is.
- Glance PTL recommends syncing metadata outside of glance process, and a 
global(ly distributed?) glance backend.
- We also defined the single pane of glass architecture that Oath plans to 
deploy
- Okay with losing connectivity from central control plane to single edge 
site
- Each edge site is a cell
- Each far edge site is just compute nodes
- Still may want to consider image distribution to edge sites so we don’t 
have to go back to main DC?
- Keystone can be distributed the same as first architecture
- Nova folks may start investigating putting API hosts at the cell level to 
get the best of both worlds - if there’s a network partition, can still talk to 
cell API to manage things
- Need to think about removing the need for rabbitmq between edge and far 
edge
- Kafka was suggested in the edge room for oslo.messaging in general
- Etcd watchers may be another option for an o.msg driver
- Other other options are more invasive into nova - involve changing 
how nova-compute talks to conductor (etcd, etc) or even putting REST APIs in 
nova-compute (and nova-conductor?)
- Neutron is going to work on an OVS “superagent” - superagent does the 
RPC handling, talks some other way to child agents. Intended to scale to 
thousands of children. Primary use case is smart nics but seems like a win for 
the edge case as well.

penick took an action item to draw up the architecture diagrams in a digestable 
format.

Wednesday: ironic things

Started with a retrospective. See 
https://etherpad.openstack.org/p/ironic-stein-ptg-retrospective for the notes - 
there wasn’t many surprising things here. We did discuss trying to target some 
quick wins for the beginning of the cycle, so that we didn’t have all of our 
features trying to land at the end. Using wsgi with the ironic-api was 
mentioned as a potential regression, but we agreed it’s a config/documentation 
issue. I took an action to make a task to document this better.

Next we quickly reviewed our vision doc, and people didn’t have much to say 
about it.

Metalsmith: it’s a thing, it’s being included into the ironic project. Dmitry 
is open to optionally supporting placement. Multiple instances will be a 
feature in the future. Otherwise mostly feature complete, goal is to keep it 
simple.

Networking-ansible: redhat building tooling that integrates with upstream 
ansible modules for networking gear. Kind of an alternative to n-g-s. Not 
really much on plans h

Re: [openstack-dev] [ironic][edge] Notes from the PTG

2018-09-19 Thread Jay Pipes

On 09/19/2018 11:03 AM, Jim Rollenhagen wrote:
On Wed, Sep 19, 2018 at 8:49 AM, Jim Rollenhagen > wrote:


Tuesday: edge


Since cdent asked in IRC, when we talk about edge and far edge, we 
defined these roughly like this:

https://usercontent.irccloud-cdn.com/file/NunkkS2y/edge_architecture1.JPG


Far out, man.

-jay

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [ironic][edge] Notes from the PTG

2018-09-19 Thread Jim Rollenhagen
On Wed, Sep 19, 2018 at 8:49 AM, Jim Rollenhagen 
wrote:
>
> Tuesday: edge
>

Since cdent asked in IRC, when we talk about edge and far edge, we defined
these roughly like this:
https://usercontent.irccloud-cdn.com/file/NunkkS2y/edge_architecture1.JPG

// jim
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [ironic][edge] Notes from the PTG

2018-09-19 Thread Jim Rollenhagen
I wrote up some notes from my perspective at the PTG for some internal
teams and figured I may as well share them here. They're primarily from the
ironic and edge WG rooms. Fairly raw, very long, but hopefully useful to
someone. Enjoy.

Tuesday: edge

Edge WG (IMHO) has historically just talked about use cases, hand-waved a
bit, and jumped to requiring an autonomous control plane per edge site -
thus spending all of their time talking about how they will make glance and
keystone sync data between control planes.

penick described roughly what we do with keystone/athenz and how that can
be used in a federated keystone deployment to provide autonomy for any
control plane, but also a single view via a global keystone.

penick and I both kept pushing for people to define a real architecture,
and we ended up with 10-15 people huddled around an easel for most of the
afternoon. Of note:

- Windriver (and others?) refuse to budge on the many control plane thing
- This means that they will need some orchestration tooling up top in
the main DC / client machines to even come close to reasonably managing all
of these sites
- They will probably need some syncing tooling
- glance->glance isn’t a thing, no matter how many people say it is.
- Glance PTL recommends syncing metadata outside of glance process, and
a global(ly distributed?) glance backend.
- We also defined the single pane of glass architecture that Oath plans to
deploy
- Okay with losing connectivity from central control plane to single
edge site
- Each edge site is a cell
- Each far edge site is just compute nodes
- Still may want to consider image distribution to edge sites so we
don’t have to go back to main DC?
- Keystone can be distributed the same as first architecture
- Nova folks may start investigating putting API hosts at the cell
level to get the best of both worlds - if there’s a network partition, can
still talk to cell API to manage things
- Need to think about removing the need for rabbitmq between edge and
far edge
- Kafka was suggested in the edge room for oslo.messaging in general
- Etcd watchers may be another option for an o.msg driver
- Other other options are more invasive into nova - involve
changing how nova-compute talks to conductor (etcd, etc) or even putting
REST APIs in nova-compute (and nova-conductor?)
- Neutron is going to work on an OVS “superagent” - superagent does
the RPC handling, talks some other way to child agents. Intended to scale
to thousands of children. Primary use case is smart nics but seems like a
win for the edge case as well.

penick took an action item to draw up the architecture diagrams in a
digestable format.

Wednesday: ironic things

Started with a retrospective. See
https://etherpad.openstack.org/p/ironic-stein-ptg-retrospective for the
notes - there wasn’t many surprising things here. We did discuss trying to
target some quick wins for the beginning of the cycle, so that we didn’t
have all of our features trying to land at the end. Using wsgi with the
ironic-api was mentioned as a potential regression, but we agreed it’s a
config/documentation issue. I took an action to make a task to document
this better.

Next we quickly reviewed our vision doc, and people didn’t have much to say
about it.

Metalsmith: it’s a thing, it’s being included into the ironic project.
Dmitry is open to optionally supporting placement. Multiple instances will
be a feature in the future. Otherwise mostly feature complete, goal is to
keep it simple.

Networking-ansible: redhat building tooling that integrates with upstream
ansible modules for networking gear. Kind of an alternative to n-g-s. Not
really much on plans here, RH just wanted to introduce it to the community.
Some discussion about it possibly replacing n-g-s later, but no hard plans.

Deploy steps/templates: we talked about what the next steps are, and what
an MVP looks like. Deploy templates are triggered by the traits that nodes
are scheduled against, and can add steps before or after (or in between?)
the default deploy steps. We agreed that we should add a RAID deploy step,
with standing questions for how arguments are passed to that deploy step,
and what the defaults look like. Myself and mgoddard took an action item to
open an RFE for this. We also agreed that we should start thinking about
how the current (only) deploy step should be split into multiple steps.

Graphical console: we discussed what the next steps are for this work. We
agreed that we should document the interface and what is returned (a URL),
and also start working on a redfish driver for graphical consoles. We also
noted that we can test in the gate with qemu, but we only need to test that
a correct URL is returned, not that the console actually works (because we
don’t really care that qemu’s console works).

Python 3: we talked about the changes to our jobs that are needed. We
agreed to use the base name of the jobs