[openstack-dev] [qa] Create subnetpool on dynamic credentials

2017-05-20 Thread Hongbin Lu
Hi QA team,

I have a proposal to create subnetpool/subnet pair on dynamic credentials: 
https://review.openstack.org/#/c/466440/ . We (Zun team) have use cases for 
using subnets with subnetpools. I wanted to get some early feedback on this 
proposal. Will this proposal be accepted? If not, would appreciate alternative 
suggestion if any. Thanks in advance.

Best regards,
Hongbin
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [ptg] ptgbot: how to make "what's currently happening" emerge

2017-05-20 Thread Jay Bryant



On 5/18/2017 4:57 AM, Thierry Carrez wrote:

Hi again,

For the PTG events we have, by design, a pretty loose schedule. Each
room is free to organize their agenda in whatever way they see fit, and
take breaks whenever they need. This flexibility is key to keep our
productivity at those events at a maximum. In Atlanta, most teams ended
up dynamically building a loose agenda on a room etherpad.

This approach is optimized for team meetups and people who strongly
identify with one team in particular. In Atlanta during the first two
days, where a lot of vertical team contributors did not really know
which room to go to, it was very difficult to get a feel of what is
currently being discussed and where they could go. Looking into 20
etherpads and trying to figure out what is currently being discussed is
just not practical. In the feedback we received, the need to expose the
schedule more visibly was the #1 request.

It is a thin line to walk on. We clearly don't want to publish a
schedule in advance or be tied to pre-established timeboxes for every
topic. We want it to be pretty fluid and natural, but we still need to
somehow make "what's currently happening" (and "what will be discussed
next") emerge globally.

One lightweight solution I've been working on is an IRC bot ("ptgbot")
that would produce a static webpage. Room leaders would update it on
#openstack-ptg using commands like:

#swift now discussing ring placement optimizations
#swift next at 14:00 we plan to discuss better #keystone integration

and the bot would collect all those "now" and "next" items and publish a
single (mobile-friendly) webpage, (which would also include
ethercalc-scheduled things, if we keep any).

The IRC commands double as natural language announcements for those that
are following activity on the IRC channel. Hashtags can be used to
attract other teams attention. You can announce later discussions, but
the commitment on exact timing is limited. Every "now" command would
clear "next" entries, so that there wouldn't be any stale entries and
the command interface would be kept dead simple (at the cost of a bit of
repetition).

I have POC code for this bot already. Before I publish it (and start
work to make infra support it), I just wanted to see if this is the
right direction and if I should continue to work on it :) I feel like
it's an incremental improvement that preserves the flexibility and
self-scheduling while addressing the main visibility concern. If you
have better ideas, please let me know !


Thierry,

I like this idea and it is consistent with what Cinder tried to do in 
the PTG channel at the last event.  I think formalizing it would be great.


Thanks!
Jay


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Is the pendulum swinging on PaaS layers?

2017-05-20 Thread Monty Taylor

On 05/19/2017 04:27 PM, Matt Riedemann wrote:

On 5/19/2017 3:03 PM, Monty Taylor wrote:

On 05/19/2017 01:04 PM, Sean Dague wrote:

On 05/19/2017 01:38 PM, Dean Troyer wrote:

On Fri, May 19, 2017 at 11:53 AM, Chris Friesen
 wrote:

..., but it seems to me that the logical
extension of that is to expose simple orthogonal APIs where the
nova boot
request should only take neutron port ids and cinder volume ids.
The actual
setup of those ports/volumes would be done by neutron and cinder.

It seems somewhat arbitrary to say "for historical reasons this
subset of
simple things can be done directly in a nova boot command, but for
more
complicated stuff you have to go use these other commands".  I
think there's
an argument to be made that it would be better to be consistent
even for the
simple things.


cdent mentioned enamel[0] above, and there is also oaktree[1], both of
which are wrapper/proxy services in front of existing OpenStack APIs.
I don't know enough about enamel yet, but one of the things I like
about oaktree is that it is not required to be deployed by the cloud
operator to be useful, I could set it up and proxy Rax and/or
CityCloud and/or mtreinish's closet cloud equally well.

The fact that these exist, and things like shade itself, are clues
that we are not meeting the needs of API consumers.  I don't think
anyone disagrees with that; let me know if you do and I'll update my
thoughts.


It's fine to have other ways to consume things. I feel like "make
OpenStack easier to use by requiring you install a client side API
server for your own requests" misses the point of the easier part. It's
cool you can do it as a power user. It's cool things like Heat exist for
people that don't want to write API calls (and just do templates). But
it's also not helping on the number of pieces of complexity to manage in
OpenStack to have a workable cloud.


Yup. Agree. Making forward progress on that is paramount.


I consider those things duct tape, leading use to the eventually
consistent place where we actually do that work internally. Because,
having seen with the ec2-api proxy, the moment you get beyond trivial
mapping, you now end up with a complex state tracking system, that's
going to need to be highly available, and replicate a bunch of your data
to be performent, and then have inconsistency issues, because a user
deployed API proxy can't have access to the notification bus, and...
boom.


You can actually get fairly far (with a few notable exceptions - I'm
looking at you unattached floating ips) without state tracking. It
comes at the cost of more API spidering after a failure/restart. Being
able to cache stuff aggressively combined with batching/rate-limiting
of requests to the cloud API allows one to do most of this to a fairly
massive scale statelessly. However, caching, batching and
rate-limiting are all pretty much required else you wind up crashing
public clouds. :)

I agree that the things are currently duct tape, but I don't think
that has to be a bad thing. The duct tape is currently needed
client-side no matter what we do, and will be for some time no matter
what we do because of older clouds. What's often missing is closing
the loop so that we can, as OpenStack, eventually provide out of the
box the consume experience that people currently get from using one of
the client-side duct tapes. That part is harder, but it's definitely
important.


You end up replicating the Ceilometer issue where there was a break down
in getting needs expressed / implemented, and the result was a service
doing heavy polling of other APIs (because that's the only way it could
get the data it needed). Literally increasing the load on the API
surfaces by a factor of 10


Right. This is why communication is essential. I'm optimistic we can
do well on this topic, because we are MUCH better are talking to each
other now than we were back when ceilometer was started.

Also, a REST-consuming porcelain like oaktree gets to draw on
real-world experience consuming OpenStack's REST APIs at scale. So
it's also not the same problem setup, since it's not a from-scratch
new thing.

This is, incidentally, why experience with caching and batching is
important. There is a reason why we do GET /servers/detail once every
5 seconds rather than doing a specific GET /server/{id}/detail calls
for each booting VM.

Look at what we could learn just from that... Users using shade are
doing a full detailed server list because it scales better for
concurrency. It's obviously more expensive on a single-call basis. BUT
- maybe it's useful information that doing optimization work on GET
/servers/detail could be beneficial.


This reminds me that I suspect we're lazy-loading server detail
information in certain cases, i.e. going back to the DB to do a join
per-instance after we've already pulled all instances in an initial set
(with some initial joins). I need to pull this thread again...





Re: [openstack-dev] Is the pendulum swinging on PaaS layers?

2017-05-20 Thread Monty Taylor

On 05/19/2017 03:13 PM, Monty Taylor wrote:

On 05/19/2017 01:53 PM, Sean Dague wrote:

On 05/19/2017 02:34 PM, Dean Troyer wrote:

On Fri, May 19, 2017 at 1:04 PM, Sean Dague  wrote:

These should be used as ways to experiment with the kinds of interfaces
we want cheaply, then take them back into services (which is a more
expensive process involving compatibility stories, deeper
documentation,
performance implications, and the like), not an end game on their own.


I totally agree here.  But I also see the rate of progress for many
and varied reasons, and want to make users lives easier now.

Have any of the lessons already learned from Shade or OSC made it into
services yet?  I think a few may have, "get me a network" being the
obvious one.  But that still took a lot of work (granted that one _is_
complicated).


Doing hard things is hard. I don't expect changing APIs to be easy at
this level of deployedness of OpenStack.


You can get the behavior. It also has other behaviors. I'm not sure any
user has actually argued for "please make me do more rest calls to
create a server".


Maybe not in those words, but "give me the tools to do what I need"
has been heard often.  Sometimes those tools are composable
primitives, sometimes they are helpful opinionated interfaces.  I've
already done the helpful opinionated stuff in OSC here (accept flavor
and image names when the non-unique names _do_ identify a single
result).  Having that control lets me give the user more options in
handling edge cases.


Sure, it does. The fact that it makes 3 API calls every time when doing
flavors by name (404 on the name, list all flavors, local search, get
the flavor by real id) on mostly read only data (without any caching) is
the kind of problem that rises from "just fix it in an upper layer". So
it does provide an experience at a cost.


We also searching of all resources by name-or-id in shade. But it's one
call - GET /images - and then we test to see if the given value matches
the name field or the id field. And there is caching, so the list call
is done once in the session.

The thing I'm the saddest about is the Nova flavor "extra_info" that one
needs to grab for backwards compat but almost never has anything useful
in it. This causes me to make a billion API calls for the initial flavor
list (which is then cached of course) It would be WAY nicer if there was
a GET /flavors/detail that would just get me the whole lot in one go, fwiw.


Quick follow up on this one.

It was "extra_specs" I was thinking about - not "extra_info"

It used to be in the flavor as part of an extension (with a longer name) 
- we fetch them in shade for backwards compat with the past when they 
were just there. However, I've also learned from a follow up in IRC that 
these aren't really things that were intended for me.


So I'll re-frame this point slightly ...

As a user it's often quite difficult to tell what general intent is 
related to use of resources - whether they are intended for general 
users, or whether they are intended for admins. I guess a lot, and 
sometimes I get it right, and sometimes I don't. I know, I know - policy 
makes it so that different cloud deployers can have _vastly_ different 
opinions on this. But a clear intent from us (greatly helped, btw, by 
putting default policy in code) of "this call, this resource, this field 
is intended for normal users, but this one is intended for admin users" 
would have certainly helped me many times in the past.


Thanks for the IRC chat!


Dean has a harder time than I do with that one because osc interactions
are lots of process invocations from scratch. We chatted a bit about how
to potentially share caching things in Boston, but not sure we've come
up with more.


All for new and better experiences. I think that's great. Where I think
we want to be really careful is deciding the path to creating better
experiences is by not engaging with the services and just writing around
it. That feedback has to come back. Those reasons have to come back, and
we need to roll sensible improvements back into base services.

If you want to go fast, go alone, if you want to go far, go together.


Couldn't agree more . I think we're getting better at that communication.

We still have a hole, which is that the path from "this is a problem and
here's how I'm working around it" to "there are devs tasked to work on
solving that problem" is a hard one, because while the communication
from those of us doing client-layer stuff with the folks doing the
servers is pretty good - the communication loop with the folks at the
companies who are prioritizing work ... not so much. Look at the number
of people hacking on shade or python-openstackclient or writing
user-facing docs compared to folks adding backend features to the services.

So - yes, I totally agree. But also, we can make and are making a lot of
progress in some areas with tiny crews. That's gonna likely be the state
of the world 

Re: [openstack-dev] [oslo] Can we stop global requirements update?

2017-05-20 Thread Julien Danjou
On Fri, May 19 2017, Mike Bayer wrote:

> IMO that's a bug for them.

Of course it's a bug. IIRC Mehdi tried to fix it without much success.

> I'm inspired to see that Keystone, Nova etc. are
> able to move between and eventlet backend and a mod_wsgi backend.IMO
> eventlet is really not needed for those services that present a REST 
> interface.
> Although for a message queue with lots of long-running connections that 
> receive
> events, that's a place where I *would* want to use a polling / non-blocking
> model.  But I'd use it explicitly, not with monkeypatching.

+1

> I'd ask why not oslo.cotyledon but it seems there's a faction here that is
> overall moving out of the Openstack umbrella in any case.

Not oslo because it can be used by other projects than just OpenStack.
And it's a condition of success. As Mehdi said, Oslo has been deserted
in the recent cycles, so putting a lib there as very little chance of
seeing its community and maintenance help grow. Whereas trying to reach
the whole Python ecosystem is more likely to get traction.

As a maintainer of SQLAlchemy I'm surprised you even suggest that. Or do
you plan on doing oslo.sqlalchemy? ;)

> Basically I think openstack should be getting off eventlet in a big way so I
> guess my sentiment here is that the Gnocchi / Cotyledon /etc. faction is just
> splitting off rather than serving as any kind of direction for the rest of
> Openstack to start looking.  But that's only an impression, maybe projects 
> will
> use Cotyledon anyway.   If every project goes off and uses something 
> completely
> different though, then I think we're losing.   The point of oslo was to 
> prevent
> that.

I understand your concern and opinion. I think you, me and Mehdi don't
have the experience as contributors in OpenStack. I invite you to try
moving any major OpenStack project to something like oslo.service2 or
Cotyledon or to achieve any technical debt resolution in OpenStack to
have a view on hard it is to tackle. Then you'll see where we stand. :)

Especially when your job is not doing that, but e.g. working on
Telemetry. :)

-- 
Julien Danjou
-- Free Software hacker
-- https://julien.danjou.info


signature.asc
Description: PGP signature
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [release] Proposal to change the timing of Feature Freeze

2017-05-20 Thread Chris Jones
Hey

Well, it definitely seems like there's not much support for the idea ;)

Thanks everyone who replied. I'll go away and think about ways we can improve 
things without moving FF :)

Cheers,
--
Chris Jones

> On 18 May 2017, at 11:18, Thierry Carrez  wrote:
> 
> Chris Jones wrote:
>> I have a fairly simple proposal to make - I'd like to suggest that
>> Feature Freeze move to being much earlier in the release cycle (no
>> earlier than M.1 and no later than M.2 would be my preference).
>> [...]
> 
> Hey Chris,
> 
> From my (admittedly too long) experience in release management, forcing
> more time for stabilization work does not magically yield better
> results. There is nothing like a "perfect" release, it's always a "good
> enough" trade-off. Holding releases in the hope that more bugs will be
> discovered and fixed only works so far: some bugs will only emerge once
> people start deploying software in their unique environments and use
> cases. It's better to put it out there when it's "good enough".
> 
> So a Feature Freeze should be placed early enough to give you an
> opportunity to slow down, fix known blockers, have documentation and
> translations catch up. Currently that means 5-6 weeks. Moving it earlier
> than this reasonable trade-off just brings more pain for little benefit.
> It is hard enough to get people to stop pushing features and feature
> freeze exceptions and do stabilization work for 5 weeks. Forcing a
> longer freeze would just see an explosion of local feature branches, not
> a more "stable" release.
> 
> Furthermore, we have a number of projects (newly-created ones that need
> to release early, or mature ones that want to push that occasional new
> feature more often) that bypass the feature freeze / RC system
> completely. With more constraints, I'd expect most projects to switch to
> that model instead.
> 
>> Rather than getting hung up on the specific numbers of weeks, perhaps it
>> would be helpful to start with opinions on whether or not there is
>> enough stabilisation time in the current release schedules.
> 
> Compared to the early days of OpenStack (where we'd still use a 5-6-week
> freeze period) our automated testing has come a long way. The cases
> where we need to respin release candidates due to a major blocker that
> was not caught in automated testing are becoming rarer. If anything, the
> data points to a need for shorter freezes rather than longer ones. The
> main reason we are still at 5-6weeks those days is for translations and
> docs, rather than real stabilization work. I'm not advocating for making
> it shorter, I still think it's the right trade-off :)
> 
> -- 
> Thierry Carrez (ttx)
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Keystone] Cockroachdb for Keystone Multi-master

2017-05-20 Thread lebre . adrien


- Mail original -
> De: "Curtis" 
> À: "OpenStack Development Mailing List (not for usage questions)" 
> 
> Cc: openst...@lists.openstack.org
> Envoyé: Vendredi 19 Mai 2017 01:43:39
> Objet: Re: [openstack-dev] [Keystone] Cockroachdb for Keystone Multi-master
> 
> On Thu, May 18, 2017 at 4:13 PM, Adrian Turjak
>  wrote:
> > Hello fellow OpenStackers,
> >
> > For the last while I've been looking at options for multi-region
> > multi-master Keystone, as well as multi-master for other services
> > I've
> > been developing and one thing that always came up was there aren't
> > many
> > truly good options for a true multi-master backend. Recently I've
> > been
> > looking at Cockroachdb and while I haven't had the chance to do any
> > testing I'm curious if anyone else has looked into it. It sounds
> > like
> > the perfect solution, and if it can be proved to be stable enough
> > it
> > could solve a lot of problems.
> >
> > So, specifically in the realm of Keystone, since we are using
> > sqlalchemy
> > we already have Postgresql support, and since Cockroachdb does talk
> > Postgres it shouldn't be too hard to back Keystone with it. At that
> > stage you have a Keystone DB that could be multi-region,
> > multi-master,
> > consistent, and mostly impervious to disaster. Is that not the holy
> > grail for a service like Keystone? Combine that with fernet tokens
> > and
> > suddenly Keystone becomes a service you can't really kill, and can
> > mostly forget about.
> >
> > I'm welcome to being called mad, but I am curious if anyone has
> > looked
> > at this. I'm likely to do some tests at some stage regarding this,
> > because I'm hoping this is the solution I've been hoping to find
> > for
> > quite a long time.
> 
> I was going to take a look at this a bit myself, just try it out. I
> can't completely speak for the Fog/Edge/Massively Distributed working
> group in OpenStack, but I feel like this might be something they look
> into.
> 

Thanks Curtis for highlighting this. 

Indeed, among the actions we defined during our F2F meeting in Boston, one is 
to investigate the feasibility of using cockroachDB as the backend. We gave 
really preliminary investigations and identify several aspects that we should 
consider (dependence w-r-t postgres, maturity of cockroach code,...)

For your information, we did a PoC one year ago to show that it is possible to 
use a Non SQL backend (i.e Redis in our case) instead of the historical MySQL 
backends. Our goal was to investigate a solution to distribute the storage 
backends accros several nodes/sites. Although it was ``just'' a PoC (i.e., the 
quality of the code was definitely improvable), it demonstrated that using a 
non-SQL backend can make sense (for more information see [1] [2]). Among the 
remarks we got, the fact of loosing the SQL API seemed to be an issue. 

Using new SQL systems such as CockRoach can solve this issue and that's why we 
want to study how this can be done (1./ identify the effort in terms of 
development, 2./ find a way to evaluate performance pros/cons 3./ do the job)

Among the different systems (cockroach, vitess, ...), the important point from 
the FEMDC working group [3] is the fact that we do not want to have centralized 
service. From what I learnt by browsing a few pages of the Vitess website, the 
architecture does not satisfy this aspect. 

In any case, we need to spend more times on these questions before having 
enough materials to get valuable answers.
In others words, if you are interested to deal with such questions, do not 
hesitate to take part in our IRC meetings and follow the mailing list.

Best, 
Ad_rien_


[1] https://hal.inria.fr/hal-01273427/
[2] https://www.openstack.org/summit/austin-2016/summit-schedule/events/7342
[3] https://wiki.openstack.org/wiki/Fog_Edge_Massively_Distributed_Clouds


> For standard multi-site I don't know how much it would help, say if
> you only had a couple or three clouds, but more than that maybe this
> starts to make sense. Also running Galera has gotten easier but still
> not that easy.
> 
> I had thought that the OpenStack community was deprecating Postgres
> support though, so that could make things a bit harder here (I might
> be wrong about this).
> 
> Thanks,
> Curtis.
> 
> >
> > Further reading:
> > https://www.cockroachlabs.com/
> > https://github.com/cockroachdb/cockroach
> > https://www.cockroachlabs.com/docs/build-a-python-app-with-cockroachdb-sqlalchemy.html
> >
> > Cheers,
> > - Adrian Turjak
> >
> >
> > __
> > OpenStack Development Mailing List (not for usage questions)
> > Unsubscribe:
> > openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> 
> 
> --
> Blog: serverascode.com
> 
> 

Re: [openstack-dev] [vitrage] [nova] [HA] [masakari] VM Heartbeat / Healthcheck Monitoring

2017-05-20 Thread Vikash Kumar
Thanks Sam

On Sat, 20 May 2017, 06:51 Sam P,  wrote:

> Hi Vikash,
>  Great... I will add you as reviewer to this spec.
>  Thank you..
> --- Regards,
> Sampath
>
>
>
> On Fri, May 19, 2017 at 1:06 PM, Vikash Kumar
>  wrote:
> > Hi Greg,
> >
> > Please include my email in this spec also. We are also dealing with
> HA
> > of Virtual Instances (especially for Vendors) and will participate.
> >
> > On Thu, May 18, 2017 at 11:33 PM, Waines, Greg <
> greg.wai...@windriver.com>
> > wrote:
> >>
> >> Yes I am good with writing spec for this in masakari-spec.
> >>
> >>
> >>
> >> Do you use gerrit for this git ?
> >>
> >> Do you have a template for your specs ?
> >>
> >>
> >>
> >> Greg.
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> From: Sam P 
> >> Reply-To: "openstack-dev@lists.openstack.org"
> >> 
> >> Date: Thursday, May 18, 2017 at 1:51 PM
> >> To: "openstack-dev@lists.openstack.org"
> >> 
> >> Subject: Re: [openstack-dev] [vitrage] [nova] [HA] [masakari] VM
> Heartbeat
> >> / Healthcheck Monitoring
> >>
> >>
> >>
> >> Hi Greg,
> >>
> >> Thank you Adam for followup.
> >>
> >> This is new feature for masakari-monitors and think  Masakari can
> >>
> >> accommodate this feature in  masakari-monitors.
> >>
> >> From the implementation prospective, it is not that hard to do.
> >>
> >> However, as you can see in our Boston presentation, Masakari will
> >>
> >> replace its monitoring parts ( which is masakari-monitors) with,
> >>
> >> nova-host-alerter, **-process-alerter, and **-instance-alerter. (**
> >>
> >> part is not defined yet..:p)...
> >>
> >> Therefore, I would like to save this specifications, and make sure we
> >>
> >> will not miss  anything in the transformation..
> >>
> >> Does is make sense to write simple spec for this in masakari-spec [1]?
> >>
> >> So we can discuss about the requirements how to implement it.
> >>
> >>
> >>
> >> [1] https://github.com/openstack/masakari-specs
> >>
> >>
> >>
> >> --- Regards,
> >>
> >> Sampath
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> On Thu, May 18, 2017 at 2:29 AM, Adam Spiers  wrote:
> >>
> >> I don't see any reason why masakari couldn't handle that, but you'd
> >>
> >> have to ask Sampath and the masakari team whether they would consider
> >>
> >> that in scope for their roadmap.
> >>
> >>
> >>
> >> Waines, Greg  wrote:
> >>
> >>
> >>
> >> Sure.  I can propose a new user story.
> >>
> >>
> >>
> >> And then are you thinking of including this user story in the scope of
> >>
> >> what masakari would be looking at ?
> >>
> >>
> >>
> >> Greg.
> >>
> >>
> >>
> >>
> >>
> >> From: Adam Spiers 
> >>
> >> Reply-To: "openstack-dev@lists.openstack.org"
> >>
> >> 
> >>
> >> Date: Wednesday, May 17, 2017 at 10:08 AM
> >>
> >> To: "openstack-dev@lists.openstack.org"
> >>
> >> 
> >>
> >> Subject: Re: [openstack-dev] [vitrage] [nova] [HA] VM Heartbeat /
> >>
> >> Healthcheck Monitoring
> >>
> >>
> >>
> >> Thanks for the clarification Greg.  This sounds like it has the
> >>
> >> potential to be a very useful capability.  May I suggest that you
> >>
> >> propose a new user story for it, along similar lines to this existing
> >>
> >> one?
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> http://specs.openstack.org/openstack/openstack-user-stories/user-stories/proposed/ha_vm.html
> >>
> >>
> >>
> >> Waines, Greg >
> >>
> >> wrote:
> >>
> >> Yes that’s correct.
> >>
> >> VM Heartbeating / Health-check Monitoring would introduce intrusive /
> >>
> >> white-box type monitoring of VMs / Instances.
> >>
> >>
> >>
> >> I realize this is somewhat in the gray-zone of what a cloud should be
> >>
> >> monitoring or not,
> >>
> >> but I believe it provides an alternative for Applications deployed in
> VMs
> >>
> >> that do not have an external monitoring/management entity like a VNF
> >> Manager
> >>
> >> in the MANO architecture.
> >>
> >> And even for VMs with VNF Managers, it provides a highly reliable
> >>
> >> alternate monitoring path that does not rely on Tenant Networking.
> >>
> >>
> >>
> >> You’re correct, that VM HB/HC Monitoring would leverage
> >>
> >> https://wiki.libvirt.org/page/Qemu_guest_agent
> >>
> >> that would require the agent to be installed in the images for talking
> >>
> >> back to the compute host.
> >>
> >> ( there are other examples of similar approaches in openstack ... the
> >>
> >> murano-agent for installation, the swift-agent for object store
> management
> >> )
> >>
> >> Although here, in the case of VM HB/HC Monitoring, via the QEMU Guest
> >>
> >> Agent, the messaging path is internal thru a QEMU virtual serial device.
> >>
> >> i.e. a very simple interface with very few dependencies ... it’s up and
> >>
> >> available very early in VM lifecycle