[openstack-dev] [nova] Nova migration policy

2015-08-28 Thread Brian Elliott
In an effort to clarify expectations around good practices in writing schema 
and data migrations in nova with respect to live upgrades, I’ve added some 
extra bits to the live upgrade devref.  Please check it out and add your 
thoughts:

https://review.openstack.org/#/c/218362/ 
https://review.openstack.org/#/c/218362/

Thanks,
Brian__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Concerns around the Extensible Resource Tracker design - revert maybe?

2014-08-13 Thread Brian Elliott

On Aug 12, 2014, at 5:21 AM, Nikola Đipanov ndipa...@redhat.com wrote:

 Hey Nova-istas,
 
 While I was hacking on [1] I was considering how to approach the fact
 that we now need to track one more thing (NUMA node utilization) in our
 resources. I went with - I'll add it to compute nodes table thinking
 it's a fundamental enough property of a compute host that it deserves to
 be there, although I was considering  Extensible Resource Tracker at one
 point (ERT from now on - see [2]) but looking at the code - it did not
 seem to provide anything I desperately needed, so I went with keeping it
 simple.
 
 So fast-forward a few days, and I caught myself solving a problem that I
 kept thinking ERT should have solved - but apparently hasn't, and I
 think it is fundamentally a broken design without it - so I'd really
 like to see it re-visited.
 
 The problem can be described by the following lemma (if you take 'lemma'
 to mean 'a sentence I came up with just now' :)):
 
 
 Due to the way scheduling works in Nova (roughly: pick a host based on
 stale(ish) data, rely on claims to trigger a re-schedule), _same exact_
 information that scheduling service used when making a placement
 decision, needs to be available to the compute service when testing the
 placement.
 “

Correct

 
 This is not the case right now, and the ERT does not propose any way to
 solve it - (see how I hacked around needing to be able to get
 extra_specs when making claims in [3], without hammering the DB). The
 result will be that any resource that we add and needs user supplied
 info for scheduling an instance against it, will need a buggy
 re-implementation of gathering all the bits from the request that
 scheduler sees, to be able to work properly.
Agreed, ERT does not attempt to solve this problem of ensuring RT has an 
identical set of information for testing claims.  I don’t think it was intended 
to.

ERT does solve the issue of bloat in the RT with adding just-one-more-thing to 
test usage-wise.  It gives a nice hook for inserting your claim logic for your 
specific use case.

 
 This is obviously a bigger concern when we want to allow users to pass
 data (through image or flavor) that can affect scheduling, but still a
 huge concern IMHO.
I think passing additional data through to compute just wasn’t a problem that 
ERT aimed to solve.  (Paul Murray?)  That being said, coordinating the passing 
of any extra data required to test a claim that is *not* sourced from the host 
itself would be a very nice addition.  You are working around it with some 
caching in your flavor db lookup use case, although one could of course cook up 
a cleaner patch to pass such data through on the “build this” request to the 
compute.

 
 As I see that there are already BPs proposing to use this IMHO broken
 ERT ([4] for example), which will surely add to the proliferation of
 code that hacks around these design shortcomings in what is already a
 messy, but also crucial (for perf as well as features) bit of Nova code.
 
 I propose to revert [2] ASAP since it is still fresh, and see how we can
 come up with a cleaner design.
 
I think the ERT is forward-progress here, but am willing to review 
patches/specs on improvements/replacements.  

 Would like to hear opinions on this, before I propose the patch tho!
 
 Thanks all,
 
 Nikola
 
 [1] https://blueprints.launchpad.net/nova/+spec/virt-driver-numa-placement
 [2] https://review.openstack.org/#/c/109643/
 [3] https://review.openstack.org/#/c/111782/
 [4] https://review.openstack.org/#/c/89893
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Thoughts from the PTL

2014-04-15 Thread Brian Elliott

On Apr 13, 2014, at 11:58 PM, Michael Still mi...@stillhq.com wrote:

 First off, thanks for electing me as the Nova PTL for Juno. I find the
 outcome of the election both flattering and daunting. I'd like to
 thank Dan and John for running as PTL candidates as well -- I strongly
 believe that a solid democratic process is part of what makes
 OpenStack so successful, and that isn't possible without people being
 will to stand up during the election cycle.

Congrats!

 
 I'm hoping to send out regular emails to this list with my thoughts
 about our current position in the release process. Its early in the
 cycle, so the ideas here aren't fully formed yet -- however I'd rather
 get feedback early and often, in case I'm off on the wrong path. What
 am I thinking about at the moment? The following things:
 
 * a mid cycle meetup. I think the Icehouse meetup was a great success,
 and I'd like to see us do this again in Juno. I'd also like to get the
 location and venue nailed down as early as possible, so that people
 who have complex travel approval processes have a chance to get travel
 sorted out. I think its pretty much a foregone conclusion this meetup
 will be somewhere in the continental US. If you're interested in
 hosting a meetup in approximately August, please mail me privately so
 we can chat.

Yeah this was a great opportunity to collaborate and keep the project pointed 
in the right direction during Icehouse.

 
 * specs review. The new blueprint process is a work of genius, and I
 think its already working better than what we've had in previous
 releases. However, there are a lot of blueprints there in review, and
 we need to focus on making sure these get looked at sooner rather than
 later. I'd especially like to encourage operators to take a look at
 blueprints relevant to their interests. Phil Day from HP has been
 doing a really good job at this, and I'd like to see more of it.

I have mixed feelings about the nova-specs repo.  I dig the open collaboration 
of the blueprints process, but I also think there is a danger of getting too 
process-oriented here.  Are these design documents expected to call out every 
detail of a feature?  Ideally, I’d like to see only very high level 
documentation in the specs repo.  Basically, the spec could include just enough 
detail for people to agree that they think a feature is worth inclusion.  More 
detailed discussion could remain on the code reviews since they are the actual 
end work product.

Thanks,
Brian
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] why don't we deal with claims when live migrating an instance?

2014-01-16 Thread Brian Elliott

On Jan 15, 2014, at 4:34 PM, Clint Byrum cl...@fewbar.com wrote:

 Hi Chris. Your thread may have gone unnoticed as it lacked the Nova tag.
 I've added it to the subject of this reply... that might attract them.  :)
 
 Excerpts from Chris Friesen's message of 2014-01-15 12:32:36 -0800:
 When we create a new instance via _build_instance() or 
 _build_and_run_instance(), in both cases we call instance_claim() to 
 reserve and test for resources.
 
 During a cold migration I see us calling prep_resize() which calls 
 resize_claim().
 
 How come we don't need to do something like this when we live migrate an 
 instance?  Do we track the hypervisor overhead somewhere in the instance?
 
 Chris
 

It is a good point and it should be done.  It is effectively a bug.

Brian

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Proposal to add Matt Riedemann to nova-core

2013-11-22 Thread Brian Elliott
+1

Solid reviewer!

Sent from my iPad

 On Nov 22, 2013, at 2:53 PM, Russell Bryant rbry...@redhat.com wrote:
 
 Greetings,
 
 I would like to propose adding Matt Riedemann to the nova-core review team.
 
 Matt has been involved with nova for a long time, taking on a wide range
 of tasks.  He writes good code.  He's very engaged with the development
 community.  Most importantly, he provides good code reviews and has
 earned the trust of other members of the review team.
 
 https://review.openstack.org/#/dashboard/6873
 https://review.openstack.org/#/q/owner:6873,n,z
 https://review.openstack.org/#/q/reviewer:6873,n,z
 
 Please respond with +1/-1, or any further comments.
 
 Thanks,
 
 -- 
 Russell Bryant
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] XML Support for Nova v3 API

2013-06-20 Thread Brian Elliott
On Jun 19, 2013, at 7:34 PM, Christopher Yeoh cbky...@gmail.com wrote:

 Hi,
 
 Just wondering what people thought about how necessary it is to keep XML 
 support for the Nova v3 API, given that if we want to drop it doing so during 
 the v2-v3 transition is pretty much the ideal time to do so.
 
 The current plan is to keep it and is what we have been doing so far when 
 porting extensions, but there are pretty obvious long term development and 
 test savings if we only have one API format to support.
 
 Regards,
 
 Chris
 

Can we support CORBA?

No really, it'd be great to drop support for it while we can.

 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Compute node stats sent to the scheduler

2013-06-17 Thread Brian Elliott

On Jun 17, 2013, at 3:50 PM, Chris Behrens cbehr...@codestud.com wrote:

 
 On Jun 17, 2013, at 7:49 AM, Russell Bryant rbry...@redhat.com wrote:
 
 On 06/16/2013 11:25 PM, Dugger, Donald D wrote:
 Looking into the scheduler a bit there's an issue of duplicated effort that 
 is a little puzzling.  The database table `compute_nodes' is being updated 
 periodically with data about capabilities and resources used (memory, 
 vcpus, ...) while at the same time a periodic RPC call is being made to the 
 scheduler sending pretty much the same data.
 
 Does anyone know why we are updating the same data in two different place 
 using two different mechanisms?  Also, assuming we were to remove one of 
 these updates, which one should go?  (I thought at one point in time there 
 was a goal to create a database free compute node which would imply we 
 should remove the DB update.)
 
 Have you looked around to see if any code is using the data from the db?
 
 Having schedulers hit the db for the current state of all compute nodes
 all of the time would be a large additional db burden that I think we
 should avoid.  So, it makes sense to keep the rpc fanout_cast of current
 stats to schedulers.
 
 This is actually what the scheduler uses. :)   The fanout messages are too 
 infrequent and can be too laggy.  So, the scheduler was moved to using the DB 
 a long, long time ago… but it was very inefficient, at first, because it 
 looped through all instances.  So we added things we needed into compute_node 
 and compute_node_stats so we only had to look at the hosts.  You have to pull 
 the hosts anyway, so we pull the stats at the same time.
 
 The problem is… when we stopped using certain data from the fanout messages…. 
 we never removed it.   We should AT LEAST do this.  But.. (see below)..
 
 
 The scheduler also does a fanout_cast to all compute nodes when it
 starts up to trigger the compute nodes to populate the cache in the
 scheduler.  It would be nice to never fanout_cast to all compute nodes
 (given that there may be a *lot* of them).  We could replace this with
 having the scheduler populate its cache from the database.
 
 I think we should audit the remaining things that the scheduler uses from 
 these messages and move them to the DB.  I believe it's limited to the 
 hypervisor capabilities to compare against aggregates or some such.  I 
 believe it's things that change very rarely… so an alternative can be to only 
 send fanout messages when capabilities change!   We could always do that as a 
 first step.
 
 
 Removing the db usage completely would be nice if nothing is actually
 using it, but we'd have to look into an alternative solution for
 removing the scheduler fanout_cast to compute.
 
 Relying on anything but the DB for current memory free, etc, is just too 
 laggy… so we need to stick with it, IMO.
 
 - Chris
 
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

As Chris said, the reason it ended up this way using the DB is to quickly get 
up to date usage on hosts to the scheduler.  I certainly understand the point 
that it's a whole lot of increased load on the DB, but the RPC data was quite 
stale.  If there is interest in moving away from the DB updates, I think we 
have to either:

1) Send RPC updates to scheduler  on essentially every state change during a 
build.

or

2) Change the scheduler architecture so there is some memory of resources 
consumed between requests.  The scheduler would have to remember which hosts 
recent builds were assigned to.  This could be a bit of a data synchronization 
problem. if you're talking about using multiple scheduler instances.

Brian
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev