Re: [openstack-dev] [nova][placement] Trying to summarize bp/glance-image-traits scheduling alternatives for rebuild

2018-05-03 Thread Eric Fried
>> verify with placement
>> whether the image traits requested are 1) supported by the compute
>> host the instance is residing on and 2) coincide with the
>> already-existing allocations.

Note that #2 is a subset of #1.  The only potential advantage of
including #1 is efficiency: We can do #1 in one API call and bail early
if it fails; but if it passes, we have to do #2 anyway, which is
multiple steps.  So would we rather save one step in the "good path" or
potentially N-1 steps in the failure case?  IMO the cost of the
additional dev/test to implement #1 is higher than that of the potential
extra API calls.  (TL;DR: just implement #2.)

-efried

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][placement] Trying to summarize bp/glance-image-traits scheduling alternatives for rebuild

2018-05-03 Thread Matt Riedemann

On 5/3/2018 3:26 PM, Dan Smith wrote:

Well, it's a little itcky in that it makes a random part of conductor a
bit like the scheduler in its understanding of and iteraction with
placement. I don't love it, but I think it's what we have to do. Trying
to do the trait math with what was used before, or conservatively
rejecting the request and being potentially wrong about that is not
reasonable, IMHO.


The upside to doing the check in conductor is we have a specific code 
flow for rebuild in conductor and we should be able to just put a 
private method off to the side for this validation scenario. That's 
preferable to baking more rebuild logic into the scheduler. It also 
means we are always going to do this validation regardless of whether or 
not the ImagePropertiesFilter is enabled, but that (1) seems OK and (2) 
no one probably ever disables the ImagePropertiesFilter anyway.


--

Thanks,

Matt

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][placement] Trying to summarize bp/glance-image-traits scheduling alternatives for rebuild

2018-05-03 Thread Dan Smith
> I'm late to this thread but I finally went through the replies and my
> thought is, we should do a pre-flight check to verify with placement
> whether the image traits requested are 1) supported by the compute
> host the instance is residing on and 2) coincide with the
> already-existing allocations. Instead of making an assumption based on
> "last image" vs "new image" and artificially limiting a rebuild that
> should be valid to go ahead. I can imagine scenarios where a user is
> trying to do a rebuild that their cloud admin says should be perfectly
> valid on their hypervisor, but it's getting rejected because old image
> traits != new image traits. It seems like unnecessary user and admin
> pain.

Yeah, I think we have to do this.

> It doesn't seem correct to reject the request if the current compute
> host can fulfill it, and if I understood correctly, we have placement
> APIs we can call from the conductor to verify the image traits
> requested for the rebuild can be fulfilled. Is there a reason not to
> do that?

Well, it's a little itcky in that it makes a random part of conductor a
bit like the scheduler in its understanding of and iteraction with
placement. I don't love it, but I think it's what we have to do. Trying
to do the trait math with what was used before, or conservatively
rejecting the request and being potentially wrong about that is not
reasonable, IMHO.

--Dan

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][placement] Trying to summarize bp/glance-image-traits scheduling alternatives for rebuild

2018-05-02 Thread melanie witt

On Wed, 2 May 2018 17:45:37 -0500, Matt Riedemann wrote:

On 5/2/2018 5:39 PM, Jay Pipes wrote:

My personal preference is to add less technical debt and go with a
solution that checks if image traits have changed in nova-api and if so,
simply refuse to perform a rebuild.


So, what if when I created my server, the image I used, let's say
image1, had required trait A and that fit the host.

Then some external service removes (or somehow changes) trait A from the
compute node resource provider (because people can and will do this,
there are a few vmware specs up that rely on being able to manage traits
out of band from nova), and then I rebuild my server with image2 that
has required trait A. That would match the original trait A in image1
and we'd say, "yup, lgtm!" and do the rebuild even though the compute
node resource provider wouldn't have trait A anymore.

Having said that, it could technically happen before traits if the
operator changed something on the underlying compute host which
invalidated instances running on that host, but I'd think if that
happened the operator would be migrating everything off the host and
disabling it from scheduling before making whatever that kind of change
would be, let's say they change the hypervisor or something less drastic
but still image property invalidating.


This is a scenario I was thinking about too. In the land of software 
licenses, this would be analogous to removing a license from a compute 
host, say. The instance is already there but should we let a rebuild 
proceed that is going to violate the image traits currently supported by 
that host? Do we potentially prolong the life of that instance by 
letting it be re-imaged?


I'm late to this thread but I finally went through the replies and my 
thought is, we should do a pre-flight check to verify with placement 
whether the image traits requested are 1) supported by the compute host 
the instance is residing on and 2) coincide with the already-existing 
allocations. Instead of making an assumption based on "last image" vs 
"new image" and artificially limiting a rebuild that should be valid to 
go ahead. I can imagine scenarios where a user is trying to do a rebuild 
that their cloud admin says should be perfectly valid on their 
hypervisor, but it's getting rejected because old image traits != new 
image traits. It seems like unnecessary user and admin pain.


It doesn't seem correct to reject the request if the current compute 
host can fulfill it, and if I understood correctly, we have placement 
APIs we can call from the conductor to verify the image traits requested 
for the rebuild can be fulfilled. Is there a reason not to do that?


-melanie





__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][placement] Trying to summarize bp/glance-image-traits scheduling alternatives for rebuild

2018-05-02 Thread Arvind N
Isnt this an existing issue with traits specified in flavor as well?

Server is created using flavor1 requiring trait A on RP1. Before the
rebuild is called, the underlying RP1 can be updated to remove trait A and
when a rebuild is requested(regardless of whether the image is updated or
not), we skip scheduling and allow the rebuild to go through.

Now, even though the flavor1 requests trait A, the underlying RP1 does not
have that trait the rebuild will succeed...

I think maybe there should be some kind of report or query which runs
periodically to ensure continued conformance with respect to instance
running and their traits. But since traits are intend to provide hints for
scheduling, this is different problem to solve IMO.

On Wed, May 2, 2018 at 3:45 PM, Matt Riedemann  wrote:

> On 5/2/2018 5:39 PM, Jay Pipes wrote:
>
>> My personal preference is to add less technical debt and go with a
>> solution that checks if image traits have changed in nova-api and if so,
>> simply refuse to perform a rebuild.
>>
>
> So, what if when I created my server, the image I used, let's say image1,
> had required trait A and that fit the host.
>
> Then some external service removes (or somehow changes) trait A from the
> compute node resource provider (because people can and will do this, there
> are a few vmware specs up that rely on being able to manage traits out of
> band from nova), and then I rebuild my server with image2 that has required
> trait A. That would match the original trait A in image1 and we'd say,
> "yup, lgtm!" and do the rebuild even though the compute node resource
> provider wouldn't have trait A anymore.
>
> Having said that, it could technically happen before traits if the
> operator changed something on the underlying compute host which invalidated
> instances running on that host, but I'd think if that happened the operator
> would be migrating everything off the host and disabling it from scheduling
> before making whatever that kind of change would be, let's say they change
> the hypervisor or something less drastic but still image property
> invalidating.
>
> --
>
> Thanks,
>
> Matt
>
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>



-- 
Arvind N
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][placement] Trying to summarize bp/glance-image-traits scheduling alternatives for rebuild

2018-05-02 Thread Matt Riedemann

On 5/2/2018 5:39 PM, Jay Pipes wrote:
My personal preference is to add less technical debt and go with a 
solution that checks if image traits have changed in nova-api and if so, 
simply refuse to perform a rebuild.


So, what if when I created my server, the image I used, let's say 
image1, had required trait A and that fit the host.


Then some external service removes (or somehow changes) trait A from the 
compute node resource provider (because people can and will do this, 
there are a few vmware specs up that rely on being able to manage traits 
out of band from nova), and then I rebuild my server with image2 that 
has required trait A. That would match the original trait A in image1 
and we'd say, "yup, lgtm!" and do the rebuild even though the compute 
node resource provider wouldn't have trait A anymore.


Having said that, it could technically happen before traits if the 
operator changed something on the underlying compute host which 
invalidated instances running on that host, but I'd think if that 
happened the operator would be migrating everything off the host and 
disabling it from scheduling before making whatever that kind of change 
would be, let's say they change the hypervisor or something less drastic 
but still image property invalidating.


--

Thanks,

Matt

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][placement] Trying to summarize bp/glance-image-traits scheduling alternatives for rebuild

2018-05-02 Thread Jay Pipes

On 05/02/2018 10:07 AM, Matt Riedemann wrote:

On 5/1/2018 5:26 PM, Arvind N wrote:
In cases of rebuilding of an instance using a different image where 
the image traits have changed between the original launch and the 
rebuild, is it reasonable to ask to just re-launch a new instance with 
the new image?


The argument for this approach is that given that the requirements 
have changed, we want the scheduler to pick and allocate the 
appropriate host for the instance.


We don't know if the requirements have changed with the new image until 
we check them.


Here is another option:

What if the API compares the original image required traits against the 
new image required traits, and if the new image has required traits 
which weren't in the original image, then (punt) fail in the API? Then 
you would at least have a chance to rebuild with a new image that has 
required traits as long as those required traits are less than or equal 
to the originally validated traits for the host on which the instance is 
currently running.


That's pretty much what I had suggested earlier, yeah.

Option 10: Don't support image-defined traits at all. I know that won't 
happen though.


At this point I'm exhausted with this entire issue and conversation and 
will probably bow out and need someone else to step in with different 
perspective, like melwitt or dansmith.


All of the solutions are bad in their own way, either because they add 
technical debt and poor user experience, or because they make rebuild 
more complicated and harder to maintain for the developers.


I hear your frustration. And I agree all of the solutions are bad in 
their own way.


My personal preference is to add less technical debt and go with a 
solution that checks if image traits have changed in nova-api and if so, 
simply refuse to perform a rebuild.


Best,
-jay

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][placement] Trying to summarize bp/glance-image-traits scheduling alternatives for rebuild

2018-05-02 Thread Arvind N
 > What if the API compares the original image required traits against the
new image required traits, and if the new image has required traits which
weren't in the original image, then (punt) fail in the API? Then you would
at least have a chance > to rebuild with a new image that has required
traits as long as those required traits are less than or equal to the
originally validated traits for the host on which the instance is currently
running.

This is what i was proposing with #1, sorry if it was unclear. Will make it
more explicit.

1. Reject the rebuild request indicating that rebuilding with a new image
with **different** required traits compared to the original request is not
supported.
If the new image has the same or reduced set of traits as the old image,
then the request will be passed through to the conductor etc

Pseudo code

> if  not set(new_image.traits_required).issubset(
set(original_image.traits_required))
>  raise exception

On Wed, May 2, 2018 at 7:07 AM, Matt Riedemann  wrote:

> On 5/1/2018 5:26 PM, Arvind N wrote:
>
>> In cases of rebuilding of an instance using a different image where the
>> image traits have changed between the original launch and the rebuild, is
>> it reasonable to ask to just re-launch a new instance with the new image?
>>
>> The argument for this approach is that given that the requirements have
>> changed, we want the scheduler to pick and allocate the appropriate host
>> for the instance.
>>
>
> We don't know if the requirements have changed with the new image until we
> check them.
>
> Here is another option:
>
> What if the API compares the original image required traits against the
> new image required traits, and if the new image has required traits which
> weren't in the original image, then (punt) fail in the API? Then you would
> at least have a chance to rebuild with a new image that has required traits
> as long as those required traits are less than or equal to the originally
> validated traits for the host on which the instance is currently running.
>
>
>> The approach above also gives you consistent results vs the other
>> approaches where the rebuild may or may not succeed depending on how the
>> original allocation of resources went.
>>
>>
> Consistently frustrating, I agree. :) Because as a user, I can rebuild
> with some images (that don't have required traits) and can't rebuild with
> other images (that do have required traits).
>
> I see no difference with this and being able to rebuild (with a new image)
> some instances (image-backed) and not others (volume-backed). Given that, I
> expect if we punt on this, someone will just come along asking for the
> support later. Could be a couple of years from now when everyone has moved
> on and it then becomes someone else's problem.
>
> For example(from Alex Xu) ,if you launched an instance on a host which has
>> two SRIOV nic. One is normal SRIOV nic(A), another one with some kind of
>> offload feature(B).
>>
>> So, the original request is: resources=SRIOV_VF:1 The instance gets a VF
>> from the normal SRIOV nic(A).
>>
>> But with a new image, the new request is: resources=SRIOV_VF:1
>> traits=HW_NIC_OFFLOAD_XX
>>
>> With all the solutions discussed in the thread, a rebuild request like
>> above may or may not succeed depending on whether during the initial launch
>> whether nic A or nic B was allocated.
>>
>> Remember that in rebuild new allocation don't happen, we have to reuse
>> the existing allocations.
>>
>> Given the above background, there seems to be 2 competing options.
>>
>> 1. Fail in the API saying you can't rebuild with a new image with new
>> required traits.
>>
>> 2. Look at the current allocations for the instance and try to match the
>> new requirement from the image with the allocations.
>>
>> With #1, we get consistent results in regards to how rebuilds are treated
>> when the image traits changed.
>>
>> With #2, the rebuild may or may not succeed, depending on how well the
>> original allocations match up with the new requirements.
>>
>> #2 will also need to need to account for handling preferred traits or
>> granular resource traits if we decide to implement them for images at some
>> point...
>>
>
> Option 10: Don't support image-defined traits at all. I know that won't
> happen though.
>
> At this point I'm exhausted with this entire issue and conversation and
> will probably bow out and need someone else to step in with different
> perspective, like melwitt or dansmith.
>
> All of the solutions are bad in their own way, either because they add
> technical debt and poor user experience, or because they make rebuild more
> complicated and harder to maintain for the developers.
>
> --
>
> Thanks,
>
> Matt
>
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> 

Re: [openstack-dev] [nova][placement] Trying to summarize bp/glance-image-traits scheduling alternatives for rebuild

2018-05-02 Thread Matt Riedemann

On 5/1/2018 5:26 PM, Arvind N wrote:
In cases of rebuilding of an instance using a different image where the 
image traits have changed between the original launch and the rebuild, 
is it reasonable to ask to just re-launch a new instance with the new image?


The argument for this approach is that given that the requirements have 
changed, we want the scheduler to pick and allocate the appropriate host 
for the instance.


We don't know if the requirements have changed with the new image until 
we check them.


Here is another option:

What if the API compares the original image required traits against the 
new image required traits, and if the new image has required traits 
which weren't in the original image, then (punt) fail in the API? Then 
you would at least have a chance to rebuild with a new image that has 
required traits as long as those required traits are less than or equal 
to the originally validated traits for the host on which the instance is 
currently running.




The approach above also gives you consistent results vs the other 
approaches where the rebuild may or may not succeed depending on how the 
original allocation of resources went.




Consistently frustrating, I agree. :) Because as a user, I can rebuild 
with some images (that don't have required traits) and can't rebuild 
with other images (that do have required traits).


I see no difference with this and being able to rebuild (with a new 
image) some instances (image-backed) and not others (volume-backed). 
Given that, I expect if we punt on this, someone will just come along 
asking for the support later. Could be a couple of years from now when 
everyone has moved on and it then becomes someone else's problem.


For example(from Alex Xu) ,if you launched an instance on a host which 
has two SRIOV nic. One is normal SRIOV nic(A), another one with some 
kind of offload feature(B).


So, the original request is: resources=SRIOV_VF:1 The instance gets a VF 
from the normal SRIOV nic(A).


But with a new image, the new request is: resources=SRIOV_VF:1 
traits=HW_NIC_OFFLOAD_XX


With all the solutions discussed in the thread, a rebuild request like 
above may or may not succeed depending on whether during the initial 
launch whether nic A or nic B was allocated.


Remember that in rebuild new allocation don't happen, we have to reuse 
the existing allocations.


Given the above background, there seems to be 2 competing options.

1. Fail in the API saying you can't rebuild with a new image with new 
required traits.


2. Look at the current allocations for the instance and try to match the 
new requirement from the image with the allocations.


With #1, we get consistent results in regards to how rebuilds are 
treated when the image traits changed.


With #2, the rebuild may or may not succeed, depending on how well the 
original allocations match up with the new requirements.


#2 will also need to need to account for handling preferred traits or 
granular resource traits if we decide to implement them for images at 
some point...


Option 10: Don't support image-defined traits at all. I know that won't 
happen though.


At this point I'm exhausted with this entire issue and conversation and 
will probably bow out and need someone else to step in with different 
perspective, like melwitt or dansmith.


All of the solutions are bad in their own way, either because they add 
technical debt and poor user experience, or because they make rebuild 
more complicated and harder to maintain for the developers.


--

Thanks,

Matt

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][placement] Trying to summarize bp/glance-image-traits scheduling alternatives for rebuild

2018-05-02 Thread Matt Riedemann

On 4/24/2018 8:26 AM, Sylvain Bauza wrote:
We also have pre-flight checks for move operations like live and cold 
migrations, and I'd really like to keep all the conditionals in the 
conductor, because it knows better than the scheduler which operation is 
asked.


I'm not sure what "pre-flight checks" we have for cold migration. The 
conductor migrate task asks the scheduler for a host and then casts to 
the destination compute to start the migration.


The conductor live migration task does do some checking on the source 
and dest computes before proceeding, I agree with you there.


> I'm not really happy with adding more in the scheduler about "yeah, 
it's a rebuild, so please do something exceptional"


Agree that building more special rebuild logic into the scheduler isn't 
ideal and hopefully we could resolve this in conductor if possible, 
despite the fact that ImagePropertiesFilter is optional (although I'm 
pretty sure everyone enables it).


--

Thanks,

Matt

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][placement] Trying to summarize bp/glance-image-traits scheduling alternatives for rebuild

2018-05-02 Thread Matt Riedemann

On 4/24/2018 3:25 AM, Balázs Gibizer wrote:
The algorithm Eric provided in a previous mail do the filtering for the 
RPs that are part of the instance allocation so that sounds good to me.


Yeah I've been wondering if that solves this VF case.

I think we should not try to adjust allocations during a rebuild. 
Changing the allocation would mean it is not a rebuild any more but a 
resize.


Agree.

--

Thanks,

Matt

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][placement] Trying to summarize bp/glance-image-traits scheduling alternatives for rebuild

2018-05-02 Thread Matt Riedemann

On 4/23/2018 4:51 PM, Arvind N wrote:
For #1, we can make the explanation very clear that we rejected the 
request because the original traits specified in the original image and 
the new traits specified in the new image do not match and hence rebuild 
is not supported.


We don't reject rebuild requests today where you rebuild with a new 
image as long as that new image passes the scheduler filters for the 
host on which the instance is already running. I don't see why we'd just 
immediately fail in the API because the new image has required traits, 
when we have no idea, from the nova-api service, whether or not those 
image-defined required traits are going to match the current host or 
not. That's just adding technical debt to rebuild, like we have for 
rebuilding a volume-backed instance with a new image (you can't do it 
today because it wasn't thought about early enough in the design process).




For #2,

Other Cons:

 1. None of the filters currently make other API requests and my
understanding is we want to avoid reintroducing such a pattern. But
definitely workable solution.


For a rebuild-specific request (which we can determine already), I'm OK 
with this - we're already not calling GET /allocation_candidates in this 
case, so if people are worried about performance, it's just a trade of 
one REST API call for another.



 2. If the user disables the image properties filter, then traits based
filtering will not be run in rebuild case


The user doesn't disable the filter, the operator does, and likely for 
good reason. I don't see a problem with this.




For #3,

Even though it handles the nested provider, there is a potential issue.

Lets say a host with two SRIOV nic. One is normal SRIOV nic(VF1), 
another one with some kind of offload feature(VF2).(Described by alex)


Initial instance launch happens with VF:1 allocated, rebuild launches 
with modified request with traits=HW_NIC_OFFLOAD_X, so basically we want 
the instance to be allocated VF2.


But the original allocation happens against VF1 and since in rebuild the 
original allocations are not changed, we have wrong allocations.


I don't know what to say about this. We shouldn't have any quantitative 
resource allocation changes as a result of a rebuild. This actually 
sounds like a case for option #4 with using GET /allocation_candidates 
and then being able to filter out if rebuliding the instance with the 
new image with new required traits but on the same host would result in 
new allocation requests, and if so, we should fail - but we can (only?) 
determine that via the response from GET /allocation_candidates.


--

Thanks,

Matt

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][placement] Trying to summarize bp/glance-image-traits scheduling alternatives for rebuild

2018-05-02 Thread Matt Riedemann

On 4/23/2018 4:43 PM, Jay Pipes wrote:
How about just having the conductor call GET 
/resource_providers?in_tree==, see if 
there is a result, and if not, don't even call the scheduler at all 
(because conductor would already know there would be a NoValidHost 
returned)?


This makes filtering on image properties required, which is something I 
was pushing back on because the ImagePropertiesFilter today, by design 
of all scheduler filters, is configurable and optional, which is why I 
wanted to add the filtering logic for image-defined required traits into 
the ImagePropertiesFilter itself.


--

Thanks,

Matt

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][placement] Trying to summarize bp/glance-image-traits scheduling alternatives for rebuild

2018-05-01 Thread Arvind N
Reminder for Operators, Please provide feedback either way.

In cases of rebuilding of an instance using a different image where the
image traits have changed between the original launch and the rebuild, is
it reasonable to ask to just re-launch a new instance with the new image?

The argument for this approach is that given that the requirements have
changed, we want the scheduler to pick and allocate the appropriate host
for the instance.

The approach above also gives you consistent results vs the other
approaches where the rebuild may or may not succeed depending on how the
original allocation of resources went.

For example(from Alex Xu) ,if you launched an instance on a host which has
two SRIOV nic. One is normal SRIOV nic(A), another one with some kind of
offload feature(B).

So, the original request is: resources=SRIOV_VF:1 The instance gets a VF
from the normal SRIOV nic(A).

But with a new image, the new request is: resources=SRIOV_VF:1
traits=HW_NIC_OFFLOAD_XX
With all the solutions discussed in the thread, a rebuild request like
above may or may not succeed depending on whether during the initial launch
whether nic A or nic B was allocated.

Remember that in rebuild new allocation don't happen, we have to reuse the
existing allocations.

Given the above background, there seems to be 2 competing options.

1. Fail in the API saying you can't rebuild with a new image with new
required traits.

2. Look at the current allocations for the instance and try to match the
new requirement from the image with the allocations.

With #1, we get consistent results in regards to how rebuilds are treated
when the image traits changed.

With #2, the rebuild may or may not succeed, depending on how well the
original allocations match up with the new requirements.

#2 will also need to need to account for handling preferred traits or
granular resource traits if we decide to implement them for images at some
point...


[1]
https://specs.openstack.org/openstack/nova-specs/specs/rocky/approved/glance-image-traits.html
[2] https://review.openstack.org/#/c/560718/

On Tue, Apr 24, 2018 at 6:26 AM, Sylvain Bauza  wrote:

> Sorry folks for the late reply, I'll try to also weigh in the Gerrit
> change.
>
> On Tue, Apr 24, 2018 at 2:55 PM, Jay Pipes  wrote:
>
>> On 04/23/2018 05:51 PM, Arvind N wrote:
>>
>>> Thanks for the detailed options Matt/eric/jay.
>>>
>>> Just few of my thoughts,
>>>
>>> For #1, we can make the explanation very clear that we rejected the
>>> request because the original traits specified in the original image and the
>>> new traits specified in the new image do not match and hence rebuild is not
>>> supported.
>>>
>>
>> I believe I had suggested that on the spec amendment patch. Matt had
>> concerns about an error message being a poor user experience (I don't
>> necessarily disagree with that) and I had suggested a clearer error message
>> to try and make that user experience slightly less sucky.
>>
>> For #3,
>>>
>>> Even though it handles the nested provider, there is a potential issue.
>>>
>>> Lets say a host with two SRIOV nic. One is normal SRIOV nic(VF1),
>>> another one with some kind of offload feature(VF2).(Described by alex)
>>>
>>> Initial instance launch happens with VF:1 allocated, rebuild launches
>>> with modified request with traits=HW_NIC_OFFLOAD_X, so basically we want
>>> the instance to be allocated VF2.
>>>
>>> But the original allocation happens against VF1 and since in rebuild the
>>> original allocations are not changed, we have wrong allocations.
>>>
>>
>> Yep, that is certainly an issue. The only solution to this that I can see
>> would be to have the conductor ask the compute node to do the pre-flight
>> check. The compute node already has the entire tree of providers, their
>> inventories and traits, along with information about providers that share
>> resources with the compute node. It has this information in the
>> ProviderTree object in the reportclient that is contained in the compute
>> node resource tracker.
>>
>> The pre-flight check, if run on the compute node, would be able to grab
>> the allocation records for the instance and determine if the required
>> traits for the new image are present on the actual resource providers
>> allocated against for the instance (and not including any child providers
>> not allocated against).
>>
>>
> Yup, that. We also have pre-flight checks for move operations like live
> and cold migrations, and I'd really like to keep all the conditionals in
> the conductor, because it knows better than the scheduler which operation
> is asked.
> I'm not really happy with adding more in the scheduler about "yeah, it's a
> rebuild, so please do something exceptional", and I'm also not happy with
> having a filter (that can be disabled) calling the Placement API.
>
>
>> Or... we chalk this up as a "too bad" situation and just either go with
>> option #1 or simply don't care about it.
>
>
> Also, that too. 

Re: [openstack-dev] [nova][placement] Trying to summarize bp/glance-image-traits scheduling alternatives for rebuild

2018-04-24 Thread Eric Fried
Alex-

On 04/24/2018 09:21 AM, Alex Xu wrote:
> 
> 
> 2018-04-24 20:53 GMT+08:00 Eric Fried  >:
> 
> > The problem isn't just checking the traits in the nested resource
> > provider. We also need to ensure the trait in the exactly same child
> > resource provider.
> 
> No, we can't get "granular" with image traits.  We accepted this as a
> limitation for the spawn aspect of this spec [1], for all the same
> reasons [2].  And by the time we've spawned the instance, we've lost the
> information about which granular request groups (from the flavor) were
> satisfied by which resources - retrofitting that information from a new
> image would be even harder.  So we need to accept the same limitation
> for rebuild.
> 
> [1] "Due to the difficulty of attempting to reconcile granular request
> groups between an image and a flavor, only the (un-numbered) trait group
> is supported. The traits listed there are merged with those of the
> un-numbered request group from the flavor."
> 
> (http://specs.openstack.org/openstack/nova-specs/specs/rocky/approved/glance-image-traits.html#proposed-change
> 
> )
> [2]
> 
> https://review.openstack.org/#/c/554305/2/specs/rocky/approved/glance-image-traits.rst@86
> 
> 
> 
> 
> Why we can return a RP which has a specific trait but we won't consume
> any resources on it?
> If the case is that we request two VFs, and this two VFs have different
> required traits, then that should be granular request.

We don't care about RPs we're not consuming resources from.  Forget
rebuild - if the image used for the original spawn request has traits
pertaining to VFs, we folded those traits into the un-numbered request
group, which means the VF resources would have needed to be in the
un-numbered request group in the flavor as well.  That was the
limitation discussed at [2]: trying to correlate granular groups from an
image to granular groups in a trait would require nontrivial invention
beyond what we're willing to do at this point.  So we're limited at
spawn time to VFs (or whatever) where we can't tell which trait belongs
to which.  The best we can do is ensure that the end result of the
un-numbered request group will collectively satisfy all the traits from
the image.  And this same limitation exists, for the same reasons, on
rebuild.  It even goes a bit further, because if there are *other* VFs
(or whatever) that came from numbered groups in the original request, we
have no way to know that; so if *those* guys have traits required by the
new image, we'll still pass.  Which is almost certainly okay.

-efried

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][placement] Trying to summarize bp/glance-image-traits scheduling alternatives for rebuild

2018-04-24 Thread Alex Xu
2018-04-24 20:53 GMT+08:00 Eric Fried :

> > The problem isn't just checking the traits in the nested resource
> > provider. We also need to ensure the trait in the exactly same child
> > resource provider.
>
> No, we can't get "granular" with image traits.  We accepted this as a
> limitation for the spawn aspect of this spec [1], for all the same
> reasons [2].  And by the time we've spawned the instance, we've lost the
> information about which granular request groups (from the flavor) were
> satisfied by which resources - retrofitting that information from a new
> image would be even harder.  So we need to accept the same limitation
> for rebuild.
>
> [1] "Due to the difficulty of attempting to reconcile granular request
> groups between an image and a flavor, only the (un-numbered) trait group
> is supported. The traits listed there are merged with those of the
> un-numbered request group from the flavor."
> (http://specs.openstack.org/openstack/nova-specs/specs/
> rocky/approved/glance-image-traits.html#proposed-change)
> [2]
> https://review.openstack.org/#/c/554305/2/specs/rocky/
> approved/glance-image-traits.rst@86


Why we can return a RP which has a specific trait but we won't consume any
resources on it?
If the case is that we request two VFs, and this two VFs have different
required traits, then that should be granular request.


>
>
> __
>

> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][placement] Trying to summarize bp/glance-image-traits scheduling alternatives for rebuild

2018-04-24 Thread Sylvain Bauza
Sorry folks for the late reply, I'll try to also weigh in the Gerrit change.

On Tue, Apr 24, 2018 at 2:55 PM, Jay Pipes  wrote:

> On 04/23/2018 05:51 PM, Arvind N wrote:
>
>> Thanks for the detailed options Matt/eric/jay.
>>
>> Just few of my thoughts,
>>
>> For #1, we can make the explanation very clear that we rejected the
>> request because the original traits specified in the original image and the
>> new traits specified in the new image do not match and hence rebuild is not
>> supported.
>>
>
> I believe I had suggested that on the spec amendment patch. Matt had
> concerns about an error message being a poor user experience (I don't
> necessarily disagree with that) and I had suggested a clearer error message
> to try and make that user experience slightly less sucky.
>
> For #3,
>>
>> Even though it handles the nested provider, there is a potential issue.
>>
>> Lets say a host with two SRIOV nic. One is normal SRIOV nic(VF1), another
>> one with some kind of offload feature(VF2).(Described by alex)
>>
>> Initial instance launch happens with VF:1 allocated, rebuild launches
>> with modified request with traits=HW_NIC_OFFLOAD_X, so basically we want
>> the instance to be allocated VF2.
>>
>> But the original allocation happens against VF1 and since in rebuild the
>> original allocations are not changed, we have wrong allocations.
>>
>
> Yep, that is certainly an issue. The only solution to this that I can see
> would be to have the conductor ask the compute node to do the pre-flight
> check. The compute node already has the entire tree of providers, their
> inventories and traits, along with information about providers that share
> resources with the compute node. It has this information in the
> ProviderTree object in the reportclient that is contained in the compute
> node resource tracker.
>
> The pre-flight check, if run on the compute node, would be able to grab
> the allocation records for the instance and determine if the required
> traits for the new image are present on the actual resource providers
> allocated against for the instance (and not including any child providers
> not allocated against).
>
>
Yup, that. We also have pre-flight checks for move operations like live and
cold migrations, and I'd really like to keep all the conditionals in the
conductor, because it knows better than the scheduler which operation is
asked.
I'm not really happy with adding more in the scheduler about "yeah, it's a
rebuild, so please do something exceptional", and I'm also not happy with
having a filter (that can be disabled) calling the Placement API.


> Or... we chalk this up as a "too bad" situation and just either go with
> option #1 or simply don't care about it.


Also, that too. Maybe just provide an error should be enough, nope?
Operators, what do you think ? (cross-calling openstack-operators@)

 -Sylvain


>
> Best,
> -jay
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][placement] Trying to summarize bp/glance-image-traits scheduling alternatives for rebuild

2018-04-24 Thread Jay Pipes

On 04/23/2018 05:51 PM, Arvind N wrote:

Thanks for the detailed options Matt/eric/jay.

Just few of my thoughts,

For #1, we can make the explanation very clear that we rejected the 
request because the original traits specified in the original image and 
the new traits specified in the new image do not match and hence rebuild 
is not supported.


I believe I had suggested that on the spec amendment patch. Matt had 
concerns about an error message being a poor user experience (I don't 
necessarily disagree with that) and I had suggested a clearer error 
message to try and make that user experience slightly less sucky.



For #3,

Even though it handles the nested provider, there is a potential issue.

Lets say a host with two SRIOV nic. One is normal SRIOV nic(VF1), 
another one with some kind of offload feature(VF2).(Described by alex)


Initial instance launch happens with VF:1 allocated, rebuild launches 
with modified request with traits=HW_NIC_OFFLOAD_X, so basically we want 
the instance to be allocated VF2.


But the original allocation happens against VF1 and since in rebuild the 
original allocations are not changed, we have wrong allocations.


Yep, that is certainly an issue. The only solution to this that I can 
see would be to have the conductor ask the compute node to do the 
pre-flight check. The compute node already has the entire tree of 
providers, their inventories and traits, along with information about 
providers that share resources with the compute node. It has this 
information in the ProviderTree object in the reportclient that is 
contained in the compute node resource tracker.


The pre-flight check, if run on the compute node, would be able to grab 
the allocation records for the instance and determine if the required 
traits for the new image are present on the actual resource providers 
allocated against for the instance (and not including any child 
providers not allocated against).


Or... we chalk this up as a "too bad" situation and just either go with 
option #1 or simply don't care about it.


Best,
-jay

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][placement] Trying to summarize bp/glance-image-traits scheduling alternatives for rebuild

2018-04-24 Thread Eric Fried
> The problem isn't just checking the traits in the nested resource
> provider. We also need to ensure the trait in the exactly same child
> resource provider.

No, we can't get "granular" with image traits.  We accepted this as a
limitation for the spawn aspect of this spec [1], for all the same
reasons [2].  And by the time we've spawned the instance, we've lost the
information about which granular request groups (from the flavor) were
satisfied by which resources - retrofitting that information from a new
image would be even harder.  So we need to accept the same limitation
for rebuild.

[1] "Due to the difficulty of attempting to reconcile granular request
groups between an image and a flavor, only the (un-numbered) trait group
is supported. The traits listed there are merged with those of the
un-numbered request group from the flavor."
(http://specs.openstack.org/openstack/nova-specs/specs/rocky/approved/glance-image-traits.html#proposed-change)
[2]
https://review.openstack.org/#/c/554305/2/specs/rocky/approved/glance-image-traits.rst@86

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][placement] Trying to summarize bp/glance-image-traits scheduling alternatives for rebuild

2018-04-24 Thread Balázs Gibizer



On Tue, Apr 24, 2018 at 9:08 AM, Alex Xu  wrote:



2018-04-24 5:51 GMT+08:00 Arvind N :

Thanks for the detailed options Matt/eric/jay.

Just few of my thoughts,

For #1, we can make the explanation very clear that we rejected the 
request because the original traits specified in the original image 
and the new traits specified in the new image do not match and hence 
rebuild is not supported.


For #2,

Other Cons:
None of the filters currently make other API requests and my 
understanding is we want to avoid reintroducing such a pattern. But 
definitely workable solution.
If the user disables the image properties filter, then traits based 
filtering will not be run in rebuild case

For #3,

Even though it handles the nested provider, there is a potential 
issue.


Lets say a host with two SRIOV nic. One is normal SRIOV nic(VF1), 
another one with some kind of offload feature(VF2).(Described by 
alex)


Initial instance launch happens with VF:1 allocated, rebuild 
launches with modified request with traits=HW_NIC_OFFLOAD_X, so 
basically we want the instance to be allocated VF2.


But the original allocation happens against VF1 and since in rebuild 
the original allocations are not changed, we have wrong allocations.



Yes, that is the case what I said, and none of #1,2,3,4 and the 
proposal in this threads works also.


The problem isn't just checking the traits in the nested resource 
provider. We also need to ensure the trait in the exactly same child 
resource provider. Or we need to adjust allocations for the child 
resource provider.


I agree that in_tree only ensure that the compute node tree has the 
required traits but it does not take into account that only some of 
those RPs from the tree provides resources for the current allocation. 
The algorithm Eric provided in a previous mail do the filtering for the 
RPs that are part of the instance allocation so that sounds good to me.


I think we should not try to adjust allocations during a rebuild. 
Changing the allocation would mean it is not a rebuild any more but a 
resize.


Cheers,
gibi






for #4, there is good amount of pushback against modifying the 
allocation_candiadates api to not have resources.


Jay:
for the GET 
/resource_providers?in_tree==, 
nested resource providers and allocation pose a problem see #3 above.


I will investigate erics option and update the spec.
--
Arvind N

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: 
openstack-dev-requ...@lists.openstack.org?subject:unsubscribe

http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev






__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][placement] Trying to summarize bp/glance-image-traits scheduling alternatives for rebuild

2018-04-24 Thread Alex Xu
2018-04-24 5:51 GMT+08:00 Arvind N :

> Thanks for the detailed options Matt/eric/jay.
>
> Just few of my thoughts,
>
> For #1, we can make the explanation very clear that we rejected the
> request because the original traits specified in the original image and the
> new traits specified in the new image do not match and hence rebuild is not
> supported.
>
> For #2,
>
> Other Cons:
>
>1. None of the filters currently make other API requests and my
>understanding is we want to avoid reintroducing such a pattern. But
>definitely workable solution.
>2. If the user disables the image properties filter, then traits based
>filtering will not be run in rebuild case
>
> For #3,
>
> Even though it handles the nested provider, there is a potential issue.
>
> Lets say a host with two SRIOV nic. One is normal SRIOV nic(VF1), another
> one with some kind of offload feature(VF2).(Described by alex)
>
> Initial instance launch happens with VF:1 allocated, rebuild launches with
> modified request with traits=HW_NIC_OFFLOAD_X, so basically we want the
> instance to be allocated VF2.
>
> But the original allocation happens against VF1 and since in rebuild the
> original allocations are not changed, we have wrong allocations.
>


Yes, that is the case what I said, and none of #1,2,3,4 and the proposal in
this threads works also.

The problem isn't just checking the traits in the nested resource provider.
We also need to ensure the trait in the exactly same child resource
provider. Or we need to adjust allocations for the child resource provider.



> for #4, there is good amount of pushback against modifying the
> allocation_candiadates api to not have resources.
>
> Jay:
> for the GET /resource_providers?in_tree==,
> nested resource providers and allocation pose a problem see #3 above.
>
> I will investigate erics option and update the spec.
> --
> Arvind N
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][placement] Trying to summarize bp/glance-image-traits scheduling alternatives for rebuild

2018-04-23 Thread Arvind N
>
> 1. Fail in the API saying you can't rebuild with a new image with new
> required traits.



Pros: - Simple way to keep the new image off a host that doesn't support it.
> - Similar solution to volume-backed rebuild with a new image.



Cons: - Confusing user experience since they might be able to rebuild with
> some new images but not others with no clear explanation about the
> difference.



Still want to get thoughts on Option 1 from the community, the only main
con can be addressed by a better error message.

My main concern is the amount of complexity being introduced now but also
what we are setting ourselfs up for the future.

When/If we decide to support forbidden traits, granular resource traits,
preferred traits etc based on image properties, we would have to handle all
those complexities for the rebuild case and possibly re-implement some of
the logic already within placement to handle these cases.

IMHO, i dont see a whole lot of benefit when weighing against the cost.
Feedback is appreciated. :)

Arvind

On Mon, Apr 23, 2018 at 3:02 PM, Eric Fried  wrote:

> > for the GET
> > /resource_providers?in_tree==, nested
> > resource providers and allocation pose a problem see #3 above.
>
> This *would* work as a quick up-front check as Jay described (if you get
> no results from this, you know that at least one of your image traits
> doesn't exist anywhere in the tree) except that it doesn't take sharing
> providers into account :(
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>



-- 
Arvind N
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][placement] Trying to summarize bp/glance-image-traits scheduling alternatives for rebuild

2018-04-23 Thread Eric Fried
> for the GET
> /resource_providers?in_tree==, nested
> resource providers and allocation pose a problem see #3 above.

This *would* work as a quick up-front check as Jay described (if you get
no results from this, you know that at least one of your image traits
doesn't exist anywhere in the tree) except that it doesn't take sharing
providers into account :(

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][placement] Trying to summarize bp/glance-image-traits scheduling alternatives for rebuild

2018-04-23 Thread Arvind N
Thanks for the detailed options Matt/eric/jay.

Just few of my thoughts,

For #1, we can make the explanation very clear that we rejected the request
because the original traits specified in the original image and the new
traits specified in the new image do not match and hence rebuild is not
supported.

For #2,

Other Cons:

   1. None of the filters currently make other API requests and my
   understanding is we want to avoid reintroducing such a pattern. But
   definitely workable solution.
   2. If the user disables the image properties filter, then traits based
   filtering will not be run in rebuild case

For #3,

Even though it handles the nested provider, there is a potential issue.

Lets say a host with two SRIOV nic. One is normal SRIOV nic(VF1), another
one with some kind of offload feature(VF2).(Described by alex)

Initial instance launch happens with VF:1 allocated, rebuild launches with
modified request with traits=HW_NIC_OFFLOAD_X, so basically we want the
instance to be allocated VF2.

But the original allocation happens against VF1 and since in rebuild the
original allocations are not changed, we have wrong allocations.

for #4, there is good amount of pushback against modifying the
allocation_candiadates api to not have resources.

Jay:
for the GET /resource_providers?in_tree==,
nested resource providers and allocation pose a problem see #3 above.

I will investigate erics option and update the spec.
-- 
Arvind N
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][placement] Trying to summarize bp/glance-image-traits scheduling alternatives for rebuild

2018-04-23 Thread Jay Pipes

On 04/23/2018 03:48 PM, Matt Riedemann wrote:
We seem to be at a bit of an impasse in this spec amendment [1] so I 
want to try and summarize the alternative solutions as I see them.


The overall goal of the blueprint is to allow defining traits via image 
properties, like flavor extra specs. Those image-defined traits are used 
to filter hosts during scheduling of the instance. During server create, 
that filtering happens during the normal "GET /allocation_candidates" 
call to placement.


The problem is during rebuild with a new image that specifies new 
required traits. A rebuild is not a move operation, but we run through 
the scheduler filters to make sure the new image (if one is specified), 
is valid for the host on which the instance is currently running.


What you are discussing above is simple a validation that the compute 
node performing the rebuild for an instance supports the capabilities 
that were required by the original image.


How about just having the conductor call GET 
/resource_providers?in_tree==, see if 
there is a result, and if not, don't even call the scheduler at all 
(because conductor would already know there would be a NoValidHost 
returned)?


If there's no image traits,  or if there is a result from GET 
/resource_providers, continue to do the existing call-the-scheduler 
behaviour in order to fulfill the ComputeCapabilitiesFilter and 
ImageMetadataFilter requirements that exist today.


So, in short, just do a quick pre-flight check from the conductor if 
image traits are found before ever calling the scheduler. Otherwise, 
proceed as normal.


Best,
-jay

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][placement] Trying to summarize bp/glance-image-traits scheduling alternatives for rebuild

2018-04-23 Thread Eric Fried
Following the discussion on IRC, here's what I think you need to do:

- Assuming the set of traits from your new image is called image_traits...
- Use GET /allocations/{instance_uuid} and pull out the set of all RP
UUIDs.  Let's call this instance_rp_uuids.
- Use the SchedulerReportClient.get_provider_tree_and_ensure_root method
[1] to populate and return the ProviderTree for the host.  (If we're
uncomfortable about the `ensure_root` bit, we can factor that away.)
Call this ptree.
- Collect all the traits in the RPs you've got allocated to your instance:

 traits_in_instance_rps = set()
 for rp_uuid in instance_rp_uuids:
 traits_in_instance_rps.update(ptree.data(rp_uuid).traits)

- See if any of your image traits are *not* in those RPs.

 missing_traits = image_traits - traits_in_instance_rps

- If there were any, it's a no go.

 if missing_traits:
 FAIL(_("The following traits were in the image but not in the
instance's RPs: %s") % ', '.join(missing_traits))

[1]
https://github.com/openstack/nova/blob/master/nova/scheduler/client/report.py#L986

On 04/23/2018 03:47 PM, Matt Riedemann wrote:
> On 4/23/2018 3:26 PM, Eric Fried wrote:
>> No, the question you're really asking in this case is, "Do the resource
>> providers in this tree contain (or not contain) these traits?"  Which to
>> me, translates directly to:
>>
>>   GET /resource_providers?in_tree=$rp_uuid={$TRAIT|!$TRAIT, ...}
>>
>> ...which we already support.  The answer is a list of providers. Compare
>> that to the providers from which resources are already allocated, and
>> Bob's your uncle.
> 
> OK and that will include filtering the required traits on nested
> providers in that tree rather than just against the root provider? If
> so, then yeah that sounds like an improvement on option 2 or 3 in my
> original email and resolves the issue without having to call (or change)
> "GET /allocation_candidates". I still think it should happen from within
> ImagePropertiesFilter, but that's an implementation detail.
> 

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][placement] Trying to summarize bp/glance-image-traits scheduling alternatives for rebuild

2018-04-23 Thread Matt Riedemann

On 4/23/2018 3:26 PM, Eric Fried wrote:

No, the question you're really asking in this case is, "Do the resource
providers in this tree contain (or not contain) these traits?"  Which to
me, translates directly to:

  GET /resource_providers?in_tree=$rp_uuid={$TRAIT|!$TRAIT, ...}

...which we already support.  The answer is a list of providers. Compare
that to the providers from which resources are already allocated, and
Bob's your uncle.


OK and that will include filtering the required traits on nested 
providers in that tree rather than just against the root provider? If 
so, then yeah that sounds like an improvement on option 2 or 3 in my 
original email and resolves the issue without having to call (or change) 
"GET /allocation_candidates". I still think it should happen from within 
ImagePropertiesFilter, but that's an implementation detail.


--

Thanks,

Matt

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][placement] Trying to summarize bp/glance-image-traits scheduling alternatives for rebuild

2018-04-23 Thread Eric Fried
Semantically, GET /allocation_candidates where we don't actually want to
allocate anything (i.e. we don't want to use the returned candidates) is
goofy, and talking about what the result would look like when there's no
`resources` is going to spider into some weird questions.

Like what does the response payload look like?  In the "good" scenario,
you would be expecting an allocation_request like:

"allocations": {
$rp_uuid: {
"resources": {
# Nada
}
},
}

...which is something we discussed recently [1] in relation to "anchor"
providers, and killed.

No, the question you're really asking in this case is, "Do the resource
providers in this tree contain (or not contain) these traits?"  Which to
me, translates directly to:

 GET /resource_providers?in_tree=$rp_uuid={$TRAIT|!$TRAIT, ...}

...which we already support.  The answer is a list of providers. Compare
that to the providers from which resources are already allocated, and
Bob's your uncle.

(I do find it messy/weird that the required/forbidden traits in the
image meta are supposed to apply *anywhere* in the provider tree.  But I
get that that's probably going to make the most sense.)

[1]
http://lists.openstack.org/pipermail/openstack-dev/2018-April/129408.html

On 04/23/2018 02:48 PM, Matt Riedemann wrote:
> We seem to be at a bit of an impasse in this spec amendment [1] so I
> want to try and summarize the alternative solutions as I see them.
> 
> The overall goal of the blueprint is to allow defining traits via image
> properties, like flavor extra specs. Those image-defined traits are used
> to filter hosts during scheduling of the instance. During server create,
> that filtering happens during the normal "GET /allocation_candidates"
> call to placement.
> 
> The problem is during rebuild with a new image that specifies new
> required traits. A rebuild is not a move operation, but we run through
> the scheduler filters to make sure the new image (if one is specified),
> is valid for the host on which the instance is currently running.
> 
> We don't currently call "GET /allocation_candidates" during rebuild
> because that could inadvertently filter out the host we know we need
> [2]. Also, since flavors don't change for rebuild, we haven't had a need
> for getting allocation candidates during rebuild since we're not
> allocating new resources (pretend bug 1763766 [3] does not exist for now).
> 
> Now that we know the problem, here are some of the solutions that have
> been discussed in the spec amendment, again, only for rebuild with a new
> image that has new traits:
> 
> 1. Fail in the API saying you can't rebuild with a new image with new
> required traits.
> 
> Pros:
> 
> - Simple way to keep the new image off a host that doesn't support it.
> - Similar solution to volume-backed rebuild with a new image.
> 
> Cons:
> 
> - Confusing user experience since they might be able to rebuild with
> some new images but not others with no clear explanation about the
> difference.
> 
> 2. Have the ImagePropertiesFilter call "GET
> /resource_providers/{rp_uuid}/traits" and compare the compute node root
> provider traits against the new image's required traits.
> 
> Pros:
> 
> - Avoids having to call "GET /allocation_candidates" during rebuild.
> - Simple way to compare the required image traits against the compute
> node provider traits.
> 
> Cons:
> 
> - Does not account for nested providers so the scheduler could reject
> the image due to its required traits which actually apply to a nested
> provider in the tree. This is somewhat related to bug 1763766.
> 
> 3. Slight variation on #2 except build a set of all traits from all
> providers in the same tree.
> 
> Pros:
> 
> - Handles the nested provider traits issue from #2.
> 
> Cons:
> 
> - Duplicates filtering in ImagePropertiesFilter that could otherwise
> happen in "GET /allocation_candidates".
> 
> 4. Add a microversion to change "GET /allocation_candidates" to make two
> changes:
> 
> a) Add an "in_tree" filter like in "GET /resource_providers". This would
> be needed to limit the scope of what gets returned since we know we only
> want to check against one specific host (the current host for the
> instance).
> 
> b) Make "resources" optional since on a rebuild we don't want to
> allocate new resources (again, notwithstanding bug 1763766).
> 
> Pros:
> 
> - We can call "GET /allocation_candidates?in_tree= UUID>=" and if nothing is returned,
> we know the new image's required traits don't work with the current node.
> - The filtering is baked into "GET /allocation_candidates" and not
> client-side in ImagePropertiesFilter.
> 
> Cons:
> 
> - Changes to the "GET /allocation_candidates" API which is going to be
> more complicated and more up-front work, but I don't have a good idea of
> how hard this would be to add since we already have the same "in_tree"
> logic in "GET