On 03/28/2018 07:03 PM, Nadathur, Sundar wrote:
Thanks, Eric. Looks like there are no good solutions even as candidates, but only options with varying levels of unacceptability. It is funny that that the option that is considered the least unacceptable is to let the problem happen and then fail the request (last one in your list).

Could I ask what is the objection to the scheme that applies multiple traits and removes one as needed, apart from the fact that it has races?

The fundamental objection that I've had to various discussions that involve abusing traits in this fashion is that you are essentially trying to "consume" traits. But traits are *not consumable things*. Only resource classes are consumable things.

If you want to track the inventory of a certain thing -- and consume those things during scheduling -- then you need to use resource classes for that thing. The inventory management system in placement already has race protections in it. This means that you won't be able to over-allocate a particular consumable accelerated function if there isn't inventory capacity for that particular function on an FPGA. Likewise, you would not be able to *remove* inventory for a particular function on an FPGA if some instance is consuming that particular function. This protection does *not* exist if you are tracking particular functions with traits; the reason is because an instance doesn't *consume* a trait. There's no such thing as "I started an instance with accelerated function X and therefore I am consuming trait Y on this FPGA."

So, bottom line for me is make sure we're using resource classes for consumable items and traits for representing non-consumable capabilities **of the resource provider**.

That means that for the (re)-programming scenarios you need to dynamically adjust the inventory of a particular FPGA resource provider.

You will need to *add* an inventory item of a custom resource class representing the specific function you are flashing *to an empty region*.

You *may* want to *delete* an inventory item of a custom resource class representing the specific function *when an instance that was using that specific function is terminated*. When the instance is terminated, Nova will *automatically* delete allocations of that custom resource class associated with the instance if you use a custom resource class to represent the particular accelerated function. No such automatic removal of allocations is done if you use traits to represent particular accelerated functions (again, because traits aren't consumable things).

Best,
-jay

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to