Re: [openstack-dev] [nova] [cyborg] Race condition in the Cyborg/Nova flow

Jay Pipes Thu, 29 Mar 2018 10:03:06 -0700

On 03/28/2018 07:03 PM, Nadathur, Sundar wrote:

Thanks, Eric. Looks like there are no good solutions even as candidates,but only options with varying levels of unacceptability. It is funnythat that the option that is considered the least unacceptable is to letthe problem happen and then fail the request (last one in your list).
Could I ask what is the objection to the scheme that applies multipletraits and removes one as needed, apart from the fact that it has races?

The fundamental objection that I've had to various discussions thatinvolve abusing traits in this fashion is that you are essentiallytrying to "consume" traits. But traits are *not consumable things*. Onlyresource classes are consumable things.

If you want to track the inventory of a certain thing -- and consumethose things during scheduling -- then you need to use resource classesfor that thing. The inventory management system in placement already hasrace protections in it. This means that you won't be able toover-allocate a particular consumable accelerated function if thereisn't inventory capacity for that particular function on an FPGA.Likewise, you would not be able to *remove* inventory for a particularfunction on an FPGA if some instance is consuming that particularfunction. This protection does *not* exist if you are trackingparticular functions with traits; the reason is because an instancedoesn't *consume* a trait. There's no such thing as "I started aninstance with accelerated function X and therefore I am consuming traitY on this FPGA."

So, bottom line for me is make sure we're using resource classes forconsumable items and traits for representing non-consumable capabilities**of the resource provider**.

That means that for the (re)-programming scenarios you need todynamically adjust the inventory of a particular FPGA resource provider.

You will need to *add* an inventory item of a custom resource classrepresenting the specific function you are flashing *to an empty region*.

You *may* want to *delete* an inventory item of a custom resource classrepresenting the specific function *when an instance that was using thatspecific function is terminated*. When the instance is terminated, Novawill *automatically* delete allocations of that custom resource classassociated with the instance if you use a custom resource class torepresent the particular accelerated function. No such automatic removalof allocations is done if you use traits to represent particularaccelerated functions (again, because traits aren't consumable things).


Best,
-jay

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [nova] [cyborg] Race condition in the Cyborg/Nova flow

Reply via email to