2018-03-06 22:45 GMT+08:00 Mooney, Sean K <sean.k.moo...@intel.com>: > > > > > *From:* Matthew Booth [mailto:mbo...@redhat.com] > *Sent:* Saturday, March 3, 2018 4:15 PM > *To:* OpenStack Development Mailing List (not for usage questions) < > openstack-dev@lists.openstack.org> > *Subject:* Re: [openstack-dev] [Nova] [Cyborg] Tracking multiple functions > > > > On 2 March 2018 at 14:31, Jay Pipes <jaypi...@gmail.com> wrote: > > On 03/02/2018 02:00 PM, Nadathur, Sundar wrote: > > Hello Nova team, > > During the Cyborg discussion at Rocky PTG, we proposed a flow for > FPGAs wherein the request spec asks for a device type as a resource class, > and optionally a function (such as encryption) in the extra specs. This > does not seem to work well for the usage model that I’ll describe below. > > An FPGA device may implement more than one function. For example, it may > implement both compression and encryption. Say a cluster has 10 devices of > device type X, and each of them is programmed to offer 2 instances of > function A and 4 instances of function B. More specifically, the device may > implement 6 PCI functions, with 2 of them tied to function A, and the other > 4 tied to function B. So, we could have 6 separate instances accessing > functions on the same device. > > > > Does this imply that Cyborg can't reprogram the FPGA at all? > > *[Mooney, Sean K] cyborg is intended to support fixed function acclerators > also so it will not always be able to program the accelerator. In this case > where an fpga is preprogramed with a multi function bitstream that is > statically provisioned cyborge will not be able to reprogram the slot if > any of the fuctions from that slot are already allocated to an instance. In > this case it will have to treat it like a fixed function device and simply > allocate a unused vf of the corret type if available. * > > > > > > In the current flow, the device type X is modeled as a resource class, so > Placement will count how many of them are in use. A flavor for ‘RC > device-type-X + function A’ will consume one instance of the RC > device-type-X. But this is not right because this precludes other > functions on the same device instance from getting used. > > One way to solve this is to declare functions A and B as resource classes > themselves and have the flavor request the function RC. Placement will then > correctly count the function instances. However, there is still a problem: > if the requested function A is not available, Placement will return an > empty list of RPs, but we need some way to reprogram some device to create > an instance of function A. > > > Clearly, nova is not going to be reprogramming devices with an instance of > a particular function. > > Cyborg might need to have a separate agent that listens to the nova > notifications queue and upon seeing an event that indicates a failed build > due to lack of resources, then Cyborg can try and reprogram a device and > then try rebuilding the original request. > > > > It was my understanding from that discussion that we intend to insert > Cyborg into the spawn workflow for device configuration in the same way > that we currently insert resources provided by Cinder and Neutron. So while > Nova won't be reprogramming a device, it will be calling out to Cyborg to > reprogram a device, and waiting while that happens. > > My understanding is (and I concede some areas are a little hazy): > > * The flavors says device type X with function Y > > * Placement tells us everywhere with device type X > > * A weigher orders these by devices which already have an available > function Y (where is this metadata stored?) > > * Nova schedules to host Z > > * Nova host Z asks cyborg for a local function Y and blocks > > * Cyborg hopefully returns function Y which is already available > > * If not, Cyborg reprograms a function Y, then returns it > > Can anybody correct me/fill in the gaps? > > *[Mooney, Sean K] that correlates closely to my recollection also. As for > the metadata I think the weigher may need to call to cyborg to retrieve > this as it will not be available in the host state object.* > Is it the nova scheduler weigher or we want to support weigh on placement? Function is traits as I think, so can we have preferred_traits? I remember we talk about that parameter in the past, but we don't have good use-case at that time. This is good use-case.
> Matt > > > > -- > > Matthew Booth > > Red Hat OpenStack Engineer, Compute DFG > > > > Phone: +442070094448 <+44%2020%207009%204448> (UK) > > > > __________________________________________________________________________ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > >
__________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev