* Alex Williamson (alex.william...@redhat.com) wrote: > On Mon, 16 Nov 2020 14:52:26 +0100 > Cornelia Huck <coh...@redhat.com> wrote: > > > On Mon, 16 Nov 2020 11:02:51 +0000 > > Stefan Hajnoczi <stefa...@redhat.com> wrote: > > > > > On Wed, Nov 11, 2020 at 04:35:43PM +0100, Cornelia Huck wrote: > > > > On Wed, 11 Nov 2020 15:14:49 +0000 > > > > Stefan Hajnoczi <stefa...@redhat.com> wrote: > > > > > > > > > On Wed, Nov 11, 2020 at 12:48:53PM +0100, Cornelia Huck wrote: > > > > > > On Tue, 10 Nov 2020 13:14:04 -0700 > > > > > > Alex Williamson <alex.william...@redhat.com> wrote: > > > > > > > On Tue, 10 Nov 2020 09:53:49 +0000 > > > > > > > Stefan Hajnoczi <stefa...@redhat.com> wrote: > > > > > > > > > > > > > > Device models supported by an mdev driver and their details can > > > > > > > > be read from > > > > > > > > the migration_info.json attr. Each mdev type supports one > > > > > > > > device model. If a > > > > > > > > parent device supports multiple device models then each device > > > > > > > > model has an > > > > > > > > mdev type. There may be multiple mdev types for a single device > > > > > > > > model when they > > > > > > > > offer different migration parameters such as resource capacity > > > > > > > > or feature > > > > > > > > availability. > > > > > > > > > > > > > > > > For example, a graphics card that supports 4 GB and 8 GB device > > > > > > > > instances would > > > > > > > > provide gfx-4GB and gfx-8GB mdev types with memory=4096 and > > > > > > > > memory=8192 > > > > > > > > migration parameters, respectively. > > > > > > > > > > > > > > > > > > > > > I think this example could be expanded for clarity. I think this > > > > > > > is > > > > > > > suggesting we have mdev_types of gfx-4GB and gfx-8GB, which each > > > > > > > implement some common device model, ie. com.gfx/GPU, where the > > > > > > > migration parameter 'memory' for each defaults to a value > > > > > > > matching the > > > > > > > type name. But it seems like this can also lead to some > > > > > > > combinatorial > > > > > > > challenges for management tools if these parameters are writable. > > > > > > > For > > > > > > > example, should a management tool create a gfx-4GB device and > > > > > > > change to > > > > > > > memory parameter to 8192 or a gfx-8GB device with the default > > > > > > > parameter? > > > > > > > > > > > > I would expect that the mdev types need to match in the first place. > > > > > > What role would the memory= parameter play, then? Allowing gfx-4GB > > > > > > to > > > > > > have memory=8192 feels wrong to me. > > > > > > > > > > Yes, I expected these mdev types to only accept a fixed "memory" > > > > > value, > > > > > but there's nothing stopping a driver author from making "memory" > > > > > accept > > > > > any value. > > > > > > > > I'm wondering how useful the memory parameter is, then. The layer > > > > checking for compatibility can filter out inconsistent settings, but > > > > why would we need to express something that is already implied in the > > > > mdev type separately? > > > > > > To avoid tying device instances to specific mdev types. An mdev type is > > > a device implementation, but the goal is to enable migration between > > > device implementations (new/old or completely different > > > implementations). > > > > > > Imagine a new physical device that now offers variable memory because > > > users found the static mdev types too constraining. How do you migrate > > > back and forth between new and old physical devices if the migration > > > parameters don't describe the memory size? Migration parameters make it > > > possible. Without them the management tool needs to hard-code knowledge > > > of specific mdev types that support migration. > > > > But doesn't the management tool *still* need to keep hardcoded > > information about what the value of that memory parameter was for an > > existing mdev type? If we have gfx-variable with a memory parameter, > > fine; but if the target is supposed to accept a gfx-4GB device, it > > should simply instantiate a gfx-4GB device. > > > > I'm getting a bit worried about the complexity of the checking that > > management software is supposed to perform. Is it really that bad to > > restrict the models to a few, well-defined ones? Especially in the mdev > > case, where we have control about what is getting instantiated? > > This is exactly what I was noting with the combinatorial challenges of > the management tool. If a vendor chooses to use a generic base device > model which they modify with parameters to match an assortment of mdev > types, then management tools will need to match every mdev type > implementing that device model to determine if compatible parameters > exist. OTOH, the vendor could choose to create a device model that > specifically describes a single configuration of known parameters. > > For example, mdev type gfx-4GB might be a device model com.gfx/GPU with > a fixed memory parameter of 4GB or it could be a device model > com.gfx/GPU-4G with no additional parameter. The hard part is when the > vendor offers an mdev type gfx-varGB with device model com.gfx/GPU and > available memory options of 1GB, 2GB, 4GB, 8GB. At that point a > management tool might decide to create a gfx-varGB device instance and > tune the memory parameter or create a gfx-4GB instance, either would be > correct and we've expressed no preference for one or the other. Thanks,
What you've described here is exactly what happens with QEMU/libvirts confusion of CPU models. Both QEMU and Libvirt have their idea of what a named CPU model means and then add/subtract flags to get what they want. When libvirt wants a CPU model that doesn't quite match what it has (e.g. a host-compatibility thing where the host is a CPU it didn't know) it's heuristics to either start from above and remove things or start from below and add them. Dave > Alex -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK