Re: [PATCH v6 21/21] s390: doc: detailed specifications for AP virtualization

Tony Krowiak Thu, 05 Jul 2018 06:30:49 -0700

On 07/04/2018 12:31 PM, Boris Fiuczynski wrote:

On 07/03/2018 06:36 PM, Tony Krowiak wrote:
On 07/02/2018 07:10 PM, Halil Pasic wrote:
On 06/29/2018 11:11 PM, Tony Krowiak wrote:
This patch provides documentation describing the AP architecture and
design concepts behind the virtualization of AP devices. It also
includes an example of how to configure AP devices for exclusive
use of KVM guests.

Signed-off-by: Tony Krowiak <[email protected]>
I don't like the design of external interfaces except for:
* cpu model features, and
* reset handling.

In particular:
1) The architecture is such that authorizing access (via APM, AQMand ADM)to an AP queue that is currently not configured (e.g. the card notphysicallyplugged, or just configured off). That seems to be a perfectlynormal use
case.
Your assign operations however enforce that the resource is bound toyour
driver, and thus the existence of the resource in the host.
It is clear: we need to avoid passing trough resources to gueststhat are not
dedicated for this purpose (e.g. a queue utilized by zcrypt). But IMHO
we need a different mechanism.
Interesting that you wait until v6 to bring this up. I agree, this isa normaluse case, but there is currently no mechanism in the AP bus fordrivers toreserve devices that are not yet configured. There is proposedsolution in theworks, but until such time that is available the only choice is todisallowassignment of AP queues to a guest that are not bound to the vfio_apdevice driver.
2) I see no benefit in deferring the exclusivity check tovfio_ap_mdev_open().The downside is however pretty obvious: management software isnotified abouta 'bad configuration' only at an attempted guest start-up. And yourcurrent QEMU
patches are not very helpful in conveying this piece of information.
It only becomes a 'bad configuration' if the two guests are startedconcurrently.Is there value in being able to configure two mediated devices withthe samequeue if the intent is to never run two guests using those mediateddevicessimultaneously? If so, then the only time the exclusivity check canbe doneis when the guest opens the mediated device. If not, then we cancertainly
prevent multiple mediated devices from being assigned the same queue.
In my view, while a mediated device is used by a guest, it is not aguest andcan be configured any way an administrator prefers. If we getconcurrence
that doing an exclusivity check when an adapter or domain is assigned to
the mediated device, I'll make that change.
I've talked with Boris, and AFAIR he said this is not acceptable tohim (@Boris
can you confirm).
Then I suggest Boris participate in the review and explain why.
[To make things a bit easier I am not going to address the aspect ofnot-currently-exiting host resources.]Your current implementation does provide active configurations thatwork with existing host resources. These need to be bound to thevfio_ap driver.Libvirt allows to define objects (e.g. domains or networks). These arejust definitions and do NOT bind any resources. The defined resourcesare bound once the definition is started.Currently I am assuming that an ap matrix device is defined in libvirtoutside of a libvirt domain (an ap definition). The mediated device ofthe ap matrix device is used in a libvirt domain by referencing it viaits UID.When a libvirt domain is started the mediated device should exist andbe configured correctly as every other host resource.Therefore there needs to be something new in libvirt that allows oneto define, start, stop and undefine an ap matrix device. After adefine the ap definition for an ap matrix device would exist inlibvirt only.Once you start the ap definition the result should be a wellconfigured ready to be used mediated device representing the apdefinition which can be used configuration-error free by a libvirtdomain. Please not that the start of an ap definition is independentfrom the start of a libvirt domain using the ap definition.
Can you explain to me how that can be accomplished?

I can make a similar case for the mediated devices. Mediated devicesplay no role in guest configuration until a vfio-apdevice is specified on the QEMU command line when starting a guest. Inother words, a mediated device configuration isindependent from the start of a guest using the mediated device. Toanswer your question then, if there are two or moremediated devices with the same APQN(s) assigned, then only start onelibvirt domain that uses one of these mediateddevices. This begs the question: Does libvirt preclude one from defininga domain that uses a host device (of any kind)that must be dedicated to a single guest? If not, then isn't itincumbent upon the administrator to ensure he doesn'tstart two guests with the same dedicated host device? Wouldn't that samelogic apply to AP devices?

Having said that, I have no problem disallowing assignment of an APqueue to more than one mediated device, however; supposean administrator - for whatever reason - wants to create multiplemediated devices with the same APQN(s) assigned, butnever intends to run more than one guest using one of those mediateddevices concurrently. The question is - as I haveasked in another response - is there a use case for allowing anadministrator to configure multiple mediated devices with

the same APQN assigned?

3) We indicate the reason for failure due to a configuration problem(exclusivityor resource allocation) via pr_err() that is via kernel messages. Idon't thinkthis is very tooling/management software friendly, and I hope wedon't expect adminsto work with the sysfs interface long term. I mean the effects ofthe admin actionsare not very persistent. Thus if the interface is a painful one, weare talking
about potentially frequent pain.
We have multiple layers of software, each with its own loggingfacilities. Figuringout what went wrong when a guest fails to start is always a painfulprocess IMHO.Typically, one has to view the log for each component in the stack tofigure outwhat went wrong and often times, still can't figure it out. Ofcourse, we can helpout here by having QEMU put out a better message when this problemoccurs. But thebottom line is, does the community think that allowing anadministrator to configuremultiple mediated devices with the same queues have value? In otherwords, are
there potential use cases that would required this?
4) If I were to act out the role of the administrator, I wouldprefer to think ofspecifying or changing the access controls of a guest in respect toAP (that issetting the AP matrix) as a single atomic operation -- which eithersucceeds or fails.
I don't understand what you are describing here. How would this bedone? Are you
suggesting the admin somehow provides the masks en masse?
The operation should succeed for any valid configuration, and failfor any invalid
on.
The current piecemeal approach seems even less fitting if weconsider changing theaccess controls of a running guest. AFAIK changing access controlsfor a runningguest is possible, and I don't see a reason why should weartificially prohibit this.
Setting and clearing bits in the APM/AQM/ADM of a guest's CRYCB iscertainly possible,but there is a lot more to it than merely setting and clearing bits.What you seemto be describing here is hot plug/unplug which I stated in the coverletter is
forthcoming. It is currently prohibited for good reason.
I think the current sysfs interface for manipulating the matrix isgood formanual playing around, but I would prefer having an interface thatis better
suited for programs (e.g. ioctl).
That wouldn't be a problem, but do we have a use case for it?
Regards,
Halil

Re: [PATCH v6 21/21] s390: doc: detailed specifications for AP virtualization

Reply via email to