On Wed, Nov 23, 2022 at 12:51:39PM +0100, Jiri Denemark wrote: > On Wed, Nov 23, 2022 at 11:41:03 +0000, Daniel P. Berrangé wrote: > > On Wed, Nov 23, 2022 at 12:29:24PM +0100, Jiri Denemark wrote: > > > On Wed, Nov 23, 2022 at 11:18:01 +0000, Daniel P. Berrangé wrote: > > > > On Tue, Nov 22, 2022 at 01:48:39PM +0100, Jiri Denemark wrote: > > > > > On Mon, Nov 21, 2022 at 12:47:58 +0530, manish.mishra wrote: > > > > > > > > > > > > On 18/11/22 5:00 pm, Jiri Denemark wrote: > > > > > > > On Fri, Nov 18, 2022 at 10:18:59 +0000, John Levon wrote: > > > > > > >> On Fri, Nov 18, 2022 at 10:52:32AM +0100, Jiri Denemark wrote: > > > > > > >> > > > > > > >>>> * Qemu already provide an option 'enforce' to validate if > > > > > > >>>> features > > > > > > >>>> with which vm is started is exactly same as one provided > > > > > > >>>> and nothing > > > > > > >>>> is silently dropped. > > > > > > >>> Right, but it's not enough. In addition to removed features > > > > > > >>> libvirt also > > > > > > >>> checks for unexpectedly added features. And you really need to > > > > > > >>> do both. > > > > > > >>> Because if you ask for -cpu Model,feat1=on,feat2=on,enforce and > > > > > > >>> QEMU > > > > > > >>> says everything is fine, the guest might see more than what you > > > > > > >>> asked. > > > > > > >>> For example, if a feature is enabled only if a host supports it > > > > > > >>> you may > > > > > > >>> or may not get it without any complains from QEMU. But if you > > > > > > >>> get it you > > > > > > >>> really need to explicitly ask for it during migration, > > > > > > >>> otherwise the > > > > > > >>> feature can just silently disappear. Of course, this would be a > > > > > > >>> really > > > > > > >>> bad behavior from QEMU, but that does not mean it can't happen > > > > > > >>> (I think > > > > > > >>> SVM is a bit problematic in this way) and the whole point of > > > > > > >>> libvirt's > > > > > > >>> checks is to prevent this kind of issues. > > > > > > >> Hi Jiri, I'm not following this very well. I think you're saying > > > > > > >> that qemu has > > > > > > >> had bugs previously where features get silently enabled, > > > > > > > Personally, I haven't actually witnessed any bug in this area (as > > > > > > > far as > > > > > > > I can remember, which is not that far :-)), but got some reports > > > > > > > of at > > > > > > > least one, even though without any proof. > > > > > > > > > > > > > > Specifically, SVM is quite strange as it is included in all AMD > > > > > > > CPU > > > > > > > models in QEMU and yet if you try to start it on a host without > > > > > > > nesting > > > > > > > enabled "enforce" does not complain. I saw the feature is enabled > > > > > > > with > > > > > > > older machine types, but I was told the magic behind this feature > > > > > > > looks > > > > > > > like not only machine type but even host configuration itself is > > > > > > > involved. Anyway, although I saw reports of that the feature > > > > > > > could be > > > > > > > enabled without an explicit request even with new machine types I > > > > > > > still > > > > > > > haven't seen any proof of this happening. So I hope it just does > > > > > > > not > > > > > > > happen and users are only afraid of this possibility. > > > > > > > > > > > > > >> and it's libvirt's job/role to paper over those issues? Do you > > > > > > >> have > > > > > > >> some specific cases of this? > > > > > > > This is not about papering. It's actually the opposite, that is > > > > > > > about > > > > > > > detecting if something like this happens and reporting it as a > > > > > > > failure > > > > > > > rather than papering it and hoping everything goes well. So I > > > > > > > think > > > > > > > doing this is a good idea even though I don't think we actually > > > > > > > saw any > > > > > > > issue of this kind. > > > > > > > > > > > > > > > > > > Hi Jiri, I see now with libvirt master, with check=='full' we verify > > > > > > both silently dropped as well as added features. But as you already > > > > > > stated Qemu silently adding feature is a Qemu bug and libvirt just > > > > > > reports that bug, so it should be very unlikely, i agree that is > > > > > > not a > > > > > > good reasoning :). Our requirement is that we want to use CPU Models > > > > > > and features which are defined in Qemu but not in libvirt for e.g if > > > > > > we want to use Icelake-Server-V4 directly or newly added vmx > > > > > > feature, > > > > > > libvirt does not allow. Currently we take help of qemu to do > > > > > > validations but for cpu feature verfication and model definations we > > > > > > still use libvirt defined definations which prevent us to use > > > > > > anything > > > > > > which is not defined in libvirt. I see there are already efforts > > > > > > going > > > > > > on to get all model and feature defination from qemu itself, but not > > > > > > sure how much time it will take. Till that happens we thought safest > > > > > > option is to have an option to remove all validations from libvirt > > > > > > and > > > > > > rely on qemu 'enforce' for migration safetly. I understand > > > > > > qemu-enforce does not check for silently added features, but that > > > > > > case > > > > > > we can assume is very unlikely and Qemu should fix otherwise VMs > > > > > > will > > > > > > not poweron anyway with check=='full'. Basically we want it as an > > > > > > modification of check='none' but also skipping things like > > > > > > virCPUValidateFeatures and passing option 'enforce' to Qemu. Or if > > > > > > silently adding features is that big concern we can have a check in > > > > > > Qemu itself? I understand current qemu-enforce is not as migration > > > > > > safe as check=='full' but probably suitable for our use case for > > > > > > time > > > > > > being? > > > > > > > > > > I understand why you want this, but on the other hand adding just a > > > > > plain pass-through for CPU model and features is quite dangerous as it > > > > > can be used to set any -cpu option even if it's not a feature. And > > > > > especially the parts that are configured in other areas of domain XML > > > > > (such as hypervisor features). So I think instead of adding a new mode > > > > > for <cpu> we should go another way. For things that are not yet > > > > > supported by libvirt we support various elements in qemu namespace, > > > > > e.g., setting QEMU command line options, environment variables, or > > > > > even > > > > > overriding options libvirt would normally set on the command line. So > > > > > I > > > > > guess we could have a special <qemu:cpu> element which would be used > > > > > when a user wants full control over the -cpu option without any > > > > > libvirt > > > > > intervention. > > > > > > > > I really don't think we should allow this, especially not for something > > > > which is clearly intended to be used for production deployment. Our > > > > hypervisor CLI passthrough is there to allow for short term workarounds > > > > for bugs, or to experiment with a feature before it is mapped into > > > > libvirt in a supported manner. > > > > > > > > If there are aspects related to QEMU configuration thiat are not in > > > > libvirt yet, we need to address those gaps, not provide yet another > > > > way to bypass libvirt mapping. > > > > > > Indeed, this was definitely meant as a short term workaround for stuff > > > we don't have support for yet, for testing or experimental purposes. The > > > supported approach is for implement the missing parts in libvirt (and > > > QEMU) as soon as possible. I don't see why would a properly documented > > > support for experiments with -cpu would be an issue. > > > > Why can't it just use the exsting QEMU passthrough syntax we have. > > I don't think we should be adding specifial support just for CPUs > > That would be nice, but the QEMU passthrough syntax cannot be used for > changing options that libvirt already passes to QEMU. So using it would > likely result in two separate -cpu options on QEMU command line. And it > would not rule out the CPU verification code in libvirt. Remember, we > add a default -cpu model in case there's none configured in the XML.
Two -cpu options isn't a problem, the latter will override the former, which is fine from the level of support intended for QEMU passthrough usage. For the libvirt checking, isn't the 'check=none' attr sufficient to skip checks libvirt does. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|