On Wed, Nov 23, 2022 at 12:51:39PM +0100, Jiri Denemark wrote:
> On Wed, Nov 23, 2022 at 11:41:03 +0000, Daniel P. Berrangé wrote:
> > On Wed, Nov 23, 2022 at 12:29:24PM +0100, Jiri Denemark wrote:
> > > On Wed, Nov 23, 2022 at 11:18:01 +0000, Daniel P. Berrangé wrote:
> > > > On Tue, Nov 22, 2022 at 01:48:39PM +0100, Jiri Denemark wrote:
> > > > > On Mon, Nov 21, 2022 at 12:47:58 +0530, manish.mishra wrote:
> > > > > > 
> > > > > > On 18/11/22 5:00 pm, Jiri Denemark wrote:
> > > > > > > On Fri, Nov 18, 2022 at 10:18:59 +0000, John Levon wrote:
> > > > > > >> On Fri, Nov 18, 2022 at 10:52:32AM +0100, Jiri Denemark wrote:
> > > > > > >>
> > > > > > >>>>    * Qemu already provide an option 'enforce' to validate if 
> > > > > > >>>> features
> > > > > > >>>>    with which vm is started is exactly same as one provided 
> > > > > > >>>> and nothing
> > > > > > >>>>    is silently dropped.
> > > > > > >>> Right, but it's not enough. In addition to removed features 
> > > > > > >>> libvirt also
> > > > > > >>> checks for unexpectedly added features. And you really need to 
> > > > > > >>> do both.
> > > > > > >>> Because if you ask for -cpu Model,feat1=on,feat2=on,enforce and 
> > > > > > >>> QEMU
> > > > > > >>> says everything is fine, the guest might see more than what you 
> > > > > > >>> asked.
> > > > > > >>> For example, if a feature is enabled only if a host supports it 
> > > > > > >>> you may
> > > > > > >>> or may not get it without any complains from QEMU. But if you 
> > > > > > >>> get it you
> > > > > > >>> really need to explicitly ask for it during migration, 
> > > > > > >>> otherwise the
> > > > > > >>> feature can just silently disappear. Of course, this would be a 
> > > > > > >>> really
> > > > > > >>> bad behavior from QEMU, but that does not mean it can't happen 
> > > > > > >>> (I think
> > > > > > >>> SVM is a bit problematic in this way) and the whole point of 
> > > > > > >>> libvirt's
> > > > > > >>> checks is to prevent this kind of issues.
> > > > > > >> Hi Jiri, I'm not following this very well. I think you're saying 
> > > > > > >> that qemu has
> > > > > > >> had bugs previously where features get silently enabled,
> > > > > > > Personally, I haven't actually witnessed any bug in this area (as 
> > > > > > > far as
> > > > > > > I can remember, which is not that far :-)), but got some reports 
> > > > > > > of at
> > > > > > > least one, even though without any proof.
> > > > > > >
> > > > > > > Specifically, SVM is quite strange as it is included in all AMD 
> > > > > > > CPU
> > > > > > > models in QEMU and yet if you try to start it on a host without 
> > > > > > > nesting
> > > > > > > enabled "enforce" does not complain. I saw the feature is enabled 
> > > > > > > with
> > > > > > > older machine types, but I was told the magic behind this feature 
> > > > > > > looks
> > > > > > > like not only machine type but even host configuration itself is
> > > > > > > involved. Anyway, although I saw reports of that the feature 
> > > > > > > could be
> > > > > > > enabled without an explicit request even with new machine types I 
> > > > > > > still
> > > > > > > haven't seen any proof of this happening. So I hope it just does 
> > > > > > > not
> > > > > > > happen and users are only afraid of this possibility.
> > > > > > >
> > > > > > >> and it's libvirt's job/role to paper over those issues? Do you 
> > > > > > >> have
> > > > > > >> some specific cases of this?
> > > > > > > This is not about papering. It's actually the opposite, that is 
> > > > > > > about
> > > > > > > detecting if something like this happens and reporting it as a 
> > > > > > > failure
> > > > > > > rather than papering it and hoping everything goes well. So I 
> > > > > > > think
> > > > > > > doing this is a good idea even though I don't think we actually 
> > > > > > > saw any
> > > > > > > issue of this kind.
> > > > > > 
> > > > > > 
> > > > > > Hi Jiri, I see now with libvirt master, with check=='full' we verify
> > > > > > both silently dropped as well as added features. But as you already
> > > > > > stated Qemu silently adding feature is a Qemu bug and libvirt just
> > > > > > reports that bug, so it should be very unlikely, i agree that is 
> > > > > > not a
> > > > > > good reasoning :). Our requirement is that we want to use CPU Models
> > > > > > and features which are defined in Qemu but not in libvirt for e.g if
> > > > > > we want to use Icelake-Server-V4 directly or newly added vmx 
> > > > > > feature,
> > > > > > libvirt does not allow. Currently we take help of qemu to do
> > > > > > validations but for cpu feature verfication and model definations we
> > > > > > still use libvirt defined definations which prevent us to use 
> > > > > > anything
> > > > > > which is not defined in libvirt. I see there are already efforts 
> > > > > > going
> > > > > > on to get all model and feature defination from qemu itself, but not
> > > > > > sure how much time it will take. Till that happens we thought safest
> > > > > > option is to have an option to remove all validations from libvirt 
> > > > > > and
> > > > > > rely on qemu 'enforce' for migration safetly. I understand
> > > > > > qemu-enforce does not check for silently added features, but that 
> > > > > > case
> > > > > > we can assume is very unlikely and Qemu should fix otherwise VMs 
> > > > > > will
> > > > > > not poweron anyway with check=='full'. Basically we want it as an
> > > > > > modification of check='none' but also skipping things like
> > > > > > virCPUValidateFeatures and passing option 'enforce' to Qemu. Or if
> > > > > > silently adding features is that big concern we can have a check in
> > > > > > Qemu itself? I understand current qemu-enforce is not as migration
> > > > > > safe as check=='full' but probably suitable for our use case for 
> > > > > > time
> > > > > > being?
> > > > > 
> > > > > I understand why you want this, but on the other hand adding just a
> > > > > plain pass-through for CPU model and features is quite dangerous as it
> > > > > can be used to set any -cpu option even if it's not a feature. And
> > > > > especially the parts that are configured in other areas of domain XML
> > > > > (such as hypervisor features). So I think instead of adding a new mode
> > > > > for <cpu> we should go another way. For things that are not yet
> > > > > supported by libvirt we support various elements in qemu namespace,
> > > > > e.g., setting QEMU command line options, environment variables, or 
> > > > > even
> > > > > overriding options libvirt would normally set on the command line. So 
> > > > > I
> > > > > guess we could have a special <qemu:cpu> element which would be used
> > > > > when a user wants full control over the -cpu option without any 
> > > > > libvirt
> > > > > intervention.
> > > > 
> > > > I really don't think we should allow this, especially not for something
> > > > which is clearly intended to be used for production deployment. Our
> > > > hypervisor CLI passthrough is there to allow for short term workarounds
> > > > for bugs, or to experiment with a feature before it is mapped into
> > > > libvirt in a supported manner.
> > > > 
> > > > If there are aspects related to QEMU configuration thiat are not in
> > > > libvirt yet, we need to address those gaps, not provide yet another
> > > > way to bypass libvirt mapping.
> > > 
> > > Indeed, this was definitely meant as a short term workaround for stuff
> > > we don't have support for yet, for testing or experimental purposes. The
> > > supported approach is for implement the missing parts in libvirt (and
> > > QEMU) as soon as possible. I don't see why would a properly documented
> > > support for experiments with -cpu would be an issue.
> > 
> > Why can't it just use the exsting QEMU passthrough syntax we have.
> > I don't think we should be adding specifial support just for CPUs
> 
> That would be nice, but the QEMU passthrough syntax cannot be used for
> changing options that libvirt already passes to QEMU. So using it would
> likely result in two separate -cpu options on QEMU command line. And it
> would not rule out the CPU verification code in libvirt. Remember, we
> add a default -cpu model in case there's none configured in the XML.

Two -cpu options isn't a problem, the latter will override the former,
which is fine from the level of support intended for QEMU passthrough
usage.

For the libvirt checking, isn't the 'check=none' attr sufficient
to skip checks libvirt does.


With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

Reply via email to