Re: [libvirt] [Qemu-devel] [PATCH/QEMU] s390x/kvm: use cpu_model_available for guarded storage on compat machines
On 10/27/2017 07:12 PM, Jiri Denemark wrote: > On Fri, Oct 27, 2017 at 17:18:44 +0200, Halil Pasic wrote: >> On 10/27/2017 04:06 PM, Christian Borntraeger wrote: > ... >>> I talked to several people and it seems that on x86 the host model will >>> also enable new features >>> that are not known by older QEMUs and its considered works as designed. >>> (see also Jiris mail) >> >> Yes, I've seen that. It would be nice though if this design was easier to >> find in written. Unfortunately I can read minds only to a very limited >> extent, >> and the written stuff I've read did not give me a full understanding of the >> design -- although the entity to blame for this could be my limited >> intellect. > > I think it's more likely the documentation is just not perfect. I'll > look at it and try to make it better. I know about some parts which need > to be clarified thanks to this discussion. > > Jirka > Feel free to put me on cc for the patch. I would be happy to give it a look, but since I'm not actively reading the libvirt list, I'm afraid it will slip past me if I'm not in cc. Regards, Halil -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [Qemu-devel] [PATCH/QEMU] s390x/kvm: use cpu_model_available for guarded storage on compat machines
On Fri, Oct 27, 2017 at 17:18:44 +0200, Halil Pasic wrote: > On 10/27/2017 04:06 PM, Christian Borntraeger wrote: ... > > I talked to several people and it seems that on x86 the host model will > > also enable new features > > that are not known by older QEMUs and its considered works as designed. > > (see also Jiris mail) > > Yes, I've seen that. It would be nice though if this design was easier to > find in written. Unfortunately I can read minds only to a very limited extent, > and the written stuff I've read did not give me a full understanding of the > design -- although the entity to blame for this could be my limited intellect. I think it's more likely the documentation is just not perfect. I'll look at it and try to make it better. I know about some parts which need to be clarified thanks to this discussion. Jirka -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [Qemu-devel] [PATCH/QEMU] s390x/kvm: use cpu_model_available for guarded storage on compat machines
On 10/27/2017 04:06 PM, Christian Borntraeger wrote: > > > On 10/27/2017 03:40 PM, Halil Pasic wrote: >> >> >> On 10/27/2017 02:57 PM, Christian Borntraeger wrote: >>> >>> >>> On 10/27/2017 02:45 PM, Christian Borntraeger wrote: On 10/27/2017 02:31 PM, Halil Pasic wrote: gs is explicitly disabled. > > Now that I think about it, maybe the 2.9 binary is going to reject > the explicit gs flag altogether, because it's unknown. > > Isn't this a problem? No. This is exactly the _solution_ and not the problem. The target will reject unknown cpu features and migration will be aborted. This is exactly what the CPU model is for. >> >> I'm not sure we talk abut the same thing. I'm talking about the following. I >> want to disable a cpu-model feature for the sake of migration (because I know >> that binary version X does not support the feature, because it does not know >> about it). Now if I do it via let's say -cpu z13,gs=off on let's say 2.11, >> and start with the exact same command line (-cpu z13,gs=off) on lets say 2.9 >> my migration will explode because of the unknown feature I'm specifying >> not to be used. > > The migration will be rejected because the target qemu will not startup. > You can easily simulate that, e.g. by doing > > qemu-system-s390x -cpu z13,notyetknown=off > qemu-system-s390x: can't apply global z13-s390x-cpu.notyetknown=off: Property > '.notyetknown' not found > > But libvirt will not use a full model (and the disable things) instead it will > use the base model and add things. (So libvirt should never use xxx=off) > That piece of the puzzle was missing for me (no xxx=off for M minus M-base features). > > I think this is really not an issue. If you specify a feature that is not > known then > QEMU will not start on the target and migration is rejected. The guest > continues to run > on the source. So if you specify a "too new" facility yourself its really a > user error. > Everything that uses an explicit model (e.g. -cpu z13 or -cpu,sief2=on) will > work, but only > as long as the conditions are met. If you specify -cpu z14, it does not > matter if it fails > if the kernel or QEMU is too old, or if you just happen to run on a z13. > > The only question is/was: what is about "host-model". > With my patch (+ the gs fixup) the following things will work: > - host-model will work on z13 > - host-model will work on z14 (any machine version) > - host model on z13 and then migrating to z14 will work (any machine version) > - host model on z13 and then migrating to z14 and then migrating back will > work (any version) > - qemu with fixup + host model on z14 with machine version 2.10 can be > migrated > > The only thing that does not work is > - qemu with fixup + host model on z14 with machine 2.9 can not be migrated to > qemu 2.9 on z14. > I agree. > Now: this would have not worked anyway, because qemu 2.9 does not know z14. > So in theory > QEMU must forbit z14 for compat machines (which we do not know).> Noted. > I talked to several people and it seems that on x86 the host model will also > enable new features > that are not known by older QEMUs and its considered works as designed. (see > also Jiris mail) > Yes, I've seen that. It would be nice though if this design was easier to find in written. Unfortunately I can read minds only to a very limited extent, and the written stuff I've read did not give me a full understanding of the design -- although the entity to blame for this could be my limited intellect. > >> >> Well I'm not sure what I describe is relevant. My thinking is along the lines >> some features are added incrementally. How do use those of the features not >> included >> in -base model which both of my environments support and disable those that >> are unsupported by one of the environments. >> >> I will think about it some more. I've asked Boris about this situation, >> and he did not put my mind at ease (to be more precise he seemed to >> see this as a potential problem too), so I've decided to mention it. >> Sorry if I've generated some unnecessary noise. >> >> I think the root of the problem is that I don't understand the difference >> between >> z13-base and z13, and the associated rules and expected/intended usages. > > z13-base contains only those features that a guaranteed to be there (there is > the list of non-hypervisor managed features). z13 is z13-base + all features > that > will be available in a reasonably recent kernel+qemu combination and make > sense > to be there a default. So it might happen that you cannot start -cpu z14, > e.g. > if you run on a kernel < 4.12. > OK. I will have to learn more about this. IMHO it does not make sense to burden the community with the holes in my knowledge any more, but I will have to stuff them to feel comfortable in this area. Regards, Halil -- libvir-list mailing list libvir-list@redhat.com https://www
Re: [libvirt] [Qemu-devel] [PATCH/QEMU] s390x/kvm: use cpu_model_available for guarded storage on compat machines
On 10/27/2017 03:40 PM, Halil Pasic wrote: > > > On 10/27/2017 02:57 PM, Christian Borntraeger wrote: >> >> >> On 10/27/2017 02:45 PM, Christian Borntraeger wrote: >>> >>> >>> On 10/27/2017 02:31 PM, Halil Pasic wrote: >>> gs is explicitly disabled. Now that I think about it, maybe the 2.9 binary is going to reject the explicit gs flag altogether, because it's unknown. Isn't this a problem? >>> >>> No. This is exactly the _solution_ and not the problem. The target will >>> reject >>> unknown cpu features and migration will be aborted. This is exactly what >>> the CPU >>> model is for. > > I'm not sure we talk abut the same thing. I'm talking about the following. I > want to disable a cpu-model feature for the sake of migration (because I know > that binary version X does not support the feature, because it does not know > about it). Now if I do it via let's say -cpu z13,gs=off on let's say 2.11, > and start with the exact same command line (-cpu z13,gs=off) on lets say 2.9 > my migration will explode because of the unknown feature I'm specifying > not to be used. The migration will be rejected because the target qemu will not startup. You can easily simulate that, e.g. by doing qemu-system-s390x -cpu z13,notyetknown=off qemu-system-s390x: can't apply global z13-s390x-cpu.notyetknown=off: Property '.notyetknown' not found But libvirt will not use a full model (and the disable things) instead it will use the base model and add things. (So libvirt should never use xxx=off) I think this is really not an issue. If you specify a feature that is not known then QEMU will not start on the target and migration is rejected. The guest continues to run on the source. So if you specify a "too new" facility yourself its really a user error. Everything that uses an explicit model (e.g. -cpu z13 or -cpu,sief2=on) will work, but only as long as the conditions are met. If you specify -cpu z14, it does not matter if it fails if the kernel or QEMU is too old, or if you just happen to run on a z13. The only question is/was: what is about "host-model". With my patch (+ the gs fixup) the following things will work: - host-model will work on z13 - host-model will work on z14 (any machine version) - host model on z13 and then migrating to z14 will work (any machine version) - host model on z13 and then migrating to z14 and then migrating back will work (any version) - qemu with fixup + host model on z14 with machine version 2.10 can be migrated The only thing that does not work is - qemu with fixup + host model on z14 with machine 2.9 can not be migrated to qemu 2.9 on z14. Now: this would have not worked anyway, because qemu 2.9 does not know z14. So in theory QEMU must forbit z14 for compat machines (which we do not know). I talked to several people and it seems that on x86 the host model will also enable new features that are not known by older QEMUs and its considered works as designed. (see also Jiris mail) > > Well I'm not sure what I describe is relevant. My thinking is along the lines > some features are added incrementally. How do use those of the features not > included > in -base model which both of my environments support and disable those that > are unsupported by one of the environments. > > I will think about it some more. I've asked Boris about this situation, > and he did not put my mind at ease (to be more precise he seemed to > see this as a potential problem too), so I've decided to mention it. > Sorry if I've generated some unnecessary noise. > > I think the root of the problem is that I don't understand the difference > between > z13-base and z13, and the associated rules and expected/intended usages. z13-base contains only those features that a guaranteed to be there (there is the list of non-hypervisor managed features). z13 is z13-base + all features that will be available in a reasonably recent kernel+qemu combination and make sense to be there a default. So it might happen that you cannot start -cpu z14, e.g. if you run on a kernel < 4.12. >> FWIW, I think in your particular case the QEMU will reject the z14 cpu and >> not even come >> to checking the gs. >> > > I had a z13 cpu model in mind. I don't mention a z14 cpu-model (QEMU, not hw) > in > my whole email. > > Regards, > Halil > -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [Qemu-devel] [PATCH/QEMU] s390x/kvm: use cpu_model_available for guarded storage on compat machines
On 10/27/2017 03:31 PM, Cornelia Huck wrote: > On Fri, 27 Oct 2017 14:42:57 +0200 > Christian Borntraeger wrote: > >> Yes, we should also replace that with >> >> return s390_has_feat(S390_FEAT_GUARDED_STORAGE) >> >> I can fixup my patch or provide a 2nd one. >> > > Consider a fixed up patch acked by me. > +1 You can keep my ack too. I will try to find some time and read the v2 though. -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [Qemu-devel] [PATCH/QEMU] s390x/kvm: use cpu_model_available for guarded storage on compat machines
On 10/27/2017 02:57 PM, Christian Borntraeger wrote: > > > On 10/27/2017 02:45 PM, Christian Borntraeger wrote: >> >> >> On 10/27/2017 02:31 PM, Halil Pasic wrote: >> gs is explicitly disabled. >>> >>> Now that I think about it, maybe the 2.9 binary is going to reject >>> the explicit gs flag altogether, because it's unknown. >>> >>> Isn't this a problem? >> >> No. This is exactly the _solution_ and not the problem. The target will >> reject >> unknown cpu features and migration will be aborted. This is exactly what the >> CPU >> model is for. I'm not sure we talk abut the same thing. I'm talking about the following. I want to disable a cpu-model feature for the sake of migration (because I know that binary version X does not support the feature, because it does not know about it). Now if I do it via let's say -cpu z13,gs=off on let's say 2.11, and start with the exact same command line (-cpu z13,gs=off) on lets say 2.9 my migration will explode because of the unknown feature I'm specifying not to be used. Well I'm not sure what I describe is relevant. My thinking is along the lines some features are added incrementally. How do use those of the features not included in -base model which both of my environments support and disable those that are unsupported by one of the environments. I will think about it some more. I've asked Boris about this situation, and he did not put my mind at ease (to be more precise he seemed to see this as a potential problem too), so I've decided to mention it. Sorry if I've generated some unnecessary noise. I think the root of the problem is that I don't understand the difference between z13-base and z13, and the associated rules and expected/intended usages. > FWIW, I think in your particular case the QEMU will reject the z14 cpu and > not even come > to checking the gs. > I had a z13 cpu model in mind. I don't mention a z14 cpu-model (QEMU, not hw) in my whole email. Regards, Halil -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [Qemu-devel] [PATCH/QEMU] s390x/kvm: use cpu_model_available for guarded storage on compat machines
On Fri, 27 Oct 2017 14:42:57 +0200 Christian Borntraeger wrote: > Yes, we should also replace that with > > return s390_has_feat(S390_FEAT_GUARDED_STORAGE) > > I can fixup my patch or provide a 2nd one. > Consider a fixed up patch acked by me. -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [Qemu-devel] [PATCH/QEMU] s390x/kvm: use cpu_model_available for guarded storage on compat machines
On 10/27/2017 02:45 PM, Christian Borntraeger wrote: > > > On 10/27/2017 02:31 PM, Halil Pasic wrote: > gs is explicitly disabled. >> >> Now that I think about it, maybe the 2.9 binary is going to reject >> the explicit gs flag altogether, because it's unknown. >> >> Isn't this a problem? > > No. This is exactly the _solution_ and not the problem. The target will reject > unknown cpu features and migration will be aborted. This is exactly what the > CPU > model is for. FWIW, I think in your particular case the QEMU will reject the z14 cpu and not even come to checking the gs. -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [Qemu-devel] [PATCH/QEMU] s390x/kvm: use cpu_model_available for guarded storage on compat machines
On 10/27/2017 02:31 PM, Halil Pasic wrote: gs is explicitly disabled. > > Now that I think about it, maybe the 2.9 binary is going to reject > the explicit gs flag altogether, because it's unknown. > > Isn't this a problem? No. This is exactly the _solution_ and not the problem. The target will reject unknown cpu features and migration will be aborted. This is exactly what the CPU model is for. -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [Qemu-devel] [PATCH/QEMU] s390x/kvm: use cpu_model_available for guarded storage on compat machines
On 10/27/2017 02:31 PM, Halil Pasic wrote: > > > On 10/25/2017 08:13 PM, Jason J. Herne wrote: >> On 10/20/2017 10:54 AM, Christian Borntraeger wrote: >>> Starting a guest with >>> >>> hvm >>> >>> >>> >>> on an IBM z14 results in >>> >>> "qemu-system-s390x: Some features requested in the CPU model are not >>> available in the configuration: gs" >>> >>> This is because guarded storage is fenced for compat machines that did not >>> have >>> guarded storage support, but libvirt expands the cpu model according to the >>> latest available machine. >>> >>> While this prevents future migration abort (by not starting the guest at >>> all), >>> not being able to start a "host-model" guest is very much unexpected. As it >>> turns out, even if we would modify libvirt to not expand the cpu model to >>> contain "gs" for compat machines, it cannot guarantee that a migration will >>> succeed. For example if the kernel changes its features (or the user has >>> nested=1 on one host but not on the other) the migration will fail >>> nevertheless. So instead of fencing "gs" for machines <= 2.9 lets allow it >>> for >>> all machine types that support the CPU model. This will make "host-model" >>> runnable all the time, while relying on the CPU model to reject invalid >>> migration attempts. >> ... >>> - if (gs_allowed()) { >>> + if (cpu_model_allowed()) { >>> if (kvm_vm_enable_cap(s, KVM_CAP_S390_GS, 0) == 0) { >>> cap_gs = 1; > > > @Jason > > Hi Jason, > > I don't have access to a z14 at the moment, and since you do, I would > like to try out something. > > I will first describe my concern, and then the test scenario. > > The last line above, cap_gs = 1, has the side effect of returning > true ever after. > > int kvm_s390_get_gs(void) > > { > > return cap_gs; > > } > > Now considering > static bool gscb_needed(void *opaque) > { > return kvm_s390_get_gs(); > } Yes, we should also replace that with return s390_has_feat(S390_FEAT_GUARDED_STORAGE) I can fixup my patch or provide a 2nd one. -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [Qemu-devel] [PATCH/QEMU] s390x/kvm: use cpu_model_available for guarded storage on compat machines
On 10/25/2017 08:13 PM, Jason J. Herne wrote: > On 10/20/2017 10:54 AM, Christian Borntraeger wrote: >> Starting a guest with >> >> hvm >> >> >> >> on an IBM z14 results in >> >> "qemu-system-s390x: Some features requested in the CPU model are not >> available in the configuration: gs" >> >> This is because guarded storage is fenced for compat machines that did not >> have >> guarded storage support, but libvirt expands the cpu model according to the >> latest available machine. >> >> While this prevents future migration abort (by not starting the guest at >> all), >> not being able to start a "host-model" guest is very much unexpected. As it >> turns out, even if we would modify libvirt to not expand the cpu model to >> contain "gs" for compat machines, it cannot guarantee that a migration will >> succeed. For example if the kernel changes its features (or the user has >> nested=1 on one host but not on the other) the migration will fail >> nevertheless. So instead of fencing "gs" for machines <= 2.9 lets allow it >> for >> all machine types that support the CPU model. This will make "host-model" >> runnable all the time, while relying on the CPU model to reject invalid >> migration attempts. > ... >> - if (gs_allowed()) { >> + if (cpu_model_allowed()) { >> if (kvm_vm_enable_cap(s, KVM_CAP_S390_GS, 0) == 0) { >> cap_gs = 1; @Jason Hi Jason, I don't have access to a z14 at the moment, and since you do, I would like to try out something. I will first describe my concern, and then the test scenario. The last line above, cap_gs = 1, has the side effect of returning true ever after. int kvm_s390_get_gs(void) { return cap_gs; } Now considering static bool gscb_needed(void *opaque) { return kvm_s390_get_gs(); } const VMStateDescription vmstate_gscb = { .name = "cpu/gscb", .version_id = 1, .minimum_version_id = 1, .needed = gscb_needed, .fields = (VMStateField[]) { VMSTATE_UINT64_ARRAY(env.gscb, S390CPU, 4), VMSTATE_END_OF_LIST() } }; const VMStateDescription vmstate_s390_cpu = { .name = "cpu", .post_load = cpu_post_load, .pre_save = cpu_pre_save, .version_id = 4, .minimum_version_id = 3, .fields = (VMStateField[]) { VMSTATE_UINT64_ARRAY(env.regs, S390CPU, 16), VMSTATE_UINT64(env.psw.mask, S390CPU), VMSTATE_UINT64(env.psw.addr, S390CPU), VMSTATE_UINT64(env.psa, S390CPU), VMSTATE_UINT32(env.todpr, S390CPU), VMSTATE_UINT64(env.pfault_token, S390CPU), VMSTATE_UINT64(env.pfault_compare, S390CPU), VMSTATE_UINT64(env.pfault_select, S390CPU), VMSTATE_UINT64(env.cputm, S390CPU), VMSTATE_UINT64(env.ckc, S390CPU), VMSTATE_UINT64(env.gbea, S390CPU), VMSTATE_UINT64(env.pp, S390CPU), VMSTATE_UINT32_ARRAY(env.aregs, S390CPU, 16), VMSTATE_UINT64_ARRAY(env.cregs, S390CPU, 16), VMSTATE_UINT8(env.cpu_state, S390CPU), VMSTATE_UINT8(env.sigp_order, S390CPU), VMSTATE_UINT32_V(irqstate_saved_size, S390CPU, 4), VMSTATE_VBUFFER_UINT32(irqstate, S390CPU, 4, NULL, irqstate_saved_size), VMSTATE_END_OF_LIST() }, .subsections = (const VMStateDescription*[]) { &vmstate_fpu, &vmstate_vregs, &vmstate_riccb, &vmstate_exval, &vmstate_gscb, NULL }, }; I would expect the vmstate_gscb subsection being sent, even if gs is disabled via cpu-model if kernel and possibly machine has gs support (and qemu has cpu-models). So the test scenario I want you to play trough is the following. Take the latest-greatest qemu with this patch applied. Make sure gs works (is provided to the guest) with a 2.9 machine version, and a fully specified cpu-model. Now disable gs explicitly. Try to migrate this to another machine having a 2.9 binary. I expect the migration failing because, the subsection is going to be sent by the latest-greatest binary, but is unknown to the 2.9 binary. Notice this is despite the fact that gs is explicitly disabled. Now that I think about it, maybe the 2.9 binary is going to reject the explicit gs flag altogether, because it's unknown. Isn't this a problem? I'm afraid like this the only migration-safe variant is -base, but that would essentially make adding features incrementally impossible. But I hypothesize trying to migrate with z13 or even z13-base would also trigger the unknown subsection problem. Unfortunately I can't test this because my kernel never makes kvm_vm_enable_cap(s, KVM_CAP_S390_GS, 0) return 0, because I lack HW support in the host. Regards, Halil > > Ok, honestly, I dislike this idea because it means we are effectively lying > now. W
Re: [libvirt] [Qemu-devel] [PATCH/QEMU] s390x/kvm: use cpu_model_available for guarded storage on compat machines
On 10/26/2017 10:13 AM, Christian Borntraeger wrote: > > > On 10/26/2017 01:35 AM, Halil Pasic wrote: > try the most interesting scenarios out. >> >> The idea of the patch is very clear, but I don't understand the bigger gs >> feature context fully. >> >> From what I read in the code, the attempt to enable the gs capability in >> the kernel is made regardless of the cpu model. If the attempt was >> successful kvm_s390_get_gs will keep returning true. That would in turn >> affect migration, as far as I see (usages of kvm_s390_get_gs). I could >> not figure out how does gs being turned off via cpu-model (-cpu >> z14,gs=off) does turn of gs support -- at least not the details. I wanted >> to give a timely review, so I've limited myself there. > > When the CPU model turns off gs, facility bit 133 will be turned off. > Then the kernel does the right thing, see priv.c handle_gs. > > guarded storage is enabled lazily. Whenever the guest issues a Gs instruction > we will get an exit and only enable GS if facility 133 is set. > >> >> From what I see, this patch does what it advertises, and since I think >> it's the right thing to do in the current situation I gonna give it an: >> Acked-by: Halil Pasic >> >> At the same time, I would prefer the commit message being reworded. IMHO >> this patch is a good stop-gap measure, but essentially it trades an >> annoying and obvious bug for a subtle and hopefully painless one. >> >> Let me explain this last statement. For starters, I do share some of the >> concerns Boris has voiced. I won't repeat those. Same goes for the >> example Christian paraphrased previously, and the the fear of an implicit >> requirement for having to support a Cartesian product of the advertised >> machine-types and cpu-models (for each qemu binary). > > I will try to come up with a patch description that explains the open issues > and it will tell that additional might or might not be necessary depending > on followup discussions. I would be already happy with adding something like: During the discussion enabling cpu-model features for preexisting machine-types came out as at least controversial. We agreed to implement this patch as a stop-gap measure for the next release. What is a clean enough solution shall be decided without time pressure. > I will schedule this patch for 2.11 then. > Sounds like a plan. Cheers, Halil > -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [Qemu-devel] [PATCH/QEMU] s390x/kvm: use cpu_model_available for guarded storage on compat machines
On 10/26/2017 01:35 AM, Halil Pasic wrote: try the most interesting scenarios out. > > The idea of the patch is very clear, but I don't understand the bigger gs > feature context fully. > > From what I read in the code, the attempt to enable the gs capability in > the kernel is made regardless of the cpu model. If the attempt was > successful kvm_s390_get_gs will keep returning true. That would in turn > affect migration, as far as I see (usages of kvm_s390_get_gs). I could > not figure out how does gs being turned off via cpu-model (-cpu > z14,gs=off) does turn of gs support -- at least not the details. I wanted > to give a timely review, so I've limited myself there. When the CPU model turns off gs, facility bit 133 will be turned off. Then the kernel does the right thing, see priv.c handle_gs. guarded storage is enabled lazily. Whenever the guest issues a Gs instruction we will get an exit and only enable GS if facility 133 is set. > > From what I see, this patch does what it advertises, and since I think > it's the right thing to do in the current situation I gonna give it an: > Acked-by: Halil Pasic > > At the same time, I would prefer the commit message being reworded. IMHO > this patch is a good stop-gap measure, but essentially it trades an > annoying and obvious bug for a subtle and hopefully painless one. > > Let me explain this last statement. For starters, I do share some of the > concerns Boris has voiced. I won't repeat those. Same goes for the > example Christian paraphrased previously, and the the fear of an implicit > requirement for having to support a Cartesian product of the advertised > machine-types and cpu-models (for each qemu binary). I will try to come up with a patch description that explains the open issues and it will tell that additional might or might not be necessary depending on followup discussions. I will schedule this patch for 2.11 then. -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [Qemu-devel] [PATCH/QEMU] s390x/kvm: use cpu_model_available for guarded storage on compat machines
On 10/20/2017 04:54 PM, Christian Borntraeger wrote: > Starting a guest with > > hvm > > > > on an IBM z14 results in > > "qemu-system-s390x: Some features requested in the CPU model are not > available in the configuration: gs" > > This is because guarded storage is fenced for compat machines that did not > have > guarded storage support, but libvirt expands the cpu model according to the > latest available machine. > > While this prevents future migration abort (by not starting the guest at all), > not being able to start a "host-model" guest is very much unexpected. As it > turns out, even if we would modify libvirt to not expand the cpu model to > contain "gs" for compat machines, it cannot guarantee that a migration will > succeed. For example if the kernel changes its features (or the user has > nested=1 on one host but not on the other) the migration will fail > nevertheless. So instead of fencing "gs" for machines <= 2.9 lets allow it > for > all machine types that support the CPU model. This will make "host-model" > runnable all the time, while relying on the CPU model to reject invalid > migration attempts. > > Suggested-by: David Hildenbrand > Signed-off-by: Christian Borntraeger I've tried to review this patch. Unfortunately I don't have access to a z14, so I could not try the most interesting scenarios out. The idea of the patch is very clear, but I don't understand the bigger gs feature context fully. >From what I read in the code, the attempt to enable the gs capability in the kernel is made regardless of the cpu model. If the attempt was successful kvm_s390_get_gs will keep returning true. That would in turn affect migration, as far as I see (usages of kvm_s390_get_gs). I could not figure out how does gs being turned off via cpu-model (-cpu z14,gs=off) does turn of gs support -- at least not the details. I wanted to give a timely review, so I've limited myself there. >From what I see, this patch does what it advertises, and since I think it's the right thing to do in the current situation I gonna give it an: Acked-by: Halil Pasic At the same time, I would prefer the commit message being reworded. IMHO this patch is a good stop-gap measure, but essentially it trades an annoying and obvious bug for a subtle and hopefully painless one. Let me explain this last statement. For starters, I do share some of the concerns Boris has voiced. I won't repeat those. Same goes for the example Christian paraphrased previously, and the the fear of an implicit requirement for having to support a Cartesian product of the advertised machine-types and cpu-models (for each qemu binary). In my eyes, a cpu isn't all that different from the other devices which get attached to a board (represented by machine-type). So I don't see why should it be exempt from the usual compatibility requirements tied to machine-types (for the sake of stability and compatibility). (I basically mean: no new features are added to a device in the context of a given (fully qualified) machine-type (with new QEMU -- binary -- versions). As far as I understand all (other) devices shall respect these requirements. Or am I wrong? If I'm not, please enlighten me, how is a CPU fundamentally different than let's say a FLIC. A related thing is, that to implement some features indicated/controlled via cpu-model features, we need to add capabilities to certain devices. Now if the devices shall obey the 'no new features for the same machine-type' rule, but the cpu-model feature shall obey our new 'retroactively introduce/enable for all machines supporting cpu-models' rule, I think we have a conflict. An example for what I'm talking about is zPCI, AIS and FLIC. In case of the AIS and the FLIC, AFAIK the conflict was resolved so that the AIS feature/code of the FLIC is not subject to usual compat-macro mechanism. Another example, AP facility in not just about the CPU instructions, but also about a device. It's also true for the last paragraph: I might very well be wrong, and if I am please do tell (where is the hole in my reasoning). I will try to re-check my statements tomorrow -- again trying to deliver today along the lines better a small bird today than a big one tomorrow). And another question. If we adopt this introducing features for machine-types retroactively, how should the machine-type versions be understood like? My current understanding is, that the machine-type (version) is supposed to limit the observable changes when upgrading the binary to the bare minimum (basically possibly modified timings -- which can't be avoided -- and bug-fixes) for the sake of updating the binary being as safe as possible. Last but not least, I have to say, I'm neither an domain expert for cpu-models nor for libvirt and it's models. For that reason, I've personally asked Jason to do a more detailed review on this -- and am hoping to wiggle out with an weak ack. I do intend to keep on following the discussion (despite
Re: [libvirt] [Qemu-devel] [PATCH/QEMU] s390x/kvm: use cpu_model_available for guarded storage on compat machines
On 10/20/2017 10:54 AM, Christian Borntraeger wrote: Starting a guest with hvm on an IBM z14 results in "qemu-system-s390x: Some features requested in the CPU model are not available in the configuration: gs" This is because guarded storage is fenced for compat machines that did not have guarded storage support, but libvirt expands the cpu model according to the latest available machine. While this prevents future migration abort (by not starting the guest at all), not being able to start a "host-model" guest is very much unexpected. As it turns out, even if we would modify libvirt to not expand the cpu model to contain "gs" for compat machines, it cannot guarantee that a migration will succeed. For example if the kernel changes its features (or the user has nested=1 on one host but not on the other) the migration will fail nevertheless. So instead of fencing "gs" for machines <= 2.9 lets allow it for all machine types that support the CPU model. This will make "host-model" runnable all the time, while relying on the CPU model to reject invalid migration attempts. ... -if (gs_allowed()) { +if (cpu_model_allowed()) { if (kvm_vm_enable_cap(s, KVM_CAP_S390_GS, 0) == 0) { cap_gs = 1; Ok, honestly, I dislike this idea because it means we are effectively lying now. We will happily accept a +gs cpu model with a 2.9 compat machine when the point of compat machines is to mimick the older version of Qemu which does not support GS. Yes, model checking will prevent the worst side effects, namely, exploding migrations. I'd far prefer a solution that makes host-model dependent on the machine type. But I understand some of the backlash on that idea. Still, it seems like the cleanest approach even if it will be more work. With all of that said, I know we want this fixed and your patch seems to fix the problem. So if you need an R-b... Reviewed-by: Jason J. Herne Can we have a new tag? :-D Reviewed-by-with-reservations: Jason J. Herne -- -- Jason J. Herne (jjhe...@linux.vnet.ibm.com) -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list