Re: [libvirt] [Qemu-devel] [PATCH/QEMU] s390x/kvm: use cpu_model_available for guarded storage on compat machines

2017-10-27 Thread Halil Pasic


On 10/27/2017 07:12 PM, Jiri Denemark wrote:
> On Fri, Oct 27, 2017 at 17:18:44 +0200, Halil Pasic wrote:
>> On 10/27/2017 04:06 PM, Christian Borntraeger wrote:
> ...
>>> I talked to several people and it seems that on x86 the host model will 
>>> also enable new features
>>> that are not known by older QEMUs and its considered works as designed. 
>>> (see also Jiris mail)
>>
>> Yes, I've seen that. It would be nice though if this design was easier to
>> find in written. Unfortunately I can read minds only to a very limited 
>> extent,
>> and the written stuff I've read did not give me a full understanding of the
>> design -- although the entity to blame for this could be my limited 
>> intellect.
> 
> I think it's more likely the documentation is just not perfect. I'll
> look at it and try to make it better. I know about some parts which need
> to be clarified thanks to this discussion.
> 
> Jirka
> 

Feel free to put me on cc for the patch. I would be happy to give it a
look, but since I'm not actively reading the libvirt list, I'm afraid it will
slip past me if I'm not in cc.

Regards,
Halil

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [Qemu-devel] [PATCH/QEMU] s390x/kvm: use cpu_model_available for guarded storage on compat machines

2017-10-27 Thread Jiri Denemark
On Fri, Oct 27, 2017 at 17:18:44 +0200, Halil Pasic wrote:
> On 10/27/2017 04:06 PM, Christian Borntraeger wrote:
...
> > I talked to several people and it seems that on x86 the host model will 
> > also enable new features
> > that are not known by older QEMUs and its considered works as designed. 
> > (see also Jiris mail)
> 
> Yes, I've seen that. It would be nice though if this design was easier to
> find in written. Unfortunately I can read minds only to a very limited extent,
> and the written stuff I've read did not give me a full understanding of the
> design -- although the entity to blame for this could be my limited intellect.

I think it's more likely the documentation is just not perfect. I'll
look at it and try to make it better. I know about some parts which need
to be clarified thanks to this discussion.

Jirka

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [Qemu-devel] [PATCH/QEMU] s390x/kvm: use cpu_model_available for guarded storage on compat machines

2017-10-27 Thread Halil Pasic


On 10/27/2017 04:06 PM, Christian Borntraeger wrote:
> 
> 
> On 10/27/2017 03:40 PM, Halil Pasic wrote:
>>
>>
>> On 10/27/2017 02:57 PM, Christian Borntraeger wrote:
>>>
>>>
>>> On 10/27/2017 02:45 PM, Christian Borntraeger wrote:


 On 10/27/2017 02:31 PM, Halil Pasic wrote:
 gs is explicitly disabled.
>
> Now that I think about it, maybe the 2.9 binary is going to reject
> the explicit gs flag altogether, because it's unknown.
>
> Isn't this a problem? 

 No. This is exactly the _solution_ and not the problem. The target will 
 reject
 unknown cpu features and migration will be aborted. This is exactly what 
 the CPU
 model is for.
>>
>> I'm not sure we talk abut the same thing. I'm talking about the following. I
>> want to disable a cpu-model feature for the sake of migration (because I know
>> that binary version X does not support the feature, because it does not know
>> about it). Now if I do it via let's say -cpu z13,gs=off on let's say 2.11,
>> and start with the exact same command line (-cpu z13,gs=off) on lets say 2.9
>> my migration will explode because of the unknown feature I'm specifying
>> not to be used.
> 
> The migration will be rejected because the target qemu will not startup.
> You can easily simulate that, e.g. by doing
> 
> qemu-system-s390x -cpu z13,notyetknown=off
> qemu-system-s390x: can't apply global z13-s390x-cpu.notyetknown=off: Property 
> '.notyetknown' not found
> 
> But libvirt will not use a full model (and the disable things) instead it will
> use the base model and add things. (So libvirt should never use xxx=off)
> 

That piece of the puzzle was missing for me (no xxx=off for M minus M-base
features).

> 
> I think this is really not an issue. If you specify a feature that is not 
> known then
> QEMU will not start on the target and migration is rejected. The guest 
> continues to run
> on the source. So if you specify a "too new" facility yourself its really a 
> user error.
> Everything that uses an explicit model (e.g. -cpu z13 or -cpu,sief2=on) will 
> work, but only
> as long as the conditions are met. If you specify -cpu z14, it does not 
> matter if it fails
> if the kernel or QEMU is too old, or if you just happen to run on a z13.
> 
> The  only question is/was:  what is about "host-model".
> With my patch (+ the gs fixup) the following things will work:
> - host-model will work on z13
> - host-model will work on z14 (any machine version)
> - host model on z13 and then migrating to z14 will work (any machine version)
> - host model on z13 and then migrating to z14 and then migrating back will 
> work (any version)
> - qemu with fixup + host model on z14 with machine version 2.10 can be 
> migrated
> 
> The only thing that does not work is
> - qemu with fixup + host model on z14 with machine 2.9 can not be migrated to 
> qemu 2.9 on z14.
> 


I agree.

> Now: this would have not worked anyway, because qemu 2.9 does not know z14. 
> So in theory 
> QEMU must forbit z14 for compat machines (which we do not know).>

Noted.
 
> I talked to several people and it seems that on x86 the host model will also 
> enable new features
> that are not known by older QEMUs and its considered works as designed. (see 
> also Jiris mail)
> 

Yes, I've seen that. It would be nice though if this design was easier to
find in written. Unfortunately I can read minds only to a very limited extent,
and the written stuff I've read did not give me a full understanding of the
design -- although the entity to blame for this could be my limited intellect. 

> 
>>
>> Well I'm not sure what I describe is relevant. My thinking is along the lines
>> some features are added incrementally. How do use those of the features not 
>> included
>> in -base model which both of my environments support and disable those that
>> are unsupported by one of the environments.
>>
>> I will think about it some more. I've asked Boris about this situation,
>> and he did not put my mind at ease (to be more precise he seemed to
>> see this as a potential problem too), so I've decided to mention it.
>> Sorry if I've generated some unnecessary noise.
>>
>> I think the root of the problem is that I don't understand the difference 
>> between
>> z13-base and z13, and the associated rules and expected/intended usages. 
> 
> z13-base contains only those features that a guaranteed to be there (there is
> the list of non-hypervisor managed features). z13 is z13-base + all features 
> that
> will be available in a reasonably recent kernel+qemu combination and make 
> sense
> to be there a default. So it might happen that you cannot start -cpu z14, 
> e.g. 
> if you run on a kernel < 4.12.
> 

OK. I will have to learn more about this. IMHO it does not make sense
to burden the community with the holes in my knowledge any more, but I will
have to stuff them to feel comfortable in this area.

Regards,
Halil

--
libvir-list mailing list
libvir-list@redhat.com
https://www

Re: [libvirt] [Qemu-devel] [PATCH/QEMU] s390x/kvm: use cpu_model_available for guarded storage on compat machines

2017-10-27 Thread Christian Borntraeger


On 10/27/2017 03:40 PM, Halil Pasic wrote:
> 
> 
> On 10/27/2017 02:57 PM, Christian Borntraeger wrote:
>>
>>
>> On 10/27/2017 02:45 PM, Christian Borntraeger wrote:
>>>
>>>
>>> On 10/27/2017 02:31 PM, Halil Pasic wrote:
>>> gs is explicitly disabled.

 Now that I think about it, maybe the 2.9 binary is going to reject
 the explicit gs flag altogether, because it's unknown.

 Isn't this a problem? 
>>>
>>> No. This is exactly the _solution_ and not the problem. The target will 
>>> reject
>>> unknown cpu features and migration will be aborted. This is exactly what 
>>> the CPU
>>> model is for.
> 
> I'm not sure we talk abut the same thing. I'm talking about the following. I
> want to disable a cpu-model feature for the sake of migration (because I know
> that binary version X does not support the feature, because it does not know
> about it). Now if I do it via let's say -cpu z13,gs=off on let's say 2.11,
> and start with the exact same command line (-cpu z13,gs=off) on lets say 2.9
> my migration will explode because of the unknown feature I'm specifying
> not to be used.

The migration will be rejected because the target qemu will not startup.
You can easily simulate that, e.g. by doing

qemu-system-s390x -cpu z13,notyetknown=off
qemu-system-s390x: can't apply global z13-s390x-cpu.notyetknown=off: Property 
'.notyetknown' not found

But libvirt will not use a full model (and the disable things) instead it will
use the base model and add things. (So libvirt should never use xxx=off)


I think this is really not an issue. If you specify a feature that is not known 
then
QEMU will not start on the target and migration is rejected. The guest 
continues to run
on the source. So if you specify a "too new" facility yourself its really a 
user error.
Everything that uses an explicit model (e.g. -cpu z13 or -cpu,sief2=on) will 
work, but only
as long as the conditions are met. If you specify -cpu z14, it does not matter 
if it fails
if the kernel or QEMU is too old, or if you just happen to run on a z13.

The  only question is/was:  what is about "host-model".
With my patch (+ the gs fixup) the following things will work:
- host-model will work on z13
- host-model will work on z14 (any machine version)
- host model on z13 and then migrating to z14 will work (any machine version)
- host model on z13 and then migrating to z14 and then migrating back will work 
(any version)
- qemu with fixup + host model on z14 with machine version 2.10 can be migrated

The only thing that does not work is
- qemu with fixup + host model on z14 with machine 2.9 can not be migrated to 
qemu 2.9 on z14.

Now: this would have not worked anyway, because qemu 2.9 does not know z14. So 
in theory 
QEMU must forbit z14 for compat machines (which we do not know).

I talked to several people and it seems that on x86 the host model will also 
enable new features
that are not known by older QEMUs and its considered works as designed. (see 
also Jiris mail)


> 
> Well I'm not sure what I describe is relevant. My thinking is along the lines
> some features are added incrementally. How do use those of the features not 
> included
> in -base model which both of my environments support and disable those that
> are unsupported by one of the environments.
> 
> I will think about it some more. I've asked Boris about this situation,
> and he did not put my mind at ease (to be more precise he seemed to
> see this as a potential problem too), so I've decided to mention it.
> Sorry if I've generated some unnecessary noise.
> 
> I think the root of the problem is that I don't understand the difference 
> between
> z13-base and z13, and the associated rules and expected/intended usages. 

z13-base contains only those features that a guaranteed to be there (there is
the list of non-hypervisor managed features). z13 is z13-base + all features 
that
will be available in a reasonably recent kernel+qemu combination and make sense
to be there a default. So it might happen that you cannot start -cpu z14, e.g. 
if you run on a kernel < 4.12.

 
>> FWIW, I think in your particular case the QEMU will reject the z14 cpu and 
>> not even come
>> to checking the gs. 
>>
> 
> I had a z13 cpu model in mind. I don't mention a z14 cpu-model (QEMU, not hw) 
> in
> my whole email.
> 
> Regards,
> Halil
> 

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [Qemu-devel] [PATCH/QEMU] s390x/kvm: use cpu_model_available for guarded storage on compat machines

2017-10-27 Thread Halil Pasic


On 10/27/2017 03:31 PM, Cornelia Huck wrote:
> On Fri, 27 Oct 2017 14:42:57 +0200
> Christian Borntraeger  wrote:
> 
>> Yes, we should also replace that with
>>
>>  return s390_has_feat(S390_FEAT_GUARDED_STORAGE)
>>
>> I can fixup my patch or provide a 2nd one.
>>
> 
> Consider a fixed up patch acked by me.
> 

+1 You can keep my ack too. I will try to find some time and read
the v2 though.

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [Qemu-devel] [PATCH/QEMU] s390x/kvm: use cpu_model_available for guarded storage on compat machines

2017-10-27 Thread Halil Pasic


On 10/27/2017 02:57 PM, Christian Borntraeger wrote:
> 
> 
> On 10/27/2017 02:45 PM, Christian Borntraeger wrote:
>>
>>
>> On 10/27/2017 02:31 PM, Halil Pasic wrote:
>> gs is explicitly disabled.
>>>
>>> Now that I think about it, maybe the 2.9 binary is going to reject
>>> the explicit gs flag altogether, because it's unknown.
>>>
>>> Isn't this a problem? 
>>
>> No. This is exactly the _solution_ and not the problem. The target will 
>> reject
>> unknown cpu features and migration will be aborted. This is exactly what the 
>> CPU
>> model is for.

I'm not sure we talk abut the same thing. I'm talking about the following. I
want to disable a cpu-model feature for the sake of migration (because I know
that binary version X does not support the feature, because it does not know
about it). Now if I do it via let's say -cpu z13,gs=off on let's say 2.11,
and start with the exact same command line (-cpu z13,gs=off) on lets say 2.9
my migration will explode because of the unknown feature I'm specifying
not to be used.

Well I'm not sure what I describe is relevant. My thinking is along the lines
some features are added incrementally. How do use those of the features not 
included
in -base model which both of my environments support and disable those that
are unsupported by one of the environments.

I will think about it some more. I've asked Boris about this situation,
and he did not put my mind at ease (to be more precise he seemed to
see this as a potential problem too), so I've decided to mention it.
Sorry if I've generated some unnecessary noise.

I think the root of the problem is that I don't understand the difference 
between
z13-base and z13, and the associated rules and expected/intended usages. 

> FWIW, I think in your particular case the QEMU will reject the z14 cpu and 
> not even come
> to checking the gs. 
> 

I had a z13 cpu model in mind. I don't mention a z14 cpu-model (QEMU, not hw) in
my whole email.

Regards,
Halil

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [Qemu-devel] [PATCH/QEMU] s390x/kvm: use cpu_model_available for guarded storage on compat machines

2017-10-27 Thread Cornelia Huck
On Fri, 27 Oct 2017 14:42:57 +0200
Christian Borntraeger  wrote:

> Yes, we should also replace that with
> 
>  return s390_has_feat(S390_FEAT_GUARDED_STORAGE)
> 
> I can fixup my patch or provide a 2nd one.
> 

Consider a fixed up patch acked by me.

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [Qemu-devel] [PATCH/QEMU] s390x/kvm: use cpu_model_available for guarded storage on compat machines

2017-10-27 Thread Christian Borntraeger


On 10/27/2017 02:45 PM, Christian Borntraeger wrote:
> 
> 
> On 10/27/2017 02:31 PM, Halil Pasic wrote:
> gs is explicitly disabled.
>>
>> Now that I think about it, maybe the 2.9 binary is going to reject
>> the explicit gs flag altogether, because it's unknown.
>>
>> Isn't this a problem? 
> 
> No. This is exactly the _solution_ and not the problem. The target will reject
> unknown cpu features and migration will be aborted. This is exactly what the 
> CPU
> model is for.
FWIW, I think in your particular case the QEMU will reject the z14 cpu and not 
even come
to checking the gs. 

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [Qemu-devel] [PATCH/QEMU] s390x/kvm: use cpu_model_available for guarded storage on compat machines

2017-10-27 Thread Christian Borntraeger


On 10/27/2017 02:31 PM, Halil Pasic wrote:
gs is explicitly disabled.
> 
> Now that I think about it, maybe the 2.9 binary is going to reject
> the explicit gs flag altogether, because it's unknown.
> 
> Isn't this a problem? 

No. This is exactly the _solution_ and not the problem. The target will reject
unknown cpu features and migration will be aborted. This is exactly what the CPU
model is for.


--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [Qemu-devel] [PATCH/QEMU] s390x/kvm: use cpu_model_available for guarded storage on compat machines

2017-10-27 Thread Christian Borntraeger


On 10/27/2017 02:31 PM, Halil Pasic wrote:
> 
> 
> On 10/25/2017 08:13 PM, Jason J. Herne wrote:
>> On 10/20/2017 10:54 AM, Christian Borntraeger wrote:
>>> Starting a guest with
>>>     
>>>  hvm
>>>    
>>>    
>>>
>>> on an IBM z14 results in
>>>
>>> "qemu-system-s390x: Some features requested in the CPU model are not
>>> available in the configuration: gs"
>>>
>>> This is because guarded storage is fenced for compat machines that did not 
>>> have
>>> guarded storage support, but libvirt expands the cpu model according to the
>>> latest available machine.
>>>
>>> While this prevents future migration abort (by not starting the guest at 
>>> all),
>>> not being able to start a "host-model" guest is very much unexpected.  As it
>>> turns out, even if we would modify libvirt to not expand the cpu model to
>>> contain "gs" for compat machines, it cannot guarantee that a migration will
>>> succeed. For example if the kernel changes its features (or the user has
>>> nested=1 on one host but not on the other) the migration will fail
>>> nevertheless.  So instead of fencing "gs" for machines <= 2.9 lets allow it 
>>> for
>>> all machine types that support the CPU model. This will make "host-model"
>>> runnable all the time, while relying on the CPU model to reject invalid
>>> migration attempts.
>> ...
>>> -    if (gs_allowed()) {
>>> +    if (cpu_model_allowed()) {
>>>   if (kvm_vm_enable_cap(s, KVM_CAP_S390_GS, 0) == 0) {
>>>   cap_gs = 1;
> 
> 
> @Jason
> 
> Hi Jason,
> 
> I don't have access to a z14 at the moment, and since you do, I would
> like to try out something.
> 
> I will first describe my concern, and then the test scenario.
> 
> The last line above, cap_gs = 1, has the side effect of returning
> true ever after.
> 
> int kvm_s390_get_gs(void) 
>   
> { 
>   
> return cap_gs;
>   
> }  
> 
> Now considering
> static bool gscb_needed(void *opaque)
> {
> return kvm_s390_get_gs();
> }

Yes, we should also replace that with

 return s390_has_feat(S390_FEAT_GUARDED_STORAGE)

I can fixup my patch or provide a 2nd one.

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] [Qemu-devel] [PATCH/QEMU] s390x/kvm: use cpu_model_available for guarded storage on compat machines

2017-10-27 Thread Halil Pasic


On 10/25/2017 08:13 PM, Jason J. Herne wrote:
> On 10/20/2017 10:54 AM, Christian Borntraeger wrote:
>> Starting a guest with
>>     
>>  hvm
>>    
>>    
>>
>> on an IBM z14 results in
>>
>> "qemu-system-s390x: Some features requested in the CPU model are not
>> available in the configuration: gs"
>>
>> This is because guarded storage is fenced for compat machines that did not 
>> have
>> guarded storage support, but libvirt expands the cpu model according to the
>> latest available machine.
>>
>> While this prevents future migration abort (by not starting the guest at 
>> all),
>> not being able to start a "host-model" guest is very much unexpected.  As it
>> turns out, even if we would modify libvirt to not expand the cpu model to
>> contain "gs" for compat machines, it cannot guarantee that a migration will
>> succeed. For example if the kernel changes its features (or the user has
>> nested=1 on one host but not on the other) the migration will fail
>> nevertheless.  So instead of fencing "gs" for machines <= 2.9 lets allow it 
>> for
>> all machine types that support the CPU model. This will make "host-model"
>> runnable all the time, while relying on the CPU model to reject invalid
>> migration attempts.
> ...
>> -    if (gs_allowed()) {
>> +    if (cpu_model_allowed()) {
>>   if (kvm_vm_enable_cap(s, KVM_CAP_S390_GS, 0) == 0) {
>>   cap_gs = 1;


@Jason

Hi Jason,

I don't have access to a z14 at the moment, and since you do, I would
like to try out something.

I will first describe my concern, and then the test scenario.

The last line above, cap_gs = 1, has the side effect of returning
true ever after.

int kvm_s390_get_gs(void)   
{   
return cap_gs;  
}  

Now considering
static bool gscb_needed(void *opaque)
{
return kvm_s390_get_gs();
}

const VMStateDescription vmstate_gscb = {
.name = "cpu/gscb",
.version_id = 1,
.minimum_version_id = 1,
.needed = gscb_needed,
.fields = (VMStateField[]) {
VMSTATE_UINT64_ARRAY(env.gscb, S390CPU, 4),
VMSTATE_END_OF_LIST()
}
};

const VMStateDescription vmstate_s390_cpu = {
.name = "cpu",
.post_load = cpu_post_load,
.pre_save = cpu_pre_save,
.version_id = 4,
.minimum_version_id = 3,
.fields  = (VMStateField[]) {
VMSTATE_UINT64_ARRAY(env.regs, S390CPU, 16),
VMSTATE_UINT64(env.psw.mask, S390CPU),
VMSTATE_UINT64(env.psw.addr, S390CPU),
VMSTATE_UINT64(env.psa, S390CPU),
VMSTATE_UINT32(env.todpr, S390CPU),
VMSTATE_UINT64(env.pfault_token, S390CPU),
VMSTATE_UINT64(env.pfault_compare, S390CPU),
VMSTATE_UINT64(env.pfault_select, S390CPU),
VMSTATE_UINT64(env.cputm, S390CPU),
VMSTATE_UINT64(env.ckc, S390CPU),
VMSTATE_UINT64(env.gbea, S390CPU),
VMSTATE_UINT64(env.pp, S390CPU),
VMSTATE_UINT32_ARRAY(env.aregs, S390CPU, 16),
VMSTATE_UINT64_ARRAY(env.cregs, S390CPU, 16),
VMSTATE_UINT8(env.cpu_state, S390CPU),
VMSTATE_UINT8(env.sigp_order, S390CPU),
VMSTATE_UINT32_V(irqstate_saved_size, S390CPU, 4),
VMSTATE_VBUFFER_UINT32(irqstate, S390CPU, 4, NULL,
   irqstate_saved_size),
VMSTATE_END_OF_LIST()
},
.subsections = (const VMStateDescription*[]) {
&vmstate_fpu,
&vmstate_vregs,
&vmstate_riccb,
&vmstate_exval,
&vmstate_gscb,
NULL
},
};

I would expect the vmstate_gscb subsection being sent, even if gs is disabled
via cpu-model if kernel and possibly machine has gs support (and qemu
has cpu-models).

So the test scenario I want you to play trough is the following. Take
the latest-greatest qemu with this patch applied. Make sure gs works
(is provided to the guest) with a 2.9 machine version, and a fully
specified cpu-model. Now disable gs explicitly.

Try to migrate this to another machine having a 2.9 binary. I expect
the migration failing because, the subsection is going to be sent by
the latest-greatest binary, but is unknown to the 2.9 binary.

Notice this is despite the fact that gs is explicitly disabled.

Now that I think about it, maybe the 2.9 binary is going to reject
the explicit gs flag altogether, because it's unknown.

Isn't this a problem? I'm afraid like this the only migration-safe
variant is -base, but that would essentially make adding features
incrementally impossible. 

But I hypothesize trying to migrate with z13 or even z13-base
would also trigger the unknown subsection problem.

Unfortunately I can't test this because my kernel never
makes kvm_vm_enable_cap(s, KVM_CAP_S390_GS, 0) return 0, because
I lack HW support in the host.

Regards,
Halil


> 
> Ok, honestly, I dislike this idea because it means we are effectively lying 
> now. W

Re: [libvirt] [Qemu-devel] [PATCH/QEMU] s390x/kvm: use cpu_model_available for guarded storage on compat machines

2017-10-26 Thread Halil Pasic


On 10/26/2017 10:13 AM, Christian Borntraeger wrote:
> 
> 
> On 10/26/2017 01:35 AM, Halil Pasic wrote:
>  try the most interesting scenarios out.
>>
>> The idea of the patch is very clear, but I don't understand the bigger gs
>> feature context fully.
>>
>> From what I read in the code, the attempt to enable the gs capability in
>> the kernel is made regardless of the cpu model. If the attempt was
>> successful kvm_s390_get_gs will keep returning true. That would in turn
>> affect migration, as far as I see (usages of kvm_s390_get_gs). I could
>> not figure out how does gs being turned off via cpu-model (-cpu
>> z14,gs=off) does turn of gs support -- at least not the details. I wanted
>> to give a timely review, so I've limited myself there. 
> 
> When the CPU model turns off gs, facility bit 133 will be turned off.
> Then the kernel does the right thing, see priv.c handle_gs.
> 
> guarded storage is enabled lazily. Whenever the guest issues a Gs instruction
> we will get an exit and only enable GS if facility 133 is set.
> 
>>
>> From what I see, this patch does what it advertises, and since I think
>> it's the right thing to do in the current situation I gonna give it an:
>> Acked-by: Halil Pasic 
>>
>> At the same time, I would prefer the commit message being reworded. IMHO
>> this patch is a good stop-gap measure, but essentially it trades an
>> annoying and obvious bug for a subtle and hopefully painless one.
>>
>> Let me explain this last statement. For starters, I  do share some of the
>> concerns Boris has voiced.  I won't repeat those. Same goes for the
>> example Christian paraphrased previously, and the the fear of an implicit
>> requirement for having to support a Cartesian product of the advertised
>> machine-types and cpu-models (for each qemu binary).
> 
> I will try to come up with a patch description that explains the open issues
> and it will tell that additional might or might not be necessary depending
> on followup discussions.

I would be already happy with adding something like:
During the discussion enabling cpu-model features for preexisting
machine-types came out as at least controversial. We agreed to implement
this patch as a stop-gap measure for the next release. What is a clean
enough solution shall be decided without time pressure.

> I will schedule this patch for 2.11 then.
> 

Sounds like a plan.

Cheers,
Halil

> 

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [Qemu-devel] [PATCH/QEMU] s390x/kvm: use cpu_model_available for guarded storage on compat machines

2017-10-26 Thread Christian Borntraeger


On 10/26/2017 01:35 AM, Halil Pasic wrote:
 try the most interesting scenarios out.
> 
> The idea of the patch is very clear, but I don't understand the bigger gs
> feature context fully.
> 
> From what I read in the code, the attempt to enable the gs capability in
> the kernel is made regardless of the cpu model. If the attempt was
> successful kvm_s390_get_gs will keep returning true. That would in turn
> affect migration, as far as I see (usages of kvm_s390_get_gs). I could
> not figure out how does gs being turned off via cpu-model (-cpu
> z14,gs=off) does turn of gs support -- at least not the details. I wanted
> to give a timely review, so I've limited myself there. 

When the CPU model turns off gs, facility bit 133 will be turned off.
Then the kernel does the right thing, see priv.c handle_gs.

guarded storage is enabled lazily. Whenever the guest issues a Gs instruction
we will get an exit and only enable GS if facility 133 is set.

> 
> From what I see, this patch does what it advertises, and since I think
> it's the right thing to do in the current situation I gonna give it an:
> Acked-by: Halil Pasic 
> 
> At the same time, I would prefer the commit message being reworded. IMHO
> this patch is a good stop-gap measure, but essentially it trades an
> annoying and obvious bug for a subtle and hopefully painless one.
> 
> Let me explain this last statement. For starters, I  do share some of the
> concerns Boris has voiced.  I won't repeat those. Same goes for the
> example Christian paraphrased previously, and the the fear of an implicit
> requirement for having to support a Cartesian product of the advertised
> machine-types and cpu-models (for each qemu binary).

I will try to come up with a patch description that explains the open issues
and it will tell that additional might or might not be necessary depending
on followup discussions.
I will schedule this patch for 2.11 then.

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [Qemu-devel] [PATCH/QEMU] s390x/kvm: use cpu_model_available for guarded storage on compat machines

2017-10-25 Thread Halil Pasic


On 10/20/2017 04:54 PM, Christian Borntraeger wrote:
> Starting a guest with
>
> hvm
>   
>   
> 
> on an IBM z14 results in
> 
> "qemu-system-s390x: Some features requested in the CPU model are not
> available in the configuration: gs"
> 
> This is because guarded storage is fenced for compat machines that did not 
> have
> guarded storage support, but libvirt expands the cpu model according to the
> latest available machine.
> 
> While this prevents future migration abort (by not starting the guest at all),
> not being able to start a "host-model" guest is very much unexpected.  As it
> turns out, even if we would modify libvirt to not expand the cpu model to
> contain "gs" for compat machines, it cannot guarantee that a migration will
> succeed. For example if the kernel changes its features (or the user has
> nested=1 on one host but not on the other) the migration will fail
> nevertheless.  So instead of fencing "gs" for machines <= 2.9 lets allow it 
> for
> all machine types that support the CPU model. This will make "host-model"
> runnable all the time, while relying on the CPU model to reject invalid
> migration attempts.
> 
> Suggested-by: David Hildenbrand 
> Signed-off-by: Christian Borntraeger 

I've tried to review this patch. Unfortunately I don't have access to a
z14, so I could not try the most interesting scenarios out.

The idea of the patch is very clear, but I don't understand the bigger gs
feature context fully.

>From what I read in the code, the attempt to enable the gs capability in
the kernel is made regardless of the cpu model. If the attempt was
successful kvm_s390_get_gs will keep returning true. That would in turn
affect migration, as far as I see (usages of kvm_s390_get_gs). I could
not figure out how does gs being turned off via cpu-model (-cpu
z14,gs=off) does turn of gs support -- at least not the details. I wanted
to give a timely review, so I've limited myself there. 

>From what I see, this patch does what it advertises, and since I think
it's the right thing to do in the current situation I gonna give it an:
Acked-by: Halil Pasic 

At the same time, I would prefer the commit message being reworded. IMHO
this patch is a good stop-gap measure, but essentially it trades an
annoying and obvious bug for a subtle and hopefully painless one.

Let me explain this last statement. For starters, I  do share some of the
concerns Boris has voiced.  I won't repeat those. Same goes for the
example Christian paraphrased previously, and the the fear of an implicit
requirement for having to support a Cartesian product of the advertised
machine-types and cpu-models (for each qemu binary).

In my eyes, a cpu isn't all that different from the other devices which
get attached to a board (represented by machine-type). So I don't see why
should it be exempt from the usual compatibility requirements tied to
machine-types (for the sake of stability and compatibility). (I basically
mean: no new features are added to a device in the context of a given
(fully qualified) machine-type (with new QEMU -- binary -- versions). As
far as I understand all (other) devices shall respect these requirements.
Or am I wrong? If I'm not, please enlighten me, how is a CPU
fundamentally different than let's say a FLIC.

A related thing is, that to implement some features indicated/controlled
via cpu-model features, we need to add capabilities to certain devices.
Now if the devices shall obey the 'no new features for the same
machine-type' rule, but the cpu-model feature shall obey our new
'retroactively introduce/enable for all machines supporting cpu-models'
rule, I think we have a conflict.  An example for what I'm talking about
is zPCI, AIS and FLIC. In case of the AIS and the FLIC, AFAIK the
conflict was resolved so that the AIS feature/code of the FLIC is not
subject to usual compat-macro mechanism. Another example, AP facility in
not just about the CPU instructions, but also about a device.

It's also true for the last paragraph: I might very well be wrong, and if
I am please do tell (where is the hole in  my reasoning). I will try to
re-check my statements tomorrow -- again trying to deliver today along
the lines better a small bird today than a big one tomorrow).

And another question. If we adopt this introducing features for
machine-types retroactively, how should the machine-type versions be
understood like? My current understanding is, that the machine-type
(version) is supposed to limit the observable changes when upgrading
the binary to the bare minimum (basically possibly modified timings -- which
can't be avoided -- and bug-fixes) for the sake of updating the
binary being as safe as possible.

Last but not least, I have to say, I'm neither an domain expert for
cpu-models nor for libvirt and it's models. For that reason, I've
personally asked Jason to do a more detailed review on this -- and am
hoping to wiggle out with an weak ack. I do intend to keep on following
the discussion (despite 

Re: [libvirt] [Qemu-devel] [PATCH/QEMU] s390x/kvm: use cpu_model_available for guarded storage on compat machines

2017-10-25 Thread Jason J. Herne

On 10/20/2017 10:54 AM, Christian Borntraeger wrote:

Starting a guest with

 hvm
   
   

on an IBM z14 results in

"qemu-system-s390x: Some features requested in the CPU model are not
available in the configuration: gs"

This is because guarded storage is fenced for compat machines that did not have
guarded storage support, but libvirt expands the cpu model according to the
latest available machine.

While this prevents future migration abort (by not starting the guest at all),
not being able to start a "host-model" guest is very much unexpected.  As it
turns out, even if we would modify libvirt to not expand the cpu model to
contain "gs" for compat machines, it cannot guarantee that a migration will
succeed. For example if the kernel changes its features (or the user has
nested=1 on one host but not on the other) the migration will fail
nevertheless.  So instead of fencing "gs" for machines <= 2.9 lets allow it for
all machine types that support the CPU model. This will make "host-model"
runnable all the time, while relying on the CPU model to reject invalid
migration attempts.

...

-if (gs_allowed()) {
+if (cpu_model_allowed()) {
  if (kvm_vm_enable_cap(s, KVM_CAP_S390_GS, 0) == 0) {
  cap_gs = 1;


Ok, honestly, I dislike this idea because it means we are effectively 
lying now. We will happily accept a +gs cpu model with a 2.9 compat 
machine when the point of compat machines is to mimick the older version 
of Qemu which does not support GS.  Yes, model checking will prevent the 
worst side effects, namely, exploding migrations.


I'd far prefer a solution that makes host-model dependent on the machine 
type. But I understand some of the backlash on that idea. Still, it 
seems like the cleanest approach even if it will be more work.


With all of that said, I know we want this fixed and your patch seems to 
fix the problem. So if you need an R-b...


Reviewed-by: Jason J. Herne 

Can we have a new tag? :-D
Reviewed-by-with-reservations: Jason J. Herne 

--
-- Jason J. Herne (jjhe...@linux.vnet.ibm.com)

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list