On 13.05.19 12:55, Christian Borntraeger wrote: > > > On 13.05.19 11:57, David Hildenbrand wrote: >> On 13.05.19 11:51, Christian Borntraeger wrote: >>> >>> >>> On 13.05.19 11:40, David Hildenbrand wrote: >>>> On 13.05.19 11:34, Christian Borntraeger wrote: >>>>> >>>>> >>>>> On 13.05.19 10:03, David Hildenbrand wrote: >>>>>>>> + if ((SCCB_SIZE - sizeof(ReadInfo)) / sizeof(CPUEntry) < >>>>>>>> S390_MAX_CPUS) >>>>>>>> + mc->max_cpus = S390_MAX_CPUS - 8; >>>>>>> >>>>>>> This is too complicated, just set it always to 240. >>>>>>> >>>>>>> However, I am still not sure how to best handle this scenario. One >>>>>>> solution is >>>>>>> >>>>>>> 1. Set it statically to 240 for machine > 4.1 >>>>>>> 2. Keep the old machines unmodifed >>>>>>> 3. Don't indicate the CPU feature for machines <= 4.0 >>>>>>> >>>>>>> #3 is the problematic part, as it mixes host CPU features and machines. >>>>>>> Bad. The host CPU model should always look the same on all machines. I >>>>>>> don't like this. >>>>>>> >>>>>> >>>>>> FWIW, #3 is only an issue when modeling it via the CPU model, like >>>>>> Christian suggested. >>>>>> >>>>>> I suggest the following >>>>>> >>>>>> 1. Set the max #cpus for 4.1 to 240 (already done) >>>>>> 2. Keep it for the other machines unmodified (as suggested by Thomas) >>>>>> 3. Create the layout of the SCCB depending on the machine type (to be >>>>>> done) >>>>>> >>>>>> If we want to model diag318 via a CPU feature (which makes sense for >>>>>> migration): >>>>>> >>>>>> 4. Disable diag318 with a warning if used with a machine < 4.1 >>>>>> >>>>> >>>>> I think there is a simpler solution. It is perfectly fine to fail the >>>>> startup >>>>> if we cannot fulfil the cpu model. So lets just allow 248 and allow this >>>>> feature >>>>> also for older machines. And if somebody chooses both at the same time, >>>>> lets fails the startup. >>>> >>>> To which knob do you want to glue the layout of the SCLP response? Like >>>> I described? Do you mean instead of warning and masking the feature off >>>> as I suggested, simply failing? >>> >>> The sclp response will depend on the dia318 cpu model flag. If its on, the >>> sclp >>> response will have it, otherwise not. >>> - host-passthrough: not migration safe anyway >>> - host-model: if the target has diag318 good, otherwise we reject migration >>>> >>>> In that case, -machine ..-4.0 -cpu host will not work on new HW with new >>>> KVM. Just noting. >>> >>> Only if you have 248 CPUs (which is unlikely). My point was to do that for >>> all >>> machine levels. >>> >> >> The issue with this approach is that e.g. libvirt is not aware of this >> restriction. It could query "max_cpus" and expand the host-cpu model, >> but starting a guest with > 240 cpus would fail. Maybe this is acceptable. > > As of today we do the cpu model check in the same way. libvirt actually tries > to run QEMU and handles failures. > > For a failure, the user still has still to use >240 CPUs in its XML. The only > downside > is that libvirt will not reject this right away. > > During startup we would then print an error message like > > "The diag318 cpu feature is only supported for 240 and less CPUs." > > This is of similar quality as > "Selected CPU GA level is too new. Maximum supported model in the > configuration: \'%s\'", >
But that can be tested using the runability information if I am not wrong. > and others that we have today. > > So yes, I think this would be acceptable. I guess it is acceptable yes. I doubt anybody uses that many CPUs in production either way. But you never know. -- Thanks, David / dhildenb