On 14.05.19 11:10, Christian Borntraeger wrote: > > > On 14.05.19 10:59, David Hildenbrand wrote: >> On 14.05.19 10:49, Cornelia Huck wrote: >>> On Tue, 14 May 2019 10:37:32 +0200 >>> Christian Borntraeger <borntrae...@de.ibm.com> wrote: >>> >>>> On 14.05.19 09:28, David Hildenbrand wrote: >>>>>>>> But that can be tested using the runability information if I am not >>>>>>>> wrong. >>>>>>> >>>>>>> You mean the cpu level information, right? >>>>> >>>>> Yes, query-cpu-definition includes for each model runability information >>>>> via "unavailable-features" (valid under the started QEMU machine). >>>>> >>>>>>> >>>>>>>> >>>>>>>>> and others that we have today. >>>>>>>>> >>>>>>>>> So yes, I think this would be acceptable. >>>>>>>> >>>>>>>> I guess it is acceptable yes. I doubt anybody uses that many CPUs in >>>>>>>> production either way. But you never know. >>>>>>> >>>>>>> I think that using that many cpus is a more uncommon setup, but I still >>>>>>> think that having to wait for actual failure >>>>>> >>>>>> That can happen all the time today. You can easily say z14 in the xml >>>>>> when >>>>>> on a zEC12. Only at startup you get the error. The question is really: >>>>> >>>>> "-smp 248 -cpu host" will no longer work, while e.g. "-smp 248 -cpu z12" >>>>> will work. Actually, even "-smp 248" will no longer work on affected >>>>> machines. >>>>> >>>>> That is why wonder if it is better to disable the feature and print a >>>>> warning. Similar to CMMA, where want want to tolerate when CMMA is not >>>>> possible in the current environment (huge pages). >>>>> >>>>> "Diag318 will not be enabled because it is not compatible with more than >>>>> 240 CPUs". >>>>> >>>>> However, I still think that implementing support for more than one SCLP >>>>> response page is the best solution. Guests will need adaptions for > 240 >>>>> CPUs with Diag318, but who cares? Existing setups will continue to work. >>>>> >>>>> Implementing that SCLP thingy will avoid any warnings and any errors. It >>>>> just works from the QEMU perspective. >>>>> >>>>> Is implementing this realistic? >>>> >>>> Yes it is but it will take time. I will try to get this rolling. To make >>>> progress on the diag318 thing, can we error on startup now and simply >>>> remove that check when when have implemented a larger sccb? If we would >>>> now do all kinds of "change the max number games" would be harder to "fix". >>> >>> So, the idea right now is: >>> >>> - fail to start if you try to specify a diag318 device and more than >>> 240 cpus (do we need a knob to turn off the device?) >>> - in the future, support more than one SCLP response page >>> >>> I'm getting a bit lost in the discussion; but the above sounds >>> reasonable to me. >>> >> >> We can >> >> 1. Fail to start with #cpus > 240 when diag318=on >> 2. Remove the error once we support more than one SCLP response page >> >> Or >> >> 1. Allow to start with #cpus > 240 when diag318=on, but indicate only >> 240 CPUs via SCLP >> 2. Print a warning >> 3. Remove the restriction and the warning once we support more than one >> SCLP response page >> >> While I prefer the second approach (similar to defining zPCI devices >> without zpci=on), I could also live with the first approach. > > I prefer approach 1. >
Isn't approach #2 what we discussed (limiting sclp, but of course to 247 CPUs), but with an additional warning? I'm confused. -- Thanks, David / dhildenb