On Thu, Jan 19, 2017 at 8:08 AM, Juergen Gross <jgr...@suse.com> wrote: > On 17/01/17 18:26, Dario Faggioli wrote: >> In fact, relying on the mask of what pCPUs belong to >> which Credit2 runqueue is not enough. If we only do that, >> when Credit2 is the boot scheduler, we may ASSERT() or >> panic when moving a pCPU from Pool-0 to another cpupool. >> >> This is because pCPUs outside of any pool are considered >> part of cpupool0. This puts us at risk of crash when those >> same pCPUs are added to another pool and something >> different than the idle domain is found to be running >> on them. >> >> Note that, even if we prevent the above to happen (which >> is the purpose of this patch), this is still pretty bad, >> in fact, when we remove a pCPU from Pool-0: >> - in Credit1, as we do *not* update prv->ncpus and >> prv->credit, which means we're considering the wrong >> total credits when doing accounting; >> - in Credit2, the pCPU remains part of one runqueue, >> and is hence at least considered during load balancing, >> even if no vCPU should really run there. >> >> In Credit1, this "only" causes skewed accounting and >> no crashes because there is a lot of `cpumask_and`ing >> going on with the cpumask of the domains' cpupool >> (which, BTW, comes at a price). >> >> A quick and not to involved (and easily backportable) >> solution for Credit2, is to do exactly the same. >> >> Signed-off-by: Dario Faggioli <dario.faggi...@citrix.com >> --- >> Cc: George Dunlap <george.dun...@eu.citrix.com> >> Cc: Juergen Gross <jgr...@suse.com> >> Cc: Jan Beulich <jbeul...@suse.com> >> --- >> This is a bugfix, and should be backported to 4.8. >> --- >> The proper solution would mean calling deinit_pdata() when removing a pCPU >> from >> cpupool0, as well as a bit more of code reshuffling. >> >> And, although worth doing, it certainly will take more work, more time, and >> will probably be hard (well, surely harder than this) to backport. >> >> Therefore, I'd argue in favor of both taking and backporting this change, >> which >> at least enables using Credit2 as default scheduler without risking a crash >> when creating a second cpupool. >> >> Afterwards, a proper solution would be proposed for Xen 4.9. >> >> Finally, given the wide number of issues similar to this that I've found and >> fixed in the last release cycle, I think it would be good to take a stab at >> whether the interface between cpupools and the schedulers could not be >> simplified. :-O > > The first patch version for support of cpupools I sent used an own > scheduler instance for the free cpus. Keir merged this instance with > the one for Pool-0.
You mean he just made the change unilaterally without posting it to the list? I just went back and skimmed through the old cpupools threads and didn't see this discussed. Having a "cpupool-remove" operation that doesn't actually remove the cpu from the pool is a bit mad... -George _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel