On Thu, Jan 19, 2017 at 8:08 AM, Juergen Gross <jgr...@suse.com> wrote:
> On 17/01/17 18:26, Dario Faggioli wrote:
>> In fact, relying on the mask of what pCPUs belong to
>> which Credit2 runqueue is not enough. If we only do that,
>> when Credit2 is the boot scheduler, we may ASSERT() or
>> panic when moving a pCPU from Pool-0 to another cpupool.
>>
>> This is because pCPUs outside of any pool are considered
>> part of cpupool0. This puts us at risk of crash when those
>> same pCPUs are added to another pool and something
>> different than the idle domain is found to be running
>> on them.
>>
>> Note that, even if we prevent the above to happen (which
>> is the purpose of this patch), this is still pretty bad,
>> in fact, when we remove a pCPU from Pool-0:
>> - in Credit1, as we do *not* update prv->ncpus and
>>   prv->credit, which means we're considering the wrong
>>   total credits when doing accounting;
>> - in Credit2, the pCPU remains part of one runqueue,
>>   and is hence at least considered during load balancing,
>>   even if no vCPU should really run there.
>>
>> In Credit1, this "only" causes skewed accounting and
>> no crashes because there is a lot of `cpumask_and`ing
>> going on with the cpumask of the domains' cpupool
>> (which, BTW, comes at a price).
>>
>> A quick and not to involved (and easily backportable)
>> solution for Credit2, is to do exactly the same.
>>
>> Signed-off-by: Dario Faggioli <dario.faggi...@citrix.com
>> ---
>> Cc: George Dunlap <george.dun...@eu.citrix.com>
>> Cc: Juergen Gross <jgr...@suse.com>
>> Cc: Jan Beulich <jbeul...@suse.com>
>> ---
>> This is a bugfix, and should be backported to 4.8.
>> ---
>> The proper solution would mean calling deinit_pdata() when removing a pCPU 
>> from
>> cpupool0, as well as a bit more of code reshuffling.
>>
>> And, although worth doing, it certainly will take more work, more time, and
>> will probably be hard (well, surely harder than this) to backport.
>>
>> Therefore, I'd argue in favor of both taking and backporting this change, 
>> which
>> at least enables using Credit2 as default scheduler without risking a crash
>> when creating a second cpupool.
>>
>> Afterwards, a proper solution would be proposed for Xen 4.9.
>>
>> Finally, given the wide number of issues similar to this that I've found and
>> fixed in the last release cycle, I think it would be good to take a stab at
>> whether the interface between cpupools and the schedulers could not be
>> simplified. :-O
>
> The first patch version for support of cpupools I sent used an own
> scheduler instance for the free cpus. Keir merged this instance with
> the one for Pool-0.

You mean he just made the change unilaterally without posting it to
the list?  I just went back and skimmed through the old cpupools
threads and didn't see this discussed.

Having a "cpupool-remove" operation that doesn't actually remove the
cpu from the pool is a bit mad...

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Reply via email to