> 在 2016年6月26日,14:10,Boqun Feng <boqun.f...@gmail.com> 写道:
> 
> On Sun, Jun 26, 2016 at 01:21:04PM +0800, panxinhui wrote:
>> 
>>> 在 2016年6月26日,03:20,Peter Zijlstra <pet...@infradead.org> 写道:
>>> 
>>> On Sun, Jun 26, 2016 at 01:27:56AM +0800, panxinhui wrote:
>>>>>> Would that not have issues where the owner cpu is kept running but the
>>>>>> spinner (ie. _this_ vcpu) gets preempted? I would think that in that
>>>>>> case we too want to stop spinning.
>>>>>> 
>>>>> 
>>>> do  you mean that the spinner detect itself had yield out during the
>>>> big spin loop?
>>>> 
>>>> It is very possible to happen.  BUT if spinner(on this vcpu) yield
>>>> out, the next spinner would break the spin loop.  AND if spinner
>>>> detect itself yield out once, it’s very possible to get the osq lock
>>>> soon as long as the ower vcpu is running.
>>>> 
>>>> SO I think we need just check the owner vcpu’s yield_count.
>>> 
>>> I had a quick look at KVM and it looks like it only has
>>> kvm_cpu::preempted, which would suggest the interface boqun proposed.
>>> 
>>> We'll have to look at many of the other virt platforms as well to see
>>> what they can do.
>>> 
>>> We could also provide _both_ interfaces and a platform can implement
>>> whichever variant (or both) it can.
>>> 
>> the kvm code on ppc has implemented  yield_count.
>> It let me feel a little relaxed. :)
>> 
>> looks like we could  introduce the interface like below.
>> 
>> bool vcpu_is_preempted(int cpu)
>> {
>>      return arch_vcpu_is_preempted(cpu);
>> }
>> 
>> #ifdef arch_vcpu_has_yield_count 
>> bool vcpu_has_preemtped_once(int cpu, unsigned int yield_count)
>> {
>>      return arch_get_vcpu_yield_count() != yield_count;
>> }
>> 
>> #else
>> bool vcpu_has_preemtped_once(int cpu, unsigned int yield_count)
>> {
>>      /*just let called know it is preepmpted*/
>>      return vcpu_is_preempted(cpu);
>> }
>> #endif
>> 
> 
> Unfortunately, on PPC, we compile pseries code along with powernv code
> in a kernel binary, therefore we need to wire the proper primitives at
> runtime.
> 
no, we can use same vcpu preempted check ops actually.
just like how arch_spin_lock() does.

if (SHARED_PROCESSOR)
                                __spin_yield(lock); 

so we can
u32 arch_get_vcpu_yield_count(int cpu)
{
        if (SHARED_PROCESSOR)
                                return be32_to_cpu(lppaca_of(cpu).yield_count;
        return 0;
}

and all these things can be don inline then.

> I've cooked the following patch, please note I haven't test this, this
> is just a POC.
> 
> Regards,
> Boqun
> 
> ---------------->8
> virt: Introduce vcpu preemption detection functions
> 
> Lock holder preemption is an serious problem in virtualized environment
> for locking. Different archs and hypervisors employ different ways to
> try to solve the problem in different locking mechanisms. And recently
> Pan Xinhui found a significant performance drop in osq_lock(), which
> could be solved by providing a mechanism to detect the preemption of
> vcpu.
> 
> Therefore this patch introduces several vcpu preemption detection
> primitives, which allows locking primtives to save the time of
> unnecesarry spinning. These primitives act as an abstract layer between
> locking functions and arch or hypervisor related code.
> 
> There are two sets of primitives:
> 
> 1.    vcpu_preempt_count() and vcpu_has_preempted(), they must be used
>       pairwisely in a same preempt disable critical section. And they
>       could detect whether a vcpu preemption happens between them.
> 
> 2.    vcpu_is_preempted(), used to check whether other cpu's vcpu is
>       preempted.
> 
> This patch also implements those primitives on pseries and wire them up.
> 
> Signed-off-by: Boqun Feng <boqun.f...@gmail.com>
> ---
> arch/powerpc/platforms/pseries/Kconfig |  1 +
> arch/powerpc/platforms/pseries/setup.c | 28 +++++++++++
> include/linux/vcpu_preempt.h           | 90 ++++++++++++++++++++++++++++++++++
> 3 files changed, 119 insertions(+)
> create mode 100644 include/linux/vcpu_preempt.h
> 
> diff --git a/arch/powerpc/platforms/pseries/Kconfig 
> b/arch/powerpc/platforms/pseries/Kconfig
> index bec90fb30425..7d24c3e48878 100644
> --- a/arch/powerpc/platforms/pseries/Kconfig
> +++ b/arch/powerpc/platforms/pseries/Kconfig
> @@ -21,6 +21,7 @@ config PPC_PSERIES
>       select HOTPLUG_CPU if SMP
>       select ARCH_RANDOM
>       select PPC_DOORBELL
> +     select HAS_VCPU_PREEMPTION_DETECTION
>       default y
> 
it is already too many config there…
as I comment above, we could make these things inline, so looks like we needn't 
to add such config.
 
> config PPC_SPLPAR
> diff --git a/arch/powerpc/platforms/pseries/setup.c 
> b/arch/powerpc/platforms/pseries/setup.c
> index 9883bc7ea007..5d4aed54e039 100644
> --- a/arch/powerpc/platforms/pseries/setup.c
> +++ b/arch/powerpc/platforms/pseries/setup.c
> @@ -42,6 +42,7 @@
> #include <linux/of.h>
> #include <linux/of_pci.h>
> #include <linux/kexec.h>
> +#include <linux/vcpu_preempt.h>
> 
> #include <asm/mmu.h>
> #include <asm/processor.h>
> @@ -501,6 +502,31 @@ static void __init find_and_init_phbs(void)
>       of_pci_check_probe_only();
> }
> 
> +struct vcpu_preempt_ops vcpu_preempt_ops = DEFAULT_VCPU_PREEMPT_OPS;
> +EXPORT_SYMBOL(vcpu_preempt_ops);
> +
> +static long pseries_vcpu_preempt_count(void)
> +{
> +     return be32_to_cpu(get_lppaca()->yield_count);
> +}
> +
> +static bool pseries_vcpu_is_preempted(int cpu)
> +{
> +     return !!(be32_to_cpu(lppaca_of(cpu).yield_count) & 1);
> +}
> +
> +static bool pseries_vcpu_has_preempted(long vpc)
> +{
> +     return pseries_vcpu_preempt_count() != vpc;
> +}
> +
> +static void __init pseries_setup_vcpu_preempt_ops(void)
> +{
> +     vcpu_preempt_ops.preempt_count = pseries_vcpu_preempt_count;
> +     vcpu_preempt_ops.is_preempted = pseries_vcpu_is_preempted;
> +     vcpu_preempt_ops.has_preempted = pseries_vcpu_has_preempted;
> +}
> +
> static void __init pSeries_setup_arch(void)
> {
>       set_arch_panic_timeout(10, ARCH_PANIC_TIMEOUT);
> @@ -549,6 +575,8 @@ static void __init pSeries_setup_arch(void)
>                               "%ld\n", rc);
>               }
>       }
> +
> +     pseries_setup_vcpu_preempt_ops();
> }
> 
> static int __init pSeries_init_panel(void)
> diff --git a/include/linux/vcpu_preempt.h b/include/linux/vcpu_preempt.h
> new file mode 100644
> index 000000000000..4a88414bb83f
> --- /dev/null
> +++ b/include/linux/vcpu_preempt.h

How about add them in sched.h
I just think  they are similar. correct me if  I got a mistake :)

thanks
xinhui

> @@ -0,0 +1,90 @@
> +/*
> + * Primitives for checking the vcpu preemption from the guest.
> + */
> +
> +static long __vcpu_preempt_count(void)
> +{
> +     return 0;
> +}
> +
> +static bool __vcpu_has_preempted(long vpc)
> +{
> +     return false;
> +}
> +
> +static bool __vcpu_is_preempted(int cpu)
> +{
> +     return false;
> +}
> +
> +struct vcpu_preempt_ops {
> +     /*
> +      * Get the current vcpu's "preempt count", which is going to use for
> +      * checking whether the current vcpu has ever been preempted
> +      */
> +     long (*preempt_count)(void);
> +
> +     /*
> +      * Return whether a vcpu is preempted
> +      */
> +     bool (*is_preempted)(int cpu);
> +
> +     /*
> +      * Given a "vcpu preempt count", Return whether a vcpu preemption ever
> +      * happened after the .preempt_count() was called.
> +      */
> +     bool (*has_preempted)(long vpc);
> +};
> +
> +struct vcpu_preempt_ops vcpu_preempt_ops;
> +
> +/* Default boilerplate */
> +#define DEFAULT_VCPU_PREEMPT_OPS                     \
> +     {                                               \
> +             .preempt_count = __vcpu_preempt_count,  \
> +             .is_preempted = __vcpu_is_preempted,    \
> +             .has_preempted = __vcpu_has_preempted   \
> +     }
> +
> +#ifdef CONFIG_HAS_VCPU_PREEMPTION_DETECTION
> +/*
> + * vcpu_preempt_count: Get the current cpu's "vcpu preempt count"(vpc).
> + *
> + * The vpc is used for checking whether the current vcpu has ever been
> + * preempted via vcpu_has_preempted().
> + *
> + * This function and vcpu_has_preepmted() should be called in the same
> + * preemption disabled critical section.
> + */
> +static long vcpu_preempt_count(void)
> +{
> +     return vcpu_preempt_ops.preempt_count();
> +}
> +
> +/*
> + * vcpu_is_preempted: Check whether @cpu's vcpu is preempted.
> + */
> +static bool vcpu_is_preempted(int cpu)
> +{
> +     return vcpu_preempt_ops.is_preempted(cpu);
> +}
> +
> +/*
> + * vcpu_has_preepmted: Check whether the current cpu's vcpu has ever been
> + * preempted.
> + *
> + * The checked duration is between the vcpu_preempt_count() which returns 
> @vpc
> + * is called and this function called.
> + *
> + * This function and corresponding vcpu_preempt_count() should be in the same
> + * preemption disabled cirtial section.
> + */
> +static bool vcpu_has_preempted(long vpc)
> +{
> +     return vcpu_preempt_ops.has_preempt(vpc);
> +}
> +#else /* CONFIG_HAS_VCPU_PREEMPTION_DETECTION */
> +#define vcpu_preempt_count() __vcpu_preempt_count()
> +#define vcpu_is_preempted(cpu) __vcpu_is_preempted(cpu)
> +#define vcpu_has_preempted(vpc) __vcpu_has_preempted(vpc)
> +#endif /* CONFIG_HAS_VCPU_PREEPMTION_DETECTION */
> -- 
> 2.9.0
> 

Reply via email to