Re: [PATCH] kvm: rename HINTS_DEDICATED to KVM_HINTS_REALTIME

2018-05-18 Thread Paolo Bonzini
On 18/05/2018 19:13, Eduardo Habkost wrote:
>> As much as we'd like to be helpful and validate input, you need a real
>> time host too. I'm not sure how we'd find out - I suggest we do not
>> bother for now.
> I'm worried that people will start enabling the flag in all kinds
> of scenarios where the guarantees can't be kept, and make the
> meaning of the flag in practice completely different from its
> documented meaning.

I don't think we should try to detect anything.  As far as QEMU is
concerned, it's mostly garbage in, garbage out when it comes to invalid
configurations.  It's just a bit, and using it in invalid configurations
is okay if you're doing it (for example) for debugging.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm: rename HINTS_DEDICATED to KVM_HINTS_REALTIME

2018-05-18 Thread Paolo Bonzini
On 18/05/2018 18:04, Eduardo Habkost wrote:
>> Without mlock you should always use pv spinlocks.
>>
>> Otherwise you risk blocking on a lock taken by
>> a VCPU that is in turn blocked on IO, where the IO
>> is not completing because CPU is being used up
>> spinning.
>
> So the stronger guarantee seems necessary.
> 
> Now what should host userspace do if the user is trying to run an
> existing configuration where the CPUID hint was set but memory is
> not pinned?

As mentioned elsewhere in the thread, there are many ways to pin memory,
and mlock is not always necessary.  However, I agree with Michael in
making the hint provide a stronger guarantee.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm: rename HINTS_DEDICATED to KVM_HINTS_REALTIME

2018-05-18 Thread Paolo Bonzini
On 17/05/2018 20:46, Eduardo Habkost wrote:
> My understanding of the original patch is that the intention is
> to tell the guest that it is very unlikely to be preempted, so it
> can choose a more appropriate spinlock implementation.  This
> description implies that the guest will never be preempted, which
> is much stronger guarantee.
> 
> Isn't this new description incompatible with existing usage of
> the hint, which might include people who just use vCPU pinning
> but no mlock?

If you use hugetlbfs and vhost-user you don't really need mlock for the
QEMU process, do you?  The QEMU process is not doing much in that case
and hugetlbfs gives you pinned memory automatically.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm: rename HINTS_DEDICATED to KVM_HINTS_REALTIME

2018-05-17 Thread Paolo Bonzini
On 17/05/2018 16:54, Michael S. Tsirkin wrote:
> HINTS_DEDICATED seems to be somewhat confusing:
> 
> Guest doesn't really care whether it's the only task running on a host
> CPU as long as it's not preempted.
> 
> And there are more reasons for Guest to be preempted than host CPU
> sharing, for example, with memory overcommit it can get preempted on a
> memory access, post copy migration can cause preemption, etc.
> 
> Let's call it KVM_HINTS_REALTIME which seems to better
> match what guests expect.
> 
> Also, the flag most be set on all vCPUs - current guests assume this.
> Note so in the documentation.
> 
> Signed-off-by: Michael S. Tsirkin 
> ---
>  Documentation/virtual/kvm/cpuid.txt  | 6 +++---
>  arch/x86/include/uapi/asm/kvm_para.h | 2 +-
>  arch/x86/kernel/kvm.c| 8 
>  3 files changed, 8 insertions(+), 8 deletions(-)
> 
> diff --git a/Documentation/virtual/kvm/cpuid.txt 
> b/Documentation/virtual/kvm/cpuid.txt
> index d4f33eb8..ab022dc 100644
> --- a/Documentation/virtual/kvm/cpuid.txt
> +++ b/Documentation/virtual/kvm/cpuid.txt
> @@ -72,8 +72,8 @@ KVM_FEATURE_CLOCKSOURCE_STABLE_BIT ||24 || host will 
> warn if no guest-side
>  
>  flag   || value || meaning
>  
> ==
> -KVM_HINTS_DEDICATED|| 0 || guest checks this feature bit 
> to
> -   ||   || determine if there is vCPU 
> pinning
> -   ||   || and there is no vCPU 
> over-commitment,
> +KVM_HINTS_REALTIME || 0 || guest checks this feature bit 
> to
> +   ||   || determine that vCPUs are never
> +   ||   || preempted for an unlimited 
> time,
> ||   || allowing optimizations
>  
> --
> diff --git a/arch/x86/include/uapi/asm/kvm_para.h 
> b/arch/x86/include/uapi/asm/kvm_para.h
> index 4c851eb..0ede697 100644
> --- a/arch/x86/include/uapi/asm/kvm_para.h
> +++ b/arch/x86/include/uapi/asm/kvm_para.h
> @@ -29,7 +29,7 @@
>  #define KVM_FEATURE_PV_TLB_FLUSH 9
>  #define KVM_FEATURE_ASYNC_PF_VMEXIT  10
>  
> -#define KVM_HINTS_DEDICATED  0
> +#define KVM_HINTS_REALTIME  0
>  
>  /* The last 8 bits are used to indicate how to interpret the flags field
>   * in pvclock structure. If no bits are set, all flags are ignored.
> diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
> index 7867417..5b2300b 100644
> --- a/arch/x86/kernel/kvm.c
> +++ b/arch/x86/kernel/kvm.c
> @@ -457,7 +457,7 @@ static void __init sev_map_percpu_data(void)
>  static void __init kvm_smp_prepare_cpus(unsigned int max_cpus)
>  {
>   native_smp_prepare_cpus(max_cpus);
> - if (kvm_para_has_hint(KVM_HINTS_DEDICATED))
> + if (kvm_para_has_hint(KVM_HINTS_REALTIME))
>   static_branch_disable(_spin_lock_key);
>  }
>  
> @@ -553,7 +553,7 @@ static void __init kvm_guest_init(void)
>   }
>  
>   if (kvm_para_has_feature(KVM_FEATURE_PV_TLB_FLUSH) &&
> - !kvm_para_has_hint(KVM_HINTS_DEDICATED) &&
> + !kvm_para_has_hint(KVM_HINTS_REALTIME) &&
>   kvm_para_has_feature(KVM_FEATURE_STEAL_TIME))
>   pv_mmu_ops.flush_tlb_others = kvm_flush_tlb_others;
>  
> @@ -649,7 +649,7 @@ static __init int kvm_setup_pv_tlb_flush(void)
>   int cpu;
>  
>   if (kvm_para_has_feature(KVM_FEATURE_PV_TLB_FLUSH) &&
> - !kvm_para_has_hint(KVM_HINTS_DEDICATED) &&
> + !kvm_para_has_hint(KVM_HINTS_REALTIME) &&
>   kvm_para_has_feature(KVM_FEATURE_STEAL_TIME)) {
>   for_each_possible_cpu(cpu) {
>   zalloc_cpumask_var_node(per_cpu_ptr(&__pv_tlb_mask, 
> cpu),
> @@ -745,7 +745,7 @@ void __init kvm_spinlock_init(void)
>   if (!kvm_para_has_feature(KVM_FEATURE_PV_UNHALT))
>   return;
>  
> - if (kvm_para_has_hint(KVM_HINTS_DEDICATED))
> + if (kvm_para_has_hint(KVM_HINTS_REALTIME))
>   return;
>  
>   __pv_init_lock_hash();
> 

Queued, thanks.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv3 1/1] locking/qspinlock/x86: Avoid test-and-set when PV_DEDICATED is set

2017-11-10 Thread Paolo Bonzini
On 10/11/2017 07:07, Wanpeng Li wrote:
>>> You should also add a cpuid flag in kvm part.
>> It is better without that.  The flag has no dependency on KVM (kernel
>> hypervisor) code.
> Do you mean -cpu host, +,I think it will result in "warning: host
> doesn't support requested feature: CPUID.4001H:eax.XX"

There are some exceptions where QEMU overrides the values of
KVM_GET_SUPPORTED_CPUID.

I think it is better to add the flag to KVM *and* to QEMU's override in
kvm_arch_get_supported_cpuid (target/i386/kvm.c).

Paolo
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv3 1/1] locking/qspinlock/x86: Avoid test-and-set when PV_DEDICATED is set

2017-11-07 Thread Paolo Bonzini
On 07/11/2017 13:39, Eduardo Valentin wrote:
>> is this still needed after Waiman's patch to adaptively switch between
>> tas and pvqspinlock?
> Can you please point me to it ? Is it already in tip/master?
> 

No, he just posted it:

https://marc.info/?l=linux-kernel=150972337909996=2

Paolo
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv3 1/1] locking/qspinlock/x86: Avoid test-and-set when PV_DEDICATED is set

2017-11-07 Thread Paolo Bonzini
On 06/11/2017 21:26, Eduardo Valentin wrote:
> Currently, the existing qspinlock implementation will fallback to
> test-and-set if the hypervisor has not set the PV_UNHALT flag.
> 
> This patch gives the opportunity to guest kernels to select
> between test-and-set and the regular queueu fair lock implementation
> based on the PV_DEDICATED KVM feature flag. When the PV_DEDICATED
> flag is not set, the code will still fall back to test-and-set,
> but when the PV_DEDICATED flag is set, the code will use
> the regular queue spinlock implementation.
> 
> With this patch, when in autoselect mode, the guest will
> use the default spinlock implementation based on host feature
> flags as follows:
> 
> PV_DEDICATED = 1, PV_UNHALT = anything: default is qspinlock
> PV_DEDICATED = 0, PV_UNHALT = 1: default is pvqspinlock
> PV_DEDICATED = 0, PV_UNHALT = 0: default is tas

Hi Eduardo,

besides the suggestion to use a separate word than the one for features,
is this still needed after Waiman's patch to adaptively switch between
tas and pvqspinlock?

Paolo

> Cc: Paolo Bonzini <pbonz...@redhat.com>
> Cc: "Radim Krčmář" <rkrc...@redhat.com>
> Cc: Jonathan Corbet <cor...@lwn.net>
> Cc: Thomas Gleixner <t...@linutronix.de>
> Cc: Ingo Molnar <mi...@redhat.com>
> Cc: "H. Peter Anvin" <h...@zytor.com>
> Cc: x...@kernel.org
> Cc: Peter Zijlstra <pet...@infradead.org>
> Cc: Waiman Long <long...@redhat.com>
> Cc: k...@vger.kernel.org
> Cc: linux-doc@vger.kernel.org
> Cc: linux-ker...@vger.kernel.org
> Cc: Jan H. Schoenherr <jscho...@amazon.de>
> Cc: Anthony Liguori <aligu...@amazon.com>
> Suggested-by: Matt Wilson <m...@amazon.com>
> Signed-off-by: Eduardo Valentin <edu...@amazon.com>
> ---
> V3:
>  - When PV_DEDICATED is set (1), qspinlock is selected,
>regardless of the value of PV_UNHAULT. Suggested by Paolo Bonzini. 
>  - Refreshed on top of tip/master.
> V2:
>  - rebase on top of tip/master
> 
>  Documentation/virtual/kvm/cpuid.txt  | 6 ++
>  arch/x86/include/asm/qspinlock.h | 4 
>  arch/x86/include/uapi/asm/kvm_para.h | 1 +
>  arch/x86/kernel/kvm.c| 2 ++
>  4 files changed, 13 insertions(+)
> 
> diff --git a/Documentation/virtual/kvm/cpuid.txt 
> b/Documentation/virtual/kvm/cpuid.txt
> index 3c65feb..117066a 100644
> --- a/Documentation/virtual/kvm/cpuid.txt
> +++ b/Documentation/virtual/kvm/cpuid.txt
> @@ -54,6 +54,12 @@ KVM_FEATURE_PV_UNHALT  || 7 || guest 
> checks this feature bit
> ||   || before enabling 
> paravirtualized
> ||   || spinlock support.
>  
> --
> +KVM_FEATURE_PV_DEDICATED   || 8 || guest checks this feature bit
> +   ||   || to determine if they run on
> +   ||   || dedicated vCPUs, allowing 
> opti-
> +   ||   || mizations such as usage of
> +   ||   || qspinlocks.
> +--
>  KVM_FEATURE_CLOCKSOURCE_STABLE_BIT ||24 || host will warn if no 
> guest-side
> ||   || per-cpu warps are expected in
> ||   || kvmclock.
> diff --git a/arch/x86/include/asm/qspinlock.h 
> b/arch/x86/include/asm/qspinlock.h
> index 5e16b5d..de42694 100644
> --- a/arch/x86/include/asm/qspinlock.h
> +++ b/arch/x86/include/asm/qspinlock.h
> @@ -3,6 +3,8 @@
>  #define _ASM_X86_QSPINLOCK_H
>  
>  #include 
> +#include 
> +
>  #include 
>  #include 
>  #include 
> @@ -58,6 +60,8 @@ static inline bool virt_spin_lock(struct qspinlock *lock)
>   if (!static_branch_likely(_spin_lock_key))
>   return false;
>  
> + if (kvm_para_has_feature(KVM_FEATURE_PV_DEDICATED))
> + return false;
>   /*
>* On hypervisors without PARAVIRT_SPINLOCKS support we fall
>* back to a Test-and-Set spinlock, because fair locks have
> diff --git a/arch/x86/include/uapi/asm/kvm_para.h 
> b/arch/x86/include/uapi/asm/kvm_para.h
> index 554aa8f..85a9875 100644
> --- a/arch/x86/include/uapi/asm/kvm_para.h
> +++ b/arch/x86/include/uapi/asm/kvm_para.h
> @@ -25,6 +25,7 @@
>  #define KVM_FEATURE_STEAL_TIME   5
>  #define KVM_FEATURE_PV_EOI   6
>  #define KVM_FEATURE_PV_UNHALT7
> +#define KVM_FEATURE_PV_DEDICATED 8
>  
>  /* The last 8 bits are used to indicate how 

Re: [PATCHv2 1/1] locking/qspinlock/x86: Avoid test-and-set when PV_DEDICATED is set

2017-11-03 Thread Paolo Bonzini
On 02/11/2017 19:43, Eduardo Valentin wrote:
> On Thu, Nov 02, 2017 at 07:24:16PM +0100, Paolo Bonzini wrote:
>> On 02/11/2017 19:08, Eduardo Valentin wrote:
>>> On Thu, Nov 02, 2017 at 06:56:46PM +0100, Paolo Bonzini wrote:
>>>> On 02/11/2017 18:45, Eduardo Valentin wrote:
>>>>> Currently, the existing qspinlock implementation will fallback to
>>>>> test-and-set if the hypervisor has not set the PV_UNHALT flag.
>>>>>
>>>>> This patch gives the opportunity to guest kernels to select
>>>>> between test-and-set and the regular queueu fair lock implementation
>>>>> based on the PV_DEDICATED KVM feature flag. When the PV_DEDICATED
>>>>> flag is not set, the code will still fall back to test-and-set,
>>>>> but when the PV_DEDICATED flag is set, the code will use
>>>>> the regular queue spinlock implementation.
>>>>
>>>> Have you seen Waiman's series that lets you specify this on the guest
>>>> command line instead?  Would this be acceptable for your use case?
>>>
>>> No, can you please share a link to it? is it already merged to tip/master?
>>
>> [PATCH-tip v2 0/2] x86/paravirt: Enable users to choose PV lock type
>> https://lkml.org/lkml/2017/11/1/655
>>
>>>> (In other words, is there a difference for you between making the host
>>>> vs. guest administrator toggle the feature?  "@amazon.com" means you are
>>>> the host admin, how would you use it?)
>>>
>>> The way I think of this is this is a flag set by host side so the
>>> guest adapts accordingly.
>>>
>>> If the admin in guest side wants to ignore what the host is
>>> flagging, that is a different story.
>>
>> Okay, this makes sense.  But perhaps it should be a separate CPUID leaf,
>> such as "configuration hints", rather than properly a feature.
> 
> Oh OK, you don't think this starts to deviate from the feature concept.
> But would the PV_UNHALT also go to "configuration hints" bucket?

PV_UNHALT says whether the pvqspinlock API is available, PV_DEDICATED
says whether it should be used.

> Another way to see this is we have three locking feature options to select 
> from,
> so we need at least two bits here.

PV_DEDICATED = 1, PV_UNHALT = anything: default is qspinlock
PV_DEDICATED = 0, PV_UNHALT = 1: default is pvqspinlock
PV_DEDICATED = 0, PV_UNHALT = 0: default is tas

What do you think?

Paolo
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv2 1/1] locking/qspinlock/x86: Avoid test-and-set when PV_DEDICATED is set

2017-11-02 Thread Paolo Bonzini
On 02/11/2017 19:08, Eduardo Valentin wrote:
> On Thu, Nov 02, 2017 at 06:56:46PM +0100, Paolo Bonzini wrote:
>> On 02/11/2017 18:45, Eduardo Valentin wrote:
>>> Currently, the existing qspinlock implementation will fallback to
>>> test-and-set if the hypervisor has not set the PV_UNHALT flag.
>>>
>>> This patch gives the opportunity to guest kernels to select
>>> between test-and-set and the regular queueu fair lock implementation
>>> based on the PV_DEDICATED KVM feature flag. When the PV_DEDICATED
>>> flag is not set, the code will still fall back to test-and-set,
>>> but when the PV_DEDICATED flag is set, the code will use
>>> the regular queue spinlock implementation.
>>
>> Have you seen Waiman's series that lets you specify this on the guest
>> command line instead?  Would this be acceptable for your use case?
> 
> No, can you please share a link to it? is it already merged to tip/master?

[PATCH-tip v2 0/2] x86/paravirt: Enable users to choose PV lock type
https://lkml.org/lkml/2017/11/1/655

>> (In other words, is there a difference for you between making the host
>> vs. guest administrator toggle the feature?  "@amazon.com" means you are
>> the host admin, how would you use it?)
> 
> The way I think of this is this is a flag set by host side so the
> guest adapts accordingly.
> 
> If the admin in guest side wants to ignore what the host is
> flagging, that is a different story.

Okay, this makes sense.  But perhaps it should be a separate CPUID leaf,
such as "configuration hints", rather than properly a feature.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv2 1/1] locking/qspinlock/x86: Avoid test-and-set when PV_DEDICATED is set

2017-11-02 Thread Paolo Bonzini
On 02/11/2017 18:45, Eduardo Valentin wrote:
> Currently, the existing qspinlock implementation will fallback to
> test-and-set if the hypervisor has not set the PV_UNHALT flag.
> 
> This patch gives the opportunity to guest kernels to select
> between test-and-set and the regular queueu fair lock implementation
> based on the PV_DEDICATED KVM feature flag. When the PV_DEDICATED
> flag is not set, the code will still fall back to test-and-set,
> but when the PV_DEDICATED flag is set, the code will use
> the regular queue spinlock implementation.

Have you seen Waiman's series that lets you specify this on the guest
command line instead?  Would this be acceptable for your use case?

(In other words, is there a difference for you between making the host
vs. guest administrator toggle the feature?  "@amazon.com" means you are
the host admin, how would you use it?)

Thanks,

Paolo

> Cc: Paolo Bonzini <pbonz...@redhat.com>
> Cc: "Radim Krčmář" <rkrc...@redhat.com>
> Cc: Jonathan Corbet <cor...@lwn.net>
> Cc: Thomas Gleixner <t...@linutronix.de>
> Cc: Ingo Molnar <mi...@redhat.com>
> Cc: "H. Peter Anvin" <h...@zytor.com>
> Cc: x...@kernel.org
> Cc: Peter Zijlstra <pet...@infradead.org>
> Cc: Waiman Long <long...@redhat.com>
> Cc: k...@vger.kernel.org
> Cc: linux-doc@vger.kernel.org
> Cc: linux-ker...@vger.kernel.org
> Cc: Jan H. Schoenherr <jscho...@amazon.de>
> Cc: Anthony Liguori <aligu...@amazon.com>
> Suggested-by: Matt Wilson <m...@amazon.com>
> Signed-off-by: Eduardo Valentin <edu...@amazon.com>
> ---
> V2:
>  - rebase on top of tip/master
> 
>  Documentation/virtual/kvm/cpuid.txt  | 6 ++
>  arch/x86/include/asm/qspinlock.h | 4 
>  arch/x86/include/uapi/asm/kvm_para.h | 1 +
>  3 files changed, 11 insertions(+)
> 
> diff --git a/Documentation/virtual/kvm/cpuid.txt 
> b/Documentation/virtual/kvm/cpuid.txt
> index 3c65feb..117066a 100644
> --- a/Documentation/virtual/kvm/cpuid.txt
> +++ b/Documentation/virtual/kvm/cpuid.txt
> @@ -54,6 +54,12 @@ KVM_FEATURE_PV_UNHALT  || 7 || guest 
> checks this feature bit
> ||   || before enabling 
> paravirtualized
> ||   || spinlock support.
>  
> --
> +KVM_FEATURE_PV_DEDICATED   || 8 || guest checks this feature bit
> +   ||   || to determine if they run on
> +   ||   || dedicated vCPUs, allowing 
> opti-
> +   ||   || mizations such as usage of
> +   ||   || qspinlocks.
> +--
>  KVM_FEATURE_CLOCKSOURCE_STABLE_BIT ||24 || host will warn if no 
> guest-side
> ||   || per-cpu warps are expected in
> ||   || kvmclock.
> diff --git a/arch/x86/include/asm/qspinlock.h 
> b/arch/x86/include/asm/qspinlock.h
> index 308dfd0..3751898 100644
> --- a/arch/x86/include/asm/qspinlock.h
> +++ b/arch/x86/include/asm/qspinlock.h
> @@ -2,6 +2,8 @@
>  #define _ASM_X86_QSPINLOCK_H
>  
>  #include 
> +#include 
> +
>  #include 
>  #include 
>  #include 
> @@ -57,6 +59,8 @@ static inline bool virt_spin_lock(struct qspinlock *lock)
>   if (!static_branch_likely(_spin_lock_key))
>   return false;
>  
> + if (kvm_para_has_feature(KVM_FEATURE_PV_DEDICATED))
> + return false;
>   /*
>* On hypervisors without PARAVIRT_SPINLOCKS support we fall
>* back to a Test-and-Set spinlock, because fair locks have
> diff --git a/arch/x86/include/uapi/asm/kvm_para.h 
> b/arch/x86/include/uapi/asm/kvm_para.h
> index a965e5b0..d151300 100644
> --- a/arch/x86/include/uapi/asm/kvm_para.h
> +++ b/arch/x86/include/uapi/asm/kvm_para.h
> @@ -24,6 +24,7 @@
>  #define KVM_FEATURE_STEAL_TIME   5
>  #define KVM_FEATURE_PV_EOI   6
>  #define KVM_FEATURE_PV_UNHALT7
> +#define KVM_FEATURE_PV_DEDICATED 8
>  
>  /* The last 8 bits are used to indicate how to interpret the flags field
>   * in pvclock structure. If no bits are set, all flags are ignored.
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH V2] kvm: x86: reduce rtc 0x70 access vm-exit time

2017-08-14 Thread Paolo Bonzini
On 13/08/2017 21:51, Peng Hao wrote:
> some versions of windows guest access rtc frequently because of
> rtc as system tick.guest access rtc like this: write register index
> to 0x70, then write or read data from 0x71. writing 0x70 port is
> just as index and do nothing else. So we can use coalesced mmio to
> handle this scene to reduce VM-EXIT time.
> without my patch, get the vm-exit time of accessing rtc 0x70 using
> perf tools: (guest OS : windows 7 64bit)
> IO Port Access  Samples Samples%  Time%  Min Time  Max Time  Avg time
> 0x70:POUT86 30.99%74.59%   9us  29us10.75us (+- 3.41%)
> 
> with my patch
> IO Port Access  Samples Samples%  Time%   Min Time  Max Time   Avg time
>  0x70:POUT   10632.02%29.47%0us  10us 1.57us (+- 
> 7.38%)
> 
> the patch is a part of optimizing rtc 0x70 port access. Another is in
> qemu.
> 
> Signed-off-by: Peng Hao 

Very nice, thanks.

The new documentation file can be used later as a base for documentation
of coalesced MMIO ioctls.  Here is an edited version:


Coalesced MMIO and coalesced PIO can be used to optimize writes to
simple device registers.  Writes to a coalesced-I/O region are not
reported to userspace until the next non-coalesced I/O is issued, in a
similar fashion to write combining hardware.  In KVM, coalesced writes
are handled in the kernel without exits to userspace, and are thus
several times faster.

Examples of devices that can benefit from coalesced I/O include:

- devices whose memory is accessed with many consecutive writes, for
example the EGA/VGA video RAM.

- windowed I/O, such as the real-time clock.  The address register (port
0x70 in the RTC case) can use coalesced I/O, cutting the number of
userspace exits by half when reading or writing the RTC.


Paolo
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] x86/idle: use dynamic halt poll

2017-06-27 Thread Paolo Bonzini


On 27/06/2017 16:22, Radim Krčmář wrote:
> vcpu_is_preempted() on current cpu cannot return true, AFAIK.

Of course.  I must have been thinking of an older version of the
vcpu_is_preempted patch (at some point the guest was the one that set
preempted to 0).

Paolo
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] x86/idle: use dynamic halt poll

2017-06-27 Thread Paolo Bonzini


On 27/06/2017 15:40, Radim Krčmář wrote:
>> ... which is not necessarily _wrong_.  It's just a different heuristic.
> Right, it's just harder to use than host's single_task_running() -- the
> VCPU calling vcpu_is_preempted() is never preempted, so we have to look
> at other VCPUs that are not halted, but still preempted.
> 
> If we see some ratio of preempted VCPUs (> 0?), then we stop polling and
> yield to the host.  Working under the assumption that there is work for
> this PCPU if other VCPUs have stuff to do.  The downside is that it
> misses information about host's topology, so it would be hard to make it
> work well.

I would just use vcpu_is_preempted on the current CPU.  From guest POV
this option is really a "f*** everyone else" setting just like
idle=poll, only a little more polite.

If we've been preempted and we were polling, there are two cases.  If an
interrupt was queued while the guest was preempted, the poll will be
treated as successful anyway.  If it hasn't, let others run---but really
that's not because the guest wants to be polite, it's to avoid that the
scheduler penalizes it excessively.

So until it's preempted, I think it's okay if the guest doesn't care
about others.  You wouldn't use this option anyway in overcommitted
situations.

(I'm still not very convinced about the idea).

Paolo
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] x86/idle: use dynamic halt poll

2017-06-27 Thread Paolo Bonzini


On 27/06/2017 14:23, Wanpeng Li wrote:
 I have considered single_task_running() before. But since there is no
 such paravirtual interface currently and i am not sure whether it is a
 information leak from host if introducing such interface, so i didn't do
 it. Do you mean vcpu_is_preempted can do the same thing? I check the
 code and seems it only tells whether the VCPU is scheduled out or not
 which cannot satisfy the needs.
>>> Can you help to answer my confusion? I have double checked the code, but
>>> still not get your point. Do you think it is necessary to introduce an
>>> paravirtual interface to expose single_task_running() to guest?
>
> I think vcpu_is_preempted is a good enough replacement.
> For example, vcpu->arch.st.steal.preempted is 0 when the vCPU is sched
> in and vmentry, then several tasks are enqueued on the same pCPU and
> waiting on cfs red-black tree, the guest should avoid to poll in this
> scenario, however, vcpu_is_preempted returns false and guest decides
> to poll.

... which is not necessarily _wrong_.  It's just a different heuristic.

In the end, the guest could run with "idle=poll" even, and there's
little the host scheduler can do about it, except treating it as a CPU
bound task.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] x86/idle: use dynamic halt poll

2017-06-27 Thread Paolo Bonzini


On 27/06/2017 13:22, Yang Zhang wrote:
>>>
>>> Regarding the good/bad idea part, KVM's polling is made much more
>>> acceptable by single_task_running().  At least you need to integrate it
>>> with paravirtualization.  If the VM is scheduled out, you shrink the
>>> polling period.  There is already vcpu_is_preempted for this, it is used
>>> by mutexes.
>>
>> I have considered single_task_running() before. But since there is no
>> such paravirtual interface currently and i am not sure whether it is a
>> information leak from host if introducing such interface, so i didn't do
>> it. Do you mean vcpu_is_preempted can do the same thing? I check the
>> code and seems it only tells whether the VCPU is scheduled out or not
>> which cannot satisfy the needs.
> 
> Can you help to answer my confusion? I have double checked the code, but
> still not get your point. Do you think it is necessary to introduce an
> paravirtual interface to expose single_task_running() to guest?

I think vcpu_is_preempted is a good enough replacement.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] x86/idle: use dynamic halt poll

2017-06-22 Thread Paolo Bonzini


On 22/06/2017 13:22, root wrote:
>  ==
>  
> +poll_grow: (X86 only)
> +
> +This parameter is multiplied in the grow_poll_ns() to increase the poll time.
> +By default, the values is 2.
> +
> +==
> +poll_shrink: (X86 only)
> +
> +This parameter is divided in the shrink_poll_ns() to reduce the poll time.
> +By default, the values is 2.

Even before starting the debate on whether this is a good idea or a bad
idea, KVM reduces the polling value to the minimum (10 us) by default
when polling fails.  Also, it shouldn't be bound to
CONFIG_HYPERVISOR_GUEST, since there's nothing specific to virtual
machines here.

Regarding the good/bad idea part, KVM's polling is made much more
acceptable by single_task_running().  At least you need to integrate it
with paravirtualization.  If the VM is scheduled out, you shrink the
polling period.  There is already vcpu_is_preempted for this, it is used
by mutexes.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: Documentation: remove VM mmap documentation

2017-04-28 Thread Paolo Bonzini


On 27/04/2017 23:57, Jonathan Corbet wrote:
> On Thu, 27 Apr 2017 15:40:42 -0600
> Jonathan Corbet  wrote:
> 
>> On Mon, 24 Apr 2017 11:16:49 +0200
>> Jann Horn  wrote:
>>
>>> Since commit 80f5b5e700fa9c ("KVM: remove vm mmap method"), the VM mmap
>>> handler is gone. Remove the corresponding documentation.  
>>
>> Applied to the docs tree, thanks.
> 
> Actually, I've unapplied it since it leads to conflicts with the kvm
> tree, and poor Stephen has already had to fix up too many of those for me
> this time around.  Paolo, maybe you'd like to pick it up and reconcile
> things?

Yes, I'll apply it for 4.12.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v6] kvm: better MWAIT emulation for guests

2017-04-21 Thread Paolo Bonzini


On 21/04/2017 12:05, Alexander Graf wrote:
> 
> 
> On 21.04.17 12:02, Paolo Bonzini wrote:
>>
>>
>> On 12/04/2017 18:29, Michael S. Tsirkin wrote:
>>> I don't really agree we do not need the PV flag. mwait on kvm is
>>> different from mwait on bare metal in that you are heavily penalized by
>>> scheduler for polling unless you configure the host just so.
>>> HLT lets you give up the host CPU if you know you won't need
>>> it for a long time.
>>>
>>> So while many people can get by with monitor cpuid (those that isolate
>>> host CPUs) and it's a valuable option to have, I think a PV flag is also
>>> a valuable option and can be set for more configurations.
>>>
>>> Guest has an idle driver calling mwait on short waits and halt on longer
>>> ones.  I'm in fact testing an idle driver using such a PV flag and will
>>> post when ready (after vacation ~3 weeks from now probably).
>>
>> For now I think I'm removing the PV flag, making this just an
>> optimization of commit 87c00572ba05aa8c ("kvm: x86: emulate
>> monitor and mwait instructions as nop").
>>
>> We can add it for 4.13 together with the idle driver.
> 
> I think that's a perfectly reasonable approach, yes. We can always add
> the PV flag with the driver.
> 
> Thanks a lot!

Queuing the patch for 4.12.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v6] kvm: better MWAIT emulation for guests

2017-04-21 Thread Paolo Bonzini


On 12/04/2017 18:29, Michael S. Tsirkin wrote:
> I don't really agree we do not need the PV flag. mwait on kvm is
> different from mwait on bare metal in that you are heavily penalized by
> scheduler for polling unless you configure the host just so.
> HLT lets you give up the host CPU if you know you won't need
> it for a long time.
> 
> So while many people can get by with monitor cpuid (those that isolate
> host CPUs) and it's a valuable option to have, I think a PV flag is also
> a valuable option and can be set for more configurations.
> 
> Guest has an idle driver calling mwait on short waits and halt on longer
> ones.  I'm in fact testing an idle driver using such a PV flag and will
> post when ready (after vacation ~3 weeks from now probably).

For now I think I'm removing the PV flag, making this just an
optimization of commit 87c00572ba05aa8c ("kvm: x86: emulate
monitor and mwait instructions as nop").

We can add it for 4.13 together with the idle driver.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 11/32] KVM: MIPS: Add VZ capability

2017-03-03 Thread Paolo Bonzini


On 03/03/2017 13:37, James Hogan wrote:
> Actually I think the way I had designed KVM_CAP_MIPS_VZ is fine. I had
> defined it as an enumeration rather than a mask because it isn't
> expected you'd have more than one hardware virtualisation type able to
> run on a particular core.
> 
> Whether T is still supported is I think better exposed by a new
> KVM_CAP_MIPS_TE capability, indicating whether T is exposed when
> KVM_CAP_MIPS_VZ is also set.
> 
> It would be set to 1 on new kernels whenever T is supported.
> 
> For compatibility with older kernels, userland would be expected to
> determine whether T is present by:
> check(KVM_CAP_MIPS_VZ) == 0 || check(KVM_CAP_MIPS_TE) != 0
> 
> Old userland that doesn't check KVM_CAP_MIPS_TE would just hit an EINVAL
> from KVM_CREATE_VM if T isn't supported.

That's okay.

Paolo



signature.asc
Description: OpenPGP digital signature


Re: [RFC PATCH v4 19/28] swiotlb: Add warnings for use of bounce buffers with SME

2017-03-02 Thread Paolo Bonzini


On 17/02/2017 17:51, Tom Lendacky wrote:
> 
> It's meant just to notify the user about the condition. The user could
> then decide to use an alternative device that supports a greater DMA
> range (I can probably change it to a dev_warn_once() so that a device
> is identified).  I would be nice if I could issue this message once per
> device that experienced this.  I didn't see anything that would do
> that, though.

dev_warn_once would print once only, not once per device.  But if you
leave the dev_warn elsewhere, this can be just pr_warn_once.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 11/32] KVM: MIPS: Add VZ capability

2017-03-02 Thread Paolo Bonzini


On 02/03/2017 10:36, James Hogan wrote:
>  - KVM_VM_MIPS_DEFAULT = 2
> 
>This will provide the best available KVM implementation (even on
>older kernels), preferring hardware assisted virtualization over trap
>& emulate. The KVM_CAP_MIPS_VZ capability should always be checked
>against known values to determine what type of implementation was
>chosen.
> 
> This is designed to allow the desired implementation (T vs VZ) to be
> potentially chosen at runtime rather than being fixed in the kernel
> configuration.

Can the same kernel run on both TE and VZ?  If not, I'm not sure that
KVM_VM_MIPS_DEFAULT is a good idea.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/4] KVM: MIPS: Implement console output hypercall

2017-02-06 Thread Paolo Bonzini


On 06/02/2017 11:46, James Hogan wrote:
> Documentation/virtual/kvm/api.txt seems to suggest that
> KVM_EXIT_HYPERCALL is obsolete. When it suggests using KVM_EXIT_MMIO,
> does it simply mean the guest should use MMIO to some virtio device of
> some sort rather than using hypercalls, or that the hypercall should
> somehow be munged into the mmio exit information?

The former.

But there are cases when using hypercalls is unavoidable.  In that case
the trend is to use other exit reasons than KVM_EXIT_HYPERCALL, such as
KVM_EXIT_PAPR_HCALL in PowerPC.  Feel free to add KVM_EXIT_MIPS_CONOUT
or something like that.

How would you find the character device to write to in QEMU?

Paolo
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/5] kernel-doc tweaks and cleanup of rST vs. non-rST backends

2017-01-23 Thread Paolo Bonzini


On 23/01/2017 14:42, Markus Heiser wrote:
> 
> Am 04.01.2017 um 23:06 schrieb Jonathan Corbet <cor...@lwn.net>:
> 
>> On Mon,  2 Jan 2017 16:22:22 +0100
>> Paolo Bonzini <pbonz...@redhat.com> wrote:
>>
>>> these patches are the result of my experiments with using kernel-doc
>>> for QEMU's documentation.  Patches 1 and 2 should be relatively
>>> straightforward, as they are simple bugfixes.  Patches 3 to 5, instead,
>>> are making the docbook backend (and the others too) more consistent with
>>> the input and output of the rST backend.
>>>
>>> I am not sure what is the state of the kernel-doc non-rST backends;
>>> but there are still several books using the docbook workflow, so I'm
>>> trying my luck and sending the patches anyway. :)
>>
>> I've played with them a bit, and they don't seem to break things, so I'll
>> go ahead and apply them.
> 
> Hi Paolo !
> 
> Sorry for my late reply, I'am testing patch 2:
> 
>   https://www.mail-archive.com/linux-doc@vger.kernel.org/msg08503.html
> 
> but I can't find any changes in the reST output (even not in 
> include/linux/log2.h
> you mentioned). May I'm a bit blind today, so can you give me an example where
> the patch takes effect?

I found this with QEMU.  You need to test with an inline function which
has attributes with arguments.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/5] kernel-doc: make member highlighting available in all backends

2017-01-02 Thread Paolo Bonzini
Note that, in order to produce the correct Docbook markup, the "." or "->"
must be separated from the member name in the regex's captured fields.  For
consistency, this change is applied to $type_member and $type_member_func
too, not just to $type_member_xml.

List mode only prints the struct name, to avoid any undesired change in
the operation of docproc.

Signed-off-by: Paolo Bonzini <pbonz...@redhat.com>
---
 scripts/kernel-doc | 30 +++---
 1 file changed, 19 insertions(+), 11 deletions(-)

diff --git a/scripts/kernel-doc b/scripts/kernel-doc
index e5b5daa147ea..88c3290b6056 100755
--- a/scripts/kernel-doc
+++ b/scripts/kernel-doc
@@ -199,12 +199,12 @@ EOF
 # 'funcname()' - function
 # '$ENVVAR' - environmental variable
 # '_name' - name of a structure (up to two words including 'struct')
+# '_name.member' - name of a structure member
 # '@parameter' - name of a parameter
 # '%CONST' - name of a constant.
 
 ## init lots of data
 
-
 my $errors = 0;
 my $warnings = 0;
 my $anon_struct_union = 0;
@@ -221,7 +221,8 @@ my $type_enum_full = '\&(enum)\s*([_\w]+)';
 my $type_struct_full = '\&(struct)\s*([_\w]+)';
 my $type_typedef_full = '\&(typedef)\s*([_\w]+)';
 my $type_union_full = '\&(union)\s*([_\w]+)';
-my $type_member = '\&([_\w]+)((\.|->)[_\w]+)';
+my $type_member = '\&([_\w]+)(\.|->)([_\w]+)';
+my $type_member_xml = '\([_\w]+)(\.|-\)([_\w]+)';
 my $type_member_func = $type_member . '\(\)';
 
 # Output conversion substitutions.
@@ -233,7 +234,8 @@ my @highlights_html = (
[$type_func, "\$1"],
[$type_struct_xml, "\$1"],
[$type_env, "\$1"],
-   [$type_param, "\$1"]
+   [$type_param, "\$1"],
+   [$type_member_xml, "\$1\$2\$3"]
   );
 my $local_lt = "lt:";
 my $local_gt = "gt:";
@@ -245,7 +247,8 @@ my @highlights_html5 = (
 [$type_func, "\$1"],
 [$type_struct_xml, "\$1"],
 [$type_env, "\$1"],
-[$type_param, "\$1]"]
+[$type_param, "\$1]"],
+[$type_member_xml, "\$1\$2\$3"]
   );
 my $blankline_html5 = $local_lt . "br /" . $local_gt;
 
@@ -256,7 +259,8 @@ my @highlights_xml = (
   [$type_struct_xml, "\$1"],
   [$type_param, "\$1"],
   [$type_func, "\$1"],
-  [$type_env, "\$1"]
+  [$type_env, "\$1"],
+  [$type_member_xml, 
"\$1\$2\$3"]
 );
 my $blankline_xml = $local_lt . "/para" . $local_gt . $local_lt . "para" . 
$local_gt . "\n";
 
@@ -266,7 +270,8 @@ my @highlights_gnome = (
 [$type_func, "\$1"],
 [$type_struct, "\$1"],
 [$type_env, "\$1"],
-[$type_param, "\$1" ]
+[$type_param, "\$1" ],
+[$type_member, 
"\$1\$2\$3"]
   );
 my $blankline_gnome = "\n";
 
@@ -275,7 +280,8 @@ my @highlights_man = (
   [$type_constant, "\$1"],
   [$type_func, "fB\$1fP"],
   [$type_struct, "fI\$1fP"],
-  [$type_param, "fI\$1fP"]
+  [$type_param, "fI\$1fP"],
+  [$type_member, "fI\$1\$2\$3fP"]
 );
 my $blankline_man = "";
 
@@ -284,7 +290,8 @@ my @highlights_text = (
[$type_constant, "\$1"],
[$type_func, "\$1"],
[$type_struct, "\$1"],
-   [$type_param, "\$1"]
+   [$type_param, "\$1"],
+   [$type_member, "\$1\$2\$3"]
  );
 my $blankline_text = "";
 
@@ -292,8 +299,8 @@ my $blankline_text = "";
 my @highlights_rst = (
[$type_constant, "``\$1``"],
# Note: need to escape () to avoid func matching later
-   [$type_member_func, "\\:c\\:type\\:`\$1\$2() 
<\$1>`"],
-   [$type_member, "\\:c\\:type\\:`\$1\$2 <\$1>`"],
+   [$type_member_func, "\\:c\\:type\\:`\$1\$2\$3(\

[PATCH 5/5] kernel-doc: make highlights more homogenous for the various backends

2017-01-02 Thread Paolo Bonzini
$type_struct_full and friends are only used by the restructuredText
backend, because it needs to separate enum/struct/typedef/union from
the name of the type.  However, $type_struct is *also* used by the rST
backend.  This is confusing.

This patch replaces $type_struct's use in the rST backend with a new
$type_fallback; it modifies $type_struct so that it can be used in the
rST backend; and creates regular expressions like $type_struct
for enum/typedef/union, for use in all backends.

Note that, compared to $type_*_full, in the new regexes $1 includes both
the "kind" and the name (before, $1 was pretty much a constant).

Signed-off-by: Paolo Bonzini <pbonz...@redhat.com>
---
 scripts/kernel-doc | 68 +++---
 1 file changed, 50 insertions(+), 18 deletions(-)

diff --git a/scripts/kernel-doc b/scripts/kernel-doc
index 88c3290b6056..daf5e36055b7 100755
--- a/scripts/kernel-doc
+++ b/scripts/kernel-doc
@@ -214,15 +214,19 @@ my $type_constant = '\%([-_\w]+)';
 my $type_func = '(\w+)\(\)';
 my $type_param = '\@(\w+(\.\.\.)?)';
 my $type_fp_param = '\@(\w+)\(\)';  # Special RST handling for func ptr params
-my $type_struct = '\&((struct\s*)*[_\w]+)';
-my $type_struct_xml = '\\((struct\s*)*[_\w]+)';
 my $type_env = '(\$\w+)';
-my $type_enum_full = '\&(enum)\s*([_\w]+)';
-my $type_struct_full = '\&(struct)\s*([_\w]+)';
-my $type_typedef_full = '\&(typedef)\s*([_\w]+)';
-my $type_union_full = '\&(union)\s*([_\w]+)';
+my $type_enum = '\&(enum\s*([_\w]+))';
+my $type_struct = '\&(struct\s*([_\w]+))';
+my $type_typedef = '\&(typedef\s*([_\w]+))';
+my $type_union = '\&(union\s*([_\w]+))';
 my $type_member = '\&([_\w]+)(\.|->)([_\w]+)';
+my $type_fallback = '\&([_\w]+)';
+my $type_enum_xml = '\(enum\s*([_\w]+))';
+my $type_struct_xml = '\(struct\s*([_\w]+))';
+my $type_typedef_xml = '\(typedef\s*([_\w]+))';
+my $type_union_xml = '\(union\s*([_\w]+))';
 my $type_member_xml = '\([_\w]+)(\.|-\)([_\w]+)';
+my $type_fallback_xml = '\([_\w]+)';
 my $type_member_func = $type_member . '\(\)';
 
 # Output conversion substitutions.
@@ -232,10 +236,14 @@ my $type_member_func = $type_member . '\(\)';
 my @highlights_html = (
[$type_constant, "\$1"],
[$type_func, "\$1"],
+   [$type_enum_xml, "\$1"],
[$type_struct_xml, "\$1"],
+   [$type_typedef_xml, "\$1"],
+   [$type_union_xml, "\$1"],
[$type_env, "\$1"],
[$type_param, "\$1"],
-   [$type_member_xml, "\$1\$2\$3"]
+   [$type_member_xml, "\$1\$2\$3"],
+   [$type_fallback_xml, "\$1"]
   );
 my $local_lt = "lt:";
 my $local_gt = "gt:";
@@ -245,10 +253,14 @@ my $blankline_html = $local_lt . "p" . $local_gt; # was 
""
 my @highlights_html5 = (
 [$type_constant, "\$1"],
 [$type_func, "\$1"],
+[$type_enum_xml, "\$1"],
 [$type_struct_xml, "\$1"],
+[$type_typedef_xml, "\$1"],
+[$type_union_xml, "\$1"],
 [$type_env, "\$1"],
 [$type_param, "\$1]"],
-[$type_member_xml, "\$1\$2\$3"]
+[$type_member_xml, "\$1\$2\$3"],
+[$type_fallback_xml, "\$1"]
   );
 my $blankline_html5 = $local_lt . "br /" . $local_gt;
 
@@ -256,11 +268,15 @@ my $blankline_html5 = $local_lt . "br /" . $local_gt;
 my @highlights_xml = (
   ["([^=])\\\"([^\\\"<]+)\\\"", "\$1\$2"],
   [$type_constant, "\$1"],
+  [$type_enum_xml, "\$1"],
   [$type_struct_xml, "\$1"],
+  [$type_typedef_xml, "\$1"],
+  [$type_union_xml, "\$1"],
   [$type_param, "\$1"],
   [$type_func, "\$1"],
   [$type_env, "\$1"],
-  [$type_member_xml, 
"\$1\$2\$3"]
+  [$type_member_xml, 
"\$1\$2\$3"],
+  [$type_fallback_xml, "\$1"]
 );
 my $blankline_xml = $local_lt . "/para" . $local_gt . $local_lt . "para" . 
$local_gt . "\n";
 
@@ -268,10 +284,14 @@ my $blankline_xml

[PATCH 1/5] kernel-doc: cleanup parameter type in function-typed arguments

2017-01-02 Thread Paolo Bonzini
A prototype like

/**
 * foo - sample definition
 * @bar: a parameter
 */
int foo(int (*bar)(int x,
   int y));

is currently producing

.. c:function:: int foo (int (*bar) (int x,int y)

   sample definition

**Parameters**

``int (*)(int x,int y) bar``
  a parameter

Collapse the spaces so that the output is nicer.

Signed-off-by: Paolo Bonzini <pbonz...@redhat.com>
---
 scripts/kernel-doc | 1 +
 1 file changed, 1 insertion(+)

diff --git a/scripts/kernel-doc b/scripts/kernel-doc
index 030fc633acd4..c1ea91c2e497 100755
--- a/scripts/kernel-doc
+++ b/scripts/kernel-doc
@@ -2409,6 +2409,7 @@ sub push_parameter($$$) {
# "[blah" in a parameter string;
###$param =~ s/\s*//g;
push @parameterlist, $param;
+   $type =~ s/\s\s+/ /g;
$parametertypes{$param} = $type;
 }
 
-- 
2.9.3


--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/5] kernel-doc: include parameter type in docbook output

2017-01-02 Thread Paolo Bonzini
The restructuredText output includes both the parameter type and
the name for functions and function-typed members.  Do the same
for docbook.

Signed-off-by: Paolo Bonzini <pbonz...@redhat.com>
---
 scripts/kernel-doc | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/scripts/kernel-doc b/scripts/kernel-doc
index 265ea16cbe22..e5b5daa147ea 100755
--- a/scripts/kernel-doc
+++ b/scripts/kernel-doc
@@ -1131,8 +1131,9 @@ sub output_function_xml(%) {
foreach $parameter (@{$args{'parameterlist'}}) {
my $parameter_name = $parameter;
$parameter_name =~ s/\[.*//;
+   $type = $args{'parametertypes'}{$parameter};
 
-   print "  \n   
$parameter\n";
+   print "  \n   $type 
$parameter\n";
print "   \n\n";
$lineprefix=" ";
output_highlight($args{'parameterdescs'}{$parameter_name});
@@ -1223,8 +1224,9 @@ sub output_struct_xml(%) {
 
   defined($args{'parameterdescs'}{$parameter_name}) || next;
   ($args{'parameterdescs'}{$parameter_name} ne $undescribed) || next;
+  $type = $args{'parametertypes'}{$parameter};
   print "";
-  print "  $parameter\n";
+  print "  $type $parameter\n";
   print "  \n";
   output_highlight($args{'parameterdescs'}{$parameter_name});
   print "  \n";
-- 
2.9.3


--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/5] kernel-doc tweaks and cleanup of rST vs. non-rST backends

2017-01-02 Thread Paolo Bonzini
Hi,

these patches are the result of my experiments with using kernel-doc
for QEMU's documentation.  Patches 1 and 2 should be relatively
straightforward, as they are simple bugfixes.  Patches 3 to 5, instead,
are making the docbook backend (and the others too) more consistent with
the input and output of the rST backend.

I am not sure what is the state of the kernel-doc non-rST backends;
but there are still several books using the docbook workflow, so I'm
trying my luck and sending the patches anyway. :)

Paolo

Paolo Bonzini (5):
  kernel-doc: cleanup parameter type in function-typed arguments
  kernel-doc: strip attributes even if they have an argument
  kernel-doc: include parameter type in docbook output
  kernel-doc: make member highlighting available in all backends
  kernel-doc: make highlights more homogenous for the various backends

 scripts/kernel-doc | 99 --
 1 file changed, 74 insertions(+), 25 deletions(-)

-- 
2.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH v1 00/18] x86: Secure Memory Encryption (AMD)

2016-05-09 Thread Paolo Bonzini


On 02/05/2016 20:31, Andy Lutomirski wrote:
> And did the SEV implementation remember to encrypt the guest register
> state?  Because, if not, everything of importance will leak out
> through the VMCB and/or GPRs.

No, it doesn't.  And SEV is very limited unless you paravirtualize
everything.

For example, the hypervisor needs to read some instruction bytes from
memory, and instruction bytes are always encrypted (15.34.5 in the APM).
 So you're pretty much restricted to IN/OUT operations (not even
INS/OUTS) on emulated (non-assigned) devices, paravirtualized MSRs, and
hypercalls.  These are the only operations that connect the guest and
the hypervisor, where the vmexit doesn't have the need to e.g. walk
guest page tables (also always encrypted).  It possibly can be made to
work once the guest boots, and a modern UEFI firmware probably can cope
with it too just like a kernel can, but you need to ensure that your
hardware has no memory BARs for example.  And I/O port space is not very
abundant.

Even in order to emulate I/O ports or RDMSR/WRMSR or process hypercalls,
the hypervisor needs to read the GPRs.  The VMCB doesn't store guest
GPRs, not even on SEV-enabled processors.  Accordingly, the hypervisor
has access to the guest GPRs on every exit.

In general, SEV provides mitigation only.  Even if the hypervisor cannot
write known plaintext directly to memory, an accomplice virtual machine
can e.g. use the network to spray the attacked VM's memory.  At least
it's not as easy as "disable NX under the guest's feet and redirect RIP"
(pte.nx is reserved if efer.nxe=0, all you get is a #PF).  But the
hypervisor can still disable SMEP and SMAP, it can use hardware
breakpoints to leak information through the registers, and it can do all
the other attacks you mentioned.  If AMD had rdrand/rdseed, it could
replace the output with not so random values, and so on.

It's surely better than nothing, but "encryption that really is nothing
more than mitigation" is pretty weird.  I'm waiting for cloud vendors to
sell this as the best thing since sliced bread, when in reality it's
just mitigation.  I wonder how wise it is to merge SEV in its current
state---and since security is not my specialty I am definitely looking
for advice on this.

Paolo

ps: I'm now reminded of this patch:

commit dab429a798a8ab3377136e09dda55ea75a41648d
Author: David Kaplan 
Date:   Mon Mar 2 13:43:37 2015 -0600

kvm: svm: make wbinvd faster

No need to re-decode WBINVD since we know what it is from the
intercept.

Signed-off-by: David Kaplan 
[extracted from larger unlrelated patch, forward ported,
 tested,style cleanup]
Signed-off-by: Joel Schopp 
Reviewed-by: Radim Krčmář 
Signed-off-by: Marcelo Tosatti 

and I wonder if the larger unlrelated patch had anything to do with SEV!
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html