date:20191104

Re: [Xen-devel] [PATCH v2] x86/stackframe/32: repair 32-bit Xen PV

2019-11-04 Thread Jan Beulich

On 04.11.2019 23:44, Thomas Gleixner wrote:
> On Tue, 29 Oct 2019, Jan Beulich wrote:
> 
>> Once again RPL checks have been introduced which don't account for a
>> 32-bit kernel living in ring 1 when running in a PV Xen domain.
>>
>> The case in FIXUP_FRAME has been preventing boot; adjust BUG_IF_WRONG_CR3
>> as well just in case.
> 
> Either it's required and correct or it's not. Just in case is not helpful
> at all.

_If_ this macro sits on any path reachable when running PV under
Xen, then it's wrong. If any such use gets added down the road,
then it's latently wrong, which is bad enough imo, and hence
warrants the change even without analyzing whether there's
already an affected path.

>> Fixes: 3c88c692c287 ("x86/stackframe/32: Provide consistent pt_regs")
>> Signed-off-by: Jan Beulich 
> 
>> --- a/arch/x86/entry/entry_32.S
>> +++ b/arch/x86/entry/entry_32.S
>> @@ -48,6 +48,13 @@
>>  
>>  #include "calling.h"
>>  
>> +/*
>> + * When running on Xen PV, the actual %cs register's RPL in the kernel is 1,
>> + * not 0. If we need to distinguish between a %cs from kernel mode and a %cs
>> + * from user mode, we can do test $2 instead of test $3.
>> + */
>> +#define USER_SEGMENT_RPL_MASK 2
> 
> No. The define want's to be right next to the SEGMENT_RPL_MASK define
> including a less ASM focussed comment like this:
> 
> /*
>  * When running on Xen PV, the actual priviledge level of the kernel is 1,
>  * not 0. Testing the Requested Priviledge Level in a segment selector to
>  * determine whether the context is user mode or kernel mode with
>  * SEGMENT_RPL_MASK is wrong because the PV kernels priviledge level
>  * matches the 0x03 mask.
>  *
>  * Testing with USER_SEGMENT_RPL_MASK is valid for both native and Xen PV
>  * kernels because Priviledge Level 2 is never used.
>  */
> 
> Hmm?

I simply used almost exactly what Andy had suggested as a comment. He
also didn't object to the definition sitting here (it's not needed
after all outside of this file). Can the two of you please reach
agreement, for me to act upon?

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] Commit moratorium for getting a push

2019-11-04 Thread Jürgen Groß


Committers,

Please don't push any new patch to staging because we want to have a
push to master for 4.13-RC2.

Another email will be sent once the moratorium is lifted.


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v1 03/10] KVM: Prepare kvm_is_reserved_pfn() for PG_reserved changes

2019-11-04 Thread Dan Williams

On Thu, Oct 24, 2019 at 5:11 AM David Hildenbrand  wrote:
>
> Right now, ZONE_DEVICE memory is always set PG_reserved. We want to
> change that.
>
> KVM has this weird use case that you can map anything from /dev/mem
> into the guest. pfn_valid() is not a reliable check whether the memmap
> was initialized and can be touched. pfn_to_online_page() makes sure
> that we have an initialized memmap (and don't have ZONE_DEVICE memory).
>
> Rewrite kvm_is_reserved_pfn() to make sure the function produces the
> same result once we stop setting ZONE_DEVICE pages PG_reserved.
>
> Cc: Paolo Bonzini 
> Cc: "Radim Krčmář" 
> Cc: Michal Hocko 
> Cc: Dan Williams 
> Cc: KarimAllah Ahmed 
> Signed-off-by: David Hildenbrand 
> ---
>  virt/kvm/kvm_main.c | 10 --
>  1 file changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index e9eb666eb6e8..9d18cc67d124 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -151,9 +151,15 @@ __weak int kvm_arch_mmu_notifier_invalidate_range(struct 
> kvm *kvm,
>
>  bool kvm_is_reserved_pfn(kvm_pfn_t pfn)
>  {
> -   if (pfn_valid(pfn))
> -   return PageReserved(pfn_to_page(pfn));
> +   struct page *page = pfn_to_online_page(pfn);
>
> +   /*
> +* We treat any pages that are not online (not managed by the buddy)
> +* as reserved - this includes ZONE_DEVICE pages and pages without
> +* a memmap (e.g., mapped via /dev/mem).
> +*/
> +   if (page)
> +   return PageReserved(page);
> return true;
>  }

So after this all the pfn_valid() usage in kvm_main.c is replaced with
pfn_to_online_page()? Looks correct to me.

However, I'm worried that kvm is taking reference on ZONE_DEVICE pages
through some other path resulting in this:

https://lore.kernel.org/linux-nvdimm/20190919154708.ga24...@angband.pl/

I'll see if this patch set modulates or maintains that failure mode.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 09/15] xen/gntdev: use mmu_range_notifier_insert

2019-11-04 Thread Jason Gunthorpe

On Mon, Nov 04, 2019 at 05:03:31PM -0500, Boris Ostrovsky wrote:
> On 10/28/19 4:10 PM, Jason Gunthorpe wrote:
> > @@ -445,17 +438,9 @@ static void gntdev_vma_close(struct vm_area_struct 
> > *vma)
> > struct gntdev_priv *priv = file->private_data;
> >  
> > pr_debug("gntdev_vma_close %p\n", vma);
> > -   if (use_ptemod) {
> > -   /* It is possible that an mmu notifier could be running
> > -* concurrently, so take priv->lock to ensure that the vma won't
> > -* vanishing during the unmap_grant_pages call, since we will
> > -* spin here until that completes. Such a concurrent call will
> > -* not do any unmapping, since that has been done prior to
> > -* closing the vma, but it may still iterate the unmap_ops list.
> > -*/
> > -   mutex_lock(&priv->lock);
> > +   if (use_ptemod && map->vma == vma) {
> 
> 
> Is it possible for map->vma not to be equal to vma?

It could be NULL at least if use_ptemod is not set.

Otherwise, I'm not sure, the confusing bit is that the map comes from
here:

map = gntdev_find_map_index(priv, index, count);

It looks like the intent is that the map->vma is always set to the
only vma that has the map as private_data.

So, I suppose it can be relaxed to a null test and a WARN_ON that it
hasn't changed?

Jason

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v1 02/10] KVM: x86/mmu: Prepare kvm_is_mmio_pfn() for PG_reserved changes

2019-11-04 Thread Dan Williams

On Thu, Oct 24, 2019 at 5:10 AM David Hildenbrand  wrote:
>
> Right now, ZONE_DEVICE memory is always set PG_reserved. We want to
> change that.
>
> KVM has this weird use case that you can map anything from /dev/mem
> into the guest. pfn_valid() is not a reliable check whether the memmap
> was initialized and can be touched. pfn_to_online_page() makes sure
> that we have an initialized memmap (and don't have ZONE_DEVICE memory).
>
> Rewrite kvm_is_mmio_pfn() to make sure the function produces the
> same result once we stop setting ZONE_DEVICE pages PG_reserved.
>
> Cc: Paolo Bonzini 
> Cc: "Radim Krčmář" 
> Cc: Sean Christopherson 
> Cc: Vitaly Kuznetsov 
> Cc: Wanpeng Li 
> Cc: Jim Mattson 
> Cc: Joerg Roedel 
> Cc: Thomas Gleixner 
> Cc: Ingo Molnar 
> Cc: Borislav Petkov 
> Cc: "H. Peter Anvin" 
> Cc: KarimAllah Ahmed 
> Cc: Michal Hocko 
> Cc: Dan Williams 
> Signed-off-by: David Hildenbrand 
> ---
>  arch/x86/kvm/mmu.c | 29 +
>  1 file changed, 17 insertions(+), 12 deletions(-)
>
> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> index 24c23c66b226..f03089a336de 100644
> --- a/arch/x86/kvm/mmu.c
> +++ b/arch/x86/kvm/mmu.c
> @@ -2962,20 +2962,25 @@ static bool mmu_need_write_protect(struct kvm_vcpu 
> *vcpu, gfn_t gfn,
>
>  static bool kvm_is_mmio_pfn(kvm_pfn_t pfn)
>  {
> +   struct page *page = pfn_to_online_page(pfn);
> +
> +   /*
> +* ZONE_DEVICE pages are never online. Online pages that are reserved
> +* either indicate the zero page or MMIO pages.
> +*/
> +   if (page)
> +   return !is_zero_pfn(pfn) && PageReserved(pfn_to_page(pfn));
> +
> +   /*
> +* Anything with a valid (but not online) memmap could be ZONE_DEVICE.
> +* Treat only UC/UC-/WC pages as MMIO.

You might clarify that ZONE_DEVICE pages that belong to WB cacheable
pmem have the correct memtype established by devm_memremap_pages().

Other than that, feel free to add:

Reviewed-by: Dan Williams 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [libvirt test] 143589: regressions - FAIL

2019-11-04 Thread osstest service owner

flight 143589 libvirt real [real]
http://logs.test-lab.xenproject.org/osstest/logs/143589/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-libvirt 12 guest-start  fail REGR. vs. 143023
 test-amd64-i386-libvirt  12 guest-start  fail REGR. vs. 143023
 test-amd64-i386-libvirt-xsm  12 guest-start  fail REGR. vs. 143023
 test-amd64-amd64-libvirt-xsm 12 guest-start  fail REGR. vs. 143023
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 debian-hvm-install fail 
REGR. vs. 143023
 test-amd64-amd64-libvirt-pair 21 guest-start/debian  fail REGR. vs. 143023
 test-amd64-i386-libvirt-pair 21 guest-start/debian   fail REGR. vs. 143023
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 debian-hvm-install fail 
REGR. vs. 143023
 test-arm64-arm64-libvirt-xsm 12 guest-start  fail REGR. vs. 143023
 test-arm64-arm64-libvirt 12 guest-start  fail REGR. vs. 143023
 test-arm64-arm64-libvirt-qcow2 10 debian-di-install  fail REGR. vs. 143023
 test-amd64-amd64-libvirt-vhd 10 debian-di-installfail REGR. vs. 143023
 test-armhf-armhf-libvirt 12 guest-start  fail REGR. vs. 143023
 test-armhf-armhf-libvirt-raw 10 debian-di-installfail REGR. vs. 143023

version targeted for testing:
 libvirt  73f91d659b07df8ab267fed1ea4949245a7b57af
baseline version:
 libvirt  2cff65e4c60ed7b3c0c6a97d526d1f8d52c0e919

Last test of basis   143023  2019-10-22 04:19:26 Z   13 days
Failing since143051  2019-10-23 04:18:57 Z   12 days   10 attempts
Testing same since   143589  2019-11-02 16:15:11 Z1 days1 attempts


People who touched revisions under test:
  Andrea Bolognani 
  Eric Blake 
  Jim Fehlig 
  Ján Tomko 
  Maya Rashish 
  Michal Privoznik 
  Pavel Hrdina 
  Peter Krempa 

jobs:
 build-amd64-xsm  pass
 build-arm64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-arm64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-arm64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-arm64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm   fail
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsmfail
 test-amd64-amd64-libvirt-xsm fail
 test-arm64-arm64-libvirt-xsm fail
 test-amd64-i386-libvirt-xsm  fail
 test-amd64-amd64-libvirt fail
 test-arm64-arm64-libvirt fail
 test-armhf-armhf-libvirt fail
 test-amd64-i386-libvirt  fail
 test-amd64-amd64-libvirt-pairfail
 test-amd64-i386-libvirt-pair fail
 test-arm64-arm64-libvirt-qcow2   fail
 test-armhf-armhf-libvirt-raw fail
 test-amd64-amd64-libvirt-vhd fail



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.

(No revision log; it would be 1212 lines long.)

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v1 01/10] mm/memory_hotplug: Don't allow to online/offline memory blocks with holes

2019-11-04 Thread Dan Williams

On Thu, Oct 24, 2019 at 5:10 AM David Hildenbrand  wrote:
>
> Our onlining/offlining code is unnecessarily complicated. Only memory
> blocks added during boot can have holes (a range that is not
> IORESOURCE_SYSTEM_RAM). Hotplugged memory never has holes (e.g., see
> add_memory_resource()). All boot memory is alread online.

s/alread/already/

...also perhaps clarify "already online" by what point in time and why
that is relevant. For example a description of the difference between
the SetPageReserved() in the bootmem path and the one in the hotplug
path.

> Therefore, when we stop allowing to offline memory blocks with holes, we
> implicitly no longer have to deal with onlining memory blocks with holes.

Maybe an explicit reference of the code areas that deal with holes
would help to back up that assertion. Certainly it would have saved me
some time for the review.

> This allows to simplify the code. For example, we no longer have to
> worry about marking pages that fall into memory holes PG_reserved when
> onlining memory. We can stop setting pages PG_reserved.

...but not for bootmem, right?

>
> Offlining memory blocks added during boot is usually not guranteed to work

s/guranteed/guaranteed/

> either way (unmovable data might have easily ended up on that memory during
> boot). So stopping to do that should not really hurt (+ people are not
> even aware of a setup where that used to work

Maybe put a "Link: https://lkml.kernel.org/r/$msg_id"; to that discussion?

> and that the existing code
> still works correctly with memory holes). For the use case of offlining
> memory to unplug DIMMs, we should see no change. (holes on DIMMs would be
> weird).

However, less memory can be offlined than was theoretically allowed
previously, so I don't understand the "we should see no change"
comment. I still agree that's a price worth paying to get the code
cleanups and if someone screams we can look at adding it back, but the
fact that it was already fragile seems decent enough protection.

>
> Please note that hardware errors (PG_hwpoison) are not memory holes and
> not affected by this change when offlining.
>
> Cc: Andrew Morton 
> Cc: Michal Hocko 
> Cc: Oscar Salvador 
> Cc: Pavel Tatashin 
> Cc: Dan Williams 
> Cc: Anshuman Khandual 
> Signed-off-by: David Hildenbrand 
> ---
>  mm/memory_hotplug.c | 26 --
>  1 file changed, 24 insertions(+), 2 deletions(-)
>
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 561371ead39a..8d81730cf036 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -1447,10 +1447,19 @@ static void node_states_clear_node(int node, struct 
> memory_notify *arg)
> node_clear_state(node, N_MEMORY);
>  }
>
> +static int count_system_ram_pages_cb(unsigned long start_pfn,
> +unsigned long nr_pages, void *data)
> +{
> +   unsigned long *nr_system_ram_pages = data;
> +
> +   *nr_system_ram_pages += nr_pages;
> +   return 0;
> +}
> +
>  static int __ref __offline_pages(unsigned long start_pfn,
>   unsigned long end_pfn)
>  {
> -   unsigned long pfn, nr_pages;
> +   unsigned long pfn, nr_pages = 0;
> unsigned long offlined_pages = 0;
> int ret, node, nr_isolate_pageblock;
> unsigned long flags;
> @@ -1461,6 +1470,20 @@ static int __ref __offline_pages(unsigned long 
> start_pfn,
>
> mem_hotplug_begin();
>
> +   /*
> +* Don't allow to offline memory blocks that contain holes.
> +* Consecuently, memory blocks with holes can never get onlined

s/Consecuently/Consequently/

> +* (hotplugged memory has no holes and all boot memory is online).
> +* This allows to simplify the onlining/offlining code quite a lot.
> +*/

The last sentence of this comment makes sense in the context of this
patch, but I don't think it stands by itself in the code base after
the fact. The person reading the comment can't see the simplifications
because the code is already gone. I'd clarify it to talk about why it
is safe to not mess around with PG_Reserved in the hotplug path
because of this check.

After those clarifications you can add:

Reviewed-by: Dan Williams 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [BUG] Invalid guest state in Xen master, dom0 linux-5.3.6, domU windows 10

2019-11-04 Thread Andrew Cooper

On 05/11/2019 00:25, Andrew Cooper wrote:
> On 04/11/2019 23:42, Andrew Cooper wrote:
>> On 04/11/2019 23:20, Håkon Alstadheim wrote:
>>> (XEN) RFLAGS=0x0193 (0x0193)  DR7 = 0x0400
>>> 
>>> (XEN) *** Insn bytes from b8868f61d69a: 44 00 00 8c d0 9c 81 0c 24
>>> 00 01 00 00 9d 8e d0  9c 81 24 24 ff fe ff ff 9d c3 cc cc cc
>>> cc cc
>> Ok.  One question answered, several more WTF.
>>
>>  <.data>:
>>    0:    44 00 00     add    %r8b,(%rax)
>>    3:    8c d0        mov    %ss,%eax
>>    5:    9c       pushfq
>>    6:    81 0c 24 00 01 00 00     orl    $0x100,(%rsp)
>>    d:    9d       popfq 
>>    e:    8e d0        mov    %eax,%ss
>>   10:    f1       icebp 
>>   11:    9c       pushfq
>>   12:    81 24 24 ff fe ff ff     andl   $0xfeff,(%rsp)
>>   19:    9d       popfq 
>>   1a:    c3       retq  
>>   1b:    cc       int3  
>>   1c:    cc       int3  
>>   1d:    cc       int3  
>>   1e:    cc       int3  
>>   1f:    cc       int3  
>>
>>
>> This is a serious exercising of architectural corner cases, by layering
>> a single step exception on top of a MovSS-deferred ICEBP.
>>
>> Now I've looked closer, this isn't a CVE-2018-8897 exploit as no
>> breakpoints are configured in %dr7, so I'm going to revise my guess some
>> new debugger-detection in DRM software.
> I've reproduced the VMEntry failure you were seeing.  Now to figure out
> if there is sufficient control available to provide architectural
> behaviour to guest, because I'm not entirely certain there is, given
> some of ICEBP's extra magic properties.

So, this is just another case of an issue I've already tried fixing once
and haven't had time to fix in a way which doesn't break other pieces of
functionality.

I clearly need to dust off that series and get it working properly.

In the short term, this will work around your problem.

diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h
index f86af09898..10daaa6f33 100644
--- a/xen/include/asm-x86/hvm/hvm.h
+++ b/xen/include/asm-x86/hvm/hvm.h
@@ -522,8 +522,7 @@ static inline void hvm_invlpg(struct vcpu *v,
unsigned long linear)
 (X86_CR4_VMXE | X86_CR4_PAE | X86_CR4_MCE))
 
 /* These exceptions must always be intercepted. */
-#define HVM_TRAP_MASK ((1U << TRAP_debug)   | \
-   (1U << TRAP_alignment_check) | \
+#define HVM_TRAP_MASK ((1U << TRAP_alignment_check) | \
    (1U << TRAP_machine_check))
 
 static inline int hvm_cpu_up(void)

However, be aware that it will reintroduce
http://xenbits.xen.org/xsa/advisory-156.html so isn't recommended for
general use.  Seeing as this looks to be some DRM software, it isn't
likely to mount an attack like that, as it would livelock a native
system just as badly as it livelocks a virtualised system.

(Sadly, it looks like CVE-2015-8104 is the gift which keeps on giving us
new corner cases in VT-x when it comes to the handling of debug
exceptions.  I've already found several acknowledged by Intel, and one
which they are still trying to figure out how to fix.)

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [BUG] Invalid guest state in Xen master, dom0 linux-5.3.6, domU windows 10

2019-11-04 Thread Andrew Cooper

On 04/11/2019 23:42, Andrew Cooper wrote:
> On 04/11/2019 23:20, Håkon Alstadheim wrote:
>> (XEN) RFLAGS=0x0193 (0x0193)  DR7 = 0x0400
>> 
>> (XEN) *** Insn bytes from b8868f61d69a: 44 00 00 8c d0 9c 81 0c 24
>> 00 01 00 00 9d 8e d0  9c 81 24 24 ff fe ff ff 9d c3 cc cc cc
>> cc cc
> Ok.  One question answered, several more WTF.
>
>  <.data>:
>    0:    44 00 00     add    %r8b,(%rax)
>    3:    8c d0        mov    %ss,%eax
>    5:    9c       pushfq
>    6:    81 0c 24 00 01 00 00     orl    $0x100,(%rsp)
>    d:    9d       popfq 
>    e:    8e d0        mov    %eax,%ss
>   10:    f1       icebp 
>   11:    9c       pushfq
>   12:    81 24 24 ff fe ff ff     andl   $0xfeff,(%rsp)
>   19:    9d       popfq 
>   1a:    c3       retq  
>   1b:    cc       int3  
>   1c:    cc       int3  
>   1d:    cc       int3  
>   1e:    cc       int3  
>   1f:    cc       int3  
>
>
> This is a serious exercising of architectural corner cases, by layering
> a single step exception on top of a MovSS-deferred ICEBP.
>
> Now I've looked closer, this isn't a CVE-2018-8897 exploit as no
> breakpoints are configured in %dr7, so I'm going to revise my guess some
> new debugger-detection in DRM software.

I've reproduced the VMEntry failure you were seeing.  Now to figure out
if there is sufficient control available to provide architectural
behaviour to guest, because I'm not entirely certain there is, given
some of ICEBP's extra magic properties.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [BUG] Invalid guest state in Xen master, dom0 linux-5.3.6, domU windows 10

2019-11-04 Thread Andrew Cooper

On 04/11/2019 23:20, Håkon Alstadheim wrote:
>
> (XEN) RFLAGS=0x0193 (0x0193)  DR7 = 0x0400
> 
> (XEN) *** Insn bytes from b8868f61d69a: 44 00 00 8c d0 9c 81 0c 24
> 00 01 00 00 9d 8e d0  9c 81 24 24 ff fe ff ff 9d c3 cc cc cc
> cc cc

Ok.  One question answered, several more WTF.

 <.data>:
   0:    44 00 00     add    %r8b,(%rax)
   3:    8c d0        mov    %ss,%eax
   5:    9c       pushfq
   6:    81 0c 24 00 01 00 00     orl    $0x100,(%rsp)
   d:    9d       popfq 
   e:    8e d0        mov    %eax,%ss
  10:    f1       icebp 
  11:    9c       pushfq
  12:    81 24 24 ff fe ff ff     andl   $0xfeff,(%rsp)
  19:    9d       popfq 
  1a:    c3       retq  
  1b:    cc       int3  
  1c:    cc       int3  
  1d:    cc       int3  
  1e:    cc       int3  
  1f:    cc       int3  


This is a serious exercising of architectural corner cases, by layering
a single step exception on top of a MovSS-deferred ICEBP.

Now I've looked closer, this isn't a CVE-2018-8897 exploit as no
breakpoints are configured in %dr7, so I'm going to revise my guess some
new debugger-detection in DRM software.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [xen-4.11-testing test] 143586: regressions - FAIL

2019-11-04 Thread osstest service owner

flight 143586 xen-4.11-testing real [real]
http://logs.test-lab.xenproject.org/osstest/logs/143586/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-xl-qcow2   19 guest-start/debian.repeat fail REGR. vs. 143158
 test-amd64-amd64-libvirt-vhd 17 guest-start/debian.repeat fail REGR. vs. 143158
 test-amd64-i386-xl-raw  19 guest-start/debian.repeat fail REGR. vs. 143158
 test-armhf-armhf-xl-vhd 15 guest-start/debian.repeat fail REGR. vs. 143158
 test-armhf-armhf-libvirt-raw 15 guest-start/debian.repeat fail REGR. vs. 143158

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemuu-dmrestrict-amd64-dmrestrict 10 debian-hvm-install 
fail never pass
 test-amd64-i386-xl-qemuu-dmrestrict-amd64-dmrestrict 10 debian-hvm-install 
fail never pass
 test-amd64-i386-xl-pvshim12 guest-start  fail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-arm64-arm64-xl  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop  fail never pass
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stop fail never pass
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop  fail never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail   never pass
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stop fail never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  14 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop  fail never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stop fail never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemut-ws16-amd64 17 guest-stop  fail never pass
 test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stop fail never pass
 test-amd64-amd64-xl-qemut-win10-i386 10 window

Re: [Xen-devel] [PATCH v2] x86/stackframe/32: repair 32-bit Xen PV

2019-11-04 Thread Thomas Gleixner

On Tue, 29 Oct 2019, Jan Beulich wrote:

> Once again RPL checks have been introduced which don't account for a
> 32-bit kernel living in ring 1 when running in a PV Xen domain.
>
> The case in FIXUP_FRAME has been preventing boot; adjust BUG_IF_WRONG_CR3
> as well just in case.

Either it's required and correct or it's not. Just in case is not helpful
at all.

> Fixes: 3c88c692c287 ("x86/stackframe/32: Provide consistent pt_regs")
> Signed-off-by: Jan Beulich 

> --- a/arch/x86/entry/entry_32.S
> +++ b/arch/x86/entry/entry_32.S
> @@ -48,6 +48,13 @@
>  
>  #include "calling.h"
>  
> +/*
> + * When running on Xen PV, the actual %cs register's RPL in the kernel is 1,
> + * not 0. If we need to distinguish between a %cs from kernel mode and a %cs
> + * from user mode, we can do test $2 instead of test $3.
> + */
> +#define USER_SEGMENT_RPL_MASK 2

No. The define want's to be right next to the SEGMENT_RPL_MASK define
including a less ASM focussed comment like this:

/*
 * When running on Xen PV, the actual priviledge level of the kernel is 1,
 * not 0. Testing the Requested Priviledge Level in a segment selector to
 * determine whether the context is user mode or kernel mode with
 * SEGMENT_RPL_MASK is wrong because the PV kernels priviledge level
 * matches the 0x03 mask.
 *
 * Testing with USER_SEGMENT_RPL_MASK is valid for both native and Xen PV
 * kernels because Priviledge Level 2 is never used.
 */

Hmm?

Thanks,

tglx

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v1 09/10] mm/memory_hotplug: Don't mark pages PG_reserved when initializing the memmap

2019-11-04 Thread Boris Ostrovsky

On 10/24/19 8:09 AM, David Hildenbrand wrote:
> diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
> index 4f2e78a5e4db..af69f057913a 100644
> --- a/drivers/xen/balloon.c
> +++ b/drivers/xen/balloon.c
> @@ -374,6 +374,13 @@ static void xen_online_page(struct page *page, unsigned 
> int order)
>   mutex_lock(&balloon_mutex);
>   for (i = 0; i < size; i++) {
>   p = pfn_to_page(start_pfn + i);
> + /*
> +  * TODO: The core used to mark the pages reserved. Most probably
> +  * we can stop doing that now. However, especially
> +  * alloc_xenballooned_pages() left PG_reserved set
> +  * on pages that can get mapped to user space.
> +  */
> + __SetPageReserved(p);

I suspect this is not needed. Pages can get into balloon either from
here or from non-hotplug path (e.g. decrease_reservation()) and so when
we get a page from the balloon we would get a random page that may or
may not have Reserved bit set.

-boris


>   balloon_append(p);
>   }
>   mutex_unlock(&balloon_mutex);
>


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 09/15] xen/gntdev: use mmu_range_notifier_insert

2019-11-04 Thread Boris Ostrovsky

On 10/28/19 4:10 PM, Jason Gunthorpe wrote:
> @@ -445,17 +438,9 @@ static void gntdev_vma_close(struct vm_area_struct *vma)
>   struct gntdev_priv *priv = file->private_data;
>  
>   pr_debug("gntdev_vma_close %p\n", vma);
> - if (use_ptemod) {
> - /* It is possible that an mmu notifier could be running
> -  * concurrently, so take priv->lock to ensure that the vma won't
> -  * vanishing during the unmap_grant_pages call, since we will
> -  * spin here until that completes. Such a concurrent call will
> -  * not do any unmapping, since that has been done prior to
> -  * closing the vma, but it may still iterate the unmap_ops list.
> -  */
> - mutex_lock(&priv->lock);
> + if (use_ptemod && map->vma == vma) {


Is it possible for map->vma not to be equal to vma?

-boris


> + mmu_range_notifier_remove(&map->notifier);
>   map->vma = NULL;
> - mutex_unlock(&priv->lock);
>   }
>   vma->vm_private_data = NULL;
>   gntdev_put_map(priv, map);
>


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 00/15] Consolidate the mmu notifier interval_tree and locking

2019-11-04 Thread Jason Gunthorpe

On Fri, Nov 01, 2019 at 01:54:45PM -0700, Ralph Campbell wrote:
> You can add my Tested-by for the mm and nouveau changes.
> IOW, patches 1-4, 10-11, and 15.
> 
> Tested-by: Ralph Campbell 

Got it, thanks

Jason

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [BUG] Invalid guest state in Xen master, dom0 linux-5.3.6, domU windows 10

2019-11-04 Thread Andrew Cooper

On 04/11/2019 15:40, Andrew Cooper wrote:
> On 04/11/2019 15:33, Håkon Alstadheim wrote:
>> Den 04.11.2019 14:31, skrev Andrew Cooper:
>>> On 03/11/2019 10:23, Håkon Alstadheim wrote:
>>>
 (XEN) [2019-11-02 14:09:46] d2v0 vmentry failure (reason 0x8021):
 Invalid guest state (0)
 (XEN) [2019-11-02 14:09:46] * VMCS Area **
 (XEN) [2019-11-02 14:09:46] *** Guest State ***
 (XEN) [2019-11-02 14:09:46] CR0: actual=0x80050031,
 shadow=0x80050031, gh_mask=
 (XEN) [2019-11-02 14:09:46] CR4: actual=0x00172678,
 shadow=0x00170678, gh_mask=ffe8f860
 (XEN) [2019-11-02 14:09:46] CR3 = 0x001aa002
 (XEN) [2019-11-02 14:09:46] RSP = 0x8c0f4dd71e98
 (0x8c0f4dd71e98)  RIP = 0xd18a040bb75e (0xd18a040bb75e)
 (XEN) [2019-11-02 14:09:46] RFLAGS=0x0187 (0x0187)  DR7 =
 0x0400
 (XEN) [2019-11-02 14:09:46] Sysenter RSP=
 CS:RIP=:
 (XEN) [2019-11-02 14:09:46]    sel  attr  limit   base
 (XEN) [2019-11-02 14:09:46]   CS: 0010 0209b  
 (XEN) [2019-11-02 14:09:46]   DS: 002b 0c0f3  
 (XEN) [2019-11-02 14:09:46]   SS: 0018 04093  
 (XEN) [2019-11-02 14:09:46]   ES: 002b 0c0f3  
 (XEN) [2019-11-02 14:09:46]   FS: 0053 040f3 3c00 
 (XEN) [2019-11-02 14:09:46]   GS: 002b 0c0f3  f8044ff8
 (XEN) [2019-11-02 14:09:46] GDTR:    0057 f80459c61fb0
 (XEN) [2019-11-02 14:09:46] LDTR:  1c000  
 (XEN) [2019-11-02 14:09:46] IDTR:    012f d18a014a0980
 (XEN) [2019-11-02 14:09:46]   TR: 0040 0008b 0067 f80459c6
 (XEN) [2019-11-02 14:09:46] EFER(VMCS) = 0x0d01  PAT =
 0x0007010600070106
 (XEN) [2019-11-02 14:09:46] PreemptionTimer = 0x  SM Base =
 0x
 (XEN) [2019-11-02 14:09:46] DebugCtl = 0x
 DebugExceptions = 0x
 (XEN) [2019-11-02 14:09:46] Interruptibility = 0002  ActivityState
 = 
 (XEN) [2019-11-02 14:09:46] InterruptStatus = 
 (XEN) [2019-11-02 14:09:46] *** Host State ***
 (XEN) [2019-11-02 14:09:46] RIP = 0x82d080341950
 (vmx_asm_vmexit_handler)  RSP = 0x83083ff0ff70
 (XEN) [2019-11-02 14:09:46] CS=e008 SS= DS= ES= FS=
 GS= TR=e040
 (XEN) [2019-11-02 14:09:46] FSBase=
 GSBase= TRBase=83083ff14000
 (XEN) [2019-11-02 14:09:46] GDTBase=83083ff03000
 IDTBase=83083ff07000
 (XEN) [2019-11-02 14:09:46] CR0=80050033 CR3=00054dbea000
 CR4=001526e0
 (XEN) [2019-11-02 14:09:46] Sysenter RSP=83083ff0ffa0
 CS:RIP=e008:82d080395440
 (XEN) [2019-11-02 14:09:46] EFER = 0x0d01  PAT =
 0x050100070406
 (XEN) [2019-11-02 14:09:46] *** Control State ***
 (XEN) [2019-11-02 14:09:46] PinBased=00bf CPUBased=b62065fa
 SecondaryExec=17eb
 (XEN) [2019-11-02 14:09:46] EntryControls=d3ff
 ExitControls=002fefff
 (XEN) [2019-11-02 14:09:46] ExceptionBitmap=00060002 PFECmask=
 PFECmatch=
 (XEN) [2019-11-02 14:09:46] VMEntry: intr_info=8501
 errcode= ilen=0001
 (XEN) [2019-11-02 14:09:46] VMExit: intr_info=8501
 errcode= ilen=0001
 (XEN) [2019-11-02 14:09:46] reason=8021
 qualification=
 (XEN) [2019-11-02 14:09:46] IDTVectoring: info=
 errcode=
 (XEN) [2019-11-02 14:09:46] TSC Offset = 0xf45ded46dd57  TSC
 Multiplier = 0x
 (XEN) [2019-11-02 14:09:46] TPR Threshold = 0x00  PostedIntrVec = 0xf5
 (XEN) [2019-11-02 14:09:46] EPT pointer = 0x00054e3a701e  EPTP
 index = 0x
 (XEN) [2019-11-02 14:09:46] PLE Gap=0080 Window=1000
 (XEN) [2019-11-02 14:09:46] Virtual processor ID = 0x5a02 VMfunc
 controls = 
 (XEN) [2019-11-02 14:09:46] **
 (XEN) [2019-11-02 14:09:46] domain_crash called from vmx.c:3335
 (XEN) [2019-11-02 14:09:46] Domain 2 (vcpu#0) crashed on cpu#2:
>>> Interruptibility = 0002 (Blocked by Mov SS) and VMEntry:
>>> intr_info=8501 (ICEBP)
>>>
>>> Dare I ask what you're running in your windows guest?  Unless it is a
>>> vulnerability test suite, I'm rather concerned.
>> Because I have pulled out all stops ? Well no particular reason. I've
>> asked my kids nicely not to poke any /more/ holes in the security on
>> the system. Probably should tighten up the ship. I have some conflict
>> going on between the hardware pci USB cards in the machine, so I
>> thought I'd see what would happen if I gave ASUS and whatever no-name
>> Tai

[Xen-devel] [qemu-mainline test] 143566: regressions - FAIL

2019-11-04 Thread osstest service owner

flight 143566 qemu-mainline real [real]
http://logs.test-lab.xenproject.org/osstest/logs/143566/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-xl-pvshim   18 guest-localmigrate/x10   fail REGR. vs. 142915
 test-amd64-amd64-xl-qcow2   19 guest-start/debian.repeat fail REGR. vs. 142915
 test-amd64-amd64-libvirt-vhd 17 guest-start/debian.repeat fail REGR. vs. 142915
 test-armhf-armhf-xl-credit1   7 xen-boot fail REGR. vs. 142915
 test-amd64-i386-xl-raw  19 guest-start/debian.repeat fail REGR. vs. 142915
 test-armhf-armhf-libvirt-raw 15 guest-start/debian.repeat fail REGR. vs. 142915
 test-armhf-armhf-xl-vhd 15 guest-start/debian.repeat fail REGR. vs. 142915

Regressions which are regarded as allowable (not blocking):
 test-amd64-amd64-xl-rtds 16 guest-localmigrate   fail REGR. vs. 142915

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-xl-rtds   16 guest-start/debian.repeat fail blocked in 142915
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 142915
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 142915
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 142915
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 142915
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stopfail like 142915
 test-amd64-i386-xl-pvshim12 guest-start  fail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  14 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-arm64-arm64-xl  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop  fail never pass
 test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass
 test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pass

version targeted for testing:
 qemuub7c9a7f353c0e260519bf735ff0d4aa01e72784b
baseline version:
 qemuue9d42461920f6f40f4d847a5ba18e90d095ed0b9

[Xen-devel] [OSSTEST PATCH] adhoc-revtuple-generator: Bisect over 5000 commits (really)

2019-11-04 Thread Ian Jackson

In e9b0653875b3 we changed one of the `1000' values to `5000'.  But
this magic number had been duplicated.  Urgh!

The result is that adhoc-revtuple-generator might generate a weirdly
truncated output which causes cs-bisection-stop to fail with messages
like this:
  *** not RelvUp at 3d40147282670d597b336be5599b5cc4c2ff7ddd  at 
./cs-bisection-step line 554.
  *** not RelvDown at 2fa3479cfadb0bb3fe694dbfd29f2350eb2570df  at 
./cs-bisection-step line 554.
  *** not RelvUp at 2fa3479cfadb0bb3fe694dbfd29f2350eb2570df  at 
./cs-bisection-step line 554.
  ...
  Use of uninitialized value in concatenation (.) or string at 
./cs-bisection-step line 747.
  Should test .
  BROKEN see earlier errors. at ./cs-bisection-step line 1454,  line 
10089.

Fix this by (i) plumbing the magic value we already edited properly
back to the (command-line controlled) global variable (ii) changing
the global variable from 1000 to 5000.

git-grep '\b1000\b'  still produces a fair amount of output but most
of it is timeouts, which is fair enough.  There is also a flight
count limit in sg-report-flight, which limits how far back it is
willing to look.  We don't want to change that here.

With this change, cs-bisection-step on the currently-failing freebsd
build job does this:
  Searching for interesting versions
  Result found: flight 141420 (pass), for basis pass
  Result found: flight 143397 (fail), for basis failure
  Need to reproduce basis pass (pass); had 1 already.
  Should test 2fa3479cfadb0bb3fe694dbfd29f2350eb2570df.

This looks plausible: it is picking up where it left off before the
basis pass fell over its horizon.

CC: Roger Pau Monné 
CC: Jürgen Groß 
Signed-off-by: Ian Jackson 
---
 adhoc-revtuple-generator | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/adhoc-revtuple-generator b/adhoc-revtuple-generator
index ac0f2463..c8d6f4ad 100755
--- a/adhoc-revtuple-generator
+++ b/adhoc-revtuple-generator
@@ -28,7 +28,7 @@ use Osstest;
 use Osstest::TestSupport;
 use Osstest::Executive;
 
-our $num= 1000;
+our $num= 5000;
 our $doupdate= 1;
 our $showrev= 0;
 
@@ -553,7 +553,7 @@ sub main () {
 my @trees_continuous;
 foreach my $tree (@trees) {
 my $gen= tree_get_gen($tree);
-my $count= 5000;
+my $count= $num;
 my $found= 0;
 my $top= undef;
 while ($count-- > 0) {
-- 
2.11.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] xen/vcpu: Sanitise VCPUOP_initialise call hierachy

2019-11-04 Thread Jan Beulich

On 31.10.2019 20:28, Andrew Cooper wrote:
> This code is especially tangled.  VCPUOP_initialise calls into
> arch_initialise_vcpu() which calls back into default_initialise_vcpu() which
> is common code.
> 
> This path is actually dead code on ARM, because VCPUOP_initialise is filtered
> out by do_arm_vcpu_op().
> 
> The only valid way to start a secondary CPU on ARM is via the PSCI interface.
> The same could in principle be said about INIT-SIPI-SIPI for x86 HVM, if HVM
> guests hadn't already interited a paravirt way of starting CPUs.
> 
> Either way, it is quite likely that no future architectures implemented in Xen
> are going to want to use a PV interface, as some standardised (v)CPU bringup
> mechanism will already exist.
> 
> Arrange the code in do_vcpu_op() to allow arch_initialise_vcpu() to be
> optional.  Opt in for x86, and opt out for ARM.
> 
> Deleting ARM's arch_initialise_vcpu() allows for default_initialise_vcpu() to
> be folded into its (now) sole x86 caller, which reduces the compiled code
> volume in all builds.
> 
> No functional change.
> 
> Signed-off-by: Andrew Cooper 

I can see the merits of this, but I can also understand Julien's
reservations. Hence I guess whether to ack this will depend on the
discussion with him getting settled.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] xen/events: remove event handling recursion detection

2019-11-04 Thread Jürgen Groß


On 04.11.19 16:19, Jan Beulich wrote:

On 04.11.2019 16:09, Jürgen Groß wrote:

On 04.11.19 15:35, Jan Beulich wrote:

On 04.11.2019 14:58, Juergen Gross wrote:

__xen_evtchn_do_upcall() contains guards against being called
recursively. This mechanism was introduced in the early pvops times
(kernel 2.6.26) when there were still Xen versions around not honoring
disabled interrupts for sending events to pv guests.

This was changed in Xen 3.0, which is much older than any Xen version
supported by the kernel, so the recursion detection can be removed.


Would you mind pointing out which exact change(s) this was(were)?


Linux kernel: 229664bee6126e01f8662976a5fe2e79813b77c8
Xen: d8263e8dbaf5ef1445bee0662143a0fcb6d43466


Are you sure about the latter, touching only header files underneath
xen/, and there mostly public interface ones?


No, you are right, this was a false interpretation of mine.




It had always been my understanding that the recursion detection
was mainly to guard against drivers re-enabling interrupts
transiently in their handlers (which in turn may no longer be an
issue in modern Linux kernels).


This would have been doable with a simple bool. The more complex
xchg based logic was IMO for recursion detection at any point.


Well, the respective XenoLinux c/s 13098 has no mention of this, i.e.
it simply leaves open what the actual reason was:

"[LINUX] Disallow nested event delivery.

  This eliminates the risk of overflowing the kernel stack and is a
  reasonable policy given that we have no concept of priorities among
  event sources."


For XenoLinux it makes at least a little bit sense, as interrupts
were enabled during calls of some handlers AFAIK. The complexity is
rather strange, though, as the bool would have been much easier to
understand.

I'll adapt the commit message.


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 2/3] x86/boot: Cache cpu_has_hypervisor very early on boot

2019-11-04 Thread Jan Beulich

On 04.11.2019 16:31, Andrew Cooper wrote:
> On 04/11/2019 13:32, Jan Beulich wrote:
>> On 01.11.2019 21:25, Andrew Cooper wrote:
>>> --- a/xen/arch/x86/boot/head.S
>>> +++ b/xen/arch/x86/boot/head.S
>>> @@ -630,6 +630,10 @@ trampoline_setup:
>>>  
>>>  1:
>>>  /* Interrogate CPU extended features via CPUID. */
>>> +mov $1, %eax
>>> +cpuid
>>> +mov %ecx, sym_fs(boot_cpu_data) + 
>>> CPUINFO_FEATURE_OFFSET(X86_FEATURE_HYPERVISOR)
>> I understand the ECX output is all we need right now. But wouldn't
>> it be better to then store EDX as well (and similarly ECX of
>> 0x8001 output)?
> 
> No - I don't think so.
> 
> I did debate applying an and mask for the features we only intend to be
> usable this early, to avoid getting buggy code which checks for
> unexpected features this early.

Indeed doing so would seem a good reason to not also store the EDX
value here.

>> Also, along the lines of a question back to Sergey on his
>> standalone patch, wouldn't it be better to take the opportunity
>> and check here that CPUID leaf 1 is actually valid?
> 
> There is no such thing as a 64-bit capable CPU without leaf 1.

About anything can be constructed under a hypervisor. But well, I
guess I'll stop mumbling on this aspect.

>> Of course the question on the (intended) effect of
>> "cpuid=no-hypervisor" also remains. As said before, I think this
>> should be honored by all of our code that possibly can (i.e. at
>> least everything following command line parsing).
> 
> There is no chance of making that work in a sane way - we use
> cpu_has_hypervisor for quite a few things before the command line gets
> parsed.

You said so the other day, but iirc when checking I wasn't able to
identify any such case, let alone "quite a few".

Anyway, feel free to add
Acked-by: Jan Beulich 

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [BUG] Invalid guest state in Xen master, dom0 linux-5.3.6, domU windows 10

2019-11-04 Thread Andrew Cooper

On 04/11/2019 15:33, Håkon Alstadheim wrote:
>
> Den 04.11.2019 14:31, skrev Andrew Cooper:
>> On 03/11/2019 10:23, Håkon Alstadheim wrote:
>>
>>> (XEN) [2019-11-02 14:09:46] d2v0 vmentry failure (reason 0x8021):
>>> Invalid guest state (0)
>>> (XEN) [2019-11-02 14:09:46] * VMCS Area **
>>> (XEN) [2019-11-02 14:09:46] *** Guest State ***
>>> (XEN) [2019-11-02 14:09:46] CR0: actual=0x80050031,
>>> shadow=0x80050031, gh_mask=
>>> (XEN) [2019-11-02 14:09:46] CR4: actual=0x00172678,
>>> shadow=0x00170678, gh_mask=ffe8f860
>>> (XEN) [2019-11-02 14:09:46] CR3 = 0x001aa002
>>> (XEN) [2019-11-02 14:09:46] RSP = 0x8c0f4dd71e98
>>> (0x8c0f4dd71e98)  RIP = 0xd18a040bb75e (0xd18a040bb75e)
>>> (XEN) [2019-11-02 14:09:46] RFLAGS=0x0187 (0x0187)  DR7 =
>>> 0x0400
>>> (XEN) [2019-11-02 14:09:46] Sysenter RSP=
>>> CS:RIP=:
>>> (XEN) [2019-11-02 14:09:46]    sel  attr  limit   base
>>> (XEN) [2019-11-02 14:09:46]   CS: 0010 0209b  
>>> (XEN) [2019-11-02 14:09:46]   DS: 002b 0c0f3  
>>> (XEN) [2019-11-02 14:09:46]   SS: 0018 04093  
>>> (XEN) [2019-11-02 14:09:46]   ES: 002b 0c0f3  
>>> (XEN) [2019-11-02 14:09:46]   FS: 0053 040f3 3c00 
>>> (XEN) [2019-11-02 14:09:46]   GS: 002b 0c0f3  f8044ff8
>>> (XEN) [2019-11-02 14:09:46] GDTR:    0057 f80459c61fb0
>>> (XEN) [2019-11-02 14:09:46] LDTR:  1c000  
>>> (XEN) [2019-11-02 14:09:46] IDTR:    012f d18a014a0980
>>> (XEN) [2019-11-02 14:09:46]   TR: 0040 0008b 0067 f80459c6
>>> (XEN) [2019-11-02 14:09:46] EFER(VMCS) = 0x0d01  PAT =
>>> 0x0007010600070106
>>> (XEN) [2019-11-02 14:09:46] PreemptionTimer = 0x  SM Base =
>>> 0x
>>> (XEN) [2019-11-02 14:09:46] DebugCtl = 0x
>>> DebugExceptions = 0x
>>> (XEN) [2019-11-02 14:09:46] Interruptibility = 0002  ActivityState
>>> = 
>>> (XEN) [2019-11-02 14:09:46] InterruptStatus = 
>>> (XEN) [2019-11-02 14:09:46] *** Host State ***
>>> (XEN) [2019-11-02 14:09:46] RIP = 0x82d080341950
>>> (vmx_asm_vmexit_handler)  RSP = 0x83083ff0ff70
>>> (XEN) [2019-11-02 14:09:46] CS=e008 SS= DS= ES= FS=
>>> GS= TR=e040
>>> (XEN) [2019-11-02 14:09:46] FSBase=
>>> GSBase= TRBase=83083ff14000
>>> (XEN) [2019-11-02 14:09:46] GDTBase=83083ff03000
>>> IDTBase=83083ff07000
>>> (XEN) [2019-11-02 14:09:46] CR0=80050033 CR3=00054dbea000
>>> CR4=001526e0
>>> (XEN) [2019-11-02 14:09:46] Sysenter RSP=83083ff0ffa0
>>> CS:RIP=e008:82d080395440
>>> (XEN) [2019-11-02 14:09:46] EFER = 0x0d01  PAT =
>>> 0x050100070406
>>> (XEN) [2019-11-02 14:09:46] *** Control State ***
>>> (XEN) [2019-11-02 14:09:46] PinBased=00bf CPUBased=b62065fa
>>> SecondaryExec=17eb
>>> (XEN) [2019-11-02 14:09:46] EntryControls=d3ff
>>> ExitControls=002fefff
>>> (XEN) [2019-11-02 14:09:46] ExceptionBitmap=00060002 PFECmask=
>>> PFECmatch=
>>> (XEN) [2019-11-02 14:09:46] VMEntry: intr_info=8501
>>> errcode= ilen=0001
>>> (XEN) [2019-11-02 14:09:46] VMExit: intr_info=8501
>>> errcode= ilen=0001
>>> (XEN) [2019-11-02 14:09:46] reason=8021
>>> qualification=
>>> (XEN) [2019-11-02 14:09:46] IDTVectoring: info=
>>> errcode=
>>> (XEN) [2019-11-02 14:09:46] TSC Offset = 0xf45ded46dd57  TSC
>>> Multiplier = 0x
>>> (XEN) [2019-11-02 14:09:46] TPR Threshold = 0x00  PostedIntrVec = 0xf5
>>> (XEN) [2019-11-02 14:09:46] EPT pointer = 0x00054e3a701e  EPTP
>>> index = 0x
>>> (XEN) [2019-11-02 14:09:46] PLE Gap=0080 Window=1000
>>> (XEN) [2019-11-02 14:09:46] Virtual processor ID = 0x5a02 VMfunc
>>> controls = 
>>> (XEN) [2019-11-02 14:09:46] **
>>> (XEN) [2019-11-02 14:09:46] domain_crash called from vmx.c:3335
>>> (XEN) [2019-11-02 14:09:46] Domain 2 (vcpu#0) crashed on cpu#2:
>> Interruptibility = 0002 (Blocked by Mov SS) and VMEntry:
>> intr_info=8501 (ICEBP)
>>
>> Dare I ask what you're running in your windows guest?  Unless it is a
>> vulnerability test suite, I'm rather concerned.
>
> Because I have pulled out all stops ? Well no particular reason. I've
> asked my kids nicely not to poke any /more/ holes in the security on
> the system. Probably should tighten up the ship. I have some conflict
> going on between the hardware pci USB cards in the machine, so I
> thought I'd see what would happen if I gave ASUS and whatever no-name
> Taiwanese I have in there free rein. Nothing gained as far as I can
> see, so I'll see about closing some of the more gaping holes. A

Re: [Xen-devel] [BUG] Invalid guest state in Xen master, dom0 linux-5.3.6, domU windows 10

2019-11-04 Thread Håkon Alstadheim



Den 04.11.2019 14:31, skrev Andrew Cooper:

On 03/11/2019 10:23, Håkon Alstadheim wrote:


(XEN) [2019-11-02 14:09:46] d2v0 vmentry failure (reason 0x8021):
Invalid guest state (0)
(XEN) [2019-11-02 14:09:46] * VMCS Area **
(XEN) [2019-11-02 14:09:46] *** Guest State ***
(XEN) [2019-11-02 14:09:46] CR0: actual=0x80050031,
shadow=0x80050031, gh_mask=
(XEN) [2019-11-02 14:09:46] CR4: actual=0x00172678,
shadow=0x00170678, gh_mask=ffe8f860
(XEN) [2019-11-02 14:09:46] CR3 = 0x001aa002
(XEN) [2019-11-02 14:09:46] RSP = 0x8c0f4dd71e98
(0x8c0f4dd71e98)  RIP = 0xd18a040bb75e (0xd18a040bb75e)
(XEN) [2019-11-02 14:09:46] RFLAGS=0x0187 (0x0187)  DR7 =
0x0400
(XEN) [2019-11-02 14:09:46] Sysenter RSP=
CS:RIP=:
(XEN) [2019-11-02 14:09:46]    sel  attr  limit   base
(XEN) [2019-11-02 14:09:46]   CS: 0010 0209b  
(XEN) [2019-11-02 14:09:46]   DS: 002b 0c0f3  
(XEN) [2019-11-02 14:09:46]   SS: 0018 04093  
(XEN) [2019-11-02 14:09:46]   ES: 002b 0c0f3  
(XEN) [2019-11-02 14:09:46]   FS: 0053 040f3 3c00 
(XEN) [2019-11-02 14:09:46]   GS: 002b 0c0f3  f8044ff8
(XEN) [2019-11-02 14:09:46] GDTR:    0057 f80459c61fb0
(XEN) [2019-11-02 14:09:46] LDTR:  1c000  
(XEN) [2019-11-02 14:09:46] IDTR:    012f d18a014a0980
(XEN) [2019-11-02 14:09:46]   TR: 0040 0008b 0067 f80459c6
(XEN) [2019-11-02 14:09:46] EFER(VMCS) = 0x0d01  PAT =
0x0007010600070106
(XEN) [2019-11-02 14:09:46] PreemptionTimer = 0x  SM Base =
0x
(XEN) [2019-11-02 14:09:46] DebugCtl = 0x
DebugExceptions = 0x
(XEN) [2019-11-02 14:09:46] Interruptibility = 0002  ActivityState
= 
(XEN) [2019-11-02 14:09:46] InterruptStatus = 
(XEN) [2019-11-02 14:09:46] *** Host State ***
(XEN) [2019-11-02 14:09:46] RIP = 0x82d080341950
(vmx_asm_vmexit_handler)  RSP = 0x83083ff0ff70
(XEN) [2019-11-02 14:09:46] CS=e008 SS= DS= ES= FS=
GS= TR=e040
(XEN) [2019-11-02 14:09:46] FSBase=
GSBase= TRBase=83083ff14000
(XEN) [2019-11-02 14:09:46] GDTBase=83083ff03000
IDTBase=83083ff07000
(XEN) [2019-11-02 14:09:46] CR0=80050033 CR3=00054dbea000
CR4=001526e0
(XEN) [2019-11-02 14:09:46] Sysenter RSP=83083ff0ffa0
CS:RIP=e008:82d080395440
(XEN) [2019-11-02 14:09:46] EFER = 0x0d01  PAT =
0x050100070406
(XEN) [2019-11-02 14:09:46] *** Control State ***
(XEN) [2019-11-02 14:09:46] PinBased=00bf CPUBased=b62065fa
SecondaryExec=17eb
(XEN) [2019-11-02 14:09:46] EntryControls=d3ff ExitControls=002fefff
(XEN) [2019-11-02 14:09:46] ExceptionBitmap=00060002 PFECmask=
PFECmatch=
(XEN) [2019-11-02 14:09:46] VMEntry: intr_info=8501
errcode= ilen=0001
(XEN) [2019-11-02 14:09:46] VMExit: intr_info=8501
errcode= ilen=0001
(XEN) [2019-11-02 14:09:46] reason=8021
qualification=
(XEN) [2019-11-02 14:09:46] IDTVectoring: info= errcode=
(XEN) [2019-11-02 14:09:46] TSC Offset = 0xf45ded46dd57  TSC
Multiplier = 0x
(XEN) [2019-11-02 14:09:46] TPR Threshold = 0x00  PostedIntrVec = 0xf5
(XEN) [2019-11-02 14:09:46] EPT pointer = 0x00054e3a701e  EPTP
index = 0x
(XEN) [2019-11-02 14:09:46] PLE Gap=0080 Window=1000
(XEN) [2019-11-02 14:09:46] Virtual processor ID = 0x5a02 VMfunc
controls = 
(XEN) [2019-11-02 14:09:46] **
(XEN) [2019-11-02 14:09:46] domain_crash called from vmx.c:3335
(XEN) [2019-11-02 14:09:46] Domain 2 (vcpu#0) crashed on cpu#2:

Interruptibility = 0002 (Blocked by Mov SS) and VMEntry:
intr_info=8501 (ICEBP)

Dare I ask what you're running in your windows guest?  Unless it is a
vulnerability test suite, I'm rather concerned.


Because I have pulled out all stops ? Well no particular reason. I've 
asked my kids nicely not to poke any /more/ holes in the security on the 
system. Probably should tighten up the ship. I have some conflict going 
on between the hardware pci USB cards in the machine, so I thought I'd 
see what would happen if I gave ASUS and whatever no-name Taiwanese I 
have in there free rein. Nothing gained as far as I can see, so I'll see 
about closing some of the more gaping holes. At least as far as getting 
rid of deprecation warnings go :-) .


I hope "they" never get serious about requiring a license to own a 
computer with Internet access. :-)




___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 1/3] x86/boot: Remove cached CPUID data from the trampoline

2019-11-04 Thread Jan Beulich

On 04.11.2019 16:22, Andrew Cooper wrote:
> On 04/11/2019 15:03, Jan Beulich wrote:
>> On 04.11.2019 15:59, Andrew Cooper wrote:
>>> On 04/11/2019 13:25, Jan Beulich wrote:
 On 01.11.2019 21:25, Andrew Cooper wrote:
> --- a/xen/arch/x86/cpu/intel.c
> +++ b/xen/arch/x86/cpu/intel.c
> @@ -270,6 +270,7 @@ static void early_init_intel(struct cpuinfo_x86 *c)
>   if (disable) {
>   wrmsrl(MSR_IA32_MISC_ENABLE, misc_enable & ~disable);
>   bootsym(trampoline_misc_enable_off) |= disable;
> + bootsym(trampoline_efer) |= EFER_NX;
>   }
 I'm fine with all other changes here, just this one concerns me:
 Before your change we latch a value into trampoline_misc_enable_off
 just to be used for subsequent IA32_MISC_ENABLE writes we do. This
 means that, on a hypervisor (like Xen itself) simply discarding
 guest writes to the MSR (which is "fine" with the described usage
 of ours up to now), with your change we'd now end up trying to set
 EFER.NX when the feature may not actually be enabled in
 IA32_MISC_ENABLE. Wouldn't such an EFER write be liable to #GP(0)?
 I.e. don't we need to read back the MSR value here, and verify
 we actually managed to clear the bit before actually OR-ing in
 EFER_NX?
>>> If this is a problem in practice, execution won't continue beyond the
>>> next if() condition just out of context, which set EFER.NX on the BSP
>>> with an unguarded WRMSR.
>> And how is this any good? This kind of crash is exactly what I'm
>> asking to avoid.
> 
> What is the point of working around a theoretical edge case of broken
> nesting under Xen which demonstrably doesn't exist in practice?

Well, so far nothing was said about this not being an actual problem.
I simply don't know whether hardware would refuse such an EFER write.
If it does, it would be appropriate for hypervisors to also refuse
it. I.e. if we don't do so right now, correcting the behavior would
trip the code here.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [XEN PATCH for-4.13] libxl: Fix setting vncpasswd to empty string

2019-11-04 Thread Anthony PERARD

Before 93dcc22, error from setting the vnc password to an empty
string, when QEMU wasn't expected a password, never prevented the creation
of a guest, and only logged an error message.

Reported-by: Roger Pau Monné 
Fixes: 93dcc22fe798c9fa5ce117f1ed6db0d8bd779020
Signed-off-by: Anthony PERARD 
---
 tools/libxl/libxl_dm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
index 7e52f0973172..8e0fb78bd2f3 100644
--- a/tools/libxl/libxl_dm.c
+++ b/tools/libxl/libxl_dm.c
@@ -2936,7 +2936,7 @@ static void device_model_postconfig_vnc(libxl__egc *egc,
 if (rc) goto out;
 }
 
-if (vnc && vnc->passwd) {
+if (vnc && vnc->passwd && vnc->passwd[0]) {
 qmp->callback = device_model_postconfig_vnc_passwd;
 libxl__qmp_param_add_string(gc, &args, "password", vnc->passwd);
 rc = libxl__ev_qmp_send(gc, qmp, "change-vnc-password", args);
-- 
Anthony PERARD


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 2/3] x86/boot: Cache cpu_has_hypervisor very early on boot

2019-11-04 Thread Andrew Cooper

On 04/11/2019 13:32, Jan Beulich wrote:
> On 01.11.2019 21:25, Andrew Cooper wrote:
>> --- a/xen/arch/x86/boot/head.S
>> +++ b/xen/arch/x86/boot/head.S
>> @@ -630,6 +630,10 @@ trampoline_setup:
>>  
>>  1:
>>  /* Interrogate CPU extended features via CPUID. */
>> +mov $1, %eax
>> +cpuid
>> +mov %ecx, sym_fs(boot_cpu_data) + 
>> CPUINFO_FEATURE_OFFSET(X86_FEATURE_HYPERVISOR)
> I understand the ECX output is all we need right now. But wouldn't
> it be better to then store EDX as well (and similarly ECX of
> 0x8001 output)?

No - I don't think so.

I did debate applying an and mask for the features we only intend to be
usable this early, to avoid getting buggy code which checks for
unexpected features this early.

> Also, along the lines of a question back to Sergey on his
> standalone patch, wouldn't it be better to take the opportunity
> and check here that CPUID leaf 1 is actually valid?

There is no such thing as a 64-bit capable CPU without leaf 1.

> Of course the question on the (intended) effect of
> "cpuid=no-hypervisor" also remains. As said before, I think this
> should be honored by all of our code that possibly can (i.e. at
> least everything following command line parsing).

There is no chance of making that work in a sane way - we use
cpu_has_hypervisor for quite a few things before the command line gets
parsed.

If you're booting under a hypervisor, your hypervisor ought to have a
way to turn that off.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 1/3] x86/boot: Remove cached CPUID data from the trampoline

2019-11-04 Thread Andrew Cooper

On 04/11/2019 15:03, Jan Beulich wrote:
> On 04.11.2019 15:59, Andrew Cooper wrote:
>> On 04/11/2019 13:25, Jan Beulich wrote:
>>> On 01.11.2019 21:25, Andrew Cooper wrote:
 --- a/xen/arch/x86/cpu/intel.c
 +++ b/xen/arch/x86/cpu/intel.c
 @@ -270,6 +270,7 @@ static void early_init_intel(struct cpuinfo_x86 *c)
if (disable) {
wrmsrl(MSR_IA32_MISC_ENABLE, misc_enable & ~disable);
bootsym(trampoline_misc_enable_off) |= disable;
 +  bootsym(trampoline_efer) |= EFER_NX;
}
>>> I'm fine with all other changes here, just this one concerns me:
>>> Before your change we latch a value into trampoline_misc_enable_off
>>> just to be used for subsequent IA32_MISC_ENABLE writes we do. This
>>> means that, on a hypervisor (like Xen itself) simply discarding
>>> guest writes to the MSR (which is "fine" with the described usage
>>> of ours up to now), with your change we'd now end up trying to set
>>> EFER.NX when the feature may not actually be enabled in
>>> IA32_MISC_ENABLE. Wouldn't such an EFER write be liable to #GP(0)?
>>> I.e. don't we need to read back the MSR value here, and verify
>>> we actually managed to clear the bit before actually OR-ing in
>>> EFER_NX?
>> If this is a problem in practice, execution won't continue beyond the
>> next if() condition just out of context, which set EFER.NX on the BSP
>> with an unguarded WRMSR.
> And how is this any good? This kind of crash is exactly what I'm
> asking to avoid.

What is the point of working around a theoretical edge case of broken
nesting under Xen which demonstrably doesn't exist in practice?

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] xen/events: remove event handling recursion detection

2019-11-04 Thread Jan Beulich

On 04.11.2019 16:09, Jürgen Groß wrote:
> On 04.11.19 15:35, Jan Beulich wrote:
>> On 04.11.2019 14:58, Juergen Gross wrote:
>>> __xen_evtchn_do_upcall() contains guards against being called
>>> recursively. This mechanism was introduced in the early pvops times
>>> (kernel 2.6.26) when there were still Xen versions around not honoring
>>> disabled interrupts for sending events to pv guests.
>>>
>>> This was changed in Xen 3.0, which is much older than any Xen version
>>> supported by the kernel, so the recursion detection can be removed.
>>
>> Would you mind pointing out which exact change(s) this was(were)?
> 
> Linux kernel: 229664bee6126e01f8662976a5fe2e79813b77c8
> Xen: d8263e8dbaf5ef1445bee0662143a0fcb6d43466

Are you sure about the latter, touching only header files underneath
xen/, and there mostly public interface ones?

>> It had always been my understanding that the recursion detection
>> was mainly to guard against drivers re-enabling interrupts
>> transiently in their handlers (which in turn may no longer be an
>> issue in modern Linux kernels).
> 
> This would have been doable with a simple bool. The more complex
> xchg based logic was IMO for recursion detection at any point.

Well, the respective XenoLinux c/s 13098 has no mention of this, i.e.
it simply leaves open what the actual reason was:

"[LINUX] Disallow nested event delivery.

 This eliminates the risk of overflowing the kernel stack and is a
 reasonable policy given that we have no concept of priorities among
 event sources."

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v4 2/3] x86/boot: Introduce the kernel_info.setup_type_max

2019-11-04 Thread Daniel Kiper

On Fri, Nov 01, 2019 at 01:55:57PM -0700, H. Peter Anvin wrote:
> On 2019-10-24 04:48, Daniel Kiper wrote:
> > This field contains maximal allowed type for setup_data.
> >
> > Now bump the setup_header version in arch/x86/boot/header.S.
>
> Please don't bump the protocol revision here, otherwise we would create
> a very odd pseudo-revision of the protocol: 2.15 without SETUP_INDIRECT
> support, should patch 3/3 end up getting reverted.
>
> (It is possible to detect, of course, but I feel pretty sure in saying
> that bootloaders won't get it right.)
>
> Other than that:
>
> Reviewed-by: H. Peter Anvin (Intel) 

I have just sent v5. Please take a look.

Daniel

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v5 0/3] x86/boot: Introduce the kernel_info et consortes

2019-11-04 Thread Daniel Kiper

Hi,

Due to very limited space in the setup_header this patch series introduces new
kernel_info struct which will be used to convey information from the kernel to
the bootloader. This way the boot protocol can be extended regardless of the
setup_header limitations. Additionally, the patch series introduces some
convenience features like the setup_indirect struct and the
kernel_info.setup_type_max field.

Daniel

 Documentation/x86/boot.rst | 174 
++
 arch/x86/boot/Makefile |   2 +-
 arch/x86/boot/compressed/Makefile  |   4 +-
 arch/x86/boot/compressed/kaslr.c   |  12 ++
 arch/x86/boot/compressed/kernel_info.S |  22 ++
 arch/x86/boot/header.S |   3 +-
 arch/x86/boot/tools/build.c|   5 +++
 arch/x86/include/uapi/asm/bootparam.h  |  16 +++-
 arch/x86/kernel/e820.c |  11 +
 arch/x86/kernel/kdebugfs.c |  20 +++--
 arch/x86/kernel/ksysfs.c   |  30 ++
 arch/x86/kernel/setup.c|   4 ++
 arch/x86/mm/ioremap.c  |  11 +
 13 files changed, 298 insertions(+), 16 deletions(-)

Daniel Kiper (3):
  x86/boot: Introduce the kernel_info
  x86/boot: Introduce the kernel_info.setup_type_max
  x86/boot: Introduce the setup_indirect


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v5 2/3] x86/boot: Introduce the kernel_info.setup_type_max

2019-11-04 Thread Daniel Kiper

This field contains maximal allowed type for setup_data.

This patch does not bump setup_header version in arch/x86/boot/header.S
because it will be followed by additional changes coming into the
Linux/x86 boot protocol.

Suggested-by: H. Peter Anvin (Intel) 
Signed-off-by: Daniel Kiper 
Reviewed-by: Konrad Rzeszutek Wilk 
Reviewed-by: Ross Philipson 
Reviewed-by: H. Peter Anvin (Intel) 
---
v5 - suggestions/fixes:
   - move incorrect references to the setup_indirect to the
 patch introducing it,
   - do not bump setup_header version in arch/x86/boot/header.S
 (suggested by H. Peter Anvin).
---
 Documentation/x86/boot.rst | 9 -
 arch/x86/boot/compressed/kernel_info.S | 5 +
 arch/x86/include/uapi/asm/bootparam.h  | 3 +++
 3 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/Documentation/x86/boot.rst b/Documentation/x86/boot.rst
index c60fafda9427..1dad6eee8a5c 100644
--- a/Documentation/x86/boot.rst
+++ b/Documentation/x86/boot.rst
@@ -73,7 +73,7 @@ Protocol 2.14:BURNT BY INCORRECT COMMIT 
ae7e1238e68f2a472a125673ab506d49158c188
(x86/boot: Add ACPI RSDP address to setup_header)
DO NOT USE!!! ASSUME SAME AS 2.13.
 
-Protocol 2.15: (Kernel 5.5) Added the kernel_info.
+Protocol 2.15: (Kernel 5.5) Added the kernel_info and 
kernel_info.setup_type_max.
 =  
 
 .. note::
@@ -981,6 +981,13 @@ Offset/size:   0x0008/4
   This field contains the size of the kernel_info including kernel_info.header
   and kernel_info.kernel_info_var_len_data.
 
+   ==
+Field name:setup_type_max
+Offset/size:   0x0008/4
+   ==
+
+  This field contains maximal allowed type for setup_data.
+
 
 The Image Checksum
 ==
diff --git a/arch/x86/boot/compressed/kernel_info.S 
b/arch/x86/boot/compressed/kernel_info.S
index 8ea6f6e3feef..018dacbd753e 100644
--- a/arch/x86/boot/compressed/kernel_info.S
+++ b/arch/x86/boot/compressed/kernel_info.S
@@ -1,5 +1,7 @@
 /* SPDX-License-Identifier: GPL-2.0 */
 
+#include 
+
.section ".rodata.kernel_info", "a"
 
.global kernel_info
@@ -12,6 +14,9 @@ kernel_info:
/* Size total. */
.long   kernel_info_end - kernel_info
 
+   /* Maximal allowed type for setup_data. */
+   .long   SETUP_TYPE_MAX
+
 kernel_info_var_len_data:
/* Empty for time being... */
 kernel_info_end:
diff --git a/arch/x86/include/uapi/asm/bootparam.h 
b/arch/x86/include/uapi/asm/bootparam.h
index a1ebcd7a991c..dbb41128e5a0 100644
--- a/arch/x86/include/uapi/asm/bootparam.h
+++ b/arch/x86/include/uapi/asm/bootparam.h
@@ -11,6 +11,9 @@
 #define SETUP_APPLE_PROPERTIES 5
 #define SETUP_JAILHOUSE6
 
+/* max(SETUP_*) */
+#define SETUP_TYPE_MAX SETUP_JAILHOUSE
+
 /* ram_size flags */
 #define RAMDISK_IMAGE_START_MASK   0x07FF
 #define RAMDISK_PROMPT_FLAG0x8000
-- 
2.11.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v5 1/3] x86/boot: Introduce the kernel_info

2019-11-04 Thread Daniel Kiper

The relationships between the headers are analogous to the various data
sections:

  setup_header = .data
  boot_params/setup_data = .bss

What is missing from the above list? That's right:

  kernel_info = .rodata

We have been (ab)using .data for things that could go into .rodata or .bss for
a long time, for lack of alternatives and -- especially early on -- inertia.
Also, the BIOS stub is responsible for creating boot_params, so it isn't
available to a BIOS-based loader (setup_data is, though).

setup_header is permanently limited to 144 bytes due to the reach of the
2-byte jump field, which doubles as a length field for the structure, combined
with the size of the "hole" in struct boot_params that a protected-mode loader
or the BIOS stub has to copy it into. It is currently 119 bytes long, which
leaves us with 25 very precious bytes. This isn't something that can be fixed
without revising the boot protocol entirely, breaking backwards compatibility.

boot_params proper is limited to 4096 bytes, but can be arbitrarily extended
by adding setup_data entries. It cannot be used to communicate properties of
the kernel image, because it is .bss and has no image-provided content.

kernel_info solves this by providing an extensible place for information about
the kernel image. It is readonly, because the kernel cannot rely on a
bootloader copying its contents anywhere, but that is OK; if it becomes
necessary it can still contain data items that an enabled bootloader would be
expected to copy into a setup_data chunk.

This patch does not bump setup_header version in arch/x86/boot/header.S
because it will be followed by additional changes coming into the
Linux/x86 boot protocol.

Suggested-by: H. Peter Anvin (Intel) 
Signed-off-by: Daniel Kiper 
Reviewed-by: Konrad Rzeszutek Wilk 
Reviewed-by: Ross Philipson 
Reviewed-by: H. Peter Anvin (Intel) 
---
v4 - suggestions/fixes:
   - improve the documentation
 (suggested by Randy Dunlap and Konrad Rzeszutek Wilk).

v3 - suggestions/fixes:
   - split kernel_info data into fixed and variable sized regions,
 (suggested by H. Peter Anvin),
   - change kernel_info.header value to "LToP" (0x506f544c),
 (suggested by H. Peter Anvin),
   - improve the comments,
   - improve the documentation.

v2 - suggestions/fixes:
   - rename setup_header2 to kernel_info,
 (suggested by H. Peter Anvin),
   - change kernel_info.header value to "InfO" (0x4f666e49),
   - new kernel_info description in Documentation/x86/boot.rst,
 (suggested by H. Peter Anvin),
   - drop kernel_info_offset_update() as an overkill and
 update kernel_info offset directly from main(),
 (suggested by Eric Snowberg),
   - new commit message
 (suggested by H. Peter Anvin),
   - fix some commit message misspellings
 (suggested by Eric Snowberg).
---
 Documentation/x86/boot.rst | 126 +
 arch/x86/boot/Makefile |   2 +-
 arch/x86/boot/compressed/Makefile  |   4 +-
 arch/x86/boot/compressed/kernel_info.S |  17 +
 arch/x86/boot/header.S |   1 +
 arch/x86/boot/tools/build.c|   5 ++
 arch/x86/include/uapi/asm/bootparam.h  |   1 +
 7 files changed, 153 insertions(+), 3 deletions(-)
 create mode 100644 arch/x86/boot/compressed/kernel_info.S

diff --git a/Documentation/x86/boot.rst b/Documentation/x86/boot.rst
index 08a2f100c0e6..c60fafda9427 100644
--- a/Documentation/x86/boot.rst
+++ b/Documentation/x86/boot.rst
@@ -68,8 +68,25 @@ Protocol 2.12(Kernel 3.8) Added the xloadflags field 
and extension fields
 Protocol 2.13  (Kernel 3.14) Support 32- and 64-bit flags being set in
xloadflags to support booting a 64-bit kernel from 32-bit
EFI
+
+Protocol 2.14: BURNT BY INCORRECT COMMIT 
ae7e1238e68f2a472a125673ab506d49158c1889
+   (x86/boot: Add ACPI RSDP address to setup_header)
+   DO NOT USE!!! ASSUME SAME AS 2.13.
+
+Protocol 2.15: (Kernel 5.5) Added the kernel_info.
 =  
 
+.. note::
+ The protocol version number should be changed only if the setup header
+ is changed. There is no need to update the version number if boot_params
+ or kernel_info are changed. Additionally, it is recommended to use
+ xloadflags (in this case the protocol version number should not be
+ updated either) or kernel_info to communicate supported Linux kernel
+ features to the boot loader. Due to very limited space available in
+ the original setup header every update to it should be considered
+ with great care. Starting from the protocol 2.15 the primary way to
+ communicate things to the boot loader is the kernel_info.
+
 
 Memory Layout
 =
@@ -207,6 +224,7 @@ Offset/Size Proto   NameMeaning
 0258/8 2.10+   pref_addressPreferred loading 
address
 0260/4 2.10+   init_siz

[Xen-devel] [PATCH v5 3/3] x86/boot: Introduce the setup_indirect

2019-11-04 Thread Daniel Kiper

The setup_data is a bit awkward to use for extremely large data objects,
both because the setup_data header has to be adjacent to the data object
and because it has a 32-bit length field. However, it is important that
intermediate stages of the boot process have a way to identify which
chunks of memory are occupied by kernel data. Thus we introduce an uniform
way to specify such indirect data as setup_indirect struct and
SETUP_INDIRECT type.

And finally bump setup_header version in arch/x86/boot/header.S.

Suggested-by: H. Peter Anvin (Intel) 
Signed-off-by: Daniel Kiper 
Acked-by: Konrad Rzeszutek Wilk 
Reviewed-by: Ross Philipson 
Reviewed-by: H. Peter Anvin (Intel) 
---
v5 - suggestions/fixes:
   - bump setup_header version in arch/x86/boot/header.S
 (suggested by H. Peter Anvin).

v4 - suggestions/fixes:
   - change "Note:" to ".. note::".

v3 - suggestions/fixes:
   - add setup_indirect mapping/KASLR avoidance/etc. code
 (suggested by H. Peter Anvin),
   - the SETUP_INDIRECT sets most significant bit right now;
 this way it is possible to differentiate regular setup_data
 and setup_indirect objects in the debugfs filesystem.

v2 - suggestions/fixes:
   - add setup_indirect usage example
 (suggested by Eric Snowberg and Ross Philipson).
---
 Documentation/x86/boot.rst | 43 +-
 arch/x86/boot/compressed/kaslr.c   | 12 ++
 arch/x86/boot/compressed/kernel_info.S |  2 +-
 arch/x86/boot/header.S |  2 +-
 arch/x86/include/uapi/asm/bootparam.h  | 16 ++---
 arch/x86/kernel/e820.c | 11 +
 arch/x86/kernel/kdebugfs.c | 20 
 arch/x86/kernel/ksysfs.c   | 30 ++--
 arch/x86/kernel/setup.c|  4 
 arch/x86/mm/ioremap.c  | 11 +
 10 files changed, 134 insertions(+), 17 deletions(-)

diff --git a/Documentation/x86/boot.rst b/Documentation/x86/boot.rst
index 1dad6eee8a5c..38155ba8740f 100644
--- a/Documentation/x86/boot.rst
+++ b/Documentation/x86/boot.rst
@@ -827,6 +827,47 @@ Protocol:  2.09+
   sure to consider the case where the linked list already contains
   entries.
 
+  The setup_data is a bit awkward to use for extremely large data objects,
+  both because the setup_data header has to be adjacent to the data object
+  and because it has a 32-bit length field. However, it is important that
+  intermediate stages of the boot process have a way to identify which
+  chunks of memory are occupied by kernel data.
+
+  Thus setup_indirect struct and SETUP_INDIRECT type were introduced in
+  protocol 2.15.
+
+  struct setup_indirect {
+__u32 type;
+__u32 reserved;  /* Reserved, must be set to zero. */
+__u64 len;
+__u64 addr;
+  };
+
+  The type member is a SETUP_INDIRECT | SETUP_* type. However, it cannot be
+  SETUP_INDIRECT itself since making the setup_indirect a tree structure
+  could require a lot of stack space in something that needs to parse it
+  and stack space can be limited in boot contexts.
+
+  Let's give an example how to point to SETUP_E820_EXT data using 
setup_indirect.
+  In this case setup_data and setup_indirect will look like this:
+
+  struct setup_data {
+__u64 next = 0 or ;
+__u32 type = SETUP_INDIRECT;
+__u32 len = sizeof(setup_data);
+__u8 data[sizeof(setup_indirect)] = struct setup_indirect {
+  __u32 type = SETUP_INDIRECT | SETUP_E820_EXT;
+  __u32 reserved = 0;
+  __u64 len = ;
+  __u64 addr = ;
+}
+  }
+
+.. note::
+ SETUP_INDIRECT | SETUP_NONE objects cannot be properly distinguished
+ from SETUP_INDIRECT itself. So, this kind of objects cannot be provided
+ by the bootloaders.
+
    
 Field name:pref_address
 Type:  read (reloc)
@@ -986,7 +1027,7 @@ Field name:setup_type_max
 Offset/size:   0x0008/4
    ==
 
-  This field contains maximal allowed type for setup_data.
+  This field contains maximal allowed type for setup_data and setup_indirect 
structs.
 
 
 The Image Checksum
diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/kaslr.c
index 2e53c056ba20..bb9bfef174ae 100644
--- a/arch/x86/boot/compressed/kaslr.c
+++ b/arch/x86/boot/compressed/kaslr.c
@@ -459,6 +459,18 @@ static bool mem_avoid_overlap(struct mem_vector *img,
is_overlapping = true;
}
 
+   if (ptr->type == SETUP_INDIRECT &&
+   ((struct setup_indirect *)ptr->data)->type != 
SETUP_INDIRECT) {
+   avoid.start = ((struct setup_indirect 
*)ptr->data)->addr;
+   avoid.size = ((struct setup_indirect *)ptr->data)->len;
+
+   if (mem_overlaps(img, &avoid) && (avoid.start < 
earliest)) {
+   *overlap = avoid;
+   earliest = overlap->start;
+   is_over

Re: [Xen-devel] [PATCH] xen/events: remove event handling recursion detection

2019-11-04 Thread Jürgen Groß


On 04.11.19 15:35, Jan Beulich wrote:

On 04.11.2019 14:58, Juergen Gross wrote:

__xen_evtchn_do_upcall() contains guards against being called
recursively. This mechanism was introduced in the early pvops times
(kernel 2.6.26) when there were still Xen versions around not honoring
disabled interrupts for sending events to pv guests.

This was changed in Xen 3.0, which is much older than any Xen version
supported by the kernel, so the recursion detection can be removed.


Would you mind pointing out which exact change(s) this was(were)?


Linux kernel: 229664bee6126e01f8662976a5fe2e79813b77c8
Xen: d8263e8dbaf5ef1445bee0662143a0fcb6d43466


It had always been my understanding that the recursion detection
was mainly to guard against drivers re-enabling interrupts
transiently in their handlers (which in turn may no longer be an
issue in modern Linux kernels).


This would have been doable with a simple bool. The more complex
xchg based logic was IMO for recursion detection at any point.


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 1/3] x86/boot: Remove cached CPUID data from the trampoline

2019-11-04 Thread Jan Beulich

On 04.11.2019 15:59, Andrew Cooper wrote:
> On 04/11/2019 13:25, Jan Beulich wrote:
>> On 01.11.2019 21:25, Andrew Cooper wrote:
>>> --- a/xen/arch/x86/cpu/intel.c
>>> +++ b/xen/arch/x86/cpu/intel.c
>>> @@ -270,6 +270,7 @@ static void early_init_intel(struct cpuinfo_x86 *c)
>>> if (disable) {
>>> wrmsrl(MSR_IA32_MISC_ENABLE, misc_enable & ~disable);
>>> bootsym(trampoline_misc_enable_off) |= disable;
>>> +   bootsym(trampoline_efer) |= EFER_NX;
>>> }
>> I'm fine with all other changes here, just this one concerns me:
>> Before your change we latch a value into trampoline_misc_enable_off
>> just to be used for subsequent IA32_MISC_ENABLE writes we do. This
>> means that, on a hypervisor (like Xen itself) simply discarding
>> guest writes to the MSR (which is "fine" with the described usage
>> of ours up to now), with your change we'd now end up trying to set
>> EFER.NX when the feature may not actually be enabled in
>> IA32_MISC_ENABLE. Wouldn't such an EFER write be liable to #GP(0)?
>> I.e. don't we need to read back the MSR value here, and verify
>> we actually managed to clear the bit before actually OR-ing in
>> EFER_NX?
> 
> If this is a problem in practice, execution won't continue beyond the
> next if() condition just out of context, which set EFER.NX on the BSP
> with an unguarded WRMSR.

And how is this any good? This kind of crash is exactly what I'm
asking to avoid.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 1/3] x86/boot: Remove cached CPUID data from the trampoline

2019-11-04 Thread Andrew Cooper

On 04/11/2019 13:25, Jan Beulich wrote:
> On 01.11.2019 21:25, Andrew Cooper wrote:
>> --- a/xen/arch/x86/cpu/intel.c
>> +++ b/xen/arch/x86/cpu/intel.c
>> @@ -270,6 +270,7 @@ static void early_init_intel(struct cpuinfo_x86 *c)
>>  if (disable) {
>>  wrmsrl(MSR_IA32_MISC_ENABLE, misc_enable & ~disable);
>>  bootsym(trampoline_misc_enable_off) |= disable;
>> +bootsym(trampoline_efer) |= EFER_NX;
>>  }
> I'm fine with all other changes here, just this one concerns me:
> Before your change we latch a value into trampoline_misc_enable_off
> just to be used for subsequent IA32_MISC_ENABLE writes we do. This
> means that, on a hypervisor (like Xen itself) simply discarding
> guest writes to the MSR (which is "fine" with the described usage
> of ours up to now), with your change we'd now end up trying to set
> EFER.NX when the feature may not actually be enabled in
> IA32_MISC_ENABLE. Wouldn't such an EFER write be liable to #GP(0)?
> I.e. don't we need to read back the MSR value here, and verify
> we actually managed to clear the bit before actually OR-ing in
> EFER_NX?

If this is a problem in practice, execution won't continue beyond the
next if() condition just out of context, which set EFER.NX on the BSP
with an unguarded WRMSR.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v1 1/2] libxl: introduce new backend type VINPUT

2019-11-04 Thread Oleksandr Grytsov

On Mon, Oct 28, 2019 at 4:06 PM Oleksandr Grytsov  wrote:
>
> On Wed, Oct 16, 2019 at 4:26 PM Oleksandr Grytsov  wrote:
> >
> > On Fri, Oct 11, 2019 at 8:04 PM Ian Jackson  wrote:
> > >
> > > Oleksandr Grytsov writes ("Re: [PATCH v1 1/2] libxl: introduce new 
> > > backend type VINPUT"):
> > > > On Fri, Oct 11, 2019 at 5:58 PM Ian Jackson  
> > > > wrote:
> > > > > I think it was a48e00f14a2d "libxl: add backend type and id to vkb"
> > > > > which introduced what you are calling "user space" backends.  It
> > > > > appears that the vkb backend type enum was introduced there
> > > > > specifically to distinguish between these two situations.  For reasons
> > > > >
> > > > > Am I wrong ?  If not, why is this not working or not suitable ?
> > > >
> > > > You are right. It is not working in some cases due to QEMU_BACKEND 
> > > > macro.
> > > > QEMU_BACKEND treats both backends as QEMU. As result xl performs device
> > > > hotplug on add/remove device. Which is not expected in case "user
> > > > space" backend.
> > >
> > > Then perhaps this macro needs to be adjusted or called only
> > > conditionally or something ?
> >
> > I had an idea to move this macro to libxl__device_type and let device
> > itself decides
> > if it is qemu backend. But in case of VKBD it will read XS to determine 
> > backend
> > type. I guess it is ok.
> >
> > >
> > > > In this patch I propose solution similar to VUSB device. Where VUSB
> > > > used for frontend and depends on backend (kernel or QEMU) either
> > > > VUSB or QUSB used for backend device type e.g. use different backend
> > > > device type for different backends.
> > >
> > > I confess I don't quite follow all the VUSB stuff but I don't think it
> > > is necessarily a good model.
> >
> > If you don't mind to move QEMU_BACKEND macrto to libxl__device_type then
> > no need to add new device type at all.
> >
> > >
> > > > Introducing new backend device type for VKBD also allows to have
> > > > both backends (QEMU and non QEMU) run in same domain. Not sure if it
> > > > is useful scenario but with this proposal it is possible from
> > > > technical point of view.
> > >
> > > I don't understand why this is not possible simply by having a
> > > different backend type.
> > >
> > > > > I also don't understand why the "user space" kbd backend seems to be
> > > > > "linux" in the enum.
> > > >
> > > > I agree this is not so good name. But I don't know how to call
> > > > backends which are not running
> > > > inside QEMU in general.
> > >
> > > I think this would be useable on freebsd ?  "linux" definitely seems
> > > wrong.  I see it hasn't been in a release so it is not too late to
> > > rename it, subject to discussion with Juergen as RM.
> > >
> > > > > Where is the implementation of this user space
> > > > > backend ?
> > > >
> > > > We have extended kbd interface (kbdif.h) to support multi-touch events
> > > > as well. And we have
> > > > implemented own kbd backend https://github.com/xen-troops/displ_be/
> > > > It is integrated with display backend as both use wayland API.
> > >
> > > Great.
> > >
> > > > > Is it be controlled entirely through xenstore ?
> > > >
> > > > Yes it is controlled entirely through xenstore: lib xl creates
> > > > frontend/backend entries with
> > > > initial states and configuration.
> > >
> > > And your display backend in "troops" (is that the name of your
> > > project) checks whether the backend type is set to "linux", so that it
> > > knows to ignore ones that say "qemu" ?
> > >
> > > Maybe "linux" should be "troops"...
> > >
> >
> > It doesn't look as generic solution. If some user implements own backend
> > it should add new entry into backend type enum.
> > What about to have just string value instead of enum? In case QEMU
> > we don't have such entry at all but in case custom backend the user
> > can't put any string value here to be recognized by his backend.
> >
> > > Ian.
>
> ping

ping

Ian,

I'm waiting for your comments about following questions:

1. Move QEMU_BACKEND macro to libxl__device_type structure as function
and let the device to decide it has QEMU backend:

struct libxl__device_type {
...
device_qemu_backend_fn_t qemu_backend
}

In this case, introducing new device type for VKBD is not needed. The VKBD
device will check backend type XS entry to define which backend is running.

2. Use string type for VKBD backend_type field instead of enum. It will be
empty for QEMU and generic for "user space" backends.

Thanks.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] xen/events: remove event handling recursion detection

2019-11-04 Thread Jan Beulich

On 04.11.2019 14:58, Juergen Gross wrote:
> __xen_evtchn_do_upcall() contains guards against being called
> recursively. This mechanism was introduced in the early pvops times
> (kernel 2.6.26) when there were still Xen versions around not honoring
> disabled interrupts for sending events to pv guests.
> 
> This was changed in Xen 3.0, which is much older than any Xen version
> supported by the kernel, so the recursion detection can be removed.

Would you mind pointing out which exact change(s) this was(were)?
It had always been my understanding that the recursion detection
was mainly to guard against drivers re-enabling interrupts
transiently in their handlers (which in turn may no longer be an
issue in modern Linux kernels).

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH] xen/events: remove event handling recursion detection

2019-11-04 Thread Juergen Gross

__xen_evtchn_do_upcall() contains guards against being called
recursively. This mechanism was introduced in the early pvops times
(kernel 2.6.26) when there were still Xen versions around not honoring
disabled interrupts for sending events to pv guests.

This was changed in Xen 3.0, which is much older than any Xen version
supported by the kernel, so the recursion detection can be removed.

Signed-off-by: Juergen Gross 
---
 drivers/xen/events/events_base.c | 16 +++-
 1 file changed, 3 insertions(+), 13 deletions(-)

diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c
index 6c8843968a52..33212c494afd 100644
--- a/drivers/xen/events/events_base.c
+++ b/drivers/xen/events/events_base.c
@@ -1213,31 +1213,21 @@ void xen_send_IPI_one(unsigned int cpu, enum ipi_vector 
vector)
notify_remote_via_irq(irq);
 }
 
-static DEFINE_PER_CPU(unsigned, xed_nesting_count);
-
 static void __xen_evtchn_do_upcall(void)
 {
struct vcpu_info *vcpu_info = __this_cpu_read(xen_vcpu);
-   int cpu = get_cpu();
-   unsigned count;
+   int cpu = smp_processor_id();
 
do {
vcpu_info->evtchn_upcall_pending = 0;
 
-   if (__this_cpu_inc_return(xed_nesting_count) - 1)
-   goto out;
-
xen_evtchn_handle_events(cpu);
 
BUG_ON(!irqs_disabled());
 
-   count = __this_cpu_read(xed_nesting_count);
-   __this_cpu_write(xed_nesting_count, 0);
-   } while (count != 1 || vcpu_info->evtchn_upcall_pending);
-
-out:
+   rmb(); /* Hypervisor can set upcall pending. */
 
-   put_cpu();
+   } while (vcpu_info->evtchn_upcall_pending);
 }
 
 void xen_evtchn_do_upcall(struct pt_regs *regs)
-- 
2.16.4


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 3/3] x86/e820: fix 640k - 1M region reservation logic

2019-11-04 Thread Jan Beulich

On 01.11.2019 21:25, Andrew Cooper wrote:
> From: Sergey Dyasli 
> 
> Converting a guest from PV to PV-in-PVH makes the guest to have 384k
> less memory, which may confuse guest's balloon driver. This happens
> because Xen unconditionally reserves 640k - 1M region in E820 despite
> the fact that it's really a usable RAM in PVH boot mode.
> 
> Fix this by skipping region type change in virtualised environments,
> trusting whatever memory map our hypervisor has provided.
> 
> Signed-off-by: Sergey Dyasli 

Reviewed-by: Jan Beulich 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 2/3] x86/boot: Cache cpu_has_hypervisor very early on boot

2019-11-04 Thread Jan Beulich

On 01.11.2019 21:25, Andrew Cooper wrote:
> --- a/xen/arch/x86/boot/head.S
> +++ b/xen/arch/x86/boot/head.S
> @@ -630,6 +630,10 @@ trampoline_setup:
>  
>  1:
>  /* Interrogate CPU extended features via CPUID. */
> +mov $1, %eax
> +cpuid
> +mov %ecx, sym_fs(boot_cpu_data) + 
> CPUINFO_FEATURE_OFFSET(X86_FEATURE_HYPERVISOR)

I understand the ECX output is all we need right now. But wouldn't
it be better to then store EDX as well (and similarly ECX of
0x8001 output)?

Also, along the lines of a question back to Sergey on his
standalone patch, wouldn't it be better to take the opportunity
and check here that CPUID leaf 1 is actually valid?

Of course the question on the (intended) effect of
"cpuid=no-hypervisor" also remains. As said before, I think this
should be honored by all of our code that possibly can (i.e. at
least everything following command line parsing).

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [BUG] Invalid guest state in Xen master, dom0 linux-5.3.6, domU windows 10

2019-11-04 Thread Andrew Cooper

On 03/11/2019 10:23, Håkon Alstadheim wrote:

> (XEN) [2019-11-02 14:09:46] d2v0 vmentry failure (reason 0x8021):
> Invalid guest state (0)
> (XEN) [2019-11-02 14:09:46] * VMCS Area **
> (XEN) [2019-11-02 14:09:46] *** Guest State ***
> (XEN) [2019-11-02 14:09:46] CR0: actual=0x80050031,
> shadow=0x80050031, gh_mask=
> (XEN) [2019-11-02 14:09:46] CR4: actual=0x00172678,
> shadow=0x00170678, gh_mask=ffe8f860
> (XEN) [2019-11-02 14:09:46] CR3 = 0x001aa002
> (XEN) [2019-11-02 14:09:46] RSP = 0x8c0f4dd71e98
> (0x8c0f4dd71e98)  RIP = 0xd18a040bb75e (0xd18a040bb75e)
> (XEN) [2019-11-02 14:09:46] RFLAGS=0x0187 (0x0187)  DR7 =
> 0x0400
> (XEN) [2019-11-02 14:09:46] Sysenter RSP=
> CS:RIP=:
> (XEN) [2019-11-02 14:09:46]    sel  attr  limit   base
> (XEN) [2019-11-02 14:09:46]   CS: 0010 0209b  
> (XEN) [2019-11-02 14:09:46]   DS: 002b 0c0f3  
> (XEN) [2019-11-02 14:09:46]   SS: 0018 04093  
> (XEN) [2019-11-02 14:09:46]   ES: 002b 0c0f3  
> (XEN) [2019-11-02 14:09:46]   FS: 0053 040f3 3c00 
> (XEN) [2019-11-02 14:09:46]   GS: 002b 0c0f3  f8044ff8
> (XEN) [2019-11-02 14:09:46] GDTR:    0057 f80459c61fb0
> (XEN) [2019-11-02 14:09:46] LDTR:  1c000  
> (XEN) [2019-11-02 14:09:46] IDTR:    012f d18a014a0980
> (XEN) [2019-11-02 14:09:46]   TR: 0040 0008b 0067 f80459c6
> (XEN) [2019-11-02 14:09:46] EFER(VMCS) = 0x0d01  PAT =
> 0x0007010600070106
> (XEN) [2019-11-02 14:09:46] PreemptionTimer = 0x  SM Base =
> 0x
> (XEN) [2019-11-02 14:09:46] DebugCtl = 0x 
> DebugExceptions = 0x
> (XEN) [2019-11-02 14:09:46] Interruptibility = 0002  ActivityState
> = 
> (XEN) [2019-11-02 14:09:46] InterruptStatus = 
> (XEN) [2019-11-02 14:09:46] *** Host State ***
> (XEN) [2019-11-02 14:09:46] RIP = 0x82d080341950
> (vmx_asm_vmexit_handler)  RSP = 0x83083ff0ff70
> (XEN) [2019-11-02 14:09:46] CS=e008 SS= DS= ES= FS=
> GS= TR=e040
> (XEN) [2019-11-02 14:09:46] FSBase=
> GSBase= TRBase=83083ff14000
> (XEN) [2019-11-02 14:09:46] GDTBase=83083ff03000
> IDTBase=83083ff07000
> (XEN) [2019-11-02 14:09:46] CR0=80050033 CR3=00054dbea000
> CR4=001526e0
> (XEN) [2019-11-02 14:09:46] Sysenter RSP=83083ff0ffa0
> CS:RIP=e008:82d080395440
> (XEN) [2019-11-02 14:09:46] EFER = 0x0d01  PAT =
> 0x050100070406
> (XEN) [2019-11-02 14:09:46] *** Control State ***
> (XEN) [2019-11-02 14:09:46] PinBased=00bf CPUBased=b62065fa
> SecondaryExec=17eb
> (XEN) [2019-11-02 14:09:46] EntryControls=d3ff ExitControls=002fefff
> (XEN) [2019-11-02 14:09:46] ExceptionBitmap=00060002 PFECmask=
> PFECmatch=
> (XEN) [2019-11-02 14:09:46] VMEntry: intr_info=8501
> errcode= ilen=0001
> (XEN) [2019-11-02 14:09:46] VMExit: intr_info=8501
> errcode= ilen=0001
> (XEN) [2019-11-02 14:09:46] reason=8021
> qualification=
> (XEN) [2019-11-02 14:09:46] IDTVectoring: info= errcode=
> (XEN) [2019-11-02 14:09:46] TSC Offset = 0xf45ded46dd57  TSC
> Multiplier = 0x
> (XEN) [2019-11-02 14:09:46] TPR Threshold = 0x00  PostedIntrVec = 0xf5
> (XEN) [2019-11-02 14:09:46] EPT pointer = 0x00054e3a701e  EPTP
> index = 0x
> (XEN) [2019-11-02 14:09:46] PLE Gap=0080 Window=1000
> (XEN) [2019-11-02 14:09:46] Virtual processor ID = 0x5a02 VMfunc
> controls = 
> (XEN) [2019-11-02 14:09:46] **
> (XEN) [2019-11-02 14:09:46] domain_crash called from vmx.c:3335
> (XEN) [2019-11-02 14:09:46] Domain 2 (vcpu#0) crashed on cpu#2:

Interruptibility = 0002 (Blocked by Mov SS) and VMEntry:
intr_info=8501 (ICEBP)

Dare I ask what you're running in your windows guest?  Unless it is a
vulnerability test suite, I'm rather concerned.

Anyway - there are plenty of #DB injection corner cases, some of which
I'm still working through with Intel.  This is a new one I wasn't aware
of before, but its not surprising in hindsight.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 1/3] x86/boot: Remove cached CPUID data from the trampoline

2019-11-04 Thread Jan Beulich

On 01.11.2019 21:25, Andrew Cooper wrote:
> --- a/xen/arch/x86/cpu/intel.c
> +++ b/xen/arch/x86/cpu/intel.c
> @@ -270,6 +270,7 @@ static void early_init_intel(struct cpuinfo_x86 *c)
>   if (disable) {
>   wrmsrl(MSR_IA32_MISC_ENABLE, misc_enable & ~disable);
>   bootsym(trampoline_misc_enable_off) |= disable;
> + bootsym(trampoline_efer) |= EFER_NX;
>   }

I'm fine with all other changes here, just this one concerns me:
Before your change we latch a value into trampoline_misc_enable_off
just to be used for subsequent IA32_MISC_ENABLE writes we do. This
means that, on a hypervisor (like Xen itself) simply discarding
guest writes to the MSR (which is "fine" with the described usage
of ours up to now), with your change we'd now end up trying to set
EFER.NX when the feature may not actually be enabled in
IA32_MISC_ENABLE. Wouldn't such an EFER write be liable to #GP(0)?
I.e. don't we need to read back the MSR value here, and verify
we actually managed to clear the bit before actually OR-ing in
EFER_NX?

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 2/2] x86: explicitly disallow guest access to PPIN

2019-11-04 Thread Jan Beulich

On 01.11.2019 19:49, Andrew Cooper wrote:
> On 01/11/2019 14:29, Andrew Cooper wrote:
>> On 01/11/2019 14:00, Eslam Elnikety wrote:
>>> Thanks for this series, Jan.
>>>
>>> On 30.10.19 11:39, Jan Beulich wrote:
 To fulfill the "protected" in its name, don't let the real hardware
 values "shine through". Report a control register value expressing this.

 Signed-off-by: Jan Beulich 
 ---
 TBD: Do we want to permit Dom0 access?
>>> It would be nice to give an administrator a way to get PPIN outside
>>> the context of an MCE when needed.
>> I suppose this is a reasonable request.  We should expose it to the
>> hardware domain.
> 
> Actually on further thoughts, I'm going to backtrack slightly.
> 
> It is reasonable to give to dom0, but it is not reasonable to provide it
> by emulating the MSR interface.  The problem is that dom0's result is
> sensitive to where it happens to be scheduled.
> 
> The only sane way of letting dom0 have access is via a hypercall which
> returns "no PPIN" or all sockets information in one go, irrespective of
> which socket the current vcpu happens to be executing on.
> 
> This leaves us back in the (substantially easier) position of not having
> to virtualise the MSR interface to begin with.

Right, hence my question whether to make this a new sysctl subop,
or whether to permit reading of this one MSR via XENPF_resource_op
(or yet some other mechanism). Personally I'd go the
XENPF_resource_op route.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH for-4.13 v2] passthrough: simplify locking and logging

2019-11-04 Thread Durrant, Paul

> -Original Message-
> From: Anthony PERARD 
> Sent: 04 November 2019 12:14
> To: Durrant, Paul 
> Cc: Andrew Cooper ; xen-
> de...@lists.xenproject.org; jgr...@suse.com; Igor Druzhinin
> ; jbeul...@suse.com
> Subject: Re: [Xen-devel] [PATCH for-4.13 v2] passthrough: simplify locking
> and logging
> 
> On Mon, Nov 04, 2019 at 11:13:48AM +, Durrant, Paul wrote:
> > > -Original Message-
> > > From: Andrew Cooper 
> > > Sent: 04 November 2019 11:06
> > > To: Durrant, Paul ; xen-
> de...@lists.xenproject.org
> > > Cc: Igor Druzhinin ; jgr...@suse.com;
> > > jbeul...@suse.com
> > > Subject: Re: [Xen-devel] [PATCH for-4.13 v2] passthrough: simplify
> locking
> > > and logging
> > >
> > > On 04/11/2019 08:31, Durrant, Paul wrote:
> > > >> -Original Message-
> > > >> From: Igor Druzhinin 
> > > >> Sent: 01 November 2019 19:28
> > > >> To: xen-devel@lists.xenproject.org
> > > >> Cc: Durrant, Paul ; jbeul...@suse.com;
> > > >> jgr...@suse.com
> > > >> Subject: [PATCH for-4.13 v2] passthrough: simplify locking and
> logging
> > > >>
> > > >> From: Paul Durrant 
> > > >>
> > > >>
> > > >> Signed-off-by: Paul Durrant 
> > > >> ---
> > > >>
> > > >>
> > > >> v2: updated Paul's email address
> > >
> > > This was work you did at Citrix, yes?
> > >
> > > > Reviewed-by: Paul Durrant 
> > >
> > > SoB and R-by?
> >
> > I did do the work while I was at Citrix, but surely the SoB must be
> updated since the patch is only now being posted?
> 
> I don't think it matters when a patch is publicly posted, the SoB
> shouldn't change.
> Also, Igor, I think you need to add your own SoB to the patch. This would
> be because of (b) or (c) of the "Developer's Certificate of Origin 1.1"
> [1].
> 
> > As for the R-b, why should that be historic?
> 
> I think he meant that reviewing your own work is a bit weird. On the
> other hand, it is possible to have both a SoB and a R-b from the same
> persone, if the original patch has been modified.

I was merely reviewing the change of email address and verifying that it was 
the patch I wrote :-)

 Paul

> 
> 
> [1]:
> Developer's Certificate of Origin 1.1
> 
> By making a contribution to this project, I certify that:
> 
> (a) The contribution was created in whole or in part by me and I
> have the right to submit it under the open source license
> indicated in the file; or
> 
> (b) The contribution is based upon previous work that, to the best
> of my knowledge, is covered under an appropriate open source
> license and I have the right under that license to submit that
> work with modifications, whether created in whole or in part
> by me, under the same open source license (unless I am
> permitted to submit under a different license), as indicated
> in the file; or
> 
> (c) The contribution was provided directly to me by some other
> person who certified (a), (b) or (c) and I have not modified
> it.
> 
> (d) I understand and agree that this project and the contribution
> are public and that a record of the contribution (including all
> personal information I submit with it, including my sign-off) is
> maintained indefinitely and may be redistributed consistent with
> this project or the open source license(s) involved.
> 
> 
> Cheers,
> 
> --
> Anthony PERARD

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH for-4.13 v2] passthrough: simplify locking and logging

2019-11-04 Thread Anthony PERARD

On Mon, Nov 04, 2019 at 11:13:48AM +, Durrant, Paul wrote:
> > -Original Message-
> > From: Andrew Cooper 
> > Sent: 04 November 2019 11:06
> > To: Durrant, Paul ; xen-devel@lists.xenproject.org
> > Cc: Igor Druzhinin ; jgr...@suse.com;
> > jbeul...@suse.com
> > Subject: Re: [Xen-devel] [PATCH for-4.13 v2] passthrough: simplify locking
> > and logging
> > 
> > On 04/11/2019 08:31, Durrant, Paul wrote:
> > >> -Original Message-
> > >> From: Igor Druzhinin 
> > >> Sent: 01 November 2019 19:28
> > >> To: xen-devel@lists.xenproject.org
> > >> Cc: Durrant, Paul ; jbeul...@suse.com;
> > >> jgr...@suse.com
> > >> Subject: [PATCH for-4.13 v2] passthrough: simplify locking and logging
> > >>
> > >> From: Paul Durrant 
> > >>
> > >>
> > >> Signed-off-by: Paul Durrant 
> > >> ---
> > >>
> > >>
> > >> v2: updated Paul's email address
> > 
> > This was work you did at Citrix, yes?
> > 
> > > Reviewed-by: Paul Durrant 
> > 
> > SoB and R-by?
> 
> I did do the work while I was at Citrix, but surely the SoB must be updated 
> since the patch is only now being posted?

I don't think it matters when a patch is publicly posted, the SoB
shouldn't change.
Also, Igor, I think you need to add your own SoB to the patch. This would
be because of (b) or (c) of the "Developer's Certificate of Origin 1.1" [1].

> As for the R-b, why should that be historic?

I think he meant that reviewing your own work is a bit weird. On the
other hand, it is possible to have both a SoB and a R-b from the same
persone, if the original patch has been modified.

[1]:
Developer's Certificate of Origin 1.1

By making a contribution to this project, I certify that:

(a) The contribution was created in whole or in part by me and I
have the right to submit it under the open source license
indicated in the file; or

(b) The contribution is based upon previous work that, to the best
of my knowledge, is covered under an appropriate open source
license and I have the right under that license to submit that
work with modifications, whether created in whole or in part
by me, under the same open source license (unless I am
permitted to submit under a different license), as indicated
in the file; or

(c) The contribution was provided directly to me by some other
person who certified (a), (b) or (c) and I have not modified
it.

(d) I understand and agree that this project and the contribution
are public and that a record of the contribution (including all
personal information I submit with it, including my sign-off) is
maintained indefinitely and may be redistributed consistent with
this project or the open source license(s) involved.

Cheers,

-- 
Anthony PERARD

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [xen-unstable test] 143563: regressions - FAIL

2019-11-04 Thread osstest service owner

flight 143563 xen-unstable real [real]
http://logs.test-lab.xenproject.org/osstest/logs/143563/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-xl-shadow18 guest-localmigrate/x10   fail REGR. vs. 142750
 test-amd64-amd64-xl-qcow2   19 guest-start/debian.repeat fail REGR. vs. 142750
 test-amd64-amd64-libvirt-vhd 17 guest-start/debian.repeat fail REGR. vs. 142750
 test-amd64-i386-xl-raw  19 guest-start/debian.repeat fail REGR. vs. 142750
 test-armhf-armhf-xl-vhd 15 guest-start/debian.repeat fail REGR. vs. 142750
 test-armhf-armhf-libvirt-raw 15 guest-start/debian.repeat fail REGR. vs. 142750

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-xl-rtds 12 guest-start  fail REGR. vs. 142750

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stopfail like 142750
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 142750
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 142750
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 142750
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail like 142750
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 142750
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stopfail like 142750
 test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stopfail like 142750
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop fail like 142750
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-i386-xl-pvshim12 guest-start  fail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-arm64-arm64-xl  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-arm64-arm64-xl-credit1  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemut-ws16-amd64 17 guest-stop  fail never pass
 test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass
 test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pass
 test-amd64-amd64

Re: [Xen-devel] [osstest test] 143493: regressions - FAIL

2019-11-04 Thread Ian Jackson

osstest service owner writes ("[osstest test] 143493: regressions - FAIL"):
> flight 143493 osstest real [real]
> http://logs.test-lab.xenproject.org/osstest/logs/143493/
> 
> Regressions :-(
> 
> Tests which did not succeed and are blocking,
> including tests which could not be run:
>  test-amd64-amd64-pair10 xen-boot/src_hostfail REGR. vs. 
> 143392

I don't know what this is.  Linux fails to boot under Xen.  The last
message is
  random: crng init done
But it doesn't seem at all probable that this is anything to do
with osstest.

>  test-amd64-amd64-xl-pvshim   18 guest-localmigrate/x10   fail REGR. vs. 
> 143392

This is the known pvshim timekeeping problem.

  there's a time jump in the pvshim, which likely
 triggers the watchdog:
   
http://logs.test-lab.xenproject.org/osstest/logs/143493/test-amd64-amd64-xl-pvshim/pinot1---var-log-xen-console-guest-debian.guest.osstest--incoming.log
   
So following irc discussion I have force pushed this.  I have also
pushed the bootloader console change (for the benefit of arm) to
osstest pretest, but in the meantime when we have flights where that's
the only problem we can force push the relevant branch.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] Community Call: Call for Agenda Items and call details for Nov 7, 15:00 - 16:00 UTC

2019-11-04 Thread Lars Kurth

Dear community members,
 
please send me agenda items for next week’s community call. A draft agenda is 
at https://cryptpad.fr/pad/#/2/pad/edit/SkeU+Z5J9WIIU9ZsXlojiXcQ/
Please add agenda items to the document or reply to this e-mail
Note that I am on PTO today and tomorrow
 
Last month’s minutes are at 
https://cryptpad.fr/pad/#/2/pad/edit/4FGEw81flPUiivkjkuvQJ-CK/
 
Best Regards
Lars

## Meeting time (please double check the times
15:00 - 16:00 UTC
07:00 - 08:00 PST (San Francisco) - sorry for the early time slot. If this is a 
problem, let's discuss at the call
10:00 - 11:00 EST (New York)
15:00 - 16:00 FMT (London)
16:00 - 17:00 CET (Berlin)
23:00 - 22:00 CST (Beijing)
Further International meeting times: 
https://www.timeanddate.com/worldclock/meetingdetails.html?year=2018&month=11&day=7&hour=15&min=0&sec=0&p1=224&p2=24&p3=179&p4=136&p5=37&p6=33

## Dial in details
Web: https://www.gotomeet.me/larskurth

You can also dial in using your phone.
Access Code: 906-886-965

China (Toll Free): 4008 811084
Germany: +49 692 5736 7317
Poland (Toll Free): 00 800 1124759
United Kingdom: +44 330 221 0088
United States: +1 (571) 317-3129

More phone numbers
Australia: +61 2 9087 3604
Austria: +43 7 2081 5427
Argentina (Toll Free): 0 800 444 3375
Bahrain (Toll Free): 800 81 111
Belarus (Toll Free): 8 820 0011 0400
Belgium: +32 28 93 7018
Brazil (Toll Free): 0 800 047 4906
Bulgaria (Toll Free): 00800 120 4417
Canada: +1 (647) 497-9391
Chile (Toll Free): 800 395 150
Colombia (Toll Free): 01 800 518 4483
Czech Republic (Toll Free): 800 500448
Denmark: +45 32 72 03 82
Finland: +358 923 17 0568
France: +33 170 950 594
Greece (Toll Free): 00 800 4414 3838
Hong Kong (Toll Free): 30713169
Hungary (Toll Free): (06) 80 986 255
Iceland (Toll Free): 800 7204
India (Toll Free): 18002669272
Indonesia (Toll Free): 007 803 020 5375
Ireland: +353 15 360 728
Israel (Toll Free): 1 809 454 830
Italy: +39 0 247 92 13 01
Japan (Toll Free): 0 120 663 800
Korea, Republic of (Toll Free): 00798 14 207 4914
Luxembourg (Toll Free): 800 85158
Malaysia (Toll Free): 1 800 81 6854
Mexico (Toll Free): 01 800 522 1133
Netherlands: +31 207 941 377
New Zealand: +64 9 280 6302
Norway: +47 21 93 37 51
Panama (Toll Free): 00 800 226 7928
Peru (Toll Free): 0 800 77023
Philippines (Toll Free): 1 800 1110 1661
Portugal (Toll Free): 800 819 575
Romania (Toll Free): 0 800 410 029
Russian Federation (Toll Free): 8 800 100 6203
Saudi Arabia (Toll Free): 800 844 3633
Singapore (Toll Free): 18007231323
South Africa (Toll Free): 0 800 555 447
Spain: +34 932 75 2004
Sweden: +46 853 527 827
Switzerland: +41 225 4599 78
Taiwan (Toll Free): 0 800 666 854
Thailand (Toll Free): 001 800 011 023
Turkey (Toll Free): 00 800 4488 23683
Ukraine (Toll Free): 0 800 50 1733
United Arab Emirates (Toll Free): 800 044 40439
Uruguay (Toll Free): 0004 019 1018
Viet Nam (Toll Free): 122 80 481

First GoToMeeting? Let's do a quick system check:
https://link.gotomeeting.com/system-check


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH for-4.13 v2] passthrough: simplify locking and logging

2019-11-04 Thread Durrant, Paul

> -Original Message-
> From: Andrew Cooper 
> Sent: 04 November 2019 11:06
> To: Durrant, Paul ; xen-devel@lists.xenproject.org
> Cc: Igor Druzhinin ; jgr...@suse.com;
> jbeul...@suse.com
> Subject: Re: [Xen-devel] [PATCH for-4.13 v2] passthrough: simplify locking
> and logging
> 
> On 04/11/2019 08:31, Durrant, Paul wrote:
> >> -Original Message-
> >> From: Igor Druzhinin 
> >> Sent: 01 November 2019 19:28
> >> To: xen-devel@lists.xenproject.org
> >> Cc: Durrant, Paul ; jbeul...@suse.com;
> >> jgr...@suse.com
> >> Subject: [PATCH for-4.13 v2] passthrough: simplify locking and logging
> >>
> >> From: Paul Durrant 
> >>
> >> Dropping the pcidevs lock between calling device_assigned() and
> >> assign_device() means that the latter has to do the same check as the
> >> former for no obvious gain. Also, since long running operations under
> >> pcidevs lock already drop the lock and return -ERESTART periodically
> there
> >> is little point in immediately failing an assignment operation with
> >> -ERESTART just because the pcidevs lock could not be acquired (for the
> >> second time, having already blocked on acquiring the lock in
> >> device_assigned()).
> >>
> >> This patch instead acquires the lock once for assignment (or test
> assign)
> >> operations directly in iommu_do_pci_domctl() and thus can remove the
> >> duplicate domain ownership check in assign_device(). Whilst in the
> >> neighbourhood, the patch also removes some debug logging from
> >> assign_device() and deassign_device() and replaces it with proper error
> >> logging, which allows error logging in iommu_do_pci_domctl() to be
> >> removed. Also, since device_assigned() can tell the difference between
> a
> >> guest assigned device and a non-existent one, log the actual error
> >> condition rather then being ambiguous for the sake a few extra lines of
> >> code.
> >>
> >> Signed-off-by: Paul Durrant 
> >> ---
> >>
> >> This is XSA-302 followup and contains some changes important for
> >> XenServer.
> >> Juergen, could this be considered for 4.13 inclusion?
> >>
> >> v2: updated Paul's email address
> 
> This was work you did at Citrix, yes?
> 
> > Reviewed-by: Paul Durrant 
> 
> SoB and R-by?

I did do the work while I was at Citrix, but surely the SoB must be updated 
since the patch is only now being posted? As for the R-b, why should that be 
historic?

  Paul

> 
> ~Andrew
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH for-4.13 v2] passthrough: simplify locking and logging

2019-11-04 Thread Andrew Cooper

On 04/11/2019 08:31, Durrant, Paul wrote:
>> -Original Message-
>> From: Igor Druzhinin 
>> Sent: 01 November 2019 19:28
>> To: xen-devel@lists.xenproject.org
>> Cc: Durrant, Paul ; jbeul...@suse.com;
>> jgr...@suse.com
>> Subject: [PATCH for-4.13 v2] passthrough: simplify locking and logging
>>
>> From: Paul Durrant 
>>
>> Dropping the pcidevs lock between calling device_assigned() and
>> assign_device() means that the latter has to do the same check as the
>> former for no obvious gain. Also, since long running operations under
>> pcidevs lock already drop the lock and return -ERESTART periodically there
>> is little point in immediately failing an assignment operation with
>> -ERESTART just because the pcidevs lock could not be acquired (for the
>> second time, having already blocked on acquiring the lock in
>> device_assigned()).
>>
>> This patch instead acquires the lock once for assignment (or test assign)
>> operations directly in iommu_do_pci_domctl() and thus can remove the
>> duplicate domain ownership check in assign_device(). Whilst in the
>> neighbourhood, the patch also removes some debug logging from
>> assign_device() and deassign_device() and replaces it with proper error
>> logging, which allows error logging in iommu_do_pci_domctl() to be
>> removed. Also, since device_assigned() can tell the difference between a
>> guest assigned device and a non-existent one, log the actual error
>> condition rather then being ambiguous for the sake a few extra lines of
>> code.
>>
>> Signed-off-by: Paul Durrant 
>> ---
>>
>> This is XSA-302 followup and contains some changes important for
>> XenServer.
>> Juergen, could this be considered for 4.13 inclusion?
>>
>> v2: updated Paul's email address

This was work you did at Citrix, yes?

> Reviewed-by: Paul Durrant 

SoB and R-by?

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 2/2] x86: explicitly disallow guest access to PPIN

2019-11-04 Thread Jan Beulich

On 01.11.2019 19:35, Andrew Cooper wrote:
> On 30/10/2019 12:02, Jan Beulich wrote:
>> On 30.10.2019 12:43, Andrew Cooper wrote:
>>> On 30/10/2019 10:39, Jan Beulich wrote:
 To fulfill the "protected" in its name, don't let the real hardware
 values "shine through". Report a control register value expressing this.

 Signed-off-by: Jan Beulich 
 ---
 TBD: Do we want to permit Dom0 access?
>>> I would recommend reordering the two patches and putting this one first
>>> (along with the enumeration details, along with a pair of feature
>>> strings in xen-cpuid).  This patch at least wants backporting.
>> Well, the reason for this ordering is because this way Dom0
>> doesn't transiently lose all access.
> 
> Nothing pre-existing can be used reliably by dom0 because of the
> raz/write-discard behaviour.

Why "raz" - default behavior for "un-enumerated" MSRs is to hand
out the raw hardware value. I agree Dom0 can't _enable_ the PPIN
MSR (due to the write-discard default behavior), but on systems
where the firmware does the enabling it could still have read the
values.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2] x86/passthrough: fix migration of MSI when using posted interrupts

2019-11-04 Thread Roger Pau Monné

On Sat, Nov 02, 2019 at 07:48:06AM +, Tian, Kevin wrote:
> > From: Roger Pau Monné [mailto:roger@citrix.com]
> > Sent: Thursday, October 31, 2019 11:23 PM
> > 
> > On Thu, Oct 31, 2019 at 07:52:23AM -0700, Joe Jin wrote:
> > > On 10/31/19 1:01 AM, Jan Beulich wrote:
> > > > On 30.10.2019 19:01, Joe Jin wrote:
> > > >> On 10/30/19 10:23 AM, Roger Pau Monné wrote:
> > > >>> On Wed, Oct 30, 2019 at 09:38:16AM -0700, Joe Jin wrote:
> > >  On 10/30/19 1:24 AM, Roger Pau Monné wrote:
> > > > Can you try to add the following debug patch on top of the existing
> > > > one and report the output that you get on the Xen console?
> > > 
> > >  Applied debug patch and run the test again, not of any log printed,
> > >  attached Xen log on serial console, seems pi_update_irte() not been
> > >  called for iommu_intpost was false.
> > > >>>
> > > >>> I have to admit I'm lost at this point. Does it mean the original
> > > >>> issue had nothing to do with posted interrupts?
> > > >>
> > > >> Looks when inject irq by vlapic_set_irq(), it checked by
> > > >> hvm_funcs.deliver_posted_intr rather than iommu_intpost:
> > > >>
> > > >>  176 if ( hvm_funcs.deliver_posted_intr )
> > > >>  177 hvm_funcs.deliver_posted_intr(target, vec);
> > > >>
> > > >> And deliver_posted_intr() would be there, when vmx enabled:
> > > >>
> > > >> (XEN) HVM: VMX enabled
> > > >> (XEN) HVM: Hardware Assisted Paging (HAP) detected
> > > >> (XEN) HVM: HAP page sizes: 4kB, 2MB, 1GB
> > > >
> > > > I can't see the connection. start_vmx() has
> > > >
> > > > if ( cpu_has_vmx_posted_intr_processing )
> > > > {
> > > > alloc_direct_apic_vector(&posted_intr_vector,
> > pi_notification_interrupt);
> > > > if ( iommu_intpost )
> > > > alloc_direct_apic_vector(&pi_wakeup_vector,
> > pi_wakeup_interrupt);
> > > >
> > > > vmx_function_table.deliver_posted_intr = 
> > > > vmx_deliver_posted_intr;
> > > > vmx_function_table.sync_pir_to_irr = vmx_sync_pir_to_irr;
> > > > vmx_function_table.test_pir= vmx_test_pir;
> > > > }
> > > >
> > > > i.e. the hook is present only when posted interrupts are
> > > > available in general. I.e. also with just CPU-side posted
> > > > interrupts, yes, which gets confirmed by your "apicv=0"
> > > > test. Yet with just CPU-side posted interrupts I'm
> > > > struggling again to understand your original problem
> > > > description, and the need to fiddle with IOMMU side code.
> > >
> > > Yes, on my test env, cpu_has_vmx_posted_intr_processing == true &&
> > iommu_intpost == false,
> > > with this, posted interrupts been enabled.
> > 
> > I'm still quite lost. My reading of the Intel VT-d spec is that the
> > posted interrupt descriptor (which contains the PIRR) is used in
> > conjunction with a posted interrupt remapping entry in the iommu, so
> > that interrupts get recorded in the PIRR and later synced by the
> > hypervisor into the vlapic IRR when resuming the virtual CPU.
> 
> there are two parts. Intel first implements CPU posted interrupt,
> which allows one CPU to post IPI into non-root context in another
> CPU through posted interrupt descriptor. Later VT-d posted 
> interrupt comes, which use interrupt remapping entry and the
> same posted interrupt descriptor (using more fields) to convert
> a device interrupt into a posted interrupt. The posting process is
> same on the dest CPU, regardless of whether it's from another CPU
> or a device. 

Thanks for the description.

So the problem reported by Jin happens when using CPU posted
interrupts but not VT-d posted interrupts, in which case there
shouldn't be a need to sync PIRR with IRR when interrupts from a
passthrough device are reconfigured, because interrupts from that
device shouldn't end up signaled in PIRR because VT-d posted
interrupts is not being used.

Do interrupts from passthrough devices end up signaled in the posted
interrupt descriptor PIRR field when not using VT-d posted
interrupts but using CPU posted interrupts?

From my reading of your description above when using CPU posted
interrupts only the vectors signaled in the PIRR field should belong
to IPIs from other vCPUs?

Thanks, Roger.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH for-4.13 v2] passthrough: simplify locking and logging

2019-11-04 Thread Durrant, Paul

> -Original Message-
> From: Igor Druzhinin 
> Sent: 01 November 2019 19:28
> To: xen-devel@lists.xenproject.org
> Cc: Durrant, Paul ; jbeul...@suse.com;
> jgr...@suse.com
> Subject: [PATCH for-4.13 v2] passthrough: simplify locking and logging
> 
> From: Paul Durrant 
> 
> Dropping the pcidevs lock between calling device_assigned() and
> assign_device() means that the latter has to do the same check as the
> former for no obvious gain. Also, since long running operations under
> pcidevs lock already drop the lock and return -ERESTART periodically there
> is little point in immediately failing an assignment operation with
> -ERESTART just because the pcidevs lock could not be acquired (for the
> second time, having already blocked on acquiring the lock in
> device_assigned()).
> 
> This patch instead acquires the lock once for assignment (or test assign)
> operations directly in iommu_do_pci_domctl() and thus can remove the
> duplicate domain ownership check in assign_device(). Whilst in the
> neighbourhood, the patch also removes some debug logging from
> assign_device() and deassign_device() and replaces it with proper error
> logging, which allows error logging in iommu_do_pci_domctl() to be
> removed. Also, since device_assigned() can tell the difference between a
> guest assigned device and a non-existent one, log the actual error
> condition rather then being ambiguous for the sake a few extra lines of
> code.
> 
> Signed-off-by: Paul Durrant 
> ---
> 
> This is XSA-302 followup and contains some changes important for
> XenServer.
> Juergen, could this be considered for 4.13 inclusion?
> 
> v2: updated Paul's email address

Reviewed-by: Paul Durrant 

> 
> ---
>  xen/drivers/passthrough/pci.c | 101 ++---
> -
>  1 file changed, 42 insertions(+), 59 deletions(-)
> 
> diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c
> index e64666d..ea0770d 100644
> --- a/xen/drivers/passthrough/pci.c
> +++ b/xen/drivers/passthrough/pci.c
> @@ -932,30 +932,27 @@ static int deassign_device(struct domain *d,
> uint16_t seg, uint8_t bus,
>  break;
>  ret = hd->platform_ops->reassign_device(d, target, devfn,
>  pci_to_dev(pdev));
> -if ( !ret )
> -continue;
> -
> -printk(XENLOG_G_ERR "%pd: deassign %04x:%02x:%02x.%u failed
> (%d)\n",
> -   d, seg, bus, PCI_SLOT(devfn), PCI_FUNC(devfn), ret);
> -return ret;
> +if ( ret )
> +goto out;
>  }
> 
>  devfn = pdev->devfn;
>  ret = hd->platform_ops->reassign_device(d, target, devfn,
>  pci_to_dev(pdev));
>  if ( ret )
> -{
> -dprintk(XENLOG_G_ERR,
> -"%pd: deassign device (%04x:%02x:%02x.%u) failed\n",
> -d, seg, bus, PCI_SLOT(devfn), PCI_FUNC(devfn));
> -return ret;
> -}
> +goto out;
> 
>  if ( pdev->domain == hardware_domain  )
>  pdev->quarantine = false;
> 
>  pdev->fault.count = 0;
> 
> +out:
> +if ( ret )
> +printk(XENLOG_G_ERR
> +   "%pd: deassign device (%04x:%02x:%02x.%u) failed (%d)\n",
> d,
> +   seg, bus, PCI_SLOT(devfn), PCI_FUNC(devfn), ret);
> +
>  return ret;
>  }
> 
> @@ -976,10 +973,7 @@ int pci_release_devices(struct domain *d)
>  {
>  bus = pdev->bus;
>  devfn = pdev->devfn;
> -if ( deassign_device(d, pdev->seg, bus, devfn) )
> -printk("domain %d: deassign device (%04x:%02x:%02x.%u)
> failed!\n",
> -   d->domain_id, pdev->seg, bus,
> -   PCI_SLOT(devfn), PCI_FUNC(devfn));
> +deassign_device(d, pdev->seg, bus, devfn);
>  }
>  pcidevs_unlock();
> 
> @@ -1534,8 +1528,7 @@ static int device_assigned(u16 seg, u8 bus, u8
> devfn)
>  struct pci_dev *pdev;
>  int rc = 0;
> 
> -pcidevs_lock();
> -
> +ASSERT(pcidevs_locked());
>  pdev = pci_get_pdev(seg, bus, devfn);
> 
>  if ( !pdev )
> @@ -1549,11 +1542,10 @@ static int device_assigned(u16 seg, u8 bus, u8
> devfn)
>pdev->domain != dom_io )
>  rc = -EBUSY;
> 
> -pcidevs_unlock();
> -
>  return rc;
>  }
> 
> +/* caller should hold the pcidevs_lock */
>  static int assign_device(struct domain *d, u16 seg, u8 bus, u8 devfn, u32
> flag)
>  {
>  const struct domain_iommu *hd = dom_iommu(d);
> @@ -1571,23 +1563,11 @@ static int assign_device(struct domain *d, u16
> seg, u8 bus, u8 devfn, u32 flag)
>vm_event_check_ring(d->vm_event_paging)) )
>  return -EXDEV;
> 
> -if ( !pcidevs_trylock() )
> -return -ERESTART;
> -
> +/* device_assigned() should already have cleared the device for
> assignment */
> +ASSERT(pcidevs_locked());
>  pdev = pci_get_pdev(seg, bus, devfn);
> -
> -rc = -ENODEV;
> -if ( !pdev )
> -goto done;
> -
> -rc = 0;
> -

Re: [Xen-devel] [PATCH 2/2] x86: explicitly disallow guest access to PPIN

2019-11-04 Thread Jan Beulich

On 01.11.2019 15:00, Eslam Elnikety wrote:
> On 30.10.19 11:39, Jan Beulich wrote:
>> @@ -237,6 +239,18 @@ int guest_rdmsr(struct vcpu *v, uint32_t
>>  ARRAY_SIZE(msrs->dr_mask))];
>>   break;
>>   
>> +case MSR_PPIN_CTL:
>> +if ( d->arch.cpuid->x86_vendor != X86_VENDOR_INTEL )
>> +goto gp_fault;
>> +*val = PPIN_LOCKOUT;
>> +break;
>> +
>> +case MSR_AMD_PPIN_CTL:
>> +if ( !cp->extd.amd_ppin )
>> +goto gp_fault;
>> +*val = PPIN_LOCKOUT;
>> +break;
>> +
> 
> nit: It is not clear to me why you use "d->arch.cpuid->.." (and not 
> "cp->..") in the first if condition.

Simple oversight; corrected for v2.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 2/2] x86: explicitly disallow guest access to PPIN

2019-11-04 Thread Jan Beulich

On 01.11.2019 15:29, Andrew Cooper wrote:
> On 01/11/2019 14:00, Eslam Elnikety wrote:
>> Thanks for this series, Jan.
>>
>> On 30.10.19 11:39, Jan Beulich wrote:
>>> To fulfill the "protected" in its name, don't let the real hardware
>>> values "shine through". Report a control register value expressing this.
>>>
>>> Signed-off-by: Jan Beulich 
>>> ---
>>> TBD: Do we want to permit Dom0 access?
>>
>> It would be nice to give an administrator a way to get PPIN outside
>> the context of an MCE when needed.
> 
> I suppose this is a reasonable request.  We should expose it to the
> hardware domain.

Via (new) sysctl (or platform op) or by allowing direct MSR read access?
(If the former, I'd want to make this addition a separate patch.)

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

56 matches

Mail list logo