Re: [PATCH v2 2/3] x86: fix setup of brk area

2022-06-29 Thread Juergen Gross

On 29.06.22 19:14, Josh Poimboeuf wrote:

Hi Juergen,

It helps to actually Cc the person who broke it ;-)

On Thu, Jun 23, 2022 at 11:46:07AM +0200, Juergen Gross wrote:

Commit e32683c6f7d2 ("x86/mm: Fix RESERVE_BRK() for older binutils")
put the brk area into the .bss..brk section (placed directly behind
.bss),


Hm? It didn't actually do that.

For individual translation units, it did rename the section from
".brk_reservation" to ".bss..brk".  But then during linking it's still
placed in .brk in vmlinux, just like before.


Sorry, I misread the patch commit message and was fooled by the fact that
bisection clearly pointed at this patch to have introduced the problem.

I only discovered later that the main issue was the added "NOLOAD"
attribute.


causing it not to be cleared initially. As the brk area is used
to allocate early page tables, these might contain garbage in not
explicitly written entries.

This is especially a problem for Xen PV guests, as the hypervisor will
validate page tables (check for writable page tables and hypervisor
private bits) before accepting them to be used. There have been reports
of early crashes of PV guests due to illegal page table contents.

Fix that by letting clear_bss() clear the brk area, too.


While it does make sense to clear the brk area, I don't understand how
my patch broke this.  How was it getting cleared before?


It seemed to have worked by chance. The Xen hypervisor is clearing all
alloc-only sections when loading a kernel (this will "fix" the dom0
case reliably together with patch 3 of this series).

Grub might do the clearing, too (for the PV domU case), but I haven't
verified that by code inspection.

I'll drop the "Fixes:" tag.


Juergen


OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature


Re: [PATCH v9 3/3] xsm: refactor flask sid alloc and domain check

2022-06-29 Thread Jan Beulich
Just a two nits - while the change looks plausible, I'm afraid I'm
not qualified to properly review it.

On 30.06.2022 04:21, Daniel P. Smith wrote:
> The function flask_domain_alloc_security() is where a default sid should be
> assigned to a domain under construction. For reasons unknown, the initial
> domain would be assigned unlabeled_t and then fixed up under
> flask_domain_create().  With the introduction of xenboot_t it is now possible
> to distinguish when the hypervisor is in the boot state.
> 
> This commit looks to correct this by using a check to see if the hypervisor is
> under the xenboot_t context in flask_domain_alloc_security(). If it is, then 
> it

While (or maybe because) I'm not a native speaker, the use of "looks"
reads ambiguous to me. I think you mean it in the sense of e.g. "aims",
but at first I read it in the sense of "seems", which made me think
you're not certain whether it actually does.

> will inspect the domain's is_privileged field, and select the appropriate
> default label, dom0_t or domU_t, for the domain. The logic for
> flask_domain_create() was changed to allow the incoming sid to override the
> default label.
> 
> The base policy was adjusted to allow the idle domain under the xenboot_t
> context to be able to construct domains of both types, dom0 and domU.
> 
> Signed-off-by: Daniel P. Smith 
> ---
>  tools/flask/policy/modules/dom0.te |  3 +++
>  tools/flask/policy/modules/domU.te |  3 +++
>  xen/xsm/flask/hooks.c  | 34 ++
>  3 files changed, 26 insertions(+), 14 deletions(-)
> 
> diff --git a/tools/flask/policy/modules/dom0.te 
> b/tools/flask/policy/modules/dom0.te
> index 0a63ce15b6..2022bb9636 100644
> --- a/tools/flask/policy/modules/dom0.te
> +++ b/tools/flask/policy/modules/dom0.te
> @@ -75,3 +75,6 @@ admin_device(dom0_t, ioport_t)
>  admin_device(dom0_t, iomem_t)
>  
>  domain_comms(dom0_t, dom0_t)
> +
> +# Allow they hypervisor to build domains of type dom0_t

Since it repeats ...

> +xen_build_domain(dom0_t)
> diff --git a/tools/flask/policy/modules/domU.te 
> b/tools/flask/policy/modules/domU.te
> index b77df29d56..73fc90c3c6 100644
> --- a/tools/flask/policy/modules/domU.te
> +++ b/tools/flask/policy/modules/domU.te
> @@ -13,6 +13,9 @@ domain_comms(domU_t, domU_t)
>  migrate_domain_out(dom0_t, domU_t)
>  domain_self_comms(domU_t)
>  
> +# Allow they hypervisor to build domains of type domU_t
> +xen_build_domain(domU_t)

... here - s/they/the/ in both places?

Jan



Re: [PATCH RESEND v10 2/2] x86/xen: Allow per-domain usage of hardware virtualized APIC

2022-06-29 Thread Jan Beulich
On 30.06.2022 05:25, Tian, Kevin wrote:
>> From: Jane Malalane 
>> Sent: Wednesday, June 29, 2022 9:56 PM
>>
>> Introduce a new per-domain creation x86 specific flag to
>> select whether hardware assisted virtualization should be used for
>> x{2}APIC.
>>
>> A per-domain option is added to xl in order to select the usage of
>> x{2}APIC hardware assisted virtualization, as well as a global
>> configuration option.
>>
>> Having all APIC interaction exit to Xen for emulation is slow and can
>> induce much overhead. Hardware can speed up x{2}APIC by decoding the
>> APIC access and providing a VM exit with a more specific exit reason
>> than a regular EPT fault or by altogether avoiding a VM exit.
> 
> Above is obvious and could be removed. 
> 
> I think the key is just the next paragraph for why we
> want this per-domain control.

Indeed, but the paragraph above sets the context. It might be possible
to shorten it, but ...

> Apart from that:
> 
> Reviewed-by: Kevin Tian 
> 
>>
>> On the other hand, being able to disable x{2}APIC hardware assisted
>> virtualization can be useful for testing and debugging purposes.

... I think it is desirable for this sentence to start with "Otoh" or
alike.

Jan



Re: Reg. Tee init fail...

2022-06-29 Thread Juergen Gross

On 30.06.22 05:32, SK, SivaSangeetha (Siva Sangeetha) wrote:

[AMD Official Use Only - General]

+team

-Original Message-
From: Stefano Stabellini 
Sent: Thursday, June 30, 2022 1:34 AM
To: Julien Grall 
Cc: SK, SivaSangeetha (Siva Sangeetha) ; 
xen-devel@lists.xenproject.org; Stefano Stabellini ; Bertrand Marquis 
; Volodymyr Babchuk ; 
jgr...@suse.com; boris.ostrov...@oracle.com
Subject: Re: Reg. Tee init fail...

Adding Juergen and Boris because this is a Linux/x86 issue.


As you can see from this Linux driver:
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Felixir.bootlin.com%2Flinux%2Flatest%2Fsource%2Fdrivers%2Fcrypto%2Fccp%2Ftee-dev.c%23L132&data=05%7C01%7CSivaSangeetha.SK%40amd.com%7Ce962a907794f4917a80b08da5a0a7b3b%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637921298315828104%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=NxmMUckiDRGLv3qLJrhZKBt2zNTuomEZqYJdV74tXxA%3D&reserved=0

Linux as dom0 on x86 is trying to communicate with firmware (TEE). Linux is calling __pa to pass a physical 
address to firmware. However, __pa returns a "fake" address not an mfn. I imagine that a quick 
workaround would be to call "virt_to_machine" instead of "__pa" in tee-dev.c.

Normally, if this was a device, the "right fix" would be to use 
swiotlb-xen:xen_swiotlb_map_page to get back a real physical address.

However, xen_swiotlb_map_page is meant to be used as part of the dma_ops API 
and takes a struct device *dev as input parameter. Maybe xen_swiotlb_map_page 
can be used for tee-dev as well?


Basically tee-dev would need to call dma_map_page before passing addresses to 
firmware, and dma_unmap_page when it is done. E.g.:


   cmd_buffer = dma_map_page(dev, virt_to_page(cmd),
 cmd & ~PAGE_MASK,
 ring_size,
 DMA_TO_DEVICE);


Juergen, Boris,
what do you think?


Yes, I think using the DMA interface is the correct way to handle that.

BTW, I did a similar fix for the dcdbas driver recently:

https://lore.kernel.org/r/20220318150950.16843-1-jgr...@suse.com


Juergen


OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature


[linux-linus test] 171411: regressions - FAIL

2022-06-29 Thread osstest service owner
flight 171411 linux-linus real [real]
http://logs.test-lab.xenproject.org/osstest/logs/171411/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-dom0pvh-xl-intel  8 xen-bootfail REGR. vs. 171277
 test-amd64-amd64-xl-qemuu-ws16-amd64  8 xen-boot fail REGR. vs. 171277
 test-amd64-amd64-xl-qemut-win7-amd64  8 xen-boot fail REGR. vs. 171277
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 8 xen-boot fail REGR. vs. 
171277
 test-amd64-amd64-pygrub   8 xen-boot fail REGR. vs. 171277
 test-amd64-amd64-freebsd11-amd64  8 xen-boot fail REGR. vs. 171277
 test-amd64-amd64-dom0pvh-xl-amd  8 xen-boot  fail REGR. vs. 171277
 test-amd64-amd64-qemuu-nested-amd  8 xen-bootfail REGR. vs. 171277
 test-amd64-amd64-xl-qemuu-win7-amd64  8 xen-boot fail REGR. vs. 171277
 test-amd64-amd64-xl-qemut-ws16-amd64  8 xen-boot fail REGR. vs. 171277
 test-amd64-amd64-xl-xsm   8 xen-boot fail REGR. vs. 171277
 test-amd64-amd64-xl-qemut-debianhvm-i386-xsm  8 xen-boot fail REGR. vs. 171277
 test-amd64-amd64-xl-qemuu-debianhvm-i386-xsm  8 xen-boot fail REGR. vs. 171277
 test-amd64-amd64-libvirt  8 xen-boot fail REGR. vs. 171277
 test-amd64-amd64-xl-qemuu-ovmf-amd64  8 xen-boot fail REGR. vs. 171277
 test-amd64-amd64-libvirt-qcow2  8 xen-boot   fail REGR. vs. 171277
 test-amd64-amd64-xl-credit2   8 xen-boot fail REGR. vs. 171277
 test-amd64-amd64-xl-credit1   8 xen-boot fail REGR. vs. 171277
 test-amd64-amd64-xl   8 xen-boot fail REGR. vs. 171277
 test-amd64-amd64-xl-qemuu-debianhvm-amd64  8 xen-bootfail REGR. vs. 171277
 test-amd64-amd64-xl-vhd   8 xen-boot fail REGR. vs. 171277
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-shadow 8 xen-boot fail REGR. vs. 
171277
 test-amd64-amd64-xl-multivcpu  8 xen-bootfail REGR. vs. 171277
 test-amd64-amd64-libvirt-pair 12 xen-boot/src_host   fail REGR. vs. 171277
 test-amd64-amd64-libvirt-pair 13 xen-boot/dst_host   fail REGR. vs. 171277
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 8 xen-boot fail REGR. 
vs. 171277
 test-amd64-amd64-xl-shadow8 xen-boot fail REGR. vs. 171277
 test-amd64-amd64-libvirt-raw  8 xen-boot fail REGR. vs. 171277
 test-amd64-amd64-qemuu-nested-intel  8 xen-boot  fail REGR. vs. 171277
 test-amd64-amd64-libvirt-xsm  8 xen-boot fail REGR. vs. 171277
 test-amd64-amd64-xl-qemuu-dmrestrict-amd64-dmrestrict 8 xen-boot fail REGR. 
vs. 171277
 test-amd64-amd64-xl-pvhv2-intel  8 xen-boot  fail REGR. vs. 171277
 test-amd64-amd64-freebsd12-amd64  8 xen-boot fail REGR. vs. 171277
 test-amd64-amd64-xl-pvshim8 xen-boot fail REGR. vs. 171277
 test-amd64-amd64-xl-pvhv2-amd  8 xen-bootfail REGR. vs. 171277
 test-amd64-amd64-examine  8 reboot   fail REGR. vs. 171277
 test-amd64-amd64-examine-uefi  8 reboot  fail REGR. vs. 171277
 test-amd64-amd64-examine-bios  8 reboot  fail REGR. vs. 171277
 test-amd64-coresched-amd64-xl  8 xen-bootfail REGR. vs. 171277
 test-amd64-amd64-xl-qemut-debianhvm-amd64  8 xen-bootfail REGR. vs. 171277
 test-amd64-amd64-pair12 xen-boot/src_hostfail REGR. vs. 171277
 test-amd64-amd64-pair13 xen-boot/dst_hostfail REGR. vs. 171277

Regressions which are regarded as allowable (not blocking):
 test-amd64-amd64-xl-rtds  8 xen-boot fail REGR. vs. 171277

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 171277
 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check   fail like 171277
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 171277
 test-arm64-arm64-xl-seattle  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail   never pass
 test-armhf-armhf-x

RE: [PATCH v3] xen/arm: avoid overflow when setting vtimer in context switch

2022-06-29 Thread Jiamei Xie
Hi,

> -Original Message-
> From: Jiamei Xie 
> Sent: 2022年6月30日 9:54
> To: xen-devel@lists.xenproject.org
> Cc: Jiamei Xie ; Stefano Stabellini
> ; Julien Grall ; Bertrand Marquis
> ; Volodymyr Babchuk
> ; Wei Chen 
> Subject: [PATCH v3] xen/arm: avoid overflow when setting vtimer in context
> switch
> 
> virt_vtimer_save is calculating the new time for the vtimer in:
> "v->arch.virt_timer.cval + v->domain->arch.virt_timer_base.offset
> - boot_count".
> In this formula, "cval + offset" might cause uint64_t overflow.
> Changing it to "ticks_to_ns(v->domain->arch.virt_timer_base.offset -
> boot_count) + ticks_to_ns(v->arch.virt_timer.cval)" can avoid overflow,
> and "ticks_to_ns(arch.virt_timer_base.offset - boot_count)" will be
> always the same, which has been caculated in domain_vtimer_init.
> Introduce a new field virt_timer_base.nanoseconds to store this value
> for arm in struct arch_domain, so we can use it directly.
> 
> Signed-off-by: Jiamei Xie 
> Change-Id: Ib80cee51eaf844661e6f92154a0339ad2a652f9b

I am sorry I forget to remove the Change-Id.

> ---
> was "xen/arm: avoid vtimer flip-flop transition in context switch".
> v3 changes:
> -re-write commit message
> -store nanoseconds in virt_timer_base instead of adding a new structure
> -assign to nanoseconds first, then seconds
> ---
>  xen/arch/arm/include/asm/domain.h | 1 +
>  xen/arch/arm/vtimer.c | 9 ++---
>  2 files changed, 7 insertions(+), 3 deletions(-)
> 
> diff --git a/xen/arch/arm/include/asm/domain.h
> b/xen/arch/arm/include/asm/domain.h
> index ed63c2b6f9..cd9ce19b4b 100644
> --- a/xen/arch/arm/include/asm/domain.h
> +++ b/xen/arch/arm/include/asm/domain.h
> @@ -71,6 +71,7 @@ struct arch_domain
> 
>  struct {
>  uint64_t offset;
> +s_time_t nanoseconds;
>  } virt_timer_base;
> 
>  struct vgic_dist vgic;
> diff --git a/xen/arch/arm/vtimer.c b/xen/arch/arm/vtimer.c
> index 6b78fea77d..aeaea78e4c 100644
> --- a/xen/arch/arm/vtimer.c
> +++ b/xen/arch/arm/vtimer.c
> @@ -63,7 +63,9 @@ static void virt_timer_expired(void *data)
>  int domain_vtimer_init(struct domain *d, struct xen_arch_domainconfig
> *config)
>  {
>  d->arch.virt_timer_base.offset = get_cycles();
> -d->time_offset.seconds = ticks_to_ns(d->arch.virt_timer_base.offset -
> boot_count);
> +d->arch.virt_timer_base.nanoseconds =
> +ticks_to_ns(d->arch.virt_timer_base.offset - boot_count);
> +d->time_offset.seconds = d->arch.virt_timer_base.nanoseconds;
>  do_div(d->time_offset.seconds, 10);
> 
>  config->clock_frequency = timer_dt_clock_frequency;
> @@ -144,8 +146,9 @@ void virt_timer_save(struct vcpu *v)
>  if ( (v->arch.virt_timer.ctl & CNTx_CTL_ENABLE) &&
>   !(v->arch.virt_timer.ctl & CNTx_CTL_MASK))
>  {
> -set_timer(&v->arch.virt_timer.timer, 
> ticks_to_ns(v->arch.virt_timer.cval
> +
> -  v->domain->arch.virt_timer_base.offset - boot_count));
> +set_timer(&v->arch.virt_timer.timer,
> +  v->domain->arch.virt_timer_base.nanoseconds +
> +  ticks_to_ns(v->arch.virt_timer.cval));
>  }
>  }
> 
> --
> 2.25.1




RE: Reg. Tee init fail...

2022-06-29 Thread SK, SivaSangeetha (Siva Sangeetha)
[AMD Official Use Only - General]

+team

-Original Message-
From: Stefano Stabellini  
Sent: Thursday, June 30, 2022 1:34 AM
To: Julien Grall 
Cc: SK, SivaSangeetha (Siva Sangeetha) ; 
xen-devel@lists.xenproject.org; Stefano Stabellini ; 
Bertrand Marquis ; Volodymyr Babchuk 
; jgr...@suse.com; boris.ostrov...@oracle.com
Subject: Re: Reg. Tee init fail...

Adding Juergen and Boris because this is a Linux/x86 issue.


As you can see from this Linux driver:
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Felixir.bootlin.com%2Flinux%2Flatest%2Fsource%2Fdrivers%2Fcrypto%2Fccp%2Ftee-dev.c%23L132&data=05%7C01%7CSivaSangeetha.SK%40amd.com%7Ce962a907794f4917a80b08da5a0a7b3b%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637921298315828104%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=NxmMUckiDRGLv3qLJrhZKBt2zNTuomEZqYJdV74tXxA%3D&reserved=0

Linux as dom0 on x86 is trying to communicate with firmware (TEE). Linux is 
calling __pa to pass a physical address to firmware. However, __pa returns a 
"fake" address not an mfn. I imagine that a quick workaround would be to call 
"virt_to_machine" instead of "__pa" in tee-dev.c.

Normally, if this was a device, the "right fix" would be to use 
swiotlb-xen:xen_swiotlb_map_page to get back a real physical address.

However, xen_swiotlb_map_page is meant to be used as part of the dma_ops API 
and takes a struct device *dev as input parameter. Maybe xen_swiotlb_map_page 
can be used for tee-dev as well?


Basically tee-dev would need to call dma_map_page before passing addresses to 
firmware, and dma_unmap_page when it is done. E.g.:


  cmd_buffer = dma_map_page(dev, virt_to_page(cmd),
cmd & ~PAGE_MASK,
ring_size,
DMA_TO_DEVICE);


Juergen, Boris,
what do you think?



On Fri, 24 Jun 2022, Julien Grall wrote:
> Hi,
> 
> (moving the discussion to xen-devel as I think it is more appropriate)
> 
> On 24/06/2022 10:53, SK, SivaSangeetha (Siva Sangeetha) wrote:
> > [AMD Official Use Only - General]
> 
> Not clear what this means.
> 
> > 
> > Hi Xen team,
> > 
> > In TEE driver, We allocate a ring buffer, get its physical address 
> > from
> > __pa() macro, pass the physical address to secure processor for 
> > mapping it and using in secure processor side.
> > 
> > Source:
> > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fel
> > ixir.bootlin.com%2Flinux%2Flatest%2Fsource%2Fdrivers%2Fcrypto%2Fccp%
> > 2Ftee-dev.c%23L132&data=05%7C01%7CSivaSangeetha.SK%40amd.com%7Ce
> > 962a907794f4917a80b08da5a0a7b3b%7C3dd8961fe4884e608e11a82d994e183d%7
> > C0%7C0%7C637921298315828104%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAw
> > MDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&a
> > mp;sdata=NxmMUckiDRGLv3qLJrhZKBt2zNTuomEZqYJdV74tXxA%3D&reserved
> > =0
> > 
> > This works good natively in Dom0 on the target.
> > When we boot the same Dom0 kernel, with Xen hypervisor enabled, ring 
> > init fails.
> 
> Do you have any error message or error code?
> 
> > 
> > 
> > We suspect that the address passed to secure processor, is not same 
> > when xen is enabled, and when xen is enabled, some level of address 
> > translation might be required to get exact physical address.
> 
> If you are using Xen upstream, Dom0 will be mapped with IPA == PA. So 
> there should be no need for translation.
> 
> Can you provide more details on your setup (version of Xen, Linux...)?
> 
> Cheers,
> 
> --
> Julien Grall
> 



RE: [PATCH RESEND v10 2/2] x86/xen: Allow per-domain usage of hardware virtualized APIC

2022-06-29 Thread Tian, Kevin
> From: Jane Malalane 
> Sent: Wednesday, June 29, 2022 9:56 PM
> 
> Introduce a new per-domain creation x86 specific flag to
> select whether hardware assisted virtualization should be used for
> x{2}APIC.
> 
> A per-domain option is added to xl in order to select the usage of
> x{2}APIC hardware assisted virtualization, as well as a global
> configuration option.
> 
> Having all APIC interaction exit to Xen for emulation is slow and can
> induce much overhead. Hardware can speed up x{2}APIC by decoding the
> APIC access and providing a VM exit with a more specific exit reason
> than a regular EPT fault or by altogether avoiding a VM exit.

Above is obvious and could be removed. 

I think the key is just the next paragraph for why we
want this per-domain control.

Apart from that:

Reviewed-by: Kevin Tian 

> 
> On the other hand, being able to disable x{2}APIC hardware assisted
> virtualization can be useful for testing and debugging purposes.
> 
> Note:
> 
> - vmx_install_vlapic_mapping doesn't require modifications regardless
> of whether the guest has "Virtualize APIC accesses" enabled or not,
> i.e., setting the APIC_ACCESS_ADDR VMCS field is fine so long as
> virtualize_apic_accesses is supported by the CPU.
> 
> - Both per-domain and global assisted_x{2}apic options are not part of
> the migration stream, unless explicitly set in the respective
> configuration files. Default settings of assisted_x{2}apic done
> internally by the toolstack, based on host capabilities at create
> time, are not migrated.
> 
> Suggested-by: Andrew Cooper 
> Signed-off-by: Jane Malalane 
> Acked-by: Christian Lindig 
> Reviewed-by: "Roger Pau Monné" 
> Reviewed-by: Anthony PERARD 
> ---
> CC: Wei Liu 
> CC: Anthony PERARD 
> CC: George Dunlap 
> CC: Nick Rosbrook 
> CC: Juergen Gross 
> CC: Christian Lindig 
> CC: David Scott 
> CC: Jan Beulich 
> CC: Andrew Cooper 
> CC: "Roger Pau Monné" 
> CC: Jun Nakajima 
> CC: Kevin Tian 
> 
> v10:
>  * Improve commit message note on migration
> 
> v9:
>  * Fix style issues
>  * Fix exit() logic for assisted_x{2}apic parsing
>  * Add and use XEN_X86_MISC_FLAGS_MAX for ABI checking instead of
>using XEN_X86_ASSISTED_X2APIC directly
>  * Expand commit message to mention migration is safe
> 
> v8:
>  * Widen assisted_x{2}apic parsing to PVH guests in
>parse_config_data()
> 
> v7:
>  * Fix void return in libxl__arch_domain_build_info_setdefault
>  * Fix style issues
>  * Use EINVAL when rejecting assisted_x{2}apic for PV guests and
>ENODEV otherwise, when assisted_x{2}apic isn't supported
>  * Define has_assisted_x{2}apic macros for when !CONFIG_HVM
>  * Replace "EPT" fault reference with "p2m" fault since the former is
>Intel-specific
> 
> v6:
>  * Use ENODEV instead of EINVAL when rejecting assisted_x{2}apic
>for PV guests
>  * Move has_assisted_x{2}apic macros out of an Intel specific header
>  * Remove references to Intel specific features in documentation
> 
> v5:
>  * Revert v4 changes in vmx_vlapic_msr_changed(), preserving the use of
>the has_assisted_x{2}apic macros
>  * Following changes in assisted_x{2}apic_available definitions in
>patch 1, retighten conditionals for setting
>XEN_HVM_CPUID_APIC_ACCESS_VIRT and XEN_HVM_CPUID_X2APIC_VIRT
> in
>cpuid_hypervisor_leaves()
> 
> v4:
>  * Add has_assisted_x{2}apic macros and use them where appropriate
>  * Replace CPU checks with per-domain assisted_x{2}apic control
>options in vmx_vlapic_msr_changed() and cpuid_hypervisor_leaves(),
>following edits to assisted_x{2}apic_available definitions in
>patch 1
>Note: new assisted_x{2}apic_available definitions make later
>cpu_has_vmx_apic_reg_virt and cpu_has_vmx_virtual_intr_delivery
>checks redundant in vmx_vlapic_msr_changed()
> 
> v3:
>  * Change info in xl.cfg to better express reality and fix
>capitalization of x{2}apic
>  * Move "physinfo" variable definition to the beggining of
>libxl__domain_build_info_setdefault()
>  * Reposition brackets in if statement to match libxl coding style
>  * Shorten logic in libxl__arch_domain_build_info_setdefault()
>  * Correct dprintk message in arch_sanitise_domain_config()
>  * Make appropriate changes in vmx_vlapic_msr_changed() and
>cpuid_hypervisor_leaves() for amended "assisted_x2apic" bit
>  * Remove unneeded parantheses
> 
> v2:
>  * Add a LIBXL_HAVE_ASSISTED_APIC macro
>  * Pass xcpyshinfo as a pointer in libxl__arch_get_physinfo
>  * Add a return statement in now "int"
>libxl__arch_domain_build_info_setdefault()
>  * Preserve libxl__arch_domain_build_info_setdefault 's location in
>libxl_create.c
>  * Correct x{2}apic default setting logic in
>libxl__arch_domain_prepare_config()
>  * Correct logic for parsing assisted_x{2}apic host/guest options in
>xl_parse.c and initialize them to -1 in xl.c
>  * Use guest options directly in vmx_vlapic_msr_changed
>  * Fix indentation of bool assisted_x{2}apic in struct hvm_domain
>  * Add a change in xenctrl_stub

[PATCH v9 3/3] xsm: refactor flask sid alloc and domain check

2022-06-29 Thread Daniel P. Smith
The function flask_domain_alloc_security() is where a default sid should be
assigned to a domain under construction. For reasons unknown, the initial
domain would be assigned unlabeled_t and then fixed up under
flask_domain_create().  With the introduction of xenboot_t it is now possible
to distinguish when the hypervisor is in the boot state.

This commit looks to correct this by using a check to see if the hypervisor is
under the xenboot_t context in flask_domain_alloc_security(). If it is, then it
will inspect the domain's is_privileged field, and select the appropriate
default label, dom0_t or domU_t, for the domain. The logic for
flask_domain_create() was changed to allow the incoming sid to override the
default label.

The base policy was adjusted to allow the idle domain under the xenboot_t
context to be able to construct domains of both types, dom0 and domU.

Signed-off-by: Daniel P. Smith 
---
 tools/flask/policy/modules/dom0.te |  3 +++
 tools/flask/policy/modules/domU.te |  3 +++
 xen/xsm/flask/hooks.c  | 34 ++
 3 files changed, 26 insertions(+), 14 deletions(-)

diff --git a/tools/flask/policy/modules/dom0.te 
b/tools/flask/policy/modules/dom0.te
index 0a63ce15b6..2022bb9636 100644
--- a/tools/flask/policy/modules/dom0.te
+++ b/tools/flask/policy/modules/dom0.te
@@ -75,3 +75,6 @@ admin_device(dom0_t, ioport_t)
 admin_device(dom0_t, iomem_t)
 
 domain_comms(dom0_t, dom0_t)
+
+# Allow they hypervisor to build domains of type dom0_t
+xen_build_domain(dom0_t)
diff --git a/tools/flask/policy/modules/domU.te 
b/tools/flask/policy/modules/domU.te
index b77df29d56..73fc90c3c6 100644
--- a/tools/flask/policy/modules/domU.te
+++ b/tools/flask/policy/modules/domU.te
@@ -13,6 +13,9 @@ domain_comms(domU_t, domU_t)
 migrate_domain_out(dom0_t, domU_t)
 domain_self_comms(domU_t)
 
+# Allow they hypervisor to build domains of type domU_t
+xen_build_domain(domU_t)
+
 # Device model for domU_t.  You can define distinct types for device models for
 # domains of other types, or add more make_device_model lines for this type.
 declare_domain(dm_dom_t)
diff --git a/xen/xsm/flask/hooks.c b/xen/xsm/flask/hooks.c
index 8c9cd0f297..caa0ae7d4c 100644
--- a/xen/xsm/flask/hooks.c
+++ b/xen/xsm/flask/hooks.c
@@ -182,7 +182,15 @@ static int cf_check flask_domain_alloc_security(struct 
domain *d)
 dsec->sid = SECINITSID_DOMIO;
 break;
 default:
-dsec->sid = SECINITSID_UNLABELED;
+if ( domain_sid(current->domain) == SECINITSID_XENBOOT )
+{
+if ( d->is_privileged )
+dsec->sid = SECINITSID_DOM0;
+else
+dsec->sid = SECINITSID_DOMU;
+}
+else
+dsec->sid = SECINITSID_UNLABELED;
 }
 
 dsec->self_sid = dsec->sid;
@@ -548,23 +556,21 @@ static int cf_check flask_domain_create(struct domain *d, 
uint32_t ssidref)
 {
 int rc;
 struct domain_security_struct *dsec = d->ssid;
-static int dom0_created = 0;
 
-if ( is_idle_domain(current->domain) && !dom0_created )
-{
-dsec->sid = SECINITSID_DOM0;
-dom0_created = 1;
-}
-else
+/*
+ * If domain has not already been labeled or a valid new label is provided,
+ * then use the provided label, otherwise use the existing label.
+ */
+if ( dsec->sid == SECINITSID_UNLABELED || ssidref > 0 )
 {
-rc = avc_current_has_perm(ssidref, SECCLASS_DOMAIN,
-  DOMAIN__CREATE, NULL);
-if ( rc )
-return rc;
-
 dsec->sid = ssidref;
+dsec->self_sid = dsec->sid;
 }
-dsec->self_sid = dsec->sid;
+
+rc = avc_current_has_perm(dsec->sid, SECCLASS_DOMAIN,
+  DOMAIN__CREATE, NULL);
+if ( rc )
+return rc;
 
 rc = security_transition_sid(dsec->sid, dsec->sid, SECCLASS_DOMAIN,
  &dsec->self_sid);
-- 
2.20.1




[PATCH v9 2/3] flask: implement xsm_set_system_active

2022-06-29 Thread Daniel P. Smith
This commit implements full support for starting the idle domain privileged by
introducing a new flask label xenboot_t which the idle domain is labeled with
at creation.  It then provides the implementation for the XSM hook
xsm_set_system_active to relabel the idle domain to the existing xen_t flask
label.

In the reference flask policy a new macro, xen_build_domain(target), is
introduced for creating policies for dom0less/hyperlaunch allowing the
hypervisor to create and assign the necessary resources for domain
construction.

Signed-off-by: Daniel P. Smith 
Reviewed-by: Jason Andryuk 
Reviewed-by: Luca Fancellu 
Tested-by: Luca Fancellu 
Reviewed-by: Rahul Singh 
Tested-by: Rahul Singh 
---
 tools/flask/policy/modules/xen.if  | 7 +++
 tools/flask/policy/modules/xen.te  | 1 +
 tools/flask/policy/policy/initial_sids | 1 +
 xen/xsm/flask/hooks.c  | 9 -
 xen/xsm/flask/policy/initial_sids  | 1 +
 5 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/tools/flask/policy/modules/xen.if 
b/tools/flask/policy/modules/xen.if
index 5e2aa472b6..424daab6a0 100644
--- a/tools/flask/policy/modules/xen.if
+++ b/tools/flask/policy/modules/xen.if
@@ -62,6 +62,13 @@ define(`create_domain_common', `
setparam altp2mhvm altp2mhvm_op dm };
 ')
 
+# xen_build_domain(target)
+#   Allow a domain to be created at boot by the hypervisor
+define(`xen_build_domain', `
+   allow xenboot_t $1:domain create;
+   allow xenboot_t $1_channel:event create;
+')
+
 # create_domain(priv, target)
 #   Allow a domain to be created directly
 define(`create_domain', `
diff --git a/tools/flask/policy/modules/xen.te 
b/tools/flask/policy/modules/xen.te
index 3dbf93d2b8..de98206fdd 100644
--- a/tools/flask/policy/modules/xen.te
+++ b/tools/flask/policy/modules/xen.te
@@ -24,6 +24,7 @@ attribute mls_priv;
 

 
 # The hypervisor itself
+type xenboot_t, xen_type, mls_priv;
 type xen_t, xen_type, mls_priv;
 
 # Domain 0
diff --git a/tools/flask/policy/policy/initial_sids 
b/tools/flask/policy/policy/initial_sids
index 6b7b7eff21..ec729d3ba3 100644
--- a/tools/flask/policy/policy/initial_sids
+++ b/tools/flask/policy/policy/initial_sids
@@ -2,6 +2,7 @@
 # objects created before the policy is loaded or for objects that do not have a
 # label defined in some other manner.
 
+sid xenboot gen_context(system_u:system_r:xenboot_t,s0)
 sid xen gen_context(system_u:system_r:xen_t,s0)
 sid dom0 gen_context(system_u:system_r:dom0_t,s0)
 sid domxen gen_context(system_u:system_r:domxen_t,s0)
diff --git a/xen/xsm/flask/hooks.c b/xen/xsm/flask/hooks.c
index c97c44f803..8c9cd0f297 100644
--- a/xen/xsm/flask/hooks.c
+++ b/xen/xsm/flask/hooks.c
@@ -173,7 +173,7 @@ static int cf_check flask_domain_alloc_security(struct 
domain *d)
 switch ( d->domain_id )
 {
 case DOMID_IDLE:
-dsec->sid = SECINITSID_XEN;
+dsec->sid = SECINITSID_XENBOOT;
 break;
 case DOMID_XEN:
 dsec->sid = SECINITSID_DOMXEN;
@@ -193,9 +193,14 @@ static int cf_check flask_domain_alloc_security(struct 
domain *d)
 
 static int cf_check flask_set_system_active(void)
 {
+struct domain_security_struct *dsec;
 struct domain *d = current->domain;
 
+dsec = d->ssid;
+
 ASSERT(d->is_privileged);
+ASSERT(dsec->sid == SECINITSID_XENBOOT);
+ASSERT(dsec->self_sid == SECINITSID_XENBOOT);
 
 if ( d->domain_id != DOMID_IDLE )
 {
@@ -210,6 +215,8 @@ static int cf_check flask_set_system_active(void)
  */
 d->is_privileged = false;
 
+dsec->self_sid = dsec->sid = SECINITSID_XEN;
+
 return 0;
 }
 
diff --git a/xen/xsm/flask/policy/initial_sids 
b/xen/xsm/flask/policy/initial_sids
index 7eca70d339..e8b55b8368 100644
--- a/xen/xsm/flask/policy/initial_sids
+++ b/xen/xsm/flask/policy/initial_sids
@@ -3,6 +3,7 @@
 #
 # Define initial security identifiers 
 #
+sid xenboot
 sid xen
 sid dom0
 sid domio
-- 
2.20.1




[PATCH v9 1/3] xsm: create idle domain privileged and demote after setup

2022-06-29 Thread Daniel P. Smith
There are new capabilities, dom0less and hyperlaunch, that introduce internal
hypervisor logic, which needs to make resource allocation calls that are
protected by XSM access checks. The need for these resource allocations are
necessary for dom0less and hyperlaunch when they are constructing the initial
domain(s).  This creates an issue as a subset of the hypervisor code is
executed under a system domain, the idle domain, that is represented by a
per-CPU non-privileged struct domain. To enable these new capabilities to
function correctly but in a controlled manner, this commit changes the idle
system domain to be created as a privileged domain under the default policy and
demoted before transitioning to running. A new XSM hook,
xsm_set_system_active(), is introduced to allow each XSM policy type to demote
the idle domain appropriately for that policy type. In the case of SILO, it
inherits the default policy's hook for xsm_set_system_active().

For flask, a stub is added to ensure that flask policy system will function
correctly with this patch until flask is extended with support for starting the
idle domain privileged and properly demoting it on the call to
xsm_set_system_active().

Signed-off-by: Daniel P. Smith 
Reviewed-by: Jason Andryuk 
Reviewed-by: Luca Fancellu 
Acked-by: Julien Grall  # arm
Reviewed-by: Rahul Singh 
Tested-by: Rahul Singh 
---
 xen/arch/arm/setup.c|  3 +++
 xen/arch/x86/setup.c|  4 
 xen/common/sched/core.c |  7 ++-
 xen/include/xsm/dummy.h | 17 +
 xen/include/xsm/xsm.h   |  6 ++
 xen/xsm/dummy.c |  1 +
 xen/xsm/flask/hooks.c   | 23 +++
 7 files changed, 60 insertions(+), 1 deletion(-)

diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
index 577c54e6fb..85ff956ec2 100644
--- a/xen/arch/arm/setup.c
+++ b/xen/arch/arm/setup.c
@@ -1063,6 +1063,9 @@ void __init start_xen(unsigned long boot_phys_offset,
 /* Hide UART from DOM0 if we're using it */
 serial_endboot();
 
+if ( (rc = xsm_set_system_active()) != 0 )
+panic("xsm: unable to switch to SYSTEM_ACTIVE privilege: %d\n", rc);
+
 system_state = SYS_STATE_active;
 
 for_each_domain( d )
diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
index 53a73010e0..f08b07b8de 100644
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -619,6 +619,10 @@ static void noreturn init_done(void)
 {
 void *va;
 unsigned long start, end;
+int err;
+
+if ( (err = xsm_set_system_active()) != 0 )
+panic("xsm: unable to switch to SYSTEM_ACTIVE privilege: %d\n", err);
 
 system_state = SYS_STATE_active;
 
diff --git a/xen/common/sched/core.c b/xen/common/sched/core.c
index 8c73489654..250207038e 100644
--- a/xen/common/sched/core.c
+++ b/xen/common/sched/core.c
@@ -3033,7 +3033,12 @@ void __init scheduler_init(void)
 sched_ratelimit_us = SCHED_DEFAULT_RATELIMIT_US;
 }
 
-idle_domain = domain_create(DOMID_IDLE, NULL, 0);
+/*
+ * The idle dom is created privileged to ensure unrestricted access during
+ * setup and will be demoted by xsm_set_system_active() when setup is
+ * complete.
+ */
+idle_domain = domain_create(DOMID_IDLE, NULL, CDF_privileged);
 BUG_ON(IS_ERR(idle_domain));
 BUG_ON(nr_cpu_ids > ARRAY_SIZE(idle_vcpu));
 idle_domain->vcpu = idle_vcpu;
diff --git a/xen/include/xsm/dummy.h b/xen/include/xsm/dummy.h
index 58afc1d589..77f27e7163 100644
--- a/xen/include/xsm/dummy.h
+++ b/xen/include/xsm/dummy.h
@@ -101,6 +101,23 @@ static always_inline int xsm_default_action(
 }
 }
 
+static XSM_INLINE int cf_check xsm_set_system_active(void)
+{
+struct domain *d = current->domain;
+
+ASSERT(d->is_privileged);
+
+if ( d->domain_id != DOMID_IDLE )
+{
+printk("%s: should only be called by idle domain\n", __func__);
+return -EPERM;
+}
+
+d->is_privileged = false;
+
+return 0;
+}
+
 static XSM_INLINE void cf_check xsm_security_domaininfo(
 struct domain *d, struct xen_domctl_getdomaininfo *info)
 {
diff --git a/xen/include/xsm/xsm.h b/xen/include/xsm/xsm.h
index 3e2b7fe3db..8dad03fd3d 100644
--- a/xen/include/xsm/xsm.h
+++ b/xen/include/xsm/xsm.h
@@ -52,6 +52,7 @@ typedef enum xsm_default xsm_default_t;
  * !!! WARNING !!!
  */
 struct xsm_ops {
+int (*set_system_active)(void);
 void (*security_domaininfo)(struct domain *d,
 struct xen_domctl_getdomaininfo *info);
 int (*domain_create)(struct domain *d, uint32_t ssidref);
@@ -208,6 +209,11 @@ extern struct xsm_ops xsm_ops;
 
 #ifndef XSM_NO_WRAPPERS
 
+static inline int xsm_set_system_active(void)
+{
+return alternative_call(xsm_ops.set_system_active);
+}
+
 static inline void xsm_security_domaininfo(
 struct domain *d, struct xen_domctl_getdomaininfo *info)
 {
diff --git a/xen/xsm/dummy.c b/xen/xsm/dummy.c
index 8c044ef615..e6ffa948f7 100644
--- a/xen/xsm/dummy.c
+++ b/xen/xsm/dummy.c
@@ -14,6 +14,7 @@
 #include 
 
 static c

[PATCH v9 0/3] Adds starting the idle domain privileged

2022-06-29 Thread Daniel P. Smith
This series makes it so that the idle domain is started privileged under the
default policy, which the SILO policy inherits, and under the flask policy. It
then introduces a new one-way XSM hook, xsm_transition_running, that is hooked
by an XSM policy to transition the idle domain to its running privilege level.

Patch 3 is an important one, as first it addresses the issue raised under an
RFC late last year by Jason Andryuk regarding the awkward entanglement of
flask_domain_alloc_security() and flask_domain_create(). Second, it helps
articulate why it is that the hypervisor should go through the access control
checks, even when it is doing the action itself. The issue at hand is not that
the hypervisor could be influenced to go around these check. The issue is these
checks provides a configurable way to express the execution flow that the
hypervisor should enforce. Specifically with this change, it is now possible
for an owner of a dom0less or hyperlaunch system to express a policy where the
hypervisor will enforce that no dom0 will be constructed, regardless of what
boot construction details were provided to it. Likewise, an owner that does not
want to see dom0less or hyperlaunch to be used can enforce that the hypervisor
will only construct a dom0 domain. This can all be accomplished without the
need to rebuild the hypervisor with these features enabled or disabled.

Changes in v9:
- added missing Rb/Tb to patch 1
- corrected the flask policy macro in patch 2 to allow domain create
- added patch 3 to address allowing the hypervisor create more than 1 domain

Changes in v8:
- adjusted panic messages in arm and x86 setup.c to be less than 80cols
- fixed comment line that went over 80col
- added line in patch #1 commit message to clarify the need is for domain
  creation

Changes in v7:
- adjusted error message in default and flask xsm_set_system_active hooks
- merged panic messages in arm and x86 setup.c to a single line

Changes in v6:
- readded the setting of is_privileged in flask_set_system_active()
- clarified comment on is_privileged in flask_set_system_active()
- added ASSERT on is_privileged and self_sid in flask_set_system_active()
- fixed err code returned on Arm for xsm_set_system_active() panic message

Changes in v5:
- dropped setting is_privileged in flask_set_system_active()
- added err code returned by xsm_set_system_active() to panic message

Changes in v4:
- reworded patch 1 commit messaged
- fixed whitespace to coding style
- fixed comment to coding style

Changes in v3:
- renamed *_transition_running() to *_set_system_active()
- changed the XSM hook set_system_active() from void to int return
- added ASSERT check for the expected privilege level each XSM policy expected
- replaced a check against is_privileged in each arch with checking the return
  value from the call to xsm_set_system_active()

Changes in v2:
- renamed flask_domain_runtime_security() to flask_transition_running()
- added the missed assignment of self_sid

Daniel P. Smith (3):
  xsm: create idle domain privileged and demote after setup
  flask: implement xsm_set_system_active
  xsm: refactor flask sid alloc and domain check

 tools/flask/policy/modules/dom0.te |  3 ++
 tools/flask/policy/modules/domU.te |  3 ++
 tools/flask/policy/modules/xen.if  |  7 +++
 tools/flask/policy/modules/xen.te  |  1 +
 tools/flask/policy/policy/initial_sids |  1 +
 xen/arch/arm/setup.c   |  3 ++
 xen/arch/x86/setup.c   |  4 ++
 xen/common/sched/core.c|  7 ++-
 xen/include/xsm/dummy.h| 17 +++
 xen/include/xsm/xsm.h  |  6 +++
 xen/xsm/dummy.c|  1 +
 xen/xsm/flask/hooks.c  | 66 --
 xen/xsm/flask/policy/initial_sids  |  1 +
 13 files changed, 104 insertions(+), 16 deletions(-)

-- 
2.20.1




RE: [PATCH RESEND v10 1/2] xen+tools: Report Interrupt Controller Virtualization capabilities on x86

2022-06-29 Thread Tian, Kevin
> From: Jane Malalane 
> Sent: Wednesday, June 29, 2022 11:17 PM
> 
> On 29/06/2022 15:26, Jan Beulich wrote:
> > On 29.06.2022 15:55, Jane Malalane wrote:
> >> Add XEN_SYSCTL_PHYSCAP_X86_ASSISTED_XAPIC and
> >> XEN_SYSCTL_PHYSCAP_X86_ASSISTED_X2APIC to report accelerated xAPIC
> and
> >> x2APIC, on x86 hardware. This is so that xAPIC and x2APIC virtualization
> >> can subsequently be enabled on a per-domain basis.
> >> No such features are currently implemented on AMD hardware.
> >>
> >> HW assisted xAPIC virtualization will be reported if HW, at the
> >> minimum, supports virtualize_apic_accesses as this feature alone means
> >> that an access to the APIC page will cause an APIC-access VM exit. An
> >> APIC-access VM exit provides a VMM with information about the access
> >> causing the VM exit, unlike a regular EPT fault, thus simplifying some
> >> internal handling.
> >>
> >> HW assisted x2APIC virtualization will be reported if HW supports
> >> virtualize_x2apic_mode and, at least, either apic_reg_virt or
> >> virtual_intr_delivery. This also means that
> >> sysctl follows the conditionals in vmx_vlapic_msr_changed().
> >>
> >> For that purpose, also add an arch-specific "capabilities" parameter
> >> to struct xen_sysctl_physinfo.
> >>
> >> Note that this interface is intended to be compatible with AMD so that
> >> AVIC support can be introduced in a future patch. Unlike Intel that
> >> has multiple controls for APIC Virtualization, AMD has one global
> >> 'AVIC Enable' control bit, so fine-graining of APIC virtualization
> >> control cannot be done on a common interface.
> >>
> >> Suggested-by: Andrew Cooper 
> >> Signed-off-by: Jane Malalane 
> >> Reviewed-by: "Roger Pau Monné" 
> >> Reviewed-by: Jan Beulich 
> >> Reviewed-by: Anthony PERARD 
> >
> > Could you please clarify whether you did drop Kevin's R-b (which, a
> > little unhelpfully, he provided in reply to v9 a week after you had
> > posted v10) because of ...
> >
> >> v10:
> >>   * Make assisted_x{2}apic_available conditional upon _vmx_cpu_up()
> >
> > ... this, requiring him to re-offer the tag? Until told otherwise I
> > will assume so.
> 
> It wasn't intentional but yes, that is right. There was a change, albeit
> minor, in vmx from v9 to v10 so I do require Kevin Tian or Jun Nakajima
> to review it.
> 

Reviewed-by: Kevin Tian 


RE: [PATCH] x86/ept: fix shattering of special pages

2022-06-29 Thread Tian, Kevin
> From: Roger Pau Monné 
> Sent: Wednesday, June 29, 2022 5:11 PM
> 
> On Wed, Jun 29, 2022 at 08:41:43AM +, Tian, Kevin wrote:
> > > From: Roger Pau Monne 
> > > Sent: Monday, June 27, 2022 6:01 PM
> > >
> > > The current logic in epte_get_entry_emt() will split any page marked
> > > as special with order greater than zero, without checking whether the
> > > super page is all special.
> > >
> > > Fix this by only splitting the page only if it's not all marked as
> > > special, in order to prevent unneeded super page shuttering.
> > >
> > > Fixes: ca24b2ffdb ('x86/hvm: set 'ipat' in EPT for special pages')
> > > Signed-off-by: Roger Pau Monné 
> > > ---
> > > Cc: Paul Durrant 
> > > ---
> > > It would seem weird to me to have a super page entry in EPT with
> > > ranges marked as special and not the full page.  I guess it's better
> > > to be safe, but I don't see an scenario where we could end up in that
> > > situation.
> > >
> > > I've been trying to find a clarification in the original patch
> > > submission about how it's possible to have such super page EPT entry,
> > > but haven't been able to find any justification.
> > >
> > > Might be nice to expand the commit message as to why it's possible to
> > > have such mixed super page entries that would need splitting.
> >
> > Here is what I dig out.
> >
> > When reviewing v1 of adding special page check, Jan suggested
> > that APIC access page was also covered hence the old logic for APIC
> > access page can be removed. [1]
> 
> But the APIC access page is always added using set_mmio_p2m_entry(),
> which will unconditionally create an entry for it in the EPT, hence
> there's no explicit need to check for it's presence inside of higher
> order pages.
> 
> > Then when reviewing v2 he found that the order check in removed
> > logic was not carried to the new check on special page. [2]
> >
> > The original order check in old APIC access logic came from:
> >
> > commit 126018f2acd5416434747423e61a4690108b9dc9
> > Author: Jan Beulich 
> > Date:   Fri May 2 10:48:48 2014 +0200
> >
> > x86/EPT: consider page order when checking for APIC MFN
> >
> > This was overlooked in 3d90d6e6 ("x86/EPT: split super pages upon
> > mismatching memory types").
> >
> > Signed-off-by: Jan Beulich 
> > Acked-by: Kevin Tian 
> > Reviewed-by: Tim Deegan 
> >
> > I suppose Jan may actually find such mixed super page entry case
> > to bring this fix in.
> 
> Hm, I guess theoretically it could be possible for contiguous entries
> to be coalesced (if we ever implement that for p2m page tables), and
> hence result in super pages being created from smaller entries.
> 
> It that case it would be less clear to assert that special pages
> cannot be coalesced with other contiguous entries.

With Jan's confirmation I'm fine with this change too. Just below...

> >
> > Did you actually observe an impact w/o this fix?
> 
> Yes, the original change has caused a performance regression in some
> GPU pass through workloads for XenServer.  I think it's reasonable to
> avoid super page shattering if the resulting pages would all end up
> using ipat and WRBACK cache attribute, as there's no reason for the
> split in the first case.
> 

... I'd appreciate mentioning the regression case in the commit msg.

With that,

Reviewed-by: Kevin Tian 


[linux-5.4 test] 171408: regressions - FAIL

2022-06-29 Thread osstest service owner
flight 171408 linux-5.4 real [real]
flight 171414 linux-5.4 real-retest [real]
http://logs.test-lab.xenproject.org/osstest/logs/171408/
http://logs.test-lab.xenproject.org/osstest/logs/171414/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-armhf-armhf-xl-multivcpu 18 guest-start/debian.repeat fail REGR. vs. 
171352

Tests which are failing intermittently (not blocking):
 test-arm64-arm64-xl-thunderx  8 xen-boot fail in 171400 pass in 171408
 test-armhf-armhf-xl-multivcpu 14 guest-start fail in 171400 pass in 171408
 test-armhf-armhf-xl-credit1  14 guest-startfail pass in 171400
 test-armhf-armhf-xl-vhd  13 guest-startfail pass in 171400
 test-armhf-armhf-libvirt-raw 17 guest-start/debian.repeat  fail pass in 171400

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check fail blocked in 
171352
 test-armhf-armhf-xl-credit1 18 guest-start/debian.repeat fail in 171400 like 
171352
 test-armhf-armhf-xl-credit1 15 migrate-support-check fail in 171400 never pass
 test-armhf-armhf-xl-credit1 16 saverestore-support-check fail in 171400 never 
pass
 test-armhf-armhf-xl-vhd 14 migrate-support-check fail in 171400 never pass
 test-armhf-armhf-xl-vhd 15 saverestore-support-check fail in 171400 never pass
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 171352
 test-amd64-i386-xl-qemut-win7-amd64 19 guest-stop fail like 171352
 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 171352
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 171352
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 171352
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 171352
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 171352
 test-amd64-i386-xl-qemut-ws16-amd64 19 guest-stop fail like 171352
 test-armhf-armhf-xl-rtds 18 guest-start/debian.repeatfail  like 171352
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 171352
 test-armhf-armhf-xl-credit2  14 guest-start  fail  like 171352
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 171352
 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 171352
 test-amd64-i386-xl-pvshim14 guest-start  fail   never pass
 test-arm64-arm64-xl-seattle  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-raw  14 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  14 migrate-support-checkfail  

[PATCH v3] xen/arm: avoid overflow when setting vtimer in context switch

2022-06-29 Thread Jiamei Xie
virt_vtimer_save is calculating the new time for the vtimer in:
"v->arch.virt_timer.cval + v->domain->arch.virt_timer_base.offset
- boot_count".
In this formula, "cval + offset" might cause uint64_t overflow.
Changing it to "ticks_to_ns(v->domain->arch.virt_timer_base.offset -
boot_count) + ticks_to_ns(v->arch.virt_timer.cval)" can avoid overflow,
and "ticks_to_ns(arch.virt_timer_base.offset - boot_count)" will be
always the same, which has been caculated in domain_vtimer_init.
Introduce a new field virt_timer_base.nanoseconds to store this value
for arm in struct arch_domain, so we can use it directly.

Signed-off-by: Jiamei Xie 
Change-Id: Ib80cee51eaf844661e6f92154a0339ad2a652f9b
---
was "xen/arm: avoid vtimer flip-flop transition in context switch".
v3 changes:
-re-write commit message
-store nanoseconds in virt_timer_base instead of adding a new structure
-assign to nanoseconds first, then seconds
---
 xen/arch/arm/include/asm/domain.h | 1 +
 xen/arch/arm/vtimer.c | 9 ++---
 2 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/xen/arch/arm/include/asm/domain.h 
b/xen/arch/arm/include/asm/domain.h
index ed63c2b6f9..cd9ce19b4b 100644
--- a/xen/arch/arm/include/asm/domain.h
+++ b/xen/arch/arm/include/asm/domain.h
@@ -71,6 +71,7 @@ struct arch_domain
 
 struct {
 uint64_t offset;
+s_time_t nanoseconds;
 } virt_timer_base;
 
 struct vgic_dist vgic;
diff --git a/xen/arch/arm/vtimer.c b/xen/arch/arm/vtimer.c
index 6b78fea77d..aeaea78e4c 100644
--- a/xen/arch/arm/vtimer.c
+++ b/xen/arch/arm/vtimer.c
@@ -63,7 +63,9 @@ static void virt_timer_expired(void *data)
 int domain_vtimer_init(struct domain *d, struct xen_arch_domainconfig *config)
 {
 d->arch.virt_timer_base.offset = get_cycles();
-d->time_offset.seconds = ticks_to_ns(d->arch.virt_timer_base.offset - 
boot_count);
+d->arch.virt_timer_base.nanoseconds =
+ticks_to_ns(d->arch.virt_timer_base.offset - boot_count);
+d->time_offset.seconds = d->arch.virt_timer_base.nanoseconds;
 do_div(d->time_offset.seconds, 10);
 
 config->clock_frequency = timer_dt_clock_frequency;
@@ -144,8 +146,9 @@ void virt_timer_save(struct vcpu *v)
 if ( (v->arch.virt_timer.ctl & CNTx_CTL_ENABLE) &&
  !(v->arch.virt_timer.ctl & CNTx_CTL_MASK))
 {
-set_timer(&v->arch.virt_timer.timer, 
ticks_to_ns(v->arch.virt_timer.cval +
-  v->domain->arch.virt_timer_base.offset - boot_count));
+set_timer(&v->arch.virt_timer.timer,
+  v->domain->arch.virt_timer_base.nanoseconds +
+  ticks_to_ns(v->arch.virt_timer.cval));
 }
 }
 
-- 
2.25.1




Re: [PATCH V1 2/2] xen/grant-table: Use unpopulated contiguous pages instead of real RAM ones

2022-06-29 Thread Stefano Stabellini
On Mon, 20 Jun 2022, Oleksandr Tyshchenko wrote:
> From: Oleksandr Tyshchenko 
> 
> Depends on CONFIG_XEN_UNPOPULATED_ALLOC. If enabled then unpopulated
> contiguous pages will be allocated for grant mapping into instead of
> ballooning out real RAM pages.
> 
> Also fallback to allocate DMAable pages (balloon out real RAM pages)
> if we failed to allocate unpopulated contiguous pages. Use recently
> introduced is_xen_unpopulated_page() in gnttab_dma_free_pages() to know
> what API to use for freeing pages.
> 
> Signed-off-by: Oleksandr Tyshchenko 
> ---
> Please note, I haven't re-checked yet the use-case where the xen-swiotlb
> is involved (proposed by Stefano):
> https://lore.kernel.org/xen-devel/alpine.DEB.2.22.394.2206031348230.2783803@ubuntu-linux-20-04-desktop/
> I will re-check that for next version and add corresponding comment
> in the code.

Great. The patch looks good so far.


> Changes RFC -> V1:
>- update commit subject/description
>- rework to avoid introducing alternative implementation
>  of gnttab_dma_alloc(free)_pages(), use IS_ENABLED()
>- implement a fallback to real RAM pages if we failed to allocate
>  unpopulated contiguous pages (resolve initial TODO)
>- update according to the API renaming (s/dma/contiguous)
> ---
>  drivers/xen/grant-table.c | 24 
>  1 file changed, 24 insertions(+)
> 
> diff --git a/drivers/xen/grant-table.c b/drivers/xen/grant-table.c
> index 738029d..15e426b 100644
> --- a/drivers/xen/grant-table.c
> +++ b/drivers/xen/grant-table.c
> @@ -1047,6 +1047,23 @@ int gnttab_dma_alloc_pages(struct 
> gnttab_dma_alloc_args *args)
>   size_t size;
>   int i, ret;
>  
> + if (IS_ENABLED(CONFIG_XEN_UNPOPULATED_ALLOC)) {
> + ret = xen_alloc_unpopulated_contiguous_pages(args->dev, 
> args->nr_pages,
> + args->pages);
> + if (ret < 0)
> + goto fallback;
> +
> + ret = gnttab_pages_set_private(args->nr_pages, args->pages);
> + if (ret < 0)
> + goto fail;
> +
> + args->vaddr = page_to_virt(args->pages[0]);
> + args->dev_bus_addr = page_to_phys(args->pages[0]);
> +
> + return ret;
> + }
> +
> +fallback:
>   size = args->nr_pages << PAGE_SHIFT;
>   if (args->coherent)
>   args->vaddr = dma_alloc_coherent(args->dev, size,
> @@ -1103,6 +1120,13 @@ int gnttab_dma_free_pages(struct gnttab_dma_alloc_args 
> *args)
>  
>   gnttab_pages_clear_private(args->nr_pages, args->pages);
>  
> + if (IS_ENABLED(CONFIG_XEN_UNPOPULATED_ALLOC) &&
> + is_xen_unpopulated_page(args->pages[0])) {
> + xen_free_unpopulated_contiguous_pages(args->dev, args->nr_pages,
> + args->pages);
> + return 0;
> + }
> +
>   for (i = 0; i < args->nr_pages; i++)
>   args->frames[i] = page_to_xen_pfn(args->pages[i]);
>  
> -- 
> 2.7.4
> 



Re: [PATCH V1 1/2] xen/unpopulated-alloc: Introduce helpers for contiguous allocations

2022-06-29 Thread Stefano Stabellini
On Mon, 20 Jun 2022, Oleksandr Tyshchenko wrote:
> From: Oleksandr Tyshchenko 
> 
> Add ability to allocate unpopulated contiguous pages suitable for
> grant mapping into. This is going to be used by userspace PV backends
> for grant mappings in gnttab code (see gnttab_dma_alloc_pages()).
> 
> Patch also changes the allocation mechanism for unpopulated pages.
> Instead of using page_list and page->zone_device_data to manage
> hot-plugged in fill_pool() (former fill_list()) pages, reuse genpool
> subsystem to do the job for us.
> 
> Please note that even for non-contiguous allocations we always try
> to allocate single contiguous chunk in alloc_unpopulated_pages()
> instead of allocating memory page-by-page. Although it leads to less
> efficient resource utilization, it is faster. Taking into the account
> that on both x86 and Arm the unpopulated memory resource is arbitrarily
> large (it is not backed by real memory) this is not going to be a problem.
> 
> Signed-off-by: Oleksandr Tyshchenko 
> ---
> I am still thinking how we can optimize free_unpopulated_pages()
> to avoid freeing memory page-by-page for non-contiguous allocations:
> 1. We could update users to allocate/free contiguous pages even when
>continuity is not strictly required. But besides a need to alter
>a few places, this also requires having a valid struct device
>pointer in hand (maybe instead of passing *dev we could pass
>max_addr? With that we could drop DMA_BIT_MASK).
> 2. Almost all users of unpopulated pages (except gnttab_page_cache_shrink()
>in grant-table.c) retain initially allocated pages[i] array, so it
>passed in free_unpopulated_pages() absolutely unmodified since
>being allocated.
>We could update free_unpopulated_pages() to always try to free memory
>by a single chuck (previously make sure that chunk is in a pool using
>gen_pool_has_addr()) and update gnttab_page_cache_shrink() to not pass
>pages[i] array with mixed pages in it when dealing with unpopulated
>pages. This doesn't require altering a few places.
> 
> Any thoughts?
 
I think it would be better to change the callers, because it would be an
obvious explicit change, compared to try to be smarter in
free_unpopulated_pages.


> Changes RFC -> V1:
>- update commit subject/description
>- rework to avoid code duplication (resolve initial TODO)
>- rename API according to new naming scheme (s/dma/contiguous),
>  also rename some local stuff
>- drop the page_list & friends entirely and use unpopulated_pool for all
>  (contiguous and non-contiguous) allocations
>- fix build on x86 by inclusion of 
>- introduce is_xen_unpopulated_page()
>- share the implementation for xen_alloc_unpopulated_contiguous_pages()
>  and xen_alloc_unpopulated_pages()
> ---
>  drivers/xen/unpopulated-alloc.c | 188 
> +---
>  include/xen/xen.h   |  20 +
>  2 files changed, 158 insertions(+), 50 deletions(-)
> 
> diff --git a/drivers/xen/unpopulated-alloc.c b/drivers/xen/unpopulated-alloc.c
> index a39f2d3..3988480d 100644
> --- a/drivers/xen/unpopulated-alloc.c
> +++ b/drivers/xen/unpopulated-alloc.c
> @@ -1,5 +1,7 @@
>  // SPDX-License-Identifier: GPL-2.0
> +#include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -12,9 +14,8 @@
>  #include 
>  #include 
>  
> -static DEFINE_MUTEX(list_lock);
> -static struct page *page_list;
> -static unsigned int list_count;
> +static DEFINE_MUTEX(pool_lock);
> +static struct gen_pool *unpopulated_pool;
>  
>  static struct resource *target_resource;
>  
> @@ -31,12 +32,12 @@ int __weak __init arch_xen_unpopulated_init(struct 
> resource **res)
>   return 0;
>  }
>  
> -static int fill_list(unsigned int nr_pages)
> +static int fill_pool(unsigned int nr_pages)
>  {
>   struct dev_pagemap *pgmap;
>   struct resource *res, *tmp_res = NULL;
>   void *vaddr;
> - unsigned int i, alloc_pages = round_up(nr_pages, PAGES_PER_SECTION);
> + unsigned int alloc_pages = round_up(nr_pages, PAGES_PER_SECTION);
>   struct range mhp_range;
>   int ret;
>  
> @@ -106,6 +107,7 @@ static int fill_list(unsigned int nr_pages)
>   * conflict with any devices.
>   */
>   if (!xen_feature(XENFEAT_auto_translated_physmap)) {
> + unsigned int i;
>   xen_pfn_t pfn = PFN_DOWN(res->start);
>  
>   for (i = 0; i < alloc_pages; i++) {
> @@ -125,16 +127,17 @@ static int fill_list(unsigned int nr_pages)
>   goto err_memremap;
>   }
>  
> - for (i = 0; i < alloc_pages; i++) {
> - struct page *pg = virt_to_page(vaddr + PAGE_SIZE * i);
> -
> - pg->zone_device_data = page_list;
> - page_list = pg;
> - list_count++;
> + ret = gen_pool_add_virt(unpopulated_pool, (unsigned long)vaddr, 
> res->start,
> + alloc_pages * PAGE_SIZE, NUMA_NO_NODE);
> + if (ret) {
> +   

Re: Reg. Tee init fail...

2022-06-29 Thread Boris Ostrovsky



On 6/29/22 4:03 PM, Stefano Stabellini wrote:

Adding Juergen and Boris because this is a Linux/x86 issue.


As you can see from this Linux driver:
https://elixir.bootlin.com/linux/latest/source/drivers/crypto/ccp/tee-dev.c#L132

Linux as dom0 on x86 is trying to communicate with firmware (TEE). Linux
is calling __pa to pass a physical address to firmware. However, __pa
returns a "fake" address not an mfn. I imagine that a quick workaround
would be to call "virt_to_machine" instead of "__pa" in tee-dev.c.



It's probably worth a try but it seems we may need to OR the result with C-bit 
(i.e. sme_me_mask). Or (for testing purposes) run with TSME on, I think C-bit 
is not set then.


-boris



Normally, if this was a device, the "right fix" would be to use
swiotlb-xen:xen_swiotlb_map_page to get back a real physical address.

However, xen_swiotlb_map_page is meant to be used as part of the dma_ops
API and takes a struct device *dev as input parameter. Maybe
xen_swiotlb_map_page can be used for tee-dev as well?


Basically tee-dev would need to call dma_map_page before passing
addresses to firmware, and dma_unmap_page when it is done. E.g.:


   cmd_buffer = dma_map_page(dev, virt_to_page(cmd),
 cmd & ~PAGE_MASK,
 ring_size,
 DMA_TO_DEVICE);


Juergen, Boris,
what do you think?



On Fri, 24 Jun 2022, Julien Grall wrote:

Hi,

(moving the discussion to xen-devel as I think it is more appropriate)

On 24/06/2022 10:53, SK, SivaSangeetha (Siva Sangeetha) wrote:

[AMD Official Use Only - General]

Not clear what this means.


Hi Xen team,

In TEE driver, We allocate a ring buffer, get its physical address from
__pa() macro, pass the physical address to secure processor for mapping it
and using in secure processor side.

Source:
https://elixir.bootlin.com/linux/latest/source/drivers/crypto/ccp/tee-dev.c#L132

This works good natively in Dom0 on the target.
When we boot the same Dom0 kernel, with Xen hypervisor enabled, ring init
fails.

Do you have any error message or error code?



We suspect that the address passed to secure processor, is not same when xen
is enabled, and when xen is enabled, some level of address translation might
be required to get exact physical address.

If you are using Xen upstream, Dom0 will be mapped with IPA == PA. So there
should be no need for translation.

Can you provide more details on your setup (version of Xen, Linux...)?

Cheers,

--
Julien Grall





[linux-linus test] 171404: regressions - FAIL

2022-06-29 Thread osstest service owner
flight 171404 linux-linus real [real]
http://logs.test-lab.xenproject.org/osstest/logs/171404/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-dom0pvh-xl-intel  8 xen-bootfail REGR. vs. 171277
 test-amd64-amd64-xl-qemuu-ws16-amd64  8 xen-boot fail REGR. vs. 171277
 test-amd64-amd64-xl-qemut-win7-amd64  8 xen-boot fail REGR. vs. 171277
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 8 xen-boot fail REGR. vs. 
171277
 test-amd64-amd64-pygrub   8 xen-boot fail REGR. vs. 171277
 test-amd64-amd64-freebsd11-amd64  8 xen-boot fail REGR. vs. 171277
 test-amd64-amd64-dom0pvh-xl-amd  8 xen-boot  fail REGR. vs. 171277
 test-amd64-amd64-qemuu-nested-amd  8 xen-bootfail REGR. vs. 171277
 test-amd64-amd64-xl-qemuu-win7-amd64  8 xen-boot fail REGR. vs. 171277
 test-amd64-amd64-xl-qemut-ws16-amd64  8 xen-boot fail REGR. vs. 171277
 test-amd64-amd64-xl-xsm   8 xen-boot fail REGR. vs. 171277
 test-amd64-amd64-xl-qemut-debianhvm-i386-xsm  8 xen-boot fail REGR. vs. 171277
 test-amd64-amd64-xl-qemuu-debianhvm-i386-xsm  8 xen-boot fail REGR. vs. 171277
 test-amd64-amd64-libvirt  8 xen-boot fail REGR. vs. 171277
 test-amd64-amd64-xl-qemuu-ovmf-amd64  8 xen-boot fail REGR. vs. 171277
 test-amd64-amd64-libvirt-qcow2  8 xen-boot   fail REGR. vs. 171277
 test-amd64-amd64-xl-credit2   8 xen-boot fail REGR. vs. 171277
 test-amd64-amd64-xl-credit1   8 xen-boot fail REGR. vs. 171277
 test-amd64-amd64-xl   8 xen-boot fail REGR. vs. 171277
 test-amd64-amd64-xl-qemuu-debianhvm-amd64  8 xen-bootfail REGR. vs. 171277
 test-amd64-amd64-xl-vhd   8 xen-boot fail REGR. vs. 171277
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-shadow 8 xen-boot fail REGR. vs. 
171277
 test-amd64-amd64-xl-multivcpu  8 xen-bootfail REGR. vs. 171277
 test-amd64-amd64-libvirt-pair 12 xen-boot/src_host   fail REGR. vs. 171277
 test-amd64-amd64-libvirt-pair 13 xen-boot/dst_host   fail REGR. vs. 171277
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 8 xen-boot fail REGR. 
vs. 171277
 test-amd64-amd64-xl-shadow8 xen-boot fail REGR. vs. 171277
 test-amd64-amd64-libvirt-raw  8 xen-boot fail REGR. vs. 171277
 test-amd64-amd64-qemuu-nested-intel  8 xen-boot  fail REGR. vs. 171277
 test-amd64-amd64-libvirt-xsm  8 xen-boot fail REGR. vs. 171277
 test-amd64-amd64-xl-qemuu-dmrestrict-amd64-dmrestrict 8 xen-boot fail REGR. 
vs. 171277
 test-amd64-amd64-xl-pvhv2-intel  8 xen-boot  fail REGR. vs. 171277
 test-amd64-amd64-freebsd12-amd64  8 xen-boot fail REGR. vs. 171277
 test-amd64-amd64-xl-pvshim8 xen-boot fail REGR. vs. 171277
 test-amd64-amd64-xl-pvhv2-amd  8 xen-bootfail REGR. vs. 171277
 test-amd64-amd64-examine  8 reboot   fail REGR. vs. 171277
 test-amd64-amd64-examine-uefi  8 reboot  fail REGR. vs. 171277
 test-amd64-amd64-examine-bios  8 reboot  fail REGR. vs. 171277
 test-amd64-amd64-xl-qemut-debianhvm-amd64  8 xen-bootfail REGR. vs. 171277
 test-amd64-coresched-amd64-xl  8 xen-bootfail REGR. vs. 171277
 test-amd64-amd64-pair12 xen-boot/src_hostfail REGR. vs. 171277
 test-amd64-amd64-pair13 xen-boot/dst_hostfail REGR. vs. 171277

Tests which are failing intermittently (not blocking):
 test-armhf-armhf-xl-rtds 17 guest-stop fail pass in 171389

Regressions which are regarded as allowable (not blocking):
 test-amd64-amd64-xl-rtds  8 xen-boot fail REGR. vs. 171277

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 171277
 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check   fail like 171277
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 171277
 test-arm64-arm64-xl-seattle  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 14 migrate-su

[xen-unstable test] 171402: tolerable FAIL - PUSHED

2022-06-29 Thread osstest service owner
flight 171402 xen-unstable real [real]
flight 171409 xen-unstable real-retest [real]
http://logs.test-lab.xenproject.org/osstest/logs/171402/
http://logs.test-lab.xenproject.org/osstest/logs/171409/

Failures :-/ but no regressions.

Tests which are failing intermittently (not blocking):
 test-amd64-i386-xl-pvshim13 debian-fixupfail pass in 171409-retest

Tests which did not succeed, but are not blocking:
 test-amd64-i386-xl-pvshim14 guest-start  fail in 171409 never pass
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 171387
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 171387
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 171387
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 171387
 test-amd64-i386-xl-qemut-ws16-amd64 19 guest-stop fail like 171387
 test-amd64-i386-xl-qemut-win7-amd64 19 guest-stop fail like 171387
 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check   fail like 171387
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 171387
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 171387
 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 171387
 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 171387
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 171387
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-raw  14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-credit1  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-qcow2 14 migrate-support-checkfail never pass
 test-armhf-armhf-xl-vhd  14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  16 saverestore-support-checkfail   ne

Re: [PATCH 2/2] uboot-script-gen: do not enable direct mapping by default

2022-06-29 Thread Stefano Stabellini
On Wed, 29 Jun 2022, Ayan Kumar Halder wrote:
> Hi Stefano/Xenia,
> 
> On 29/06/2022 18:01, xenia wrote:
> > Hi Stefano,
> > 
> > On 6/29/22 03:28, Stefano Stabellini wrote:
> > > On Sun, 26 Jun 2022, Xenia Ragiadakou wrote:
> > > > To be inline with XEN, do not enable direct mapping automatically for
> > > > all
> > > > statically allocated domains.
> > > > 
> > > > Signed-off-by: Xenia Ragiadakou 
> > > Actually I don't know about this one. I think it is OK that ImageBuilder
> > > defaults are different from Xen defaults. This is a case where I think
> > > it would be good to enable DOMU_DIRECT_MAP by default when
> > > DOMU_STATIC_MEM is specified.
> > Just realized that I forgot to add [ImageBuilder] tag to the patches. Sorry
> > about that.
> 
> @Stefano, why do you wish the Imagebuilder's behaviour to differ from Xen ? Is
> there any use-case that helps.

As background, ImageBuilder is meant to be very simple to use especially
for the most common configurations. In fact, I think ImageBuilder
doesn't necessarely have to support all the options that Xen offers,
only the most common and important.

If someone wants an esoteric option, they can always edit the generated
boot.source and make any necessary changes. I make sure to explain that
editing boot.source is always a possibility in all the talks I gave
about ImageBuilder.

Now to answer the specific question. I am positive that the most common
configuration for people that wants static memory is to have direct_map.
That is because the two go hand-in-hand in configuration where the IOMMU
is not used. So I think that from an ImageBuilder perspective direct_map
should default to enabled when static memory is requested. It can always
be disabled, both using DOMU_DIRECT_MAP, or editing boot.source.


> > I cc Ayan, since the change was suggested by him.
> > I have no strong preference on the default value.
> > 
> > Xenia
> > 
> > > > ---
> > > >   README.md    | 4 ++--
> > > >   scripts/uboot-script-gen | 8 ++--
> > > >   2 files changed, 4 insertions(+), 8 deletions(-)
> > > > 
> > > > diff --git a/README.md b/README.md
> > > > index cb15ca5..03e437b 100644
> > > > --- a/README.md
> > > > +++ b/README.md
> > > > @@ -169,8 +169,8 @@ Where:
> > > >     if specified, indicates the host physical address regions
> > > >     [baseaddr, baseaddr + size) to be reserved to the VM for static
> > > > allocation.
> > > >   -- DOMU_DIRECT_MAP[number] can be set to 1 or 0.
> > > > -  If set to 1, the VM is direct mapped. The default is 1.
> > > > +- DOMU_DIRECT_MAP[number] if set to 1, enables direct mapping.
> > > > +  By default, direct mapping is disabled.
> > > >     This is only applicable when DOMU_STATIC_MEM is specified.
> > > >     - LINUX is optional but specifies the Linux kernel for when Xen is
> > > > NOT
> > > > diff --git a/scripts/uboot-script-gen b/scripts/uboot-script-gen
> > > > index 085e29f..66ce6f7 100755
> > > > --- a/scripts/uboot-script-gen
> > > > +++ b/scripts/uboot-script-gen
> > > > @@ -52,7 +52,7 @@ function dt_set()
> > > >   echo "fdt set $path $var $array" >> $UBOOT_SOURCE
> > > >   elif test $data_type = "bool"
> > > >   then
> > > > -    if test "$data" -eq 1
> > > > +    if test "$data" == "1"
> > > >   then
> > > >   echo "fdt set $path $var" >> $UBOOT_SOURCE
> > > >   fi
> > > > @@ -74,7 +74,7 @@ function dt_set()
> > > >   fdtput $FDTEDIT -p -t s $path $var $data
> > > >   elif test $data_type = "bool"
> > > >   then
> > > > -    if test "$data" -eq 1
> > > > +    if test "$data" == "1"
> > > >   then
> > > >   fdtput $FDTEDIT -p $path $var
> > > >   fi
> > > > @@ -491,10 +491,6 @@ function xen_config()
> > > >   then
> > > >   DOMU_CMD[$i]="console=ttyAMA0"
> > > >   fi
> > > > -    if test -z "${DOMU_DIRECT_MAP[$i]}"
> > > > -    then
> > > > - DOMU_DIRECT_MAP[$i]=1
> > > > -    fi
> > > >   i=$(( $i + 1 ))
> > > >   done
> > > >   }
> > > > -- 
> > > > 2.34.1
> > > > 
> 

Re: Reg. Tee init fail...

2022-06-29 Thread Stefano Stabellini
Adding Juergen and Boris because this is a Linux/x86 issue.


As you can see from this Linux driver:
https://elixir.bootlin.com/linux/latest/source/drivers/crypto/ccp/tee-dev.c#L132

Linux as dom0 on x86 is trying to communicate with firmware (TEE). Linux
is calling __pa to pass a physical address to firmware. However, __pa
returns a "fake" address not an mfn. I imagine that a quick workaround
would be to call "virt_to_machine" instead of "__pa" in tee-dev.c.

Normally, if this was a device, the "right fix" would be to use
swiotlb-xen:xen_swiotlb_map_page to get back a real physical address.

However, xen_swiotlb_map_page is meant to be used as part of the dma_ops
API and takes a struct device *dev as input parameter. Maybe
xen_swiotlb_map_page can be used for tee-dev as well?


Basically tee-dev would need to call dma_map_page before passing
addresses to firmware, and dma_unmap_page when it is done. E.g.:


  cmd_buffer = dma_map_page(dev, virt_to_page(cmd),
cmd & ~PAGE_MASK,
ring_size,
DMA_TO_DEVICE);


Juergen, Boris,
what do you think?



On Fri, 24 Jun 2022, Julien Grall wrote:
> Hi,
> 
> (moving the discussion to xen-devel as I think it is more appropriate)
> 
> On 24/06/2022 10:53, SK, SivaSangeetha (Siva Sangeetha) wrote:
> > [AMD Official Use Only - General]
> 
> Not clear what this means.
> 
> > 
> > Hi Xen team,
> > 
> > In TEE driver, We allocate a ring buffer, get its physical address from
> > __pa() macro, pass the physical address to secure processor for mapping it
> > and using in secure processor side.
> > 
> > Source:
> > https://elixir.bootlin.com/linux/latest/source/drivers/crypto/ccp/tee-dev.c#L132
> > 
> > This works good natively in Dom0 on the target.
> > When we boot the same Dom0 kernel, with Xen hypervisor enabled, ring init
> > fails.
> 
> Do you have any error message or error code?
> 
> > 
> > 
> > We suspect that the address passed to secure processor, is not same when xen
> > is enabled, and when xen is enabled, some level of address translation might
> > be required to get exact physical address.
> 
> If you are using Xen upstream, Dom0 will be mapped with IPA == PA. So there
> should be no need for translation.
> 
> Can you provide more details on your setup (version of Xen, Linux...)?
> 
> Cheers,
> 
> -- 
> Julien Grall
> 



Re: [PATCH 2/2] uboot-script-gen: do not enable direct mapping by default

2022-06-29 Thread Ayan Kumar Halder

Hi Stefano/Xenia,

On 29/06/2022 18:01, xenia wrote:

Hi Stefano,

On 6/29/22 03:28, Stefano Stabellini wrote:

On Sun, 26 Jun 2022, Xenia Ragiadakou wrote:
To be inline with XEN, do not enable direct mapping automatically 
for all

statically allocated domains.

Signed-off-by: Xenia Ragiadakou 

Actually I don't know about this one. I think it is OK that ImageBuilder
defaults are different from Xen defaults. This is a case where I think
it would be good to enable DOMU_DIRECT_MAP by default when
DOMU_STATIC_MEM is specified.
Just realized that I forgot to add [ImageBuilder] tag to the patches. 
Sorry about that.


@Stefano, why do you wish the Imagebuilder's behaviour to differ from 
Xen ? Is there any use-case that helps.


- Ayan



I cc Ayan, since the change was suggested by him.
I have no strong preference on the default value.

Xenia


---
  README.md    | 4 ++--
  scripts/uboot-script-gen | 8 ++--
  2 files changed, 4 insertions(+), 8 deletions(-)

diff --git a/README.md b/README.md
index cb15ca5..03e437b 100644
--- a/README.md
+++ b/README.md
@@ -169,8 +169,8 @@ Where:
    if specified, indicates the host physical address regions
    [baseaddr, baseaddr + size) to be reserved to the VM for static 
allocation.

  -- DOMU_DIRECT_MAP[number] can be set to 1 or 0.
-  If set to 1, the VM is direct mapped. The default is 1.
+- DOMU_DIRECT_MAP[number] if set to 1, enables direct mapping.
+  By default, direct mapping is disabled.
    This is only applicable when DOMU_STATIC_MEM is specified.
    - LINUX is optional but specifies the Linux kernel for when Xen 
is NOT

diff --git a/scripts/uboot-script-gen b/scripts/uboot-script-gen
index 085e29f..66ce6f7 100755
--- a/scripts/uboot-script-gen
+++ b/scripts/uboot-script-gen
@@ -52,7 +52,7 @@ function dt_set()
  echo "fdt set $path $var $array" >> $UBOOT_SOURCE
  elif test $data_type = "bool"
  then
-    if test "$data" -eq 1
+    if test "$data" == "1"
  then
  echo "fdt set $path $var" >> $UBOOT_SOURCE
  fi
@@ -74,7 +74,7 @@ function dt_set()
  fdtput $FDTEDIT -p -t s $path $var $data
  elif test $data_type = "bool"
  then
-    if test "$data" -eq 1
+    if test "$data" == "1"
  then
  fdtput $FDTEDIT -p $path $var
  fi
@@ -491,10 +491,6 @@ function xen_config()
  then
  DOMU_CMD[$i]="console=ttyAMA0"
  fi
-    if test -z "${DOMU_DIRECT_MAP[$i]}"
-    then
- DOMU_DIRECT_MAP[$i]=1
-    fi
  i=$(( $i + 1 ))
  done
  }
--
2.34.1





[PATCH v2 1/2] x86/spec-ctrl: Only adjust MSR_SPEC_CTRL for idle with legacy IBRS

2022-06-29 Thread Andrew Cooper
Back at the time of the original Spectre-v2 fixes, it was recommended to clear
MSR_SPEC_CTRL when going idle.  This is because of the side effects on the
sibling thread caused by the microcode IBRS and STIBP implementations which
were retrofitted to existing CPUs.

However, there are no relevant cross-thread impacts for the hardware
IBRS/STIBP implementations, so this logic should not be used on Intel CPUs
supporting eIBRS, or any AMD CPUs; doing so only adds unnecessary latency to
the idle path.

Furthermore, there's no point playing with MSR_SPEC_CTRL in the idle paths if
SMT is disabled for other reasons.

Fixes: 8d03080d2a33 ("x86/spec-ctrl: Cease using thunk=lfence on AMD")
Signed-off-by: Andrew Cooper 
---
CC: Jan Beulich 
CC: Roger Pau Monné 
CC: Wei Liu 

v2:
 * New
---
 xen/arch/x86/include/asm/cpufeatures.h |  2 +-
 xen/arch/x86/include/asm/spec_ctrl.h   |  5 +++--
 xen/arch/x86/spec_ctrl.c   | 10 --
 3 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/xen/arch/x86/include/asm/cpufeatures.h 
b/xen/arch/x86/include/asm/cpufeatures.h
index bd45a144ee78..493d338a085e 100644
--- a/xen/arch/x86/include/asm/cpufeatures.h
+++ b/xen/arch/x86/include/asm/cpufeatures.h
@@ -33,7 +33,7 @@ XEN_CPUFEATURE(SC_MSR_HVM,X86_SYNTH(17)) /* 
MSR_SPEC_CTRL used by Xen fo
 XEN_CPUFEATURE(SC_RSB_PV, X86_SYNTH(18)) /* RSB overwrite needed for 
PV */
 XEN_CPUFEATURE(SC_RSB_HVM,X86_SYNTH(19)) /* RSB overwrite needed for 
HVM */
 XEN_CPUFEATURE(XEN_SELFSNOOP, X86_SYNTH(20)) /* SELFSNOOP gets used by Xen 
itself */
-XEN_CPUFEATURE(SC_MSR_IDLE,   X86_SYNTH(21)) /* (SC_MSR_PV || SC_MSR_HVM) 
&& default_xen_spec_ctrl */
+XEN_CPUFEATURE(SC_MSR_IDLE,   X86_SYNTH(21)) /* Clear MSR_SPEC_CTRL on 
idle */
 XEN_CPUFEATURE(XEN_LBR,   X86_SYNTH(22)) /* Xen uses MSR_DEBUGCTL.LBR 
*/
 /* Bits 23,24 unused. */
 XEN_CPUFEATURE(SC_VERW_IDLE,  X86_SYNTH(25)) /* VERW used by Xen for idle 
*/
diff --git a/xen/arch/x86/include/asm/spec_ctrl.h 
b/xen/arch/x86/include/asm/spec_ctrl.h
index 751355f471f4..7e83e0179fb9 100644
--- a/xen/arch/x86/include/asm/spec_ctrl.h
+++ b/xen/arch/x86/include/asm/spec_ctrl.h
@@ -78,7 +78,8 @@ static always_inline void spec_ctrl_enter_idle(struct 
cpu_info *info)
 uint32_t val = 0;
 
 /*
- * Branch Target Injection:
+ * It is recommended in some cases to clear MSR_SPEC_CTRL when going idle,
+ * to avoid impacting sibling threads.
  *
  * Latch the new shadow value, then enable shadowing, then update the MSR.
  * There are no SMP issues here; only local processor ordering concerns.
@@ -114,7 +115,7 @@ static always_inline void spec_ctrl_exit_idle(struct 
cpu_info *info)
 uint32_t val = info->xen_spec_ctrl;
 
 /*
- * Branch Target Injection:
+ * Restore MSR_SPEC_CTRL on exit from idle.
  *
  * Disable shadowing before updating the MSR.  There are no SMP issues
  * here; only local processor ordering concerns.
diff --git a/xen/arch/x86/spec_ctrl.c b/xen/arch/x86/spec_ctrl.c
index 1f275ad1fb5d..57f4fcb21398 100644
--- a/xen/arch/x86/spec_ctrl.c
+++ b/xen/arch/x86/spec_ctrl.c
@@ -1151,8 +1151,14 @@ void __init init_speculation_mitigations(void)
 /* (Re)init BSP state now that default_spec_ctrl_flags has been 
calculated. */
 init_shadow_spec_ctrl_state();
 
-/* If Xen is using any MSR_SPEC_CTRL settings, adjust the idle path. */
-if ( default_xen_spec_ctrl )
+/*
+ * For microcoded IBRS only (i.e. Intel, pre eIBRS), it is recommended to
+ * clear MSR_SPEC_CTRL before going idle, to avoid impacting sibling
+ * threads.  Activate this if SMT is enabled, and Xen is using a non-zero
+ * MSR_SPEC_CTRL setting.
+ */
+if ( boot_cpu_has(X86_FEATURE_IBRSB) && !(caps & ARCH_CAPS_IBRS_ALL) &&
+ hw_smt_enabled && default_xen_spec_ctrl )
 setup_force_cpu_cap(X86_FEATURE_SC_MSR_IDLE);
 
 xpti_init_default(caps);
-- 
2.11.0




[PATCH v2 2/2] x86/spec-ctrl: Knobs for STIBP and PSFD, and follow hardware STIBP hint

2022-06-29 Thread Andrew Cooper
STIBP and PSFD are slightly weird bits, because they're both implied by other
bits in MSR_SPEC_CTRL.  Add fine grain controls for them, and take the
implications into account when setting IBRS/SSBD.

Rearrange the IBPB text/variables/logic to keep all the MSR_SPEC_CTRL bits
together, for consistency.

However, AMD have a hardware hint CPUID bit recommending that STIBP be set
unilaterally.  This is advertised on Zen3, so follow the recommendation.
Furthermore, in such cases, set STIBP behind the guest's back for now.  This
has negligible overhead for the guest, but saves a WRMSR on vmentry.  This is
the only default change.

Signed-off-by: Andrew Cooper 
Reviewed-by: Jan Beulich 
---
CC: Jan Beulich 
CC: Roger Pau Monné 
CC: Wei Liu 

v2:
 * Tweak comments/logic per suggestion.
 * Also set STIBP behind the guest's back to improve the vmentry path.
---
 docs/misc/xen-command-line.pandoc | 21 ++---
 xen/arch/x86/hvm/svm/vmcb.c   |  9 ++
 xen/arch/x86/spec_ctrl.c  | 65 +--
 3 files changed, 81 insertions(+), 14 deletions(-)

diff --git a/docs/misc/xen-command-line.pandoc 
b/docs/misc/xen-command-line.pandoc
index a92b7d228cae..da18172e50c5 100644
--- a/docs/misc/xen-command-line.pandoc
+++ b/docs/misc/xen-command-line.pandoc
@@ -2258,8 +2258,9 @@ By default SSBD will be mitigated at runtime (i.e 
`ssbd=runtime`).
 
 ### spec-ctrl (x86)
 > `= List of [ , xen=, {pv,hvm,msr-sc,rsb,md-clear}=,
->  bti-thunk=retpoline|lfence|jmp, {ibrs,ibpb,ssbd,eager-fpu,
->  l1d-flush,branch-harden,srb-lock,unpriv-mmio}= ]`
+>  bti-thunk=retpoline|lfence|jmp, {ibrs,ibpb,ssbd,psfd,
+>  eager-fpu,l1d-flush,branch-harden,srb-lock,
+>  unpriv-mmio}= ]`
 
 Controls for speculative execution sidechannel mitigations.  By default, Xen
 will pick the most appropriate mitigations based on compiled in support,
@@ -2309,9 +2310,10 @@ On hardware supporting IBRS (Indirect Branch Restricted 
Speculation), the
 If Xen is not using IBRS itself, functionality is still set up so IBRS can be
 virtualised for guests.
 
-On hardware supporting IBPB (Indirect Branch Prediction Barrier), the `ibpb=`
-option can be used to force (the default) or prevent Xen from issuing branch
-prediction barriers on vcpu context switches.
+On hardware supporting STIBP (Single Thread Indirect Branch Predictors), the
+`stibp=` option can be used to force or prevent Xen using the feature itself.
+By default, Xen will use STIBP when IBRS is in use (IBRS implies STIBP), and
+when hardware hints recommend using it as a blanket setting.
 
 On hardware supporting SSBD (Speculative Store Bypass Disable), the `ssbd=`
 option can be used to force or prevent Xen using the feature itself.  On AMD
@@ -2319,6 +2321,15 @@ hardware, this is a global option applied at boot, and 
not virtualised for
 guest use.  On Intel hardware, the feature is virtualised for guests,
 independently of Xen's choice of setting.
 
+On hardware supporting PSFD (Predictive Store Forwarding Disable), the `psfd=`
+option can be used to force or prevent Xen using the feature itself.  By
+default, Xen will not use PSFD.  PSFD is implied by SSBD, and SSBD is off by
+default.
+
+On hardware supporting IBPB (Indirect Branch Prediction Barrier), the `ibpb=`
+option can be used to force (the default) or prevent Xen from issuing branch
+prediction barriers on vcpu context switches.
+
 On all hardware, the `eager-fpu=` option can be used to force or prevent Xen
 from using fully eager FPU context switches.  This is currently implemented as
 a global control.  By default, Xen will choose to use fully eager context
diff --git a/xen/arch/x86/hvm/svm/vmcb.c b/xen/arch/x86/hvm/svm/vmcb.c
index 958309657799..0fc57dfd71cf 100644
--- a/xen/arch/x86/hvm/svm/vmcb.c
+++ b/xen/arch/x86/hvm/svm/vmcb.c
@@ -29,6 +29,7 @@
 #include 
 #include 
 #include 
+#include 
 
 struct vmcb_struct *alloc_vmcb(void)
 {
@@ -176,6 +177,14 @@ static int construct_vmcb(struct vcpu *v)
 vmcb->_pause_filter_thresh = SVM_PAUSETHRESH_INIT;
 }
 
+/*
+ * When default_xen_spec_ctrl simply SPEC_CTRL_STIBP, default this behind
+ * the back of the VM too.  Our SMT topology isn't accurate, the overhead
+ * is neglegable, and doing this saves a WRMSR on the vmentry path.
+ */
+if ( default_xen_spec_ctrl == SPEC_CTRL_STIBP )
+v->arch.msrs->spec_ctrl.raw = SPEC_CTRL_STIBP;
+
 return 0;
 }
 
diff --git a/xen/arch/x86/spec_ctrl.c b/xen/arch/x86/spec_ctrl.c
index 57f4fcb21398..efed24933d91 100644
--- a/xen/arch/x86/spec_ctrl.c
+++ b/xen/arch/x86/spec_ctrl.c
@@ -48,9 +48,13 @@ static enum ind_thunk {
 THUNK_LFENCE,
 THUNK_JMP,
 } opt_thunk __initdata = THUNK_DEFAULT;
+
 static int8_t __initdata opt_ibrs = -1;
+int8_t __initdata opt_stibp = -1;
+bool __read_mostly opt_ssbd;
+int8_t __initdata opt_psfd = -1;
+
 bool __read_mostly opt_ibpb = true;
-bool __read_mostly opt_ssbd = false;
 int8_t 

[PATCH v2 0/2] x86/spec-ctrl: Minor fixes

2022-06-29 Thread Andrew Cooper
Patch 2 has been posted before, but ages ago and has been rebased over
subseqent XSAs.  Patch 1 is new.

Andrew Cooper (2):
  x86/spec-ctrl: Only adjust MSR_SPEC_CTRL for idle with legacy IBRS
  x86/spec-ctrl: Knobs for STIBP and PSFD, and follow hardware STIBP hint

 docs/misc/xen-command-line.pandoc  | 21 +++---
 xen/arch/x86/hvm/svm/vmcb.c|  9 
 xen/arch/x86/include/asm/cpufeatures.h |  2 +-
 xen/arch/x86/include/asm/spec_ctrl.h   |  5 ++-
 xen/arch/x86/spec_ctrl.c   | 75 +-
 5 files changed, 93 insertions(+), 19 deletions(-)

-- 
2.11.0




Re: R: R: Crash when using xencov

2022-06-29 Thread Julien Grall

(moving the discussion to xen-devel)

On 28/06/2022 16:32, Carmine Cesarano wrote:

Hi,


Hello,

Please refrain to top-post and/or post images on the ML. If you need to 
share an image, then please upload them somewhere else.



I made two further attempts, first by compiling xen and xen-tools with gcc 10 
and second with gcc 7, getting the same problem.

By running “xencov reset”, with with both compilers, the line of code 
associated with the crash is:


Discussing with Andrew Cooper on IRC, it looks like the problem is 
because Xen and GCC disagrees on the format. There are newer format that 
Xen doesn't understand.


If you are interested to support GCOV on your setup, then I would 
suggest to look at the documentation and/or look at what Linux is doing 
for newer compiler.




   *   /xen/xen/common/coverage/gcc_4_7.c:123
By running “xencov read”, I get two different behaviors with the two compilers:

   *   /xen/xen/common/coverage/gcc_4_7.c:165   [GCC 11]
   *   /xen/xen/common/coverage/gcov.c:131  [GCC 7]

Attached are the logs captured with a serial port.

Cheers,

Carmine Cesarano
Da: Julien Grall
Inviato: lunedì 27 giugno 2022 14:42
A: Carmine Cesarano
Oggetto: Re: R: Crash when using xencov

Hello,

You seemed to have removed xen-users from the CC list. Please keep it in
CC unless the discussion needs to private.

Also, please avoid top-posting.

On 27/06/2022 13:36, Carmine Cesarano wrote:

Yes, i mean stable-4.16. Below the logs after running "xencov reset". The situation for 
"xencov read" is similar.

(XEN) [ Xen-4.16.2-pre  x86_64  debug=y gcov=y  Not tainted ]
(XEN) CPU:0
(XEN) RIP:e008:[] gcov_info_reset+0x87/0xa9
(XEN) RFLAGS: 00010202   CONTEXT: hypervisor (d0v0)
(XEN) rax:    rbx: 82d04056bdc0   rcx: 000c000b
(XEN) rdx:    rsi: 0001   rdi: 82d04056bdc0
(XEN) rbp: 83023a7e7cb0   rsp: 83023a7e7c88   r8:  7fff
(XEN) r9:  deadbeefdeadf00d   r10:    r11: 
(XEN) r12: 0001   r13: 82d04056be28   r14: 
(XEN) r15: 82d04056bdc0   cr0: 80050033   cr4: 00172620
(XEN) cr3: 00017ea0b000   cr2: 
(XEN) fsb: 7fc0fb0ca200   gsb: 88807b40   gss: 
(XEN) ds:    es:    fs:    gs:    ss: e010   cs: e008
(XEN) Xen code around  (gcov_info_reset+0x87/0xa9):
(XEN)  1d 44 89 f0 49 8b 57 70 <4c> 8b 24 c2 49 83 c4 18 48 83 05 a6 81 4c 00 01
(XEN) Xen stack trace from rsp=83023a7e7c88:
(XEN)82d04056bdc0 0001 82d04070f180 0001
(XEN) 83023a7e7cc8 82d040257a6a 83023a7e7db0
(XEN)83023a7e7ce8 82d040257547 83023a7e7fff 83023a7e7fff
(XEN)83023a7e7e58 82d040255d5f 83023a7e7d68 82d0403b5e8b
(XEN)0017d5b2  83023a6b5000 
(XEN)7fc0fb348010 80017ea0e127 0202 82d040399fd8
(XEN)5a40 83023a7e7d68 0206 82e002fab640
(XEN)83023a7e7e58 82d0403bb29d 83023a69f000 3a7e7fff
(XEN)00017ea0f067  0017d5b2 0017d5b2
(XEN)00140014 0002  
(XEN)   
(XEN)   
(XEN)   
(XEN) 83023a7e7ef8 0001 83023a69f000
(XEN)deadbeefdeadf00d 82d04025579d 83023a7e7ee8 82d040387f62
(XEN)7fc0fb348010 deadbeefdeadf00d deadbeefdeadf00d deadbeefdeadf00d
(XEN)deadbeefdeadf00d 83023a7e7fff 82d0403b2c99 83023a7e7eb8
(XEN)82d04038214c 83023a69f000 83023a7e7ed8 83023a69f000
(XEN)   
(XEN)7cfdc58180e7 82d0404392ae  88800f484c00
(XEN) Xen call trace:
(XEN)[] R gcov_info_reset+0x87/0xa9


Thanks! There are multiple versions of gcov_info_reset() in the tree.
The one used will depend on the compiler you are using.

Can you use addr2line (or gdb) to find out the file and line of code
associated with the crash?

For addr2line you could do:

addr2line -e xen-syms 0x82d040257bd2


Cheers,

--
Julien Grall



[linux-5.4 test] 171400: regressions - FAIL

2022-06-29 Thread osstest service owner
flight 171400 linux-5.4 real [real]
flight 171406 linux-5.4 real-retest [real]
http://logs.test-lab.xenproject.org/osstest/logs/171400/
http://logs.test-lab.xenproject.org/osstest/logs/171406/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-armhf-armhf-xl-multivcpu 14 guest-start fail REGR. vs. 171352

Tests which are failing intermittently (not blocking):
 test-arm64-arm64-xl-thunderx  8 xen-bootfail pass in 171406-retest

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check fail blocked in 
171352
 test-arm64-arm64-xl-thunderx 15 migrate-support-check fail in 171406 never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-check fail in 171406 never 
pass
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 171352
 test-amd64-i386-xl-qemut-win7-amd64 19 guest-stop fail like 171352
 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 171352
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 171352
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 171352
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 171352
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 171352
 test-armhf-armhf-xl-credit1  18 guest-start/debian.repeatfail  like 171352
 test-amd64-i386-xl-qemut-ws16-amd64 19 guest-stop fail like 171352
 test-armhf-armhf-xl-rtds 18 guest-start/debian.repeatfail  like 171352
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 171352
 test-armhf-armhf-xl-credit2  14 guest-start  fail  like 171352
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 171352
 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 171352
 test-amd64-i386-xl-pvshim14 guest-start  fail   never pass
 test-arm64-arm64-xl-seattle  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-raw  14 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass
 test-armhf-armhf-libvirt-qcow2 14 migrate-support-checkfail never pass
 test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail   never 

Re: [XEN PATCH v3 25/25] tools: Remove -Werror everywhere else

2022-06-29 Thread Stefano Stabellini
On Wed, 29 Jun 2022, Luca Fancellu wrote:
> + CC: Stefano Stabellini
> 
> > On 24 Jun 2022, at 17:04, Anthony PERARD  wrote:
> > 
> > Patch "tools: Add -Werror by default to all tools/" have added
> > "-Werror" to CFLAGS in tools/Rules.mk, remove it from every other
> > makefiles as it is now duplicated.
> > 
> > Signed-off-by: Anthony PERARD 
> 
> Hi Anthony,
> 
> I will try to review the serie when I manage to have some time, in the mean 
> time I can say the whole
> serie builds fine in my Yocto setup on arm64 and x86_64, I’ve tried also the 
> tool stack to
> create/destroy/console guests and no problem so far.
> 
> The only problem I have is building for arm32 because, I think, this patch 
> does a great job and it
> discovers a problem here:

That reminds me that we only have arm32 Xen hypervisor builds in
gitlab-ci, we don't have any arm32 Xen tools builds. I'll add it to my
TODO but if someone (not necessarily Luca) has some spare time it could
be a nice project. It could be done with Yocto by adding a Yocto build
container to automation/build/.

Re: [PATCH] xen: arm: Don't use stop_cpu() in halt_this_cpu()

2022-06-29 Thread Stefano Stabellini
On Wed, 29 Jun 2022, Julien Grall wrote:
> On 28/06/2022 23:56, Stefano Stabellini wrote:
> > > The advantage of the panic() is it will remind us that some needs to be
> > > fixed.
> > > With a warning (or WARN()) people will tend to ignore it.
> > 
> > I know that this specific code path (cpu off) is probably not super
> > relevant for what I am about to say, but as we move closer to safety
> > certifiability we need to get away from using "panic" and BUG_ON as a
> > reminder that more work is needed to have a fully correct implementation
> > of something.
> 
> I don't think we have many places at runtime using BUG_ON()/panic(). They are
> often used because we think Xen would not be able to recover if the condition
> is hit.
> 
> I am happy to remove them, but this should not be at the expense to introduce
> other potential weird bugs.
> 
> > 
> > I also see your point and agree that ASSERT is not acceptable for
> > external input but from my point of view panic is the same (slightly
> > worse because it doesn't go away in production builds).
> 
> I think it depends on your target. Would you be happy if Xen continue to run
> with potentially a fatal flaw?

Actually, this is an excellent question. I don't know what is the
expected behavior from a safety perspective in case of serious errors.
How the error should be reported and whether continuing or not is
recommended. I'll try to find out more information.



Re: [PATCH v2 0/3] x86: fix brk area initialization

2022-06-29 Thread Boris Ostrovsky



On 6/29/22 10:10 AM, Juergen Gross wrote:

On 23.06.22 11:46, Juergen Gross wrote:

The brk area needs to be zeroed initially, like the .bss section.
At the same time its memory should be covered by the ELF program
headers.

Juergen Gross (3):
   x86/xen: use clear_bss() for Xen PV guests
   x86: fix setup of brk area
   x86: fix .brk attribute in linker script

  arch/x86/include/asm/setup.h  |  3 +++
  arch/x86/kernel/head64.c  |  4 +++-
  arch/x86/kernel/vmlinux.lds.S |  2 +-
  arch/x86/xen/enlighten_pv.c   |  8 ++--
  arch/x86/xen/xen-head.S   | 10 +-
  5 files changed, 14 insertions(+), 13 deletions(-)



Could I please have some feedback? This series is fixing a major
regression regarding running as Xen PV guest (depending on kernel
configuration system will crash very early).

#regzbot ^introduced e32683c6f7d2




I don't think you need this for Xen bits as Jan had already reviewed it but in 
case you do


Reviewed-by: Boris Ostrovsky 




Re: [PATCH v2 3/3] x86: fix .brk attribute in linker script

2022-06-29 Thread Josh Poimboeuf
On Thu, Jun 23, 2022 at 11:46:08AM +0200, Juergen Gross wrote:
> Commit e32683c6f7d2 ("x86/mm: Fix RESERVE_BRK() for older binutils")
> added the "NOLOAD" attribute to the .brk section as a "failsafe"
> measure.
> 
> Unfortunately this leads to the linker no longer covering the .brk
> section in a program header, resulting in the kernel loader not knowing
> that the memory for the .brk section must be reserved.
> 
> This has led to crashes when loading the kernel as PV dom0 under Xen,
> but other scenarios could be hit by the same problem (e.g. in case an
> uncompressed kernel is used and the initrd is placed directly behind
> it).
> 
> So drop the "NOLOAD" attribute. This has been verified to correctly
> cover the .brk section by a program header of the resulting ELF file.
> 
> Fixes: e32683c6f7d2 ("x86/mm: Fix RESERVE_BRK() for older binutils")
> Signed-off-by: Juergen Gross 

Reviewed-by: Josh Poimboeuf 

-- 
Josh



Re: [PATCH v2 2/3] x86: fix setup of brk area

2022-06-29 Thread Josh Poimboeuf
Hi Juergen,

It helps to actually Cc the person who broke it ;-)

On Thu, Jun 23, 2022 at 11:46:07AM +0200, Juergen Gross wrote:
> Commit e32683c6f7d2 ("x86/mm: Fix RESERVE_BRK() for older binutils")
> put the brk area into the .bss..brk section (placed directly behind
> .bss),

Hm? It didn't actually do that.

For individual translation units, it did rename the section from
".brk_reservation" to ".bss..brk".  But then during linking it's still
placed in .brk in vmlinux, just like before.

> causing it not to be cleared initially. As the brk area is used
> to allocate early page tables, these might contain garbage in not
> explicitly written entries.
> 
> This is especially a problem for Xen PV guests, as the hypervisor will
> validate page tables (check for writable page tables and hypervisor
> private bits) before accepting them to be used. There have been reports
> of early crashes of PV guests due to illegal page table contents.
> 
> Fix that by letting clear_bss() clear the brk area, too.

While it does make sense to clear the brk area, I don't understand how
my patch broke this.  How was it getting cleared before?

-- 
Josh



[PATCH v5] xen/pass-through: merge emulated bits correctly

2022-06-29 Thread Chuck Zmudzinski
In xen_pt_config_reg_init(), there is an error in the merging of the
emulated data with the host value. With the current Qemu, instead of
merging the emulated bits with the host bits as defined by emu_mask,
the emulated bits are merged with the host bits as defined by the
inverse of emu_mask. In some cases, depending on the data in the
registers on the host, the way the registers are setup, and the
initial values of the emulated bits, the end result will be that
the register is initialized with the wrong value.

To correct this error, use the XEN_PT_MERGE_VALUE macro to help ensure
the merge is done correctly.

This correction is needed to resolve Qemu project issue #1061, which
describes the failure of Xen HVM Linux guests to boot in certain
configurations with passed through PCI devices, that is, when this error
disables instead of enables the PCI_STATUS_CAP_LIST bit of the
PCI_STATUS register of a passed through PCI device, which in turn
disables the MSI-X capability of the device in Linux guests with the end
result being that the Linux guest never completes the boot process.

Fixes: 2e87512eccf3 ("xen/pt: Sync up the dev.config and data values")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1061
Buglink: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=988333

Signed-off-by: Chuck Zmudzinski 
Reviewed-by: Anthony PERARD 
---
v2: Edit the commit message to more accurately describe the cause
of the error.

v3: * Add Reviewed-By: Anthony Perard 
* Add qemu-sta...@nongnu.org to recipients to indicate the patch
  may be suitable for backport to Qemu stable

v4: * Add Fixed commit subject to Fixes: 2e87512eccf3

Sorry for the extra noise with v4 (I thought the Fixed commit subject
would be automatically added).

v5: * Coding style fix: move block comment leading /* and trailing */
  to separate lines

Again, sorry for the noise, but the style of the comment was wrong
before v5.

Thank you, Anthony, again, for taking the time to review this patch.

 hw/xen/xen_pt_config_init.c | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/hw/xen/xen_pt_config_init.c b/hw/xen/xen_pt_config_init.c
index cad4aeba84..4758514ddf 100644
--- a/hw/xen/xen_pt_config_init.c
+++ b/hw/xen/xen_pt_config_init.c
@@ -1965,11 +1965,12 @@ static void 
xen_pt_config_reg_init(XenPCIPassthroughState *s,
 
 if ((data & host_mask) != (val & host_mask)) {
 uint32_t new_val;
-
-/* Mask out host (including past size). */
-new_val = val & host_mask;
-/* Merge emulated ones (excluding the non-emulated ones). */
-new_val |= data & host_mask;
+/*
+ * Merge the emulated bits (data) with the host bits (val)
+ * and mask out the bits past size to enable restoration
+ * of the proper value for logging below.
+ */
+new_val = XEN_PT_MERGE_VALUE(val, data, host_mask) & size_mask;
 /* Leave intact host and emulated values past the size - even 
though
  * we do not care as we write per reg->size granularity, but for 
the
  * logging below lets have the proper value. */
-- 
2.36.1




Re: [PATCH 2/2] uboot-script-gen: do not enable direct mapping by default

2022-06-29 Thread xenia

Hi Stefano,

On 6/29/22 03:28, Stefano Stabellini wrote:

On Sun, 26 Jun 2022, Xenia Ragiadakou wrote:

To be inline with XEN, do not enable direct mapping automatically for all
statically allocated domains.

Signed-off-by: Xenia Ragiadakou 

Actually I don't know about this one. I think it is OK that ImageBuilder
defaults are different from Xen defaults. This is a case where I think
it would be good to enable DOMU_DIRECT_MAP by default when
DOMU_STATIC_MEM is specified.
Just realized that I forgot to add [ImageBuilder] tag to the patches. 
Sorry about that.


I cc Ayan, since the change was suggested by him.
I have no strong preference on the default value.

Xenia


---
  README.md| 4 ++--
  scripts/uboot-script-gen | 8 ++--
  2 files changed, 4 insertions(+), 8 deletions(-)

diff --git a/README.md b/README.md
index cb15ca5..03e437b 100644
--- a/README.md
+++ b/README.md
@@ -169,8 +169,8 @@ Where:
if specified, indicates the host physical address regions
[baseaddr, baseaddr + size) to be reserved to the VM for static allocation.
  
-- DOMU_DIRECT_MAP[number] can be set to 1 or 0.

-  If set to 1, the VM is direct mapped. The default is 1.
+- DOMU_DIRECT_MAP[number] if set to 1, enables direct mapping.
+  By default, direct mapping is disabled.
This is only applicable when DOMU_STATIC_MEM is specified.
  
  - LINUX is optional but specifies the Linux kernel for when Xen is NOT

diff --git a/scripts/uboot-script-gen b/scripts/uboot-script-gen
index 085e29f..66ce6f7 100755
--- a/scripts/uboot-script-gen
+++ b/scripts/uboot-script-gen
@@ -52,7 +52,7 @@ function dt_set()
  echo "fdt set $path $var $array" >> $UBOOT_SOURCE
  elif test $data_type = "bool"
  then
-if test "$data" -eq 1
+if test "$data" == "1"
  then
  echo "fdt set $path $var" >> $UBOOT_SOURCE
  fi
@@ -74,7 +74,7 @@ function dt_set()
  fdtput $FDTEDIT -p -t s $path $var $data
  elif test $data_type = "bool"
  then
-if test "$data" -eq 1
+if test "$data" == "1"
  then
  fdtput $FDTEDIT -p $path $var
  fi
@@ -491,10 +491,6 @@ function xen_config()
  then
  DOMU_CMD[$i]="console=ttyAMA0"
  fi
-if test -z "${DOMU_DIRECT_MAP[$i]}"
-then
- DOMU_DIRECT_MAP[$i]=1
-fi
  i=$(( $i + 1 ))
  done
  }
--
2.34.1





Re: [PATCH 2/2] console/serial: bump buffer from 16K to 32K

2022-06-29 Thread Jan Beulich
On 29.06.2022 18:19, Roger Pau Monné wrote:
> On Wed, Jun 29, 2022 at 06:03:34PM +0200, Jan Beulich wrote:
>> On 29.06.2022 17:23, Roger Pau Monné wrote:
>>> On Thu, Jun 23, 2022 at 03:32:30PM +0200, Jan Beulich wrote:
 On 23.06.2022 11:08, Roger Pau Monne wrote:
> Testing on a Kaby Lake box with 8 CPUs leads to the serial buffer
> being filled halfway during dom0 boot, and thus a non-trivial chunk of
> Linux boot messages are dropped.
>
> Increasing the buffer to 32K does fix the issue and Linux boot
> messages are no longer dropped.  There's no justification either on
> why 16K was chosen, and hence bumping to 32K in order to cope with
> current systems generating output faster does seem appropriate to have
> a better user experience with the provided defaults.

 Just to record what was part of an earlier discussion: I'm not going
 to nak such a change, but I think the justification is insufficient:
 On this same basis someone else could come a few days later and bump
 to 64k, then 128k, etc.
>>>
>>> Indeed, and that would be fine IMO.  We should aim to provide defaults
>>> that work fine for most situations, and here I don't see what drawback
>>> it has to increase the default buffer size from 16kiB to 32kiB, and
>>> I would be fine with increasing to 128kiB if that's required for some
>>> use case, albeit I have a hard time seeing how we could fill that
>>> buffer.
>>>
>>> If I can ask, what kind of justification you would see fit for
>>> granting an increase to the default buffer size?
>>
>> Making plausible that for a majority of contemporary systems the buffer
>> is not large enough would be one aspect. But then there's imo always
>> going to be an issue: What if non-Linux Dom0 would be far more chatty?
>> What if Linux, down the road, was made less verbose (by default)? What
>> if people expect large enough a buffer to also suffice when Linux runs
>> in e.g. ignore_loglevel mode? We simply can't fit all use cases and at
>> the same time also not go overboard with the default size. That's why
>> there's a way to control this via command line option.
> 
> Maybe I should clarify that the current buffer size is not enough on
> this system with the default Linux log level. I think we can expect
> someone that changes the default Linux log level to also consider
> changing the Xen buffer size.  OTOH when using the default Linux log
> level the default Xen serial buffer should be enough.
> 
> I haven't tested with FreeBSD on that system, other systems I have
> seem to be fine when booting FreeBSD with the default Xen serial
> buffer size.
> 
> I think we should exercise caution if someone proposed to increase to
> 1M for example, but I don't see why so much controversy for going from
> 16K to 32K, it's IMO a reasonable value and has proven to prevent
> serial log loss when using the default Linux log level.

But see - that's exactly my point. Where do you draw the line between
"we should accept" and "exercise caution"? Is it 256k? Or 512k?

Certainly I'm also aware of the common "memory is cheap" attitude,
but that doesn't make me like it. That's both because of having
started with 64k total (or maybe even less; too long ago meanwhile),
and because of my general observation that everything only ever
(fast) growing makes many things slower than they would need to be.

As to controversy - I did make clear before that I don't mean to nak
the change. But given my views, you'll need to find someone else to
ack it despite being aware of my opinion.

Jan



Re: [PATCH] x86/ept: fix shattering of special pages

2022-06-29 Thread Jan Beulich
On 27.06.2022 12:01, Roger Pau Monne wrote:
> The current logic in epte_get_entry_emt() will split any page marked
> as special with order greater than zero, without checking whether the
> super page is all special.
> 
> Fix this by only splitting the page only if it's not all marked as
> special, in order to prevent unneeded super page shuttering.
> 
> Fixes: ca24b2ffdb ('x86/hvm: set 'ipat' in EPT for special pages')
> Signed-off-by: Roger Pau Monné 
> ---
> Cc: Paul Durrant 
> ---
> It would seem weird to me to have a super page entry in EPT with
> ranges marked as special and not the full page.  I guess it's better
> to be safe, but I don't see an scenario where we could end up in that
> situation.
> 
> I've been trying to find a clarification in the original patch
> submission about how it's possible to have such super page EPT entry,
> but haven't been able to find any justification.

I think the loop there was added "just in case", whereas in reality
it was only single special pages that would ever be mapped. So ...

> Might be nice to expand the commit message as to why it's possible to
> have such mixed super page entries that would need splitting.

... there may not be anything to add. I don't mind this being re-done,
hence
Reviewed-by: Jan Beulich 
Yet I'm not sure I understand what case this actually fixes (and
hence why you did add a Fixes: tag) - are there cases where non-
order-0 special pages are mapped in reality?

As to the Fixes: tag - I think we mean to follow Linux there and
hence want 12-digit hashes to be specified. See also
docs/process/sending-patches.pandoc. (But no need to resend just
for this.)

Jan



Re: [PATCH 2/2] console/serial: bump buffer from 16K to 32K

2022-06-29 Thread Roger Pau Monné
On Wed, Jun 29, 2022 at 06:03:34PM +0200, Jan Beulich wrote:
> On 29.06.2022 17:23, Roger Pau Monné wrote:
> > On Thu, Jun 23, 2022 at 03:32:30PM +0200, Jan Beulich wrote:
> >> On 23.06.2022 11:08, Roger Pau Monne wrote:
> >>> Testing on a Kaby Lake box with 8 CPUs leads to the serial buffer
> >>> being filled halfway during dom0 boot, and thus a non-trivial chunk of
> >>> Linux boot messages are dropped.
> >>>
> >>> Increasing the buffer to 32K does fix the issue and Linux boot
> >>> messages are no longer dropped.  There's no justification either on
> >>> why 16K was chosen, and hence bumping to 32K in order to cope with
> >>> current systems generating output faster does seem appropriate to have
> >>> a better user experience with the provided defaults.
> >>
> >> Just to record what was part of an earlier discussion: I'm not going
> >> to nak such a change, but I think the justification is insufficient:
> >> On this same basis someone else could come a few days later and bump
> >> to 64k, then 128k, etc.
> > 
> > Indeed, and that would be fine IMO.  We should aim to provide defaults
> > that work fine for most situations, and here I don't see what drawback
> > it has to increase the default buffer size from 16kiB to 32kiB, and
> > I would be fine with increasing to 128kiB if that's required for some
> > use case, albeit I have a hard time seeing how we could fill that
> > buffer.
> > 
> > If I can ask, what kind of justification you would see fit for
> > granting an increase to the default buffer size?
> 
> Making plausible that for a majority of contemporary systems the buffer
> is not large enough would be one aspect. But then there's imo always
> going to be an issue: What if non-Linux Dom0 would be far more chatty?
> What if Linux, down the road, was made less verbose (by default)? What
> if people expect large enough a buffer to also suffice when Linux runs
> in e.g. ignore_loglevel mode? We simply can't fit all use cases and at
> the same time also not go overboard with the default size. That's why
> there's a way to control this via command line option.

Maybe I should clarify that the current buffer size is not enough on
this system with the default Linux log level. I think we can expect
someone that changes the default Linux log level to also consider
changing the Xen buffer size.  OTOH when using the default Linux log
level the default Xen serial buffer should be enough.

I haven't tested with FreeBSD on that system, other systems I have
seem to be fine when booting FreeBSD with the default Xen serial
buffer size.

I think we should exercise caution if someone proposed to increase to
1M for example, but I don't see why so much controversy for going from
16K to 32K, it's IMO a reasonable value and has proven to prevent
serial log loss when using the default Linux log level.

Thanks, Roger.



Re: [PATCH 2/2] console/serial: bump buffer from 16K to 32K

2022-06-29 Thread Jan Beulich
On 29.06.2022 17:23, Roger Pau Monné wrote:
> On Thu, Jun 23, 2022 at 03:32:30PM +0200, Jan Beulich wrote:
>> On 23.06.2022 11:08, Roger Pau Monne wrote:
>>> Testing on a Kaby Lake box with 8 CPUs leads to the serial buffer
>>> being filled halfway during dom0 boot, and thus a non-trivial chunk of
>>> Linux boot messages are dropped.
>>>
>>> Increasing the buffer to 32K does fix the issue and Linux boot
>>> messages are no longer dropped.  There's no justification either on
>>> why 16K was chosen, and hence bumping to 32K in order to cope with
>>> current systems generating output faster does seem appropriate to have
>>> a better user experience with the provided defaults.
>>
>> Just to record what was part of an earlier discussion: I'm not going
>> to nak such a change, but I think the justification is insufficient:
>> On this same basis someone else could come a few days later and bump
>> to 64k, then 128k, etc.
> 
> Indeed, and that would be fine IMO.  We should aim to provide defaults
> that work fine for most situations, and here I don't see what drawback
> it has to increase the default buffer size from 16kiB to 32kiB, and
> I would be fine with increasing to 128kiB if that's required for some
> use case, albeit I have a hard time seeing how we could fill that
> buffer.
> 
> If I can ask, what kind of justification you would see fit for
> granting an increase to the default buffer size?

Making plausible that for a majority of contemporary systems the buffer
is not large enough would be one aspect. But then there's imo always
going to be an issue: What if non-Linux Dom0 would be far more chatty?
What if Linux, down the road, was made less verbose (by default)? What
if people expect large enough a buffer to also suffice when Linux runs
in e.g. ignore_loglevel mode? We simply can't fit all use cases and at
the same time also not go overboard with the default size. That's why
there's a way to control this via command line option.

Jan



Re: [PATCH RESEND v10 1/2] xen+tools: Report Interrupt Controller Virtualization capabilities on x86

2022-06-29 Thread Jane Malalane
On 29/06/2022 15:49, Christian Lindig wrote:
> 
> 
> On 29 Jun 2022, at 14:55, Jane Malalane 
> mailto:jane.malal...@citrix.com>> wrote:
> 
> + physinfo = caml_alloc_tuple(11);
> Store_field(physinfo, 0, Val_int(c_physinfo.threads_per_core));
> Store_field(physinfo, 1, Val_int(c_physinfo.cores_per_socket));
> Store_field(physinfo, 2, Val_int(c_physinfo.nr_cpus));
> @@ -749,6 +749,17 @@ CAMLprim value stub_xc_physinfo(value xch)
> Store_field(physinfo, 8, cap_list);
> Store_field(physinfo, 9, Val_int(c_physinfo.max_cpu_id + 1));
> 
> +#if defined(__i386__) || defined(__x86_64__)
> + /*
> +  * arch_capabilities: physinfo_arch_cap_flag list;
> +  */
> + arch_cap_list = c_bitmap_to_ocaml_list
> + /* ! physinfo_arch_cap_flag CAP_ none */
> + /* ! XEN_SYSCTL_PHYSCAP_ XEN_SYSCTL_PHYSCAP_X86_MAX max */
> + (c_physinfo.arch_capabilities);
> + Store_field(physinfo, 10, arch_cap_list);
> +#endif
> +
> CAMLreturn(physinfo);
> }
> 
> I this extending the tuple but only defining a value on x86? Does this not 
> lead to undefined fields on other architectures?

You're right, it's missing a definition, I will send a new version - 
will just give some time for more eventual comments from others on the 
series overall.

Thank you,

Jane

Re: [PATCH 2/2] console/serial: bump buffer from 16K to 32K

2022-06-29 Thread Roger Pau Monné
On Thu, Jun 23, 2022 at 03:32:30PM +0200, Jan Beulich wrote:
> On 23.06.2022 11:08, Roger Pau Monne wrote:
> > Testing on a Kaby Lake box with 8 CPUs leads to the serial buffer
> > being filled halfway during dom0 boot, and thus a non-trivial chunk of
> > Linux boot messages are dropped.
> > 
> > Increasing the buffer to 32K does fix the issue and Linux boot
> > messages are no longer dropped.  There's no justification either on
> > why 16K was chosen, and hence bumping to 32K in order to cope with
> > current systems generating output faster does seem appropriate to have
> > a better user experience with the provided defaults.
> 
> Just to record what was part of an earlier discussion: I'm not going
> to nak such a change, but I think the justification is insufficient:
> On this same basis someone else could come a few days later and bump
> to 64k, then 128k, etc.

Indeed, and that would be fine IMO.  We should aim to provide defaults
that work fine for most situations, and here I don't see what drawback
it has to increase the default buffer size from 16kiB to 32kiB, and
I would be fine with increasing to 128kiB if that's required for some
use case, albeit I have a hard time seeing how we could fill that
buffer.

If I can ask, what kind of justification you would see fit for
granting an increase to the default buffer size?

Thanks, Roger.



Re: [PATCH RESEND v10 1/2] xen+tools: Report Interrupt Controller Virtualization capabilities on x86

2022-06-29 Thread Jane Malalane
On 29/06/2022 15:26, Jan Beulich wrote:
> On 29.06.2022 15:55, Jane Malalane wrote:
>> Add XEN_SYSCTL_PHYSCAP_X86_ASSISTED_XAPIC and
>> XEN_SYSCTL_PHYSCAP_X86_ASSISTED_X2APIC to report accelerated xAPIC and
>> x2APIC, on x86 hardware. This is so that xAPIC and x2APIC virtualization
>> can subsequently be enabled on a per-domain basis.
>> No such features are currently implemented on AMD hardware.
>>
>> HW assisted xAPIC virtualization will be reported if HW, at the
>> minimum, supports virtualize_apic_accesses as this feature alone means
>> that an access to the APIC page will cause an APIC-access VM exit. An
>> APIC-access VM exit provides a VMM with information about the access
>> causing the VM exit, unlike a regular EPT fault, thus simplifying some
>> internal handling.
>>
>> HW assisted x2APIC virtualization will be reported if HW supports
>> virtualize_x2apic_mode and, at least, either apic_reg_virt or
>> virtual_intr_delivery. This also means that
>> sysctl follows the conditionals in vmx_vlapic_msr_changed().
>>
>> For that purpose, also add an arch-specific "capabilities" parameter
>> to struct xen_sysctl_physinfo.
>>
>> Note that this interface is intended to be compatible with AMD so that
>> AVIC support can be introduced in a future patch. Unlike Intel that
>> has multiple controls for APIC Virtualization, AMD has one global
>> 'AVIC Enable' control bit, so fine-graining of APIC virtualization
>> control cannot be done on a common interface.
>>
>> Suggested-by: Andrew Cooper 
>> Signed-off-by: Jane Malalane 
>> Reviewed-by: "Roger Pau Monné" 
>> Reviewed-by: Jan Beulich 
>> Reviewed-by: Anthony PERARD 
> 
> Could you please clarify whether you did drop Kevin's R-b (which, a
> little unhelpfully, he provided in reply to v9 a week after you had
> posted v10) because of ...
> 
>> v10:
>>   * Make assisted_x{2}apic_available conditional upon _vmx_cpu_up()
> 
> ... this, requiring him to re-offer the tag? Until told otherwise I
> will assume so.

It wasn't intentional but yes, that is right. There was a change, albeit 
minor, in vmx from v9 to v10 so I do require Kevin Tian or Jun Nakajima 
to review it.

Thank you,

Jane.<>

Re: [PATCH] xen/arm: smmu-v3: Fix MISRA C 2012 Rule 1.3 violations

2022-06-29 Thread Julien Grall

On 29/06/2022 15:10, Bertrand Marquis wrote:

Hi,


Hi Bertrand,


In fact the patch was committed before we started this discussion as Rahul R-b 
was enough.


It was probably merged a bit too fast. When there are multiple 
maintainers responsible for the code, I tend to prefer to wait a bit 
just in case there are extra comments.



But I would still be interested to have other maintainers view on this.


Rahul and you are the official maintainers for that code. I think 
Stefano and I are only CCed because we maintain the Arm code 
(get_maintainers.pl doesn't automatically remove maintainers from upper 
directory).





On 29 Jun 2022, at 10:03, Bertrand Marquis  wrote:

Hi Xenia,


On 29 Jun 2022, at 09:55, xenia  wrote:

Hi Bertrand,

On 6/29/22 10:24, Bertrand Marquis wrote:

Hi Xenia,


On 28 Jun 2022, at 16:08, Xenia Ragiadakou  wrote:

The expression 1 << 31 produces undefined behaviour because the type of integer
constant 1 is (signed) int and the result of shifting 1 by 31 bits is not
representable in the (signed) int type.
Change the type of 1 to unsigned int by adding the U suffix.

Signed-off-by: Xenia Ragiadakou 
---
Q_OVERFLOW_FLAG has already been fixed in upstream kernel code.
For GBPA_UPDATE I will submit a patch.

xen/drivers/passthrough/arm/smmu-v3.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/xen/drivers/passthrough/arm/smmu-v3.c 
b/xen/drivers/passthrough/arm/smmu-v3.c
index 1e857f915a..f2562acc38 100644
--- a/xen/drivers/passthrough/arm/smmu-v3.c
+++ b/xen/drivers/passthrough/arm/smmu-v3.c
@@ -338,7 +338,7 @@ static int platform_get_irq_byname_optional(struct device 
*dev,
#define CR2_E2H (1 << 0)

#define ARM_SMMU_GBPA   0x44
-#define GBPA_UPDATE(1 << 31)
+#define GBPA_UPDATE(1U << 31)
#define GBPA_ABORT  (1 << 20)

#define ARM_SMMU_IRQ_CTRL   0x50
@@ -410,7 +410,7 @@ static int platform_get_irq_byname_optional(struct device 
*dev,

#define Q_IDX(llq, p)   ((p) & ((1 << (llq)->max_n_shift) - 1))
#define Q_WRP(llq, p)   ((p) & (1 << (llq)->max_n_shift))

Could also make sense to fix those 2 to be coherent.

According to the spec, the maximum value that max_n_shift can take is 19.
Hence, 1 << (llq)->max_n_shift cannot produce undefined behavior.


I agree with that but my preference would be to not rely on deep analysis on 
the code for those kind of cases and use 1U whenever possible.

What other maintainers think on this ?


In general, I prefer if we use 1U (or 1UL/1ULL) so it is clearer that 
the code is correct and consistent (I always find odd when we use 1U << 
31 but 1 << 20).


In this case, even if you use 1U, it is not really clear whether 
max_n_shift can be greater than 31. That said, I would not suggest to 
use ULL unless this is strictly necessary.






Personally, I have no problem to submit another patch that adds U/UL suffixes 
to all shifted integer constants in the file :) but ...
It seems that this driver has been ported from linux and this file still uses 
linux coding style, probably because deviations will reduce its maintainability.
Adding a U suffix to those two might be considered an unjustified deviation.


This can be solved by sending patch to Linux as well. This will also 
help Linux to reduce the number MISRA violations (I guess SMMUv3 will be 
part of the safety certification scope).




At this stage the SMMU code is starting to deviate a lot from Linux so it will 
not be possible to update it easily to sync with latest linux code.
And the decision should be are we fixing it or should we fully skip this file 
saying that it is coming from linux and should not be fixed.
We started to have a discussion during the FuSa meeting on this but we need to 
come up with a conclusion per file.

On this one I would say keeping it with linux code style and near from the 
linux one is not needed.


I will address both point separately.

In general, we don't want to mix coding style within a file. As the file 
started with the Linux one, then we should keep it like that. Otherwise, 
you will end up with a mix and it will be difficult for the contributor 
to know how to write new code. That said, I would not necessarily 
consider using "1 << ..." as part of the code style we want to keep.


This brings to the second part (i.e. keeping the code near Linux). Linux 
has a much larger user/developper base than us. Therefore there are more 
chance for them to find bugs and fix them. By diverging, then we could 
end up to add new bugs and also increase the maintainance.


I have tried both way with the SMMUv{1,2} driver. The first attempt was 
to fully adapt the code to Xen (coding style...). But this made 
difficult to keep track of bugs. A few months we removed it completely 
and then tried to keep as close as to Linux. Since then Linux reworked 
the IOMMU subsystem, so port needs to be adapted. It is

Re: [PATCH RESEND v10 1/2] xen+tools: Report Interrupt Controller Virtualization capabilities on x86

2022-06-29 Thread Christian Lindig


On 29 Jun 2022, at 14:55, Jane Malalane 
mailto:jane.malal...@citrix.com>> wrote:

+ physinfo = caml_alloc_tuple(11);
Store_field(physinfo, 0, Val_int(c_physinfo.threads_per_core));
Store_field(physinfo, 1, Val_int(c_physinfo.cores_per_socket));
Store_field(physinfo, 2, Val_int(c_physinfo.nr_cpus));
@@ -749,6 +749,17 @@ CAMLprim value stub_xc_physinfo(value xch)
Store_field(physinfo, 8, cap_list);
Store_field(physinfo, 9, Val_int(c_physinfo.max_cpu_id + 1));

+#if defined(__i386__) || defined(__x86_64__)
+ /*
+  * arch_capabilities: physinfo_arch_cap_flag list;
+  */
+ arch_cap_list = c_bitmap_to_ocaml_list
+ /* ! physinfo_arch_cap_flag CAP_ none */
+ /* ! XEN_SYSCTL_PHYSCAP_ XEN_SYSCTL_PHYSCAP_X86_MAX max */
+ (c_physinfo.arch_capabilities);
+ Store_field(physinfo, 10, arch_cap_list);
+#endif
+
CAMLreturn(physinfo);
}

I this extending the tuple but only defining a value on x86? Does this not lead 
to undefined fields on other architectures?

type physinfo = {
  threads_per_core : int;
  cores_per_socket : int;
@@ -124,6 +128,7 @@ type physinfo = {
  scrub_pages  : nativeint;
  capabilities : physinfo_cap_flag list;
  max_nr_cpus  : int; (** compile-time max possible number of nr_cpus *)
+  arch_capabilities : physinfo_arch_cap_flag list;
}
type version = { major : int; minor : int; extra : string

Here the record is extended but it looks to me like the new field is kept 
undefined on non x86 architectures. If the field still has a defined content, 
it would be good to explain why.

— C


Re: [PATCH v10 0/2] xen: Report and use hardware APIC virtualization capabilities

2022-06-29 Thread Jan Beulich
On 29.06.2022 13:09, Jan Beulich wrote:
> On 29.06.2022 12:43, Anthony PERARD wrote:
>> On Thu, Jun 23, 2022 at 09:23:27AM +0200, Jan Beulich wrote:
>>> On 13.04.2022 13:21, Jane Malalane wrote:
 Jane Malalane (2):
   xen+tools: Report Interrupt Controller Virtualization capabilities on
 x86
   x86/xen: Allow per-domain usage of hardware virtualized APIC

  docs/man/xl.cfg.5.pod.in  | 15 ++
  docs/man/xl.conf.5.pod.in | 12 +++
  tools/golang/xenlight/helpers.gen.go  | 16 ++
  tools/golang/xenlight/types.gen.go|  4 
  tools/include/libxl.h | 14 +
  tools/libs/light/libxl.c  |  3 +++
  tools/libs/light/libxl_arch.h |  9 ++--
  tools/libs/light/libxl_arm.c  | 14 ++---
  tools/libs/light/libxl_create.c   | 22 
  tools/libs/light/libxl_types.idl  |  4 
  tools/libs/light/libxl_x86.c  | 39 
 +--
  tools/ocaml/libs/xc/xenctrl.ml|  7 +++
  tools/ocaml/libs/xc/xenctrl.mli   |  7 +++
  tools/ocaml/libs/xc/xenctrl_stubs.c   | 17 ---
  tools/xl/xl.c |  8 +++
  tools/xl/xl.h |  2 ++
  tools/xl/xl_info.c|  6 --
  tools/xl/xl_parse.c   | 19 +
  xen/arch/x86/domain.c | 29 +-
  xen/arch/x86/hvm/hvm.c|  3 +++
  xen/arch/x86/hvm/vmx/vmcs.c   | 11 ++
  xen/arch/x86/hvm/vmx/vmx.c| 13 
  xen/arch/x86/include/asm/hvm/domain.h |  6 ++
  xen/arch/x86/include/asm/hvm/hvm.h| 10 +
  xen/arch/x86/sysctl.c |  4 
  xen/arch/x86/traps.c  |  5 +++--
  xen/include/public/arch-x86/xen.h |  5 +
  xen/include/public/sysctl.h   | 11 +-
  28 files changed, 281 insertions(+), 34 deletions(-)

>>>
>>> Just FYI: It's been over two months that v10 has been pending. There
>>> are still missing acks. You may want to ping the respective maintainers
>>> for this to make progress.
>>
>> Are you looking for a ack for the "docs/man" changes? If so, I guess
>> I'll have to make it more explicit next time that a review for "tools"
>> also mean review of the changes in their respective man pages.
> 
> No, the docs changes (being clearly tools docs) are fine.
> 
>> Or are you looking for a ack for the "golang" changes? Those changes are
>> automatically generated by a tool already in our repository.
> 
> Indeed it's Go (where I think an ack is still required, no matter
> if the changes are generated ones [which I wasn't even aware of, I
> have to confess]) and ...
> 
>> Or is it an "ocaml" ack for the first patch? Unfortunately, the
>> maintainers haven't been CCed, I guess that could be an issue.
> 
> ... OCaml which I was after.

Oh and actually for at least patch 2 also VMX. For patch 1 I've sent a
separate reply to the resent v10.

Jan



Re: [PATCH RESEND v10 1/2] xen+tools: Report Interrupt Controller Virtualization capabilities on x86

2022-06-29 Thread Jan Beulich
On 29.06.2022 15:55, Jane Malalane wrote:
> Add XEN_SYSCTL_PHYSCAP_X86_ASSISTED_XAPIC and
> XEN_SYSCTL_PHYSCAP_X86_ASSISTED_X2APIC to report accelerated xAPIC and
> x2APIC, on x86 hardware. This is so that xAPIC and x2APIC virtualization
> can subsequently be enabled on a per-domain basis.
> No such features are currently implemented on AMD hardware.
> 
> HW assisted xAPIC virtualization will be reported if HW, at the
> minimum, supports virtualize_apic_accesses as this feature alone means
> that an access to the APIC page will cause an APIC-access VM exit. An
> APIC-access VM exit provides a VMM with information about the access
> causing the VM exit, unlike a regular EPT fault, thus simplifying some
> internal handling.
> 
> HW assisted x2APIC virtualization will be reported if HW supports
> virtualize_x2apic_mode and, at least, either apic_reg_virt or
> virtual_intr_delivery. This also means that
> sysctl follows the conditionals in vmx_vlapic_msr_changed().
> 
> For that purpose, also add an arch-specific "capabilities" parameter
> to struct xen_sysctl_physinfo.
> 
> Note that this interface is intended to be compatible with AMD so that
> AVIC support can be introduced in a future patch. Unlike Intel that
> has multiple controls for APIC Virtualization, AMD has one global
> 'AVIC Enable' control bit, so fine-graining of APIC virtualization
> control cannot be done on a common interface.
> 
> Suggested-by: Andrew Cooper 
> Signed-off-by: Jane Malalane 
> Reviewed-by: "Roger Pau Monné" 
> Reviewed-by: Jan Beulich 
> Reviewed-by: Anthony PERARD 

Could you please clarify whether you did drop Kevin's R-b (which, a
little unhelpfully, he provided in reply to v9 a week after you had
posted v10) because of ...

> v10:
>  * Make assisted_x{2}apic_available conditional upon _vmx_cpu_up()

... this, requiring him to re-offer the tag? Until told otherwise I
will assume so.

Jan



Re: [PATCH] xen/arm: smmu-v3: Fix MISRA C 2012 Rule 1.3 violations

2022-06-29 Thread Bertrand Marquis
Hi,

In fact the patch was committed before we started this discussion as Rahul R-b 
was enough.
But I would still be interested to have other maintainers view on this.

> On 29 Jun 2022, at 10:03, Bertrand Marquis  wrote:
> 
> Hi Xenia,
> 
>> On 29 Jun 2022, at 09:55, xenia  wrote:
>> 
>> Hi Bertrand,
>> 
>> On 6/29/22 10:24, Bertrand Marquis wrote:
>>> Hi Xenia,
>>> 
 On 28 Jun 2022, at 16:08, Xenia Ragiadakou  wrote:
 
 The expression 1 << 31 produces undefined behaviour because the type of 
 integer
 constant 1 is (signed) int and the result of shifting 1 by 31 bits is not
 representable in the (signed) int type.
 Change the type of 1 to unsigned int by adding the U suffix.
 
 Signed-off-by: Xenia Ragiadakou 
 ---
 Q_OVERFLOW_FLAG has already been fixed in upstream kernel code.
 For GBPA_UPDATE I will submit a patch.
 
 xen/drivers/passthrough/arm/smmu-v3.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
 
 diff --git a/xen/drivers/passthrough/arm/smmu-v3.c 
 b/xen/drivers/passthrough/arm/smmu-v3.c
 index 1e857f915a..f2562acc38 100644
 --- a/xen/drivers/passthrough/arm/smmu-v3.c
 +++ b/xen/drivers/passthrough/arm/smmu-v3.c
 @@ -338,7 +338,7 @@ static int platform_get_irq_byname_optional(struct 
 device *dev,
 #define CR2_E2H(1 << 0)
 
 #define ARM_SMMU_GBPA  0x44
 -#define GBPA_UPDATE   (1 << 31)
 +#define GBPA_UPDATE   (1U << 31)
 #define GBPA_ABORT (1 << 20)
 
 #define ARM_SMMU_IRQ_CTRL  0x50
 @@ -410,7 +410,7 @@ static int platform_get_irq_byname_optional(struct 
 device *dev,
 
 #define Q_IDX(llq, p)  ((p) & ((1 << 
 (llq)->max_n_shift) - 1))
 #define Q_WRP(llq, p)  ((p) & (1 << 
 (llq)->max_n_shift))
>>> Could also make sense to fix those 2 to be coherent.
>> According to the spec, the maximum value that max_n_shift can take is 19.
>> Hence, 1 << (llq)->max_n_shift cannot produce undefined behavior.
> 
> I agree with that but my preference would be to not rely on deep analysis on 
> the code for those kind of cases and use 1U whenever possible.
> 
> What other maintainers think on this ?
> 
>> 
>> Personally, I have no problem to submit another patch that adds U/UL 
>> suffixes to all shifted integer constants in the file :) but ...
>> It seems that this driver has been ported from linux and this file still 
>> uses linux coding style, probably because deviations will reduce its 
>> maintainability.
>> Adding a U suffix to those two might be considered an unjustified deviation.
> 
> At this stage the SMMU code is starting to deviate a lot from Linux so it 
> will not be possible to update it easily to sync with latest linux code.
> And the decision should be are we fixing it or should we fully skip this file 
> saying that it is coming from linux and should not be fixed.
> We started to have a discussion during the FuSa meeting on this but we need 
> to come up with a conclusion per file.
> 
> On this one I would say keeping it with linux code style and near from the 
> linux one is not needed.
> Rahul, Julien, Stefano: what do you think here ?
> 
> Cheers
> Bertrand

Cheers
Bertrand




Re: [PATCH v2 0/3] x86: fix brk area initialization

2022-06-29 Thread Juergen Gross

On 23.06.22 11:46, Juergen Gross wrote:

The brk area needs to be zeroed initially, like the .bss section.
At the same time its memory should be covered by the ELF program
headers.

Juergen Gross (3):
   x86/xen: use clear_bss() for Xen PV guests
   x86: fix setup of brk area
   x86: fix .brk attribute in linker script

  arch/x86/include/asm/setup.h  |  3 +++
  arch/x86/kernel/head64.c  |  4 +++-
  arch/x86/kernel/vmlinux.lds.S |  2 +-
  arch/x86/xen/enlighten_pv.c   |  8 ++--
  arch/x86/xen/xen-head.S   | 10 +-
  5 files changed, 14 insertions(+), 13 deletions(-)



Could I please have some feedback? This series is fixing a major
regression regarding running as Xen PV guest (depending on kernel
configuration system will crash very early).

#regzbot ^introduced e32683c6f7d2


Juergen


OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature


[PATCH RESEND v10 2/2] x86/xen: Allow per-domain usage of hardware virtualized APIC

2022-06-29 Thread Jane Malalane
Introduce a new per-domain creation x86 specific flag to
select whether hardware assisted virtualization should be used for
x{2}APIC.

A per-domain option is added to xl in order to select the usage of
x{2}APIC hardware assisted virtualization, as well as a global
configuration option.

Having all APIC interaction exit to Xen for emulation is slow and can
induce much overhead. Hardware can speed up x{2}APIC by decoding the
APIC access and providing a VM exit with a more specific exit reason
than a regular EPT fault or by altogether avoiding a VM exit.

On the other hand, being able to disable x{2}APIC hardware assisted
virtualization can be useful for testing and debugging purposes.

Note:

- vmx_install_vlapic_mapping doesn't require modifications regardless
of whether the guest has "Virtualize APIC accesses" enabled or not,
i.e., setting the APIC_ACCESS_ADDR VMCS field is fine so long as
virtualize_apic_accesses is supported by the CPU.

- Both per-domain and global assisted_x{2}apic options are not part of
the migration stream, unless explicitly set in the respective
configuration files. Default settings of assisted_x{2}apic done
internally by the toolstack, based on host capabilities at create
time, are not migrated.

Suggested-by: Andrew Cooper 
Signed-off-by: Jane Malalane 
Acked-by: Christian Lindig 
Reviewed-by: "Roger Pau Monné" 
Reviewed-by: Anthony PERARD 
---
CC: Wei Liu 
CC: Anthony PERARD 
CC: George Dunlap 
CC: Nick Rosbrook 
CC: Juergen Gross 
CC: Christian Lindig 
CC: David Scott 
CC: Jan Beulich 
CC: Andrew Cooper 
CC: "Roger Pau Monné" 
CC: Jun Nakajima 
CC: Kevin Tian 

v10:
 * Improve commit message note on migration

v9:
 * Fix style issues
 * Fix exit() logic for assisted_x{2}apic parsing
 * Add and use XEN_X86_MISC_FLAGS_MAX for ABI checking instead of
   using XEN_X86_ASSISTED_X2APIC directly
 * Expand commit message to mention migration is safe

v8:
 * Widen assisted_x{2}apic parsing to PVH guests in
   parse_config_data()

v7:
 * Fix void return in libxl__arch_domain_build_info_setdefault
 * Fix style issues
 * Use EINVAL when rejecting assisted_x{2}apic for PV guests and
   ENODEV otherwise, when assisted_x{2}apic isn't supported
 * Define has_assisted_x{2}apic macros for when !CONFIG_HVM
 * Replace "EPT" fault reference with "p2m" fault since the former is
   Intel-specific

v6:
 * Use ENODEV instead of EINVAL when rejecting assisted_x{2}apic
   for PV guests
 * Move has_assisted_x{2}apic macros out of an Intel specific header
 * Remove references to Intel specific features in documentation

v5:
 * Revert v4 changes in vmx_vlapic_msr_changed(), preserving the use of
   the has_assisted_x{2}apic macros
 * Following changes in assisted_x{2}apic_available definitions in
   patch 1, retighten conditionals for setting
   XEN_HVM_CPUID_APIC_ACCESS_VIRT and XEN_HVM_CPUID_X2APIC_VIRT in
   cpuid_hypervisor_leaves()

v4:
 * Add has_assisted_x{2}apic macros and use them where appropriate
 * Replace CPU checks with per-domain assisted_x{2}apic control
   options in vmx_vlapic_msr_changed() and cpuid_hypervisor_leaves(),
   following edits to assisted_x{2}apic_available definitions in
   patch 1
   Note: new assisted_x{2}apic_available definitions make later
   cpu_has_vmx_apic_reg_virt and cpu_has_vmx_virtual_intr_delivery
   checks redundant in vmx_vlapic_msr_changed()

v3:
 * Change info in xl.cfg to better express reality and fix
   capitalization of x{2}apic
 * Move "physinfo" variable definition to the beggining of
   libxl__domain_build_info_setdefault()
 * Reposition brackets in if statement to match libxl coding style
 * Shorten logic in libxl__arch_domain_build_info_setdefault()
 * Correct dprintk message in arch_sanitise_domain_config()
 * Make appropriate changes in vmx_vlapic_msr_changed() and
   cpuid_hypervisor_leaves() for amended "assisted_x2apic" bit
 * Remove unneeded parantheses

v2:
 * Add a LIBXL_HAVE_ASSISTED_APIC macro
 * Pass xcpyshinfo as a pointer in libxl__arch_get_physinfo
 * Add a return statement in now "int"
   libxl__arch_domain_build_info_setdefault()
 * Preserve libxl__arch_domain_build_info_setdefault 's location in
   libxl_create.c
 * Correct x{2}apic default setting logic in
   libxl__arch_domain_prepare_config()
 * Correct logic for parsing assisted_x{2}apic host/guest options in
   xl_parse.c and initialize them to -1 in xl.c
 * Use guest options directly in vmx_vlapic_msr_changed
 * Fix indentation of bool assisted_x{2}apic in struct hvm_domain
 * Add a change in xenctrl_stubs.c to pass xenctrl ABI checks
---
 docs/man/xl.cfg.5.pod.in  | 15 +++
 docs/man/xl.conf.5.pod.in | 12 
 tools/golang/xenlight/helpers.gen.go  | 12 
 tools/golang/xenlight/types.gen.go|  2 ++
 tools/include/libxl.h |  7 +++
 tools/libs/light/libxl_arch.h |  5 +++--
 tools/libs/light/libxl_arm.c  |  9 ++---
 tools/libs/light/libxl_create.c   | 22 +-
 to

[PATCH RESEND v10 1/2] xen+tools: Report Interrupt Controller Virtualization capabilities on x86

2022-06-29 Thread Jane Malalane
Add XEN_SYSCTL_PHYSCAP_X86_ASSISTED_XAPIC and
XEN_SYSCTL_PHYSCAP_X86_ASSISTED_X2APIC to report accelerated xAPIC and
x2APIC, on x86 hardware. This is so that xAPIC and x2APIC virtualization
can subsequently be enabled on a per-domain basis.
No such features are currently implemented on AMD hardware.

HW assisted xAPIC virtualization will be reported if HW, at the
minimum, supports virtualize_apic_accesses as this feature alone means
that an access to the APIC page will cause an APIC-access VM exit. An
APIC-access VM exit provides a VMM with information about the access
causing the VM exit, unlike a regular EPT fault, thus simplifying some
internal handling.

HW assisted x2APIC virtualization will be reported if HW supports
virtualize_x2apic_mode and, at least, either apic_reg_virt or
virtual_intr_delivery. This also means that
sysctl follows the conditionals in vmx_vlapic_msr_changed().

For that purpose, also add an arch-specific "capabilities" parameter
to struct xen_sysctl_physinfo.

Note that this interface is intended to be compatible with AMD so that
AVIC support can be introduced in a future patch. Unlike Intel that
has multiple controls for APIC Virtualization, AMD has one global
'AVIC Enable' control bit, so fine-graining of APIC virtualization
control cannot be done on a common interface.

Suggested-by: Andrew Cooper 
Signed-off-by: Jane Malalane 
Reviewed-by: "Roger Pau Monné" 
Reviewed-by: Jan Beulich 
Reviewed-by: Anthony PERARD 
---
CC: George Dunlap 
CC: Nick Rosbrook 
CC: Wei Liu 
CC: Anthony PERARD 
CC: Andrew Cooper 
CC: Jan Beulich 
CC: Julien Grall 
CC: Stefano Stabellini 
CC: Juergen Gross 
CC: Christian Lindig 
CC: David Scott 
CC: "Roger Pau Monné" 
CC: Jun Nakajima 
CC: Kevin Tian 

v10:
 * Make assisted_x{2}apic_available conditional upon _vmx_cpu_up()

v9:
 * Move assisted_x{2}apic_available to vmx_vmcs_init() so they get
   declared at boot time, after vmx_secondary_exec_control is set

v8:
 * Improve commit message

v7:
 * Make sure assisted_x{2}apic_available evaluates to false, to ensure
   Xen builds, when !CONFIG_HVM
 * Fix coding style issues

v6:
 * Limit abi check to x86
 * Fix coding style issue

v5:
 * Have assisted_xapic_available solely depend on
   cpu_has_vmx_virtualize_apic_accesses and assisted_x2apic_available
   depend on cpu_has_vmx_virtualize_x2apic_mode and
   cpu_has_vmx_apic_reg_virt OR cpu_has_vmx_virtual_intr_delivery

v4:
 * Fallback to the original v2/v1 conditions for setting
   assisted_xapic_available and assisted_x2apic_available so that in
   the future APIC virtualization can be exposed on AMD hardware
   since fine-graining of "AVIC" is not supported, i.e., AMD solely
   uses "AVIC Enable". This also means that sysctl mimics what's
   exposed in CPUID

v3:
 * Define XEN_SYSCTL_PHYSCAP_ARCH_MAX for ABI checking and actually
   set "arch_capbilities", via a call to c_bitmap_to_ocaml_list()
 * Have assisted_x2apic_available only depend on
   cpu_has_vmx_virtualize_x2apic_mode

v2:
 * Use one macro LIBXL_HAVE_PHYSINFO_ASSISTED_APIC instead of two
 * Pass xcpyshinfo as a pointer in libxl__arch_get_physinfo
 * Set assisted_x{2}apic_available to be conditional upon "bsp" and
   annotate it with __ro_after_init
 * Change XEN_SYSCTL_PHYSCAP_ARCH_ASSISTED_X{2}APIC to
   _X86_ASSISTED_X{2}APIC
 * Keep XEN_SYSCTL_PHYSCAP_X86_ASSISTED_X{2}APIC contained within
   sysctl.h
 * Fix padding introduced in struct xen_sysctl_physinfo and bump
   XEN_SYSCTL_INTERFACE_VERSION
---
 tools/golang/xenlight/helpers.gen.go |  4 
 tools/golang/xenlight/types.gen.go   |  2 ++
 tools/include/libxl.h|  7 +++
 tools/libs/light/libxl.c |  3 +++
 tools/libs/light/libxl_arch.h|  4 
 tools/libs/light/libxl_arm.c |  5 +
 tools/libs/light/libxl_types.idl |  2 ++
 tools/libs/light/libxl_x86.c | 11 +++
 tools/ocaml/libs/xc/xenctrl.ml   |  5 +
 tools/ocaml/libs/xc/xenctrl.mli  |  5 +
 tools/ocaml/libs/xc/xenctrl_stubs.c  | 15 +--
 tools/xl/xl_info.c   |  6 --
 xen/arch/x86/hvm/hvm.c   |  3 +++
 xen/arch/x86/hvm/vmx/vmcs.c  |  7 +++
 xen/arch/x86/include/asm/hvm/hvm.h   |  5 +
 xen/arch/x86/sysctl.c|  4 
 xen/include/public/sysctl.h  | 11 ++-
 17 files changed, 94 insertions(+), 5 deletions(-)

diff --git a/tools/golang/xenlight/helpers.gen.go 
b/tools/golang/xenlight/helpers.gen.go
index b746ff1081..dd4e6c9f14 100644
--- a/tools/golang/xenlight/helpers.gen.go
+++ b/tools/golang/xenlight/helpers.gen.go
@@ -3373,6 +3373,8 @@ x.CapVmtrace = bool(xc.cap_vmtrace)
 x.CapVpmu = bool(xc.cap_vpmu)
 x.CapGnttabV1 = bool(xc.cap_gnttab_v1)
 x.CapGnttabV2 = bool(xc.cap_gnttab_v2)
+x.CapAssistedXapic = bool(xc.cap_assisted_xapic)
+x.CapAssistedX2Apic = bool(xc.cap_assisted_x2apic)
 
  return nil}
 
@@ -3407,6 +3409,8 @@ xc.cap_vmtrace = C.bool(x.CapVmtrace)
 xc.cap_vpmu = C.bool(x.CapVpmu)
 xc.cap_gnttab_v1 = C.b

[PATCH RESEND v10 0/2] xen: Report and use hardware APIC virtualization capabilities

2022-06-29 Thread Jane Malalane
Jane Malalane (2):
  xen+tools: Report Interrupt Controller Virtualization capabilities on
x86
  x86/xen: Allow per-domain usage of hardware virtualized APIC

 docs/man/xl.cfg.5.pod.in  | 15 ++
 docs/man/xl.conf.5.pod.in | 12 +++
 tools/golang/xenlight/helpers.gen.go  | 16 ++
 tools/golang/xenlight/types.gen.go|  4 
 tools/include/libxl.h | 14 +
 tools/libs/light/libxl.c  |  3 +++
 tools/libs/light/libxl_arch.h |  9 ++--
 tools/libs/light/libxl_arm.c  | 14 ++---
 tools/libs/light/libxl_create.c   | 22 
 tools/libs/light/libxl_types.idl  |  4 
 tools/libs/light/libxl_x86.c  | 39 +--
 tools/ocaml/libs/xc/xenctrl.ml|  7 +++
 tools/ocaml/libs/xc/xenctrl.mli   |  7 +++
 tools/ocaml/libs/xc/xenctrl_stubs.c   | 17 ---
 tools/xl/xl.c |  8 +++
 tools/xl/xl.h |  2 ++
 tools/xl/xl_info.c|  6 --
 tools/xl/xl_parse.c   | 19 +
 xen/arch/x86/domain.c | 29 +-
 xen/arch/x86/hvm/hvm.c|  3 +++
 xen/arch/x86/hvm/vmx/vmcs.c   | 11 ++
 xen/arch/x86/hvm/vmx/vmx.c| 13 
 xen/arch/x86/include/asm/hvm/domain.h |  6 ++
 xen/arch/x86/include/asm/hvm/hvm.h| 10 +
 xen/arch/x86/sysctl.c |  4 
 xen/arch/x86/traps.c  |  5 +++--
 xen/include/public/arch-x86/xen.h |  5 +
 xen/include/public/sysctl.h   | 11 +-
 28 files changed, 281 insertions(+), 34 deletions(-)

-- 
2.11.0




[PATCH v2] docs/misra: Add instructions for cppcheck

2022-06-29 Thread Luca Fancellu
Add instructions on how to build cppcheck, the version currently used
and an example to use the cppcheck integration to run the analysis on
the Xen codebase

Signed-off-by: Luca Fancellu 
---
Changes in v2:
- typo fixes, removed build command line, rephrasing (Julien)
---
 docs/misra/cppcheck.txt | 64 +
 1 file changed, 64 insertions(+)
 create mode 100644 docs/misra/cppcheck.txt

diff --git a/docs/misra/cppcheck.txt b/docs/misra/cppcheck.txt
new file mode 100644
index ..25d8c3050b72
--- /dev/null
+++ b/docs/misra/cppcheck.txt
@@ -0,0 +1,64 @@
+Cppcheck for Xen static and MISRA analysis
+==
+
+Xen can be analysed for both static analysis problems and MISRA violation using
+cppcheck, the open source tool allows the creation of a report with all the
+findings. Xen has introduced the support in the Makefile so it's very easy to
+use and in this document we can see how.
+
+The minimum version required for cppcheck is 2.7. Note that at the time of
+writing (June 2022), the version 2.8 is known to be broken [1].
+
+Install cppcheck on the system
+==
+
+Cppcheck can be retrieved from the github repository or by downloading the
+tarball, the version tested so far is the 2.7:
+
+ - https://github.com/danmar/cppcheck/tree/2.7
+ - https://github.com/danmar/cppcheck/archive/2.7.tar.gz
+
+To compile and install it, the complete command line can be found in readme.md,
+section "GNU make", please add the "install" target to that line and use every
+argument as it is in the documentation of the tool, so that every Xen developer
+following this page can reproduce the same findings.
+
+This will compile and install cppcheck in /usr/bin and all the cppcheck config
+files and addons will be installed in /usr/share/cppcheck folder, please modify
+that path in FILESDIR if it's not convinient for your system.
+
+If you don't want to overwrite a possible cppcheck binary installed in your
+system, you can omit the "install" target and FILESDIR, cppcheck will be just
+compiled and the binaries will be available in the same folder.
+If you choose to do that, later in this page it's explained how to use a local
+installation of cppcheck for the Xen analysis.
+
+Dependencies are listed in the readme.md of the project repository.
+
+Use cppcheck to analyse Xen
+===
+
+Using cppcheck integration is very simple, it requires few steps:
+
+ 1) Compile Xen
+ 2) call the cppcheck make target to generate a report in xml format:
+make CPPCHECK_MISRA=y cppcheck
+ 3) call the cppcheck-html make target to generate a report in xml and html
+format:
+make CPPCHECK_MISRA=y cppcheck-html
+
+In case the cppcheck binaries are not in the PATH, CPPCHECK and
+CPPCHECK_HTMLREPORT variables can be overridden with the full path to the
+binaries:
+
+make -C xen \
+CPPCHECK=/path/to/cppcheck \
+CPPCHECK_HTMLREPORT=/path/to/cppcheck-htmlreport \
+CPPCHECK_MISRA=y \
+cppcheck-html
+
+The output is by default in a folder named cppcheck-htmlreport, but the name
+can be changed by passing it in the CPPCHECK_HTMLREPORT_OUTDIR variable.
+
+
+[1] 
https://sourceforge.net/p/cppcheck/discussion/general/thread/bfc3ab6c41/?limit=25
-- 
2.17.1




[PATCH] tools/helpers: fix snprintf argument in init-dom0less.c

2022-06-29 Thread Luca Fancellu
Fix snprintf argument in init-dom0less.c because two instances of
the function are using libxl_dominfo struct members that are uint64_t
types, so change "%lu" to "%"PRIu64 to handle it properly when
building on arm32 and arm64.

Signed-off-by: Luca Fancellu 
---
 tools/helpers/init-dom0less.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/tools/helpers/init-dom0less.c b/tools/helpers/init-dom0less.c
index 4c90dd6a0c8f..fee93459c4b9 100644
--- a/tools/helpers/init-dom0less.c
+++ b/tools/helpers/init-dom0less.c
@@ -5,6 +5,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -138,10 +139,10 @@ static int create_xenstore(struct xs_handle *xsh,
   "vm/" LIBXL_UUID_FMT, LIBXL_UUID_BYTES(uuid));
 if (rc < 0 || rc >= STR_MAX_LENGTH)
 return rc;
-rc = snprintf(max_memkb_str, STR_MAX_LENGTH, "%lu", info->max_memkb);
+rc = snprintf(max_memkb_str, STR_MAX_LENGTH, "%"PRIu64, info->max_memkb);
 if (rc < 0 || rc >= STR_MAX_LENGTH)
 return rc;
-rc = snprintf(target_memkb_str, STR_MAX_LENGTH, "%lu", 
info->current_memkb);
+rc = snprintf(target_memkb_str, STR_MAX_LENGTH, "%"PRIu64, 
info->current_memkb);
 if (rc < 0 || rc >= STR_MAX_LENGTH)
 return rc;
 rc = snprintf(ring_ref_str, STR_MAX_LENGTH, "%lld",
-- 
2.17.1




[qemu-mainline test] 171393: tolerable FAIL - PUSHED

2022-06-29 Thread osstest service owner
flight 171393 qemu-mainline real [real]
http://logs.test-lab.xenproject.org/osstest/logs/171393/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 171376
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 171376
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 171376
 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 171376
 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check   fail like 171376
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 171376
 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 171376
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 171376
 test-amd64-i386-xl-pvshim14 guest-start  fail   never pass
 test-arm64-arm64-xl-seattle  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt  15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-raw  14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-arm64-arm64-xl-vhd  14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  15 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass
 test-armhf-armhf-libvirt-qcow2 14 migrate-support-checkfail never pass
 test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass

version targeted for testing:
 qemuu2a8835cb45371a1f05c9c5899741d66685290f28
baseline version:
 qemuu29f6db75667f44f3f01ba5037dacaf9ebd9328da

Last test of basis   171376  2022-06-27 23:09:50 Z1 days
Testing same since   171380  2022-06-28 08:41:09 Z1 days3 attempts


People who touched revisions under test:
  Alex Bennée 
  David Hildenbrand 
  Jagannathan Rama

[libvirt test] 171396: regressions - FAIL

2022-06-29 Thread osstest service owner
flight 171396 libvirt real [real]
http://logs.test-lab.xenproject.org/osstest/logs/171396/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-arm64-libvirt   6 libvirt-buildfail REGR. vs. 151777
 build-amd64-libvirt   6 libvirt-buildfail REGR. vs. 151777
 build-i386-libvirt6 libvirt-buildfail REGR. vs. 151777
 build-armhf-libvirt   6 libvirt-buildfail REGR. vs. 151777

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-pair  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-amd64-amd64-libvirt-vhd  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-pair  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-amd64-i386-libvirt-raw   1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-xsm   1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-qcow2  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-raw  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-raw  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-qcow2  1 build-check(1)   blocked  n/a

version targeted for testing:
 libvirt  0dd1fdae2e9f8a3e2ca8cd4c2aa5f475146d9623
baseline version:
 libvirt  2c846fa6bcc11929c9fb857a22430fb9945654ad

Last test of basis   151777  2020-07-10 04:19:19 Z  719 days
Failing since151818  2020-07-11 04:18:52 Z  718 days  700 attempts
Testing same since   171396  2022-06-29 04:20:22 Z0 days1 attempts


People who touched revisions under test:
Adolfo Jayme Barrientos 
  Aleksandr Alekseev 
  Aleksei Zakharov 
  Amneesh Singh 
  Andika Triwidada 
  Andrea Bolognani 
  Andrew Melnychenko 
  Ani Sinha 
  Balázs Meskó 
  Barrett Schonefeld 
  Bastian Germann 
  Bastien Orivel 
  BiaoXiang Ye 
  Bihong Yu 
  Binfeng Wu 
  Bjoern Walk 
  Boris Fiuczynski 
  Brad Laue 
  Brian Turek 
  Bruno Haible 
  Chris Mayo 
  Christian Borntraeger 
  Christian Ehrhardt 
  Christian Kirbach 
  Christian Schoenebeck 
  Christophe Fergeau 
  Claudio Fontana 
  Cole Robinson 
  Collin Walling 
  Cornelia Huck 
  Cédric Bosdonnat 
  Côme Borsoi 
  Daniel Henrique Barboza 
  Daniel Letai 
  Daniel P. Berrange 
  Daniel P. Berrangé 
  Didik Supriadi 
  dinglimin 
  Divya Garg 
  Dmitrii Shcherbakov 
  Dmytro Linkin 
  Eiichi Tsukata 
  Emilio Herrera 
  Eric Farman 
  Erik Skultety 
  Fabian Affolter 
  Fabian Freyer 
  Fabiano Fidêncio 
  Fangge Jin 
  Farhan Ali 
  Fedora Weblate Translation 
  Florian Schmidt 
  Franck Ridel 
  Gavi Teitz 
  gongwei 
  Guoyi Tu
  Göran Uddeborg 
  Halil Pasic 
  Han Han 
  Hao Wang 
  Haonan Wang 
  Hela Basa 
  Helmut Grohne 
  Hiroki Narukawa 
  Hyman Huang(黄勇) 
  Ian Wienand 
  Ioanna Alifieraki 
  Ivan Teterevkov 
  Jakob Meng 
  Jamie Strandboge 
  Jamie Strandboge 
  Jan Kuparinen 
  jason lee 
  Jean-Baptiste Holcroft 
  Jia Zhou 
  Jianan Gao 
  Jim Fehlig 
  Jin Yan 
  Jing Qi 
  Jinsheng Zhang 
  Jiri Denemark 
  Joachim Falk 
  John Ferlan 
  John Levon 
  John Levon 
  Jonathan Watt 
  Jonathon Jongsma 
  Julio Faracco 
  Justin Gatzen 
  Ján Tomko 
  Kashyap Chamarthy 
  Kevin Locke 
  Kim InSoo 
  Koichi Murase 
  Kristina Hanicova 
  Laine Stump 
  Laszlo Ersek 
  Lee Yarwood 
  Lei Yang 
  Lena Voytek 
  Liang Yan 
  Liang Yan 
  Liao Pingfang 
  Lin Ma 
  Lin Ma 
  Lin Ma 
  Liu Yiding 
  Lubomir Rintel 
  Luke Yue 
  Luyao Zhong 
  luzhipeng 
  Marc Hartmayer 
  Marc-André Lureau 
  Marek Marczykowski-Górecki 
  Mark Mielke 
  Markus Schade 
  Martin Kletzander 
  Martin Pitt 
  Masayoshi Mizuma 
  Matej Cepl 
  Matt Coleman 
  Matt Coleman 
  Mauro Matteo Cascella 
  Max Goodhart 
  Maxim Nestratov 
  Meina Li 
  Michal Privoznik 
  Michał Smyk 
  Milo Casagrande 
  Moshe Levi 
  Moteen Shah 
  Moteen Shah 
  Muha Aliss 
  Nathan 
  Neal Gompa 
  Nick Chevsky 
  Nick Shyrokovskiy 
  Nickys Music Group 
  Nico Pache 
  Nicolas Lécureuil 
  Nicolas Lécureuil 
  Nikolay Shirokovskiy 
  Nikolay Shirokovskiy 
  Nikolay Shirokovskiy 
  Niteesh Dubey 
  Olaf Hering 
  Olesya Gerasimenko 
  Or Ozeri 
  Orion Poplawski 
  Pany 
  Paolo Bonzini 
  Patrick Magauran 
  Paulo de Rezende Pinatti 
  Pavel Hrdina 
  Peng Lia

Re: [PATCH v10 0/2] xen: Report and use hardware APIC virtualization capabilities

2022-06-29 Thread Jan Beulich
On 29.06.2022 12:43, Anthony PERARD wrote:
> On Thu, Jun 23, 2022 at 09:23:27AM +0200, Jan Beulich wrote:
>> On 13.04.2022 13:21, Jane Malalane wrote:
>>> Jane Malalane (2):
>>>   xen+tools: Report Interrupt Controller Virtualization capabilities on
>>> x86
>>>   x86/xen: Allow per-domain usage of hardware virtualized APIC
>>>
>>>  docs/man/xl.cfg.5.pod.in  | 15 ++
>>>  docs/man/xl.conf.5.pod.in | 12 +++
>>>  tools/golang/xenlight/helpers.gen.go  | 16 ++
>>>  tools/golang/xenlight/types.gen.go|  4 
>>>  tools/include/libxl.h | 14 +
>>>  tools/libs/light/libxl.c  |  3 +++
>>>  tools/libs/light/libxl_arch.h |  9 ++--
>>>  tools/libs/light/libxl_arm.c  | 14 ++---
>>>  tools/libs/light/libxl_create.c   | 22 
>>>  tools/libs/light/libxl_types.idl  |  4 
>>>  tools/libs/light/libxl_x86.c  | 39 
>>> +--
>>>  tools/ocaml/libs/xc/xenctrl.ml|  7 +++
>>>  tools/ocaml/libs/xc/xenctrl.mli   |  7 +++
>>>  tools/ocaml/libs/xc/xenctrl_stubs.c   | 17 ---
>>>  tools/xl/xl.c |  8 +++
>>>  tools/xl/xl.h |  2 ++
>>>  tools/xl/xl_info.c|  6 --
>>>  tools/xl/xl_parse.c   | 19 +
>>>  xen/arch/x86/domain.c | 29 +-
>>>  xen/arch/x86/hvm/hvm.c|  3 +++
>>>  xen/arch/x86/hvm/vmx/vmcs.c   | 11 ++
>>>  xen/arch/x86/hvm/vmx/vmx.c| 13 
>>>  xen/arch/x86/include/asm/hvm/domain.h |  6 ++
>>>  xen/arch/x86/include/asm/hvm/hvm.h| 10 +
>>>  xen/arch/x86/sysctl.c |  4 
>>>  xen/arch/x86/traps.c  |  5 +++--
>>>  xen/include/public/arch-x86/xen.h |  5 +
>>>  xen/include/public/sysctl.h   | 11 +-
>>>  28 files changed, 281 insertions(+), 34 deletions(-)
>>>
>>
>> Just FYI: It's been over two months that v10 has been pending. There
>> are still missing acks. You may want to ping the respective maintainers
>> for this to make progress.
> 
> Are you looking for a ack for the "docs/man" changes? If so, I guess
> I'll have to make it more explicit next time that a review for "tools"
> also mean review of the changes in their respective man pages.

No, the docs changes (being clearly tools docs) are fine.

> Or are you looking for a ack for the "golang" changes? Those changes are
> automatically generated by a tool already in our repository.

Indeed it's Go (where I think an ack is still required, no matter
if the changes are generated ones [which I wasn't even aware of, I
have to confess]) and ...

> Or is it an "ocaml" ack for the first patch? Unfortunately, the
> maintainers haven't been CCed, I guess that could be an issue.

... OCaml which I was after.

Jan



Re: [PATCH 3/7] xen/common: Use unsigned int instead of plain unsigned

2022-06-29 Thread Julien Grall

Hi,

On 29/06/2022 11:46, Michal Orzel wrote:

On 27.06.2022 15:15, Michal Orzel wrote:

This is just for the style and consistency reasons as the former is
being used more often than the latter.

Signed-off-by: Michal Orzel 


It looks like this change was forgotten when merging other patches from the 
series.


I noticed the same and was going to commit it yesterday night. However, 
it is technically missing an ack/review for trace.c (this is maintained 
by George).


The change is small and likely not controversial. So I guess we could do 
without George's review. That said, I would like to give him a chance to 
answer (I will commit it on Friday if there are no answer).


Cheers,

--
Julien Grall



Re: [PATCH v7 7/9] xen/arm: unpopulate memory when domain is static

2022-06-29 Thread Julien Grall

Hi Jan,

On 29/06/2022 07:19, Jan Beulich wrote:

On 29.06.2022 08:08, Penny Zheng wrote:

From: Jan Beulich 
Sent: Wednesday, June 29, 2022 1:56 PM

On 29.06.2022 05:12, Penny Zheng wrote:

From: Julien Grall 
Sent: Monday, June 27, 2022 6:19 PM

On 27/06/2022 11:03, Penny Zheng wrote:

-Original Message-

put_static_pages, that is, adding pages to the reserved list, is
only for freeing static pages on runtime. In static page
initialization stage, I also use free_statimem_pages, and in which
stage, I think the domain has not been constructed at all. So I
prefer the freeing of staticmem pages is split into two parts:
free_staticmem_pages and put_static_pages


AFAIU, all the pages would have to be allocated via
acquire_domstatic_pages(). This call requires the domain to be
allocated and setup for handling memory.

Therefore, I think the split is unnecessary. This would also have the
advantage to remove one loop. Admittly, this is not important when
the order 0, but it would become a problem for larger order (you may
have to pull the struct page_info multiple time in the cache).



How about this:
I create a new func free_domstatic_page, and it will be like:
"
static void free_domstatic_page(struct domain *d, struct page_info
*page) {
 unsigned int i;
 bool need_scrub;

 /* NB. May recursively lock from relinquish_memory(). */
 spin_lock_recursive(&d->page_alloc_lock);

 arch_free_heap_page(d, page);

 /*
  * Normally we expect a domain to clear pages before freeing them,
  * if it cares about the secrecy of their contents. However, after
  * a domain has died we assume responsibility for erasure. We do
  * scrub regardless if option scrub_domheap is set.
  */
 need_scrub = d->is_dying || scrub_debug || opt_scrub_domheap;

 free_staticmem_pages(page, 1, need_scrub);

 /* Add page on the resv_page_list *after* it has been freed. */
 put_static_page(d, page);

 drop_dom_ref = !domain_adjust_tot_pages(d, -1);

 spin_unlock_recursive(&d->page_alloc_lock);

 if ( drop_dom_ref )
 put_domain(d);
}
"

In free_domheap_pages, we just call free_domstatic_page:

"
@@ -2430,6 +2430,9 @@ void free_domheap_pages(struct page_info *pg,
unsigned int order)

  ASSERT_ALLOC_CONTEXT();

+if ( unlikely(pg->count_info & PGC_static) )
+return free_domstatic_page(d, pg);
+
  if ( unlikely(is_xen_heap_page(pg)) )
  {
  /* NB. May recursively lock from relinquish_memory(). */ @@
-2673,6 +2676,38 @@ void free_staticmem_pages(struct page_info *pg,
unsigned long nr_mfns, "

Then the split could be avoided and we could save the loop as much as

possible.

Any suggestion?


Looks reasonable at the first glance (will need to see it in proper context for 
a
final opinion), provided e.g. Xen heap pages can never be static.


If you don't prefer let free_domheap_pages to call free_domstatic_page, then, 
maybe
the if-array should happen at put_page
"
@@ -1622,6 +1622,8 @@ void put_page(struct page_info *page)

  if ( unlikely((nx & PGC_count_mask) == 0) )
  {
+if ( unlikely(page->count_info & PGC_static) )


At a first glance, this would likely need to be tested against 'nx'.


+free_domstatic_page(page);
  free_domheap_page(page);
  }
  }
"
Wdyt now?


Personally I'd prefer this variant, but we'll have to see what Julien or
the other Arm maintainers think.


I think this is fine so long we are not expecting more places where 
free_domheap_page() may need to be replaced with free_domstatic_page().


I can't think of any at the moment, but I would like Penny to confirm 
what Arm plans to do with static memory.


Cheers,

--
Julien Grall



Re: [PATCH 3/7] xen/common: Use unsigned int instead of plain unsigned

2022-06-29 Thread Michal Orzel
On 27.06.2022 15:15, Michal Orzel wrote:
> This is just for the style and consistency reasons as the former is
> being used more often than the latter.
> 
> Signed-off-by: Michal Orzel 

It looks like this change was forgotten when merging other patches from the 
series.



Re: [PATCH v10 0/2] xen: Report and use hardware APIC virtualization capabilities

2022-06-29 Thread Anthony PERARD
On Thu, Jun 23, 2022 at 09:23:27AM +0200, Jan Beulich wrote:
> On 13.04.2022 13:21, Jane Malalane wrote:
> > Jane Malalane (2):
> >   xen+tools: Report Interrupt Controller Virtualization capabilities on
> > x86
> >   x86/xen: Allow per-domain usage of hardware virtualized APIC
> > 
> >  docs/man/xl.cfg.5.pod.in  | 15 ++
> >  docs/man/xl.conf.5.pod.in | 12 +++
> >  tools/golang/xenlight/helpers.gen.go  | 16 ++
> >  tools/golang/xenlight/types.gen.go|  4 
> >  tools/include/libxl.h | 14 +
> >  tools/libs/light/libxl.c  |  3 +++
> >  tools/libs/light/libxl_arch.h |  9 ++--
> >  tools/libs/light/libxl_arm.c  | 14 ++---
> >  tools/libs/light/libxl_create.c   | 22 
> >  tools/libs/light/libxl_types.idl  |  4 
> >  tools/libs/light/libxl_x86.c  | 39 
> > +--
> >  tools/ocaml/libs/xc/xenctrl.ml|  7 +++
> >  tools/ocaml/libs/xc/xenctrl.mli   |  7 +++
> >  tools/ocaml/libs/xc/xenctrl_stubs.c   | 17 ---
> >  tools/xl/xl.c |  8 +++
> >  tools/xl/xl.h |  2 ++
> >  tools/xl/xl_info.c|  6 --
> >  tools/xl/xl_parse.c   | 19 +
> >  xen/arch/x86/domain.c | 29 +-
> >  xen/arch/x86/hvm/hvm.c|  3 +++
> >  xen/arch/x86/hvm/vmx/vmcs.c   | 11 ++
> >  xen/arch/x86/hvm/vmx/vmx.c| 13 
> >  xen/arch/x86/include/asm/hvm/domain.h |  6 ++
> >  xen/arch/x86/include/asm/hvm/hvm.h| 10 +
> >  xen/arch/x86/sysctl.c |  4 
> >  xen/arch/x86/traps.c  |  5 +++--
> >  xen/include/public/arch-x86/xen.h |  5 +
> >  xen/include/public/sysctl.h   | 11 +-
> >  28 files changed, 281 insertions(+), 34 deletions(-)
> > 
> 
> Just FYI: It's been over two months that v10 has been pending. There
> are still missing acks. You may want to ping the respective maintainers
> for this to make progress.

Hi Jan,

Are you looking for a ack for the "docs/man" changes? If so, I guess
I'll have to make it more explicit next time that a review for "tools"
also mean review of the changes in their respective man pages.

Or are you looking for a ack for the "golang" changes? Those changes are
automatically generated by a tool already in our repository.

Or is it an "ocaml" ack for the first patch? Unfortunately, the
maintainers haven't been CCed, I guess that could be an issue.

Cheers,

-- 
Anthony PERARD



Re: [PATCH v5 2/8] xen/arm: allocate static shared memory to the default owner dom_io

2022-06-29 Thread Julien Grall




On 29/06/2022 08:13, Penny Zheng wrote:

Hi Julien


Hi Penny,




-Original Message-
From: Julien Grall 
Sent: Saturday, June 25, 2022 2:22 AM
To: Penny Zheng ; xen-devel@lists.xenproject.org
Cc: Wei Chen ; Stefano Stabellini
; Bertrand Marquis ;
Volodymyr Babchuk ; Andrew Cooper
; George Dunlap ;
Jan Beulich ; Wei Liu 
Subject: Re: [PATCH v5 2/8] xen/arm: allocate static shared memory to the
default owner dom_io

Hi Penny,

On 20/06/2022 06:11, Penny Zheng wrote:

From: Penny Zheng 

This commit introduces process_shm to cope with static shared memory
in domain construction.

DOMID_IO will be the default owner of memory pre-shared among

multiple

domains at boot time, when no explicit owner is specified.


The document in patch #1 suggest the page will be shared with dom_shared.
But here you say "DOMID_IO".

Which one is correct?



I’ll fix the documentation, DOM_IO is the last call.



This commit only considers allocating static shared memory to dom_io
when owner domain is not explicitly defined in device tree, all the
left, including the "borrower" code path, the "explicit owner" code
path, shall be introduced later in the following patches.

Signed-off-by: Penny Zheng 
Reviewed-by: Stefano Stabellini 
---
v5 change:
- refine in-code comment
---
v4 change:
- no changes
---
v3 change:
- refine in-code comment
---
v2 change:
- instead of introducing a new system domain, reuse the existing
dom_io
- make dom_io a non-auto-translated domain, then no need to create P2M
for it
- change dom_io definition and make it wider to support static shm
here too
- introduce is_shm_allocated_to_domio to check whether static shm is
allocated yet, instead of using shm_mask bitmap
- add in-code comment
---
   xen/arch/arm/domain_build.c | 132

+++-

   xen/common/domain.c |   3 +
   2 files changed, 134 insertions(+), 1 deletion(-)

diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index 7ddd16c26d..91a5ace851 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -527,6 +527,10 @@ static bool __init

append_static_memory_to_bank(struct domain *d,

   return true;
   }

+/*
+ * If cell is NULL, pbase and psize should hold valid values.
+ * Otherwise, cell will be populated together with pbase and psize.
+ */
   static mfn_t __init acquire_static_memory_bank(struct domain *d,
  const __be32 **cell,
  u32 addr_cells, u32
size_cells, @@ -535,7 +539,8 @@ static mfn_t __init

acquire_static_memory_bank(struct domain *d,

   mfn_t smfn;
   int res;

-device_tree_get_reg(cell, addr_cells, size_cells, pbase, psize);
+if ( cell )
+device_tree_get_reg(cell, addr_cells, size_cells, pbase,
+ psize);


I think this is a bit of a hack. To me it sounds like this should be moved out 
to
a separate helper. This will also make the interface of
acquire_shared_memory_bank() less questionable (see below).



Ok,  I'll try to not reuse acquire_static_memory_bank in
acquire_shared_memory_bank.


I am OK with that so long it doesn't involve too much duplication.


   ASSERT(IS_ALIGNED(*pbase, PAGE_SIZE) && IS_ALIGNED(*psize,
PAGE_SIZE));


In the context of your series, who is checking that both psize and pbase are
suitably aligned?



Actually, the whole parsing process is redundant for the static shared memory.
I've already parsed it and checked it before in process_shm.


I looked at process_shm() and couldn't find any code that would check 
pbase and psize are suitable aligned (ASSERT() doesn't count).





+return true;
+}
+
+static mfn_t __init acquire_shared_memory_bank(struct domain *d,
+   u32 addr_cells, u32 size_cells,
+   paddr_t *pbase,
+paddr_t *psize)


There is something that doesn't add-up in this interface. The use of pointer
implies that pbase and psize may be modified by the function. So...



Just like you points out before, it's a bit hacky to use 
acquire_static_memory_bank,
and the pointer used here is also due to it. Internal parsing process of 
acquire_static_memory_bank
needs pointer to deliver the result.

I’ll rewrite acquire_shared_memory, and it will be like:
"
static mfn_t __init acquire_shared_memory_bank(struct domain *d,
paddr_t pbase, paddr_t psize)
{
 mfn_t smfn;
 unsigned long nr_pfns;
 int res;

 /*
  * Pages of statically shared memory shall be included
  * in domain_tot_pages().
  */
 nr_pfns = PFN_DOWN(psize);
 if ( d->max_page + nr_pfns > UINT_MAX )


On Arm32, this check would always be true a 32-bit unsigned value is 
always below UINT_MAX. On arm64, you might get away because nr_pfns is 
unsigned long (so I think the type promotion works). But this is fragile.


I would suggest to use the following check:


Re: [XEN PATCH v3 25/25] tools: Remove -Werror everywhere else

2022-06-29 Thread Juergen Gross

On 24.06.22 18:04, Anthony PERARD wrote:

Patch "tools: Add -Werror by default to all tools/" have added
"-Werror" to CFLAGS in tools/Rules.mk, remove it from every other
makefiles as it is now duplicated.

Signed-off-by: Anthony PERARD 


Reviewed-by: Juergen Gross 


Juergen


OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature


Re: [XEN PATCH v3 19/25] tools: Introduce $(xenlibs-ldlibs, ) macro

2022-06-29 Thread Juergen Gross

On 24.06.22 18:04, Anthony PERARD wrote:

This can be used when linking against multiple in-tree Xen libraries,
and avoid duplicated flags. It can be used instead of multiple
$(LDLIBS_libxen*).

For now, replace the open-coding in libs.mk.

The macro $(xenlibs-libs, ) will be useful later when only the path to
the libraries is wanted (e.g. for checking for dependencies).

Signed-off-by: Anthony PERARD 


Reviewed-by: Juergen Gross 


Juergen


OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature


Re: [XEN PATCH v3 18/25] tools: Introduce $(xenlibs-rpath,..) to replace $(SHDEPS_lib*)

2022-06-29 Thread Juergen Gross

On 24.06.22 18:04, Anthony PERARD wrote:

This patch introduce a new macro $(xenlibs-dependencies,) to generate
a list of all the xen library that a library is list against, and they
are listed only once. We use the side effect of $(sort ) which remove
duplicates.

This is used by another macro $(xenlibs-rpath,) which is to replace
$(SHDEPS_libxen*).

In libs.mk, we don't need to $(sort ) SHLIB_lib* anymore as this was used
to remove duplicates and they are no more duplicates.

Signed-off-by: Anthony PERARD 


Reviewed-by: Juergen Gross 


Juergen


OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature


Re: [XEN PATCH v3 17/25] libs/libs.mk: Rework target headers.chk dependencies

2022-06-29 Thread Juergen Gross

On 24.06.22 18:04, Anthony PERARD wrote:

There is no need to call the "headers.chk" target when it isn't
wanted, so it never need to be .PHONY.

Also, there is no more reason to separate the prerequisites from the
recipe.

Signed-off-by: Anthony PERARD 


Reviewed-by: Juergen Gross 


Juergen


OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature


Re: [XEN PATCH v3 16/25] libs/libs.mk: Remove the need for $(PKG_CONFIG_INST)

2022-06-29 Thread Juergen Gross

On 24.06.22 18:04, Anthony PERARD wrote:

We can simply use $(PKG_CONFIG) to set the parameters, and add it to
$(TARGETS) as necessary.

Signed-off-by: Anthony PERARD 


Reviewed-by: Juergen Gross 


Juergen


OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature


Re: [XEN PATCH v3 15/25] libs/libs.mk: Rename $(LIB) to $(TARGETS)

2022-06-29 Thread Juergen Gross

On 24.06.22 18:04, Anthony PERARD wrote:

Signed-off-by: Anthony PERARD 


Reviewed-by: Juergen Gross 


Juergen


OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature


Re: [XEN PATCH v3 13/25] tools/libs/util: cleanup Makefile

2022-06-29 Thread Juergen Gross

On 24.06.22 18:04, Anthony PERARD wrote:

Remove -I. from CFLAGS, it isn't necessary.

Removed $(AUTOSRCS), it isn't used.

Signed-off-by: Anthony PERARD 


Reviewed-by: Juergen Gross 


Juergen


OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature


Re: [PATCH v5 1/8] xen/arm: introduce static shared memory

2022-06-29 Thread Julien Grall




On 29/06/2022 06:38, Penny Zheng wrote:

Hi Julien


Hi Penny,




-Original Message-
From: Julien Grall 
Sent: Saturday, June 25, 2022 1:55 AM
To: Penny Zheng ; xen-devel@lists.xenproject.org
Cc: Wei Chen ; Stefano Stabellini
; Bertrand Marquis ;
Volodymyr Babchuk 
Subject: Re: [PATCH v5 1/8] xen/arm: introduce static shared memory

Hi Penny,

On 20/06/2022 06:11, Penny Zheng wrote:

From: Penny Zheng 

This patch serie introduces a new feature: setting up static


Typo: s/serie/series/


shared memory on a dom0less system, through device tree configuration.

This commit parses shared memory node at boot-time, and reserve it in
bootinfo.reserved_mem to avoid other use.

This commits proposes a new Kconfig CONFIG_STATIC_SHM to wrap
static-shm-related codes, and this option depends on static memory(
CONFIG_STATIC_MEMORY). That's because that later we want to reuse a
few helpers, guarded with CONFIG_STATIC_MEMORY, like
acquire_staticmem_pages, etc, on static shared memory.

Signed-off-by: Penny Zheng 
Reviewed-by: Stefano Stabellini 
---
v5 change:
- no change
---
v4 change:
- nit fix on doc
---
v3 change:
- make nr_shm_domain unsigned int
---
v2 change:
- document refinement
- remove bitmap and use the iteration to check
- add a new field nr_shm_domain to keep the number of shared domain
---
   docs/misc/arm/device-tree/booting.txt | 120

++

   xen/arch/arm/Kconfig  |   6 ++
   xen/arch/arm/bootfdt.c|  68 +++
   xen/arch/arm/include/asm/setup.h  |   3 +
   4 files changed, 197 insertions(+)

diff --git a/docs/misc/arm/device-tree/booting.txt
b/docs/misc/arm/device-tree/booting.txt
index 98253414b8..6467bc5a28 100644
--- a/docs/misc/arm/device-tree/booting.txt
+++ b/docs/misc/arm/device-tree/booting.txt
@@ -378,3 +378,123 @@ device-tree:

   This will reserve a 512MB region starting at the host physical address
   0x3000 to be exclusively used by DomU1.
+
+Static Shared Memory
+
+
+The static shared memory device tree nodes allow users to statically
+set up shared memory on dom0less system, enabling domains to do
+shm-based communication.
+
+- compatible
+
+"xen,domain-shared-memory-v1"
+
+- xen,shm-id
+
+An 8-bit integer that represents the unique identifier of the shared

memory

+region. The maximum identifier shall be "xen,shm-id = <0xff>".
+
+- xen,shared-mem
+
+An array takes a physical address, which is the base address of the
+shared memory region in host physical address space, a size, and a

guest

+physical address, as the target address of the mapping. The number of

cells

+for the host address (and size) is the same as the guest pseudo-physical
+address and they are inherited from the parent node.


Sorry for jump in the discussion late. But as this is going to be a stable ABI, 
I
would to make sure the interface is going to be easily extendable.

AFAIU, with your proposal the host physical address is mandatory. I would
expect that some user may want to share memory but don't care about the
exact location in memory. So I think it would be good to make it optional in
the binding.

I think this wants to be done now because it would be difficult to change the
binding afterwards (the host physical address is the first set of cells).

The Xen doesn't need to handle the optional case.



Sure, I'll make "the host physical address" optional here, and right now, with 
no actual
code implementation. I'll make up it later in free time~

The user case you mentioned here is that we let xen to allocate an arbitrary 
static shared
memory region, so size and guest physical address are still mandatory, right?


That's correct.

Cheers,

--
Julien Grall



Re: [PATCH] docs/misra: Add instructions for cppcheck

2022-06-29 Thread Luca Fancellu



> On 29 Jun 2022, at 11:16, Julien Grall  wrote:
> 
> Hi Luca,
> 
> On 28/06/2022 16:23, Luca Fancellu wrote:
>>> On 24 Jun 2022, at 18:25, Julien Grall  wrote:
>>> On 24/06/2022 14:34, Luca Fancellu wrote:
> On 24 Jun 2022, at 13:17, Julien Grall  wrote:
>> Sorry for the late reply, this would be my changes, would you agree on them?
> 
> They LGTM.

Thanks, I will send V2 soon.

> 
> Cheers,
> 
> -- 
> Julien Grall




Re: [PATCH] docs/misra: Add instructions for cppcheck

2022-06-29 Thread Julien Grall

Hi Luca,

On 28/06/2022 16:23, Luca Fancellu wrote:

On 24 Jun 2022, at 18:25, Julien Grall  wrote:
On 24/06/2022 14:34, Luca Fancellu wrote:

On 24 Jun 2022, at 13:17, Julien Grall  wrote:

Sorry for the late reply, this would be my changes, would you agree on them?


They LGTM.

Cheers,

--
Julien Grall



Re: [XEN PATCH v3 12/25] .gitignore: Cleanup ignores of tools/libs/*/{headers.chk,*.pc}

2022-06-29 Thread Juergen Gross

On 24.06.22 18:04, Anthony PERARD wrote:

Signed-off-by: Anthony PERARD 


Reviewed-by: Juergen Gross 


Juergen


OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature


Re: [RFC PATCH 1/2] xen/memory : Add stats_table resource type

2022-06-29 Thread Matias Ezequiel Vara Larsen
On Fri, Jun 17, 2022 at 08:54:34PM +, George Dunlap wrote:
> Preface: It looks like this may be one of your first hypervisor patches.  So 
> thank you!  FYI there’s a lot that needs fixing up here; please don’t read a 
> tone of annoyance into it, I’m just trying to tell you what needs fixing in 
> the most efficient manner. :-)
> 

Thanks for the comments :) I have addressed some of them already in the 
response to the
cover letter.

> > On 17 May 2022, at 15:33, Matias Ezequiel Vara Larsen 
> >  wrote:
> > 
> > Allow to map vcpu stats using acquire_resource().
> 
> This needs a lot more expansion in terms of what it is that you’re doing in 
> this patch and why.
> 

Got it. I'll improve the commit message in v1.

> > 
> > Signed-off-by: Matias Ezequiel Vara Larsen 
> > ---
> > xen/common/domain.c | 42 +
> > xen/common/memory.c | 29 +
> > xen/common/sched/core.c |  5 +
> > xen/include/public/memory.h |  1 +
> > xen/include/xen/sched.h |  5 +
> > 5 files changed, 82 insertions(+)
> > 
> > diff --git a/xen/common/domain.c b/xen/common/domain.c
> > index 17cc32fde3..ddd9f88874 100644
> > --- a/xen/common/domain.c
> > +++ b/xen/common/domain.c
> > @@ -132,6 +132,42 @@ static void vcpu_info_reset(struct vcpu *v)
> > v->vcpu_info_mfn = INVALID_MFN;
> > }
> > 
> > +static void stats_free_buffer(struct vcpu * v)
> > +{
> > +struct page_info *pg = v->stats.pg;
> > +
> > +if ( !pg )
> > +return;
> > +
> > +v->stats.va = NULL;
> > +
> > +if ( v->stats.va )
> > +unmap_domain_page_global(v->stats.va);
> > +
> > +v->stats.va = NULL;
> 
> Looks like you meant to delete the first `va->stats.va = NULL` but forgot?
> 

Apologies for this, I completely missed.

> > +
> > +free_domheap_page(pg);
> 
> Pretty sure this will crash.
> 
> Unfortunately page allocation and freeing is somewhat complicated and 
> requires a bit of boilerplate.  You can look at the vmtrace_alloc_buffer() 
> code for a template, but the general sequence you want is as follows:
> 
> * On the allocate side:
> 
> 1. Allocate the page
> 
>pg = alloc_domheap_page(d, MEMF_no_refcount);
> 
> This will allocate a page with the PGC_allocated bit set and a single 
> reference count.  (If you pass a page with PGC_allocated set to 
> free_domheap_page(), it will crash; which is why I said the above.)
> 
> 2.  Grab a general reference count for the vcpu struct’s reference to it; as 
> well as a writable type count, so that it doesn’t get used as a special page
> 
> if (get_page_and_type(pg, d, PGT_writable_page)) {
>   put_page_alloc_ref(p);
>   /* failure path */
> }
> 
> * On the free side, don’t call free_domheap_pages() directly.  Rather, drop 
> the allocation, then drop your own type count, thus:
> 
> v->stats.va  = NULL;
> 
> put_page_alloc_ref(pg);
> put_page_and_type(pg);
> 
> The issue here is that we can’t free the page until all references have 
> dropped; and the whole point of this exercise is to allow guest userspace in 
> dom0 to gain a reference to the page.  We can’t actually free the page until 
> *all* references are gone, including the userspace one in dom0.  The 
> put_page() which brings the reference count to 0 will automatically free the 
> page.
> 

Thanks for the explanation. For some reason, this is not crashing. I will
reimplement the allocation, releasing, and then update the documentation that I
proposed at
https://lists.xenproject.org/archives/html/xen-devel/2022-05/msg01963.html. The
idea of this reference document is to guide someone that would like to export a 
new
resource by relying on the acquire-resource interface. 

> 
> > +}
> > +
> > +static int stats_alloc_buffer(struct vcpu *v)
> > +{
> > +struct domain *d = v->domain;
> > +struct page_info *pg;
> > +
> > +pg = alloc_domheap_page(d, MEMF_no_refcount);
> > +
> > +if ( !pg )
> > +return -ENOMEM;
> > +
> > +v->stats.va = __map_domain_page_global(pg);
> > +if ( !v->stats.va )
> > +return -ENOMEM;
> > +
> > +v->stats.pg = pg;
> > +clear_page(v->stats.va);
> > +return 0;
> > +}
> 
> The other thing to say about this is that the memory is being allocated 
> unconditionally, even if nobody is planning to read it.  The vast majority of 
> Xen users will not be running xcp-rrd, so it will be pointless overheard.
> 
> At a basic level, you want to follow suit with the vmtrace buffers, which are 
> only allocated if the proper domctl flag is set on domain creation.  (We 
> could consider instead, or in addition, having a global Xen command-line 
> parameter which would enable this feature for all domains.)
>

I agree. I will add a domctl flag on domain creation to enable the allocation of
these buffers.
 
> > +
> > static void vmtrace_free_buffer(struct vcpu *v)
> > {
> > const struct domain *d = v->domain;
> > @@ -203,6 +239,9 @@ static int vmtrace_alloc_buffer(struct

[linux-linus test] 171389: regressions - FAIL

2022-06-29 Thread osstest service owner
flight 171389 linux-linus real [real]
http://logs.test-lab.xenproject.org/osstest/logs/171389/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-dom0pvh-xl-intel  8 xen-bootfail REGR. vs. 171277
 test-amd64-amd64-xl-qemuu-ws16-amd64  8 xen-boot fail REGR. vs. 171277
 test-amd64-amd64-xl-qemut-win7-amd64  8 xen-boot fail REGR. vs. 171277
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 8 xen-boot fail REGR. vs. 
171277
 test-amd64-amd64-pygrub   8 xen-boot fail REGR. vs. 171277
 test-amd64-amd64-freebsd11-amd64  8 xen-boot fail REGR. vs. 171277
 test-amd64-amd64-dom0pvh-xl-amd  8 xen-boot  fail REGR. vs. 171277
 test-amd64-amd64-qemuu-nested-amd  8 xen-bootfail REGR. vs. 171277
 test-amd64-amd64-xl-qemuu-win7-amd64  8 xen-boot fail REGR. vs. 171277
 test-amd64-amd64-xl-qemut-ws16-amd64  8 xen-boot fail REGR. vs. 171277
 test-amd64-amd64-xl-xsm   8 xen-boot fail REGR. vs. 171277
 test-amd64-amd64-xl-qemut-debianhvm-i386-xsm  8 xen-boot fail REGR. vs. 171277
 test-amd64-amd64-xl-qemuu-debianhvm-i386-xsm  8 xen-boot fail REGR. vs. 171277
 test-amd64-amd64-xl-qemuu-ovmf-amd64  8 xen-boot fail REGR. vs. 171277
 test-amd64-amd64-xl-credit2   8 xen-boot fail REGR. vs. 171277
 test-amd64-amd64-libvirt  8 xen-boot fail REGR. vs. 171277
 test-amd64-amd64-xl-credit1   8 xen-boot fail REGR. vs. 171277
 test-amd64-amd64-libvirt-qcow2  8 xen-boot   fail REGR. vs. 171277
 test-amd64-amd64-xl   8 xen-boot fail REGR. vs. 171277
 test-amd64-amd64-xl-qemuu-debianhvm-amd64  8 xen-bootfail REGR. vs. 171277
 test-amd64-amd64-xl-vhd   8 xen-boot fail REGR. vs. 171277
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-shadow 8 xen-boot fail REGR. vs. 
171277
 test-amd64-amd64-xl-multivcpu  8 xen-bootfail REGR. vs. 171277
 test-amd64-amd64-libvirt-pair 12 xen-boot/src_host   fail REGR. vs. 171277
 test-amd64-amd64-libvirt-pair 13 xen-boot/dst_host   fail REGR. vs. 171277
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 8 xen-boot fail REGR. 
vs. 171277
 test-amd64-amd64-xl-shadow8 xen-boot fail REGR. vs. 171277
 test-amd64-amd64-libvirt-raw  8 xen-boot fail REGR. vs. 171277
 test-amd64-amd64-qemuu-nested-intel  8 xen-boot  fail REGR. vs. 171277
 test-amd64-amd64-libvirt-xsm  8 xen-boot fail REGR. vs. 171277
 test-amd64-amd64-xl-qemuu-dmrestrict-amd64-dmrestrict 8 xen-boot fail REGR. 
vs. 171277
 test-amd64-amd64-xl-pvhv2-intel  8 xen-boot  fail REGR. vs. 171277
 test-amd64-amd64-freebsd12-amd64  8 xen-boot fail REGR. vs. 171277
 test-amd64-amd64-xl-pvshim8 xen-boot fail REGR. vs. 171277
 test-amd64-amd64-xl-pvhv2-amd  8 xen-bootfail REGR. vs. 171277
 test-amd64-amd64-examine  8 reboot   fail REGR. vs. 171277
 test-amd64-amd64-examine-uefi  8 reboot  fail REGR. vs. 171277
 test-amd64-amd64-examine-bios  8 reboot  fail REGR. vs. 171277
 test-amd64-amd64-xl-qemut-debianhvm-amd64  8 xen-bootfail REGR. vs. 171277
 test-amd64-coresched-amd64-xl  8 xen-bootfail REGR. vs. 171277
 test-amd64-amd64-pair12 xen-boot/src_hostfail REGR. vs. 171277
 test-amd64-amd64-pair13 xen-boot/dst_hostfail REGR. vs. 171277

Regressions which are regarded as allowable (not blocking):
 test-amd64-amd64-xl-rtds  8 xen-boot fail REGR. vs. 171277

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 171277
 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check   fail like 171277
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 171277
 test-arm64-arm64-xl-seattle  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail   never pass
 test-arm64-arm64-x

Re: [PATCH] x86/ept: fix shattering of special pages

2022-06-29 Thread Roger Pau Monné
On Wed, Jun 29, 2022 at 08:41:43AM +, Tian, Kevin wrote:
> > From: Roger Pau Monne 
> > Sent: Monday, June 27, 2022 6:01 PM
> > 
> > The current logic in epte_get_entry_emt() will split any page marked
> > as special with order greater than zero, without checking whether the
> > super page is all special.
> > 
> > Fix this by only splitting the page only if it's not all marked as
> > special, in order to prevent unneeded super page shuttering.
> > 
> > Fixes: ca24b2ffdb ('x86/hvm: set 'ipat' in EPT for special pages')
> > Signed-off-by: Roger Pau Monné 
> > ---
> > Cc: Paul Durrant 
> > ---
> > It would seem weird to me to have a super page entry in EPT with
> > ranges marked as special and not the full page.  I guess it's better
> > to be safe, but I don't see an scenario where we could end up in that
> > situation.
> > 
> > I've been trying to find a clarification in the original patch
> > submission about how it's possible to have such super page EPT entry,
> > but haven't been able to find any justification.
> > 
> > Might be nice to expand the commit message as to why it's possible to
> > have such mixed super page entries that would need splitting.
> 
> Here is what I dig out.
> 
> When reviewing v1 of adding special page check, Jan suggested
> that APIC access page was also covered hence the old logic for APIC
> access page can be removed. [1]

But the APIC access page is always added using set_mmio_p2m_entry(),
which will unconditionally create an entry for it in the EPT, hence
there's no explicit need to check for it's presence inside of higher
order pages.

> Then when reviewing v2 he found that the order check in removed
> logic was not carried to the new check on special page. [2] 
> 
> The original order check in old APIC access logic came from:
> 
> commit 126018f2acd5416434747423e61a4690108b9dc9
> Author: Jan Beulich 
> Date:   Fri May 2 10:48:48 2014 +0200
> 
> x86/EPT: consider page order when checking for APIC MFN
> 
> This was overlooked in 3d90d6e6 ("x86/EPT: split super pages upon
> mismatching memory types").
> 
> Signed-off-by: Jan Beulich 
> Acked-by: Kevin Tian 
> Reviewed-by: Tim Deegan 
> 
> I suppose Jan may actually find such mixed super page entry case
> to bring this fix in.

Hm, I guess theoretically it could be possible for contiguous entries
to be coalesced (if we ever implement that for p2m page tables), and
hence result in super pages being created from smaller entries.

It that case it would be less clear to assert that special pages
cannot be coalesced with other contiguous entries.

> Not sure whether Jan still remember the history.
> 
> [1] https://lists.xenproject.org/archives/html/xen-devel/2020-07/msg01648.html
> [2] https://lore.kernel.org/all/a4856c33-8bb0-4afa-cc71-3af4c229b...@suse.com/
> 
> > ---
> >  xen/arch/x86/mm/p2m-ept.c | 20 +++-
> >  1 file changed, 11 insertions(+), 9 deletions(-)
> > 
> > diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
> > index b04ca6dbe8..b4919bad51 100644
> > --- a/xen/arch/x86/mm/p2m-ept.c
> > +++ b/xen/arch/x86/mm/p2m-ept.c
> > @@ -491,7 +491,7 @@ int epte_get_entry_emt(struct domain *d, gfn_t gfn,
> > mfn_t mfn,
> >  {
> >  int gmtrr_mtype, hmtrr_mtype;
> >  struct vcpu *v = current;
> > -unsigned long i;
> > +unsigned long i, special_pgs;
> > 
> >  *ipat = false;
> > 
> > @@ -525,15 +525,17 @@ int epte_get_entry_emt(struct domain *d, gfn_t
> > gfn, mfn_t mfn,
> >  return MTRR_TYPE_WRBACK;
> >  }
> > 
> > -for ( i = 0; i < (1ul << order); i++ )
> > -{
> > +for ( special_pgs = i = 0; i < (1ul << order); i++ )
> >  if ( is_special_page(mfn_to_page(mfn_add(mfn, i))) )
> > -{
> > -if ( order )
> > -return -1;
> > -*ipat = true;
> > -return MTRR_TYPE_WRBACK;
> > -}
> > +special_pgs++;
> > +
> > +if ( special_pgs )
> > +{
> > +if ( special_pgs != (1ul << order) )
> > +return -1;
> > +
> > +*ipat = true;
> > +return MTRR_TYPE_WRBACK;
> 
> Did you actually observe an impact w/o this fix? 

Yes, the original change has caused a performance regression in some
GPU pass through workloads for XenServer.  I think it's reasonable to
avoid super page shattering if the resulting pages would all end up
using ipat and WRBACK cache attribute, as there's no reason for the
split in the first case.

Thanks, Roger.



Re: [PATCH] xen/arm: smmu-v3: Fix MISRA C 2012 Rule 1.3 violations

2022-06-29 Thread Bertrand Marquis
Hi Xenia,

> On 29 Jun 2022, at 09:55, xenia  wrote:
> 
> Hi Bertrand,
> 
> On 6/29/22 10:24, Bertrand Marquis wrote:
>> Hi Xenia,
>> 
>>> On 28 Jun 2022, at 16:08, Xenia Ragiadakou  wrote:
>>> 
>>> The expression 1 << 31 produces undefined behaviour because the type of 
>>> integer
>>> constant 1 is (signed) int and the result of shifting 1 by 31 bits is not
>>> representable in the (signed) int type.
>>> Change the type of 1 to unsigned int by adding the U suffix.
>>> 
>>> Signed-off-by: Xenia Ragiadakou 
>>> ---
>>> Q_OVERFLOW_FLAG has already been fixed in upstream kernel code.
>>> For GBPA_UPDATE I will submit a patch.
>>> 
>>> xen/drivers/passthrough/arm/smmu-v3.c | 4 ++--
>>> 1 file changed, 2 insertions(+), 2 deletions(-)
>>> 
>>> diff --git a/xen/drivers/passthrough/arm/smmu-v3.c 
>>> b/xen/drivers/passthrough/arm/smmu-v3.c
>>> index 1e857f915a..f2562acc38 100644
>>> --- a/xen/drivers/passthrough/arm/smmu-v3.c
>>> +++ b/xen/drivers/passthrough/arm/smmu-v3.c
>>> @@ -338,7 +338,7 @@ static int platform_get_irq_byname_optional(struct 
>>> device *dev,
>>> #define CR2_E2H (1 << 0)
>>> 
>>> #define ARM_SMMU_GBPA   0x44
>>> -#define GBPA_UPDATE(1 << 31)
>>> +#define GBPA_UPDATE(1U << 31)
>>> #define GBPA_ABORT  (1 << 20)
>>> 
>>> #define ARM_SMMU_IRQ_CTRL   0x50
>>> @@ -410,7 +410,7 @@ static int platform_get_irq_byname_optional(struct 
>>> device *dev,
>>> 
>>> #define Q_IDX(llq, p)   ((p) & ((1 << 
>>> (llq)->max_n_shift) - 1))
>>> #define Q_WRP(llq, p)   ((p) & (1 << 
>>> (llq)->max_n_shift))
>> Could also make sense to fix those 2 to be coherent.
> According to the spec, the maximum value that max_n_shift can take is 19.
> Hence, 1 << (llq)->max_n_shift cannot produce undefined behavior.

I agree with that but my preference would be to not rely on deep analysis on 
the code for those kind of cases and use 1U whenever possible.

What other maintainers think on this ?

> 
> Personally, I have no problem to submit another patch that adds U/UL suffixes 
> to all shifted integer constants in the file :) but ...
> It seems that this driver has been ported from linux and this file still uses 
> linux coding style, probably because deviations will reduce its 
> maintainability.
> Adding a U suffix to those two might be considered an unjustified deviation.

At this stage the SMMU code is starting to deviate a lot from Linux so it will 
not be possible to update it easily to sync with latest linux code.
And the decision should be are we fixing it or should we fully skip this file 
saying that it is coming from linux and should not be fixed.
We started to have a discussion during the FuSa meeting on this but we need to 
come up with a conclusion per file.

On this one I would say keeping it with linux code style and near from the 
linux one is not needed.
Rahul, Julien, Stefano: what do you think here ?

Cheers
Bertrand

>>> -#define Q_OVERFLOW_FLAG(1 << 31)
>>> +#define Q_OVERFLOW_FLAG(1U << 31)
>>> #define Q_OVF(p)((p) & Q_OVERFLOW_FLAG)
>>> #define Q_ENT(q, p) ((q)->base +\
>>>  Q_IDX(&((q)->llq), p) *\
>> Cheers
>> Bertrand




Re: [XEN PATCH v3 25/25] tools: Remove -Werror everywhere else

2022-06-29 Thread Luca Fancellu

+ CC: Stefano Stabellini

> On 24 Jun 2022, at 17:04, Anthony PERARD  wrote:
> 
> Patch "tools: Add -Werror by default to all tools/" have added
> "-Werror" to CFLAGS in tools/Rules.mk, remove it from every other
> makefiles as it is now duplicated.
> 
> Signed-off-by: Anthony PERARD 

Hi Anthony,

I will try to review the serie when I manage to have some time, in the mean 
time I can say the whole
serie builds fine in my Yocto setup on arm64 and x86_64, I’ve tried also the 
tool stack to
create/destroy/console guests and no problem so far.

The only problem I have is building for arm32 because, I think, this patch does 
a great job and it
discovers a problem here:

arm-poky-linux-gnueabi-gcc  -mthumb -mfpu=neon -mfloat-abi=hard 
-mcpu=cortex-a15  
--sysroot=/data_sdc1/lucfan01/test_kirkstone_xen/build/xtp-qemu-arm32/tmp/work/cortexa15t2hf-neon-poky-linux-gnueabi/xen-tools/4.17+git1-r0/recipe-sysroot
   -marm -DBUILD_ID -fno-strict-aliasing -std=gnu99 -Wall -Wstrict-prototypes 
-Wdeclaration-after-statement -Wno-unused-but-set-variable 
-Wno-unused-local-typedefs   -Werror -O2 -fomit-frame-pointer 
-D__XEN_INTERFACE_VERSION__=__XEN_LATEST_INTERFACE_VERSION__ -MMD -MP -MF 
.init-dom0less.o.d -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE 
-D_LARGEFILE64_SOURCE  -mthumb -mfpu=neon -mfloat-abi=hard -mcpu=cortex-a15 
-fstack-protector-strong  -O2 -D_FORTIFY_SOURCE=2 -Wformat -Wformat-security 
-Werror=format-security  -O2 -pipe -g -feliminate-unused-debug-types 
-fmacro-prefix-map=/data_sdc1/lucfan01/test_kirkstone_xen/build/xtp-qemu-arm32/tmp/work/cortexa15t2hf-neon-poky-linux-gnueabi/xen-tools/4.17+git1-r0=/usr/src/debug/xen-tools/4.17+git1-r0
  
-fdebug-prefix-map=/data_sdc1/lucfan01/test_kirkstone_xen/build/xtp-qemu-arm32/tmp/work/cortexa15t2hf-neon-poky-linux-gnueabi/xen-tools/4.17+git1-r0=/usr/src/debug/xen-tools/4.17+git1-r0
  
-fdebug-prefix-map=/data_sdc1/lucfan01/test_kirkstone_xen/build/xtp-qemu-arm32/tmp/work/cortexa15t2hf-neon-poky-linux-gnueabi/xen-tools/4.17+git1-r0/recipe-sysroot=
  
-fdebug-prefix-map=/data_sdc1/lucfan01/test_kirkstone_xen/build/xtp-qemu-arm32/tmp/work/cortexa15t2hf-neon-poky-linux-gnueabi/xen-tools/4.17+git1-r0/recipe-sysroot-native=
  
-I/data_sdc1/lucfan01/test_kirkstone_xen/build/xtp-qemu-arm32/tmp/work/cortexa15t2hf-neon-poky-linux-gnueabi/xen-tools/4.17+git1-r0/local-xen/xen/tools/helpers/../../tools/include
 
-I/data_sdc1/lucfan01/test_kirkstone_xen/build/xtp-qemu-arm32/tmp/work/cortexa15t2hf-neon-poky-linux-gnueabi/xen-tools/4.17+git1-r0/local-xen/xen/tools/helpers/../../tools/include
 
-I/data_sdc1/lucfan01/test_kirkstone_xen/build/xtp-qemu-arm32/tmp/work/cortexa15t2hf-neon-poky-linux-gnueabi/xen-tools/4.17+git1-r0/local-xen/xen/tools/helpers/../../tools/include
 
-I/data_sdc1/lucfan01/test_kirkstone_xen/build/xtp-qemu-arm32/tmp/work/cortexa15t2hf-neon-poky-linux-gnueabi/xen-tools/4.17+git1-r0/local-xen/xen/tools/helpers/../../tools/include
 -D__XEN_TOOLS__ 
-I/data_sdc1/lucfan01/test_kirkstone_xen/build/xtp-qemu-arm32/tmp/work/cortexa15t2hf-neon-poky-linux-gnueabi/xen-tools/4.17+git1-r0/local-xen/xen/tools/helpers/../../tools/include
 -D__XEN_TOOLS__ 
-I/data_sdc1/lucfan01/test_kirkstone_xen/build/xtp-qemu-arm32/tmp/work/cortexa15t2hf-neon-poky-linux-gnueabi/xen-tools/4.17+git1-r0/local-xen/xen/tools/helpers/../../tools/include
  -c -o init-dom0less.o init-dom0less.c 
init-dom0less.c: In function 'create_xenstore':
init-dom0less.c:141:53: error: format '%lu' expects argument of type 'long 
unsigned int', but argument 4 has type 'uint64_t' {aka 'long long unsigned 
int'} [-Werror=format=]
  141 | rc = snprintf(max_memkb_str, STR_MAX_LENGTH, "%lu", 
info->max_memkb);
  |   ~~^   ~~~
  | |   |
  | |   uint64_t 
{aka long long unsigned int}
  | long unsigned int
  |   %llu
init-dom0less.c:144:56: error: format '%lu' expects argument of type 'long 
unsigned int', but argument 4 has type 'uint64_t' {aka 'long long unsigned 
int'} [-Werror=format=]
  144 | rc = snprintf(target_memkb_str, STR_MAX_LENGTH, "%lu", 
info->current_memkb);
  |  ~~^   
~~~
  ||   |
  ||   uint64_t 
{aka long long unsigned int}
  |long unsigned int
  |  %llu
  

Won’t be too difficult to fix, if I have time I will do it, otherwise if 
someone wants to do it’s fine for me.

Cheers,
Luca




RE: [PATCH v6 02/12] IOMMU/x86: new command line option to suppress use of superpage mappings

2022-06-29 Thread Tian, Kevin
> From: Roger Pau Monné 
> Sent: Tuesday, June 28, 2022 8:52 PM
> 
> On Thu, Jun 09, 2022 at 12:17:23PM +0200, Jan Beulich wrote:
> > Before actually enabling their use, provide a means to suppress it in
> > case of problems. Note that using the option can also affect the sharing
> > of page tables in the VT-d / EPT combination: If EPT would use large
> > page mappings but the option is in effect, page table sharing would be
> > suppressed (to properly fulfill the admin request).
> >
> > Requested-by: Roger Pau Monné 
> > Signed-off-by: Jan Beulich 
> > ---
> > v6: New.
> >
> > --- a/docs/misc/xen-command-line.pandoc
> > +++ b/docs/misc/xen-command-line.pandoc
> > @@ -1405,7 +1405,7 @@ detection of systems known to misbehave
> >
> >  ### iommu
> >  = List of [ , verbose, debug, force, required, 
> > quarantine[=scratch-
> page],
> > -sharept, intremap, intpost, crash-disable,
> > +sharept, superpages, intremap, intpost, crash-disable,
> >  snoop, qinval, igfx, amd-iommu-perdev-intremap,
> >  dom0-{passthrough,strict} ]
> >
> > @@ -1481,6 +1481,12 @@ boolean (e.g. `iommu=no`) can override t
> >
> >  This option is ignored on ARM, and the pagetables are always shared.
> >
> > +*   The `superpages` boolean controls whether superpage mappings may
> be used
> > +in IOMMU page tables.  If using this option is necessary to fix an 
> > issue,
> > +please report a bug.
> > +
> > +This option is only valid on x86.
> > +
> >  *   The `intremap` boolean controls the Interrupt Remapping sub-feature,
> and
> >  is active by default on compatible hardware.  On x86 systems, the first
> >  generation of IOMMUs only supported DMA remapping, and Interrupt
> Remapping
> > --- a/xen/arch/x86/include/asm/iommu.h
> > +++ b/xen/arch/x86/include/asm/iommu.h
> > @@ -132,7 +132,7 @@ extern bool untrusted_msi;
> >  int pi_update_irte(const struct pi_desc *pi_desc, const struct pirq *pirq,
> > const uint8_t gvec);
> >
> > -extern bool iommu_non_coherent;
> > +extern bool iommu_non_coherent, iommu_superpages;
> >
> >  static inline void iommu_sync_cache(const void *addr, unsigned int size)
> >  {
> > --- a/xen/drivers/passthrough/iommu.c
> > +++ b/xen/drivers/passthrough/iommu.c
> > @@ -88,6 +88,8 @@ static int __init cf_check parse_iommu_p
> >  iommu_igfx = val;
> >  else if ( (val = parse_boolean("qinval", s, ss)) >= 0 )
> >  iommu_qinval = val;
> > +else if ( (val = parse_boolean("superpages", s, ss)) >= 0 )
> > +iommu_superpages = val;
> >  #endif
> >  else if ( (val = parse_boolean("verbose", s, ss)) >= 0 )
> >  iommu_verbose = val;
> > --- a/xen/drivers/passthrough/vtd/iommu.c
> > +++ b/xen/drivers/passthrough/vtd/iommu.c
> > @@ -2213,7 +2213,8 @@ static bool __init vtd_ept_page_compatib
> >  if ( rdmsr_safe(MSR_IA32_VMX_EPT_VPID_CAP, ept_cap) != 0 )
> >  return false;
> >
> > -return (ept_has_2mb(ept_cap) && opt_hap_2mb) <=
> cap_sps_2mb(vtd_cap) &&
> > +return iommu_superpages &&
> > +   (ept_has_2mb(ept_cap) && opt_hap_2mb) <=
> cap_sps_2mb(vtd_cap) &&
> > (ept_has_1gb(ept_cap) && opt_hap_1gb) <= cap_sps_1gb(vtd_cap);
> 
> Isn't this too strict in requesting iommu superpages to be enabled
> regardless of whether EPT has superpage support?
> 
> Shouldn't this instead be:
> 
> return iommu_superpages ? ((ept_has_2mb(ept_cap) && opt_hap_2mb) <=
> cap_sps_2mb(vtd_cap) &&
>(ept_has_1gb(ept_cap) && opt_hap_1gb) <=
> cap_sps_1gb(vtd_cap))
> : !((ept_has_2mb(ept_cap) && opt_hap_2mb) ||
> (ept_has_1gb(ept_cap) && opt_hap_1gb));
> 
> I think we want to introduce some local variables to store EPT
> superpage availability, as the lines are too long.
> 

Or to be pair with ept side checks:

return (ept_has_2mb(ept_cap) && opt_hap_2mb) <=
   (cap_sps_2mb(vtd_cap) && iommu_superpages) &&
   (ept_has_1gb(ept_cap) && opt_hap_1gb) <=
   (cap_sps_1gb(vtd_cap) && iommu_superpages);


Re: [PATCH] xen/arm: smmu-v3: Fix MISRA C 2012 Rule 1.3 violations

2022-06-29 Thread xenia

Hi Bertrand,

On 6/29/22 10:24, Bertrand Marquis wrote:

Hi Xenia,


On 28 Jun 2022, at 16:08, Xenia Ragiadakou  wrote:

The expression 1 << 31 produces undefined behaviour because the type of integer
constant 1 is (signed) int and the result of shifting 1 by 31 bits is not
representable in the (signed) int type.
Change the type of 1 to unsigned int by adding the U suffix.

Signed-off-by: Xenia Ragiadakou 
---
Q_OVERFLOW_FLAG has already been fixed in upstream kernel code.
For GBPA_UPDATE I will submit a patch.

xen/drivers/passthrough/arm/smmu-v3.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/xen/drivers/passthrough/arm/smmu-v3.c 
b/xen/drivers/passthrough/arm/smmu-v3.c
index 1e857f915a..f2562acc38 100644
--- a/xen/drivers/passthrough/arm/smmu-v3.c
+++ b/xen/drivers/passthrough/arm/smmu-v3.c
@@ -338,7 +338,7 @@ static int platform_get_irq_byname_optional(struct device 
*dev,
#define CR2_E2H (1 << 0)

#define ARM_SMMU_GBPA   0x44
-#define GBPA_UPDATE(1 << 31)
+#define GBPA_UPDATE(1U << 31)
#define GBPA_ABORT  (1 << 20)

#define ARM_SMMU_IRQ_CTRL   0x50
@@ -410,7 +410,7 @@ static int platform_get_irq_byname_optional(struct device 
*dev,

#define Q_IDX(llq, p)   ((p) & ((1 << (llq)->max_n_shift) - 1))
#define Q_WRP(llq, p)   ((p) & (1 << (llq)->max_n_shift))

Could also make sense to fix those 2 to be coherent.

According to the spec, the maximum value that max_n_shift can take is 19.
Hence, 1 << (llq)->max_n_shift cannot produce undefined behavior.

Personally, I have no problem to submit another patch that adds U/UL 
suffixes to all shifted integer constants in the file :) but ...
It seems that this driver has been ported from linux and this file still 
uses linux coding style, probably because deviations will reduce its 
maintainability.
Adding a U suffix to those two might be considered an unjustified 
deviation.

-#define Q_OVERFLOW_FLAG(1 << 31)
+#define Q_OVERFLOW_FLAG(1U << 31)
#define Q_OVF(p)((p) & Q_OVERFLOW_FLAG)
#define Q_ENT(q, p) ((q)->base + \
 Q_IDX(&((q)->llq), p) * \

Cheers
Bertrand





Re: [PATCH 1/6] x86/Kconfig: add selection of default x2APIC destination mode

2022-06-29 Thread Roger Pau Monné
On Thu, Jun 23, 2022 at 04:47:22PM +0200, Jan Beulich wrote:
> On 23.06.2022 10:24, Roger Pau Monne wrote:
> > Allow selecting the default x2APIC destination mode from Kconfig.
> > Note the default destination mode is still Logical (Cluster) mode.
> > 
> > Signed-off-by: Roger Pau Monné 
> > ---
> >  xen/arch/x86/Kconfig  | 29 +
> >  xen/arch/x86/genapic/x2apic.c |  6 --
> >  2 files changed, 33 insertions(+), 2 deletions(-)
> > 
> > diff --git a/xen/arch/x86/Kconfig b/xen/arch/x86/Kconfig
> > index 1e31edc99f..f560dc13f4 100644
> > --- a/xen/arch/x86/Kconfig
> > +++ b/xen/arch/x86/Kconfig
> > @@ -226,6 +226,35 @@ config XEN_ALIGN_2M
> >  
> >  endchoice
> >  
> > +choice
> > +   prompt "x2APIC default destination mode"
> 
> What's the point of using "choice" here, and not a single "bool"?

I think choice better reflects the purpose of the option, it's
selecting between two different modes.  It could be expressed with a
bool, but I think it's less clear.

> > +   default X2APIC_LOGICAL
> > +   ---help---
> 
> Nit: Please don't use ---help--- anymore - we're trying to phase out its
> use as Linux has dropped it altogether (and hence once we update our
> Kconfig, we'd like to change as few places as possible), leaving just
> "help".
> 
> One downside of "choice" (iirc) is that the individual sub-options' help
> text is inaccessible from at least the command line version of kconfig.

Hm, I usually use menuconfig when wanting to poke at options help.

I guess I could introduce a single X2APIC_PHYSICAL bool that starts
with default false and notes that otherwise the destination mode is
logical.

> > + Specify default destination mode for x2APIC.
> > +
> > + If unsure, choose "Logical".
> > +
> > +config X2APIC_LOGICAL
> > +   bool "Logical mode"
> > +   ---help---
> > + Use Logical Destination mode.
> > +
> > + When using this mode APICs are addressed using the Logical
> > + Destination mode, which allows for optimized IPI sending,
> > + but also reduces the amount of vectors available for external
> > + interrupts when compared to physical mode.
> > +
> > +config X2APIC_PHYS
> 
> X2APIC_PHYSICAL (to be in line with X2APIC_LOGICAL)?

Right, was about to expand it but did consider PHYS to be clear enough
(opposed to using LOG or LOGIC), will expand in next version.

Thanks, Roger.



[xen-unstable test] 171387: tolerable FAIL - PUSHED

2022-06-29 Thread osstest service owner
flight 171387 xen-unstable real [real]
flight 171401 xen-unstable real-retest [real]
http://logs.test-lab.xenproject.org/osstest/logs/171387/
http://logs.test-lab.xenproject.org/osstest/logs/171401/

Failures :-/ but no regressions.

Tests which are failing intermittently (not blocking):
 test-amd64-amd64-dom0pvh-xl-intel 18 guest-localmigrate fail pass in 
171401-retest

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 171377
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 171377
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 171377
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 171377
 test-amd64-i386-xl-qemut-ws16-amd64 19 guest-stop fail like 171377
 test-amd64-i386-xl-qemut-win7-amd64 19 guest-stop fail like 171377
 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check   fail like 171377
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 171377
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 171377
 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 171377
 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 171377
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 171377
 test-amd64-i386-xl-pvshim14 guest-start  fail   never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-raw  14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-credit1  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-qcow2 14 migrate-support-checkfail never pass
 test-armhf-armhf-xl-vhd  14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  16 saverestore-support-checkfail

RE: [PATCH] x86/ept: fix shattering of special pages

2022-06-29 Thread Tian, Kevin
> From: Roger Pau Monne 
> Sent: Monday, June 27, 2022 6:01 PM
> 
> The current logic in epte_get_entry_emt() will split any page marked
> as special with order greater than zero, without checking whether the
> super page is all special.
> 
> Fix this by only splitting the page only if it's not all marked as
> special, in order to prevent unneeded super page shuttering.
> 
> Fixes: ca24b2ffdb ('x86/hvm: set 'ipat' in EPT for special pages')
> Signed-off-by: Roger Pau Monné 
> ---
> Cc: Paul Durrant 
> ---
> It would seem weird to me to have a super page entry in EPT with
> ranges marked as special and not the full page.  I guess it's better
> to be safe, but I don't see an scenario where we could end up in that
> situation.
> 
> I've been trying to find a clarification in the original patch
> submission about how it's possible to have such super page EPT entry,
> but haven't been able to find any justification.
> 
> Might be nice to expand the commit message as to why it's possible to
> have such mixed super page entries that would need splitting.

Here is what I dig out.

When reviewing v1 of adding special page check, Jan suggested
that APIC access page was also covered hence the old logic for APIC
access page can be removed. [1]

Then when reviewing v2 he found that the order check in removed
logic was not carried to the new check on special page. [2] 

The original order check in old APIC access logic came from:

commit 126018f2acd5416434747423e61a4690108b9dc9
Author: Jan Beulich 
Date:   Fri May 2 10:48:48 2014 +0200

x86/EPT: consider page order when checking for APIC MFN

This was overlooked in 3d90d6e6 ("x86/EPT: split super pages upon
mismatching memory types").

Signed-off-by: Jan Beulich 
Acked-by: Kevin Tian 
Reviewed-by: Tim Deegan 

I suppose Jan may actually find such mixed super page entry case
to bring this fix in.

Not sure whether Jan still remember the history.

[1] https://lists.xenproject.org/archives/html/xen-devel/2020-07/msg01648.html
[2] https://lore.kernel.org/all/a4856c33-8bb0-4afa-cc71-3af4c229b...@suse.com/

> ---
>  xen/arch/x86/mm/p2m-ept.c | 20 +++-
>  1 file changed, 11 insertions(+), 9 deletions(-)
> 
> diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
> index b04ca6dbe8..b4919bad51 100644
> --- a/xen/arch/x86/mm/p2m-ept.c
> +++ b/xen/arch/x86/mm/p2m-ept.c
> @@ -491,7 +491,7 @@ int epte_get_entry_emt(struct domain *d, gfn_t gfn,
> mfn_t mfn,
>  {
>  int gmtrr_mtype, hmtrr_mtype;
>  struct vcpu *v = current;
> -unsigned long i;
> +unsigned long i, special_pgs;
> 
>  *ipat = false;
> 
> @@ -525,15 +525,17 @@ int epte_get_entry_emt(struct domain *d, gfn_t
> gfn, mfn_t mfn,
>  return MTRR_TYPE_WRBACK;
>  }
> 
> -for ( i = 0; i < (1ul << order); i++ )
> -{
> +for ( special_pgs = i = 0; i < (1ul << order); i++ )
>  if ( is_special_page(mfn_to_page(mfn_add(mfn, i))) )
> -{
> -if ( order )
> -return -1;
> -*ipat = true;
> -return MTRR_TYPE_WRBACK;
> -}
> +special_pgs++;
> +
> +if ( special_pgs )
> +{
> +if ( special_pgs != (1ul << order) )
> +return -1;
> +
> +*ipat = true;
> +return MTRR_TYPE_WRBACK;

Did you actually observe an impact w/o this fix? 

>  }
> 
>  switch ( type )
> --
> 2.36.1



RE: [PATCH v5 1/8] xen/arm: introduce static shared memory

2022-06-29 Thread Penny Zheng
Hi Julien

> -Original Message-
> From: Julien Grall 
> Sent: Saturday, June 25, 2022 3:26 AM
> To: Penny Zheng ; xen-devel@lists.xenproject.org
> Cc: Wei Chen ; Stefano Stabellini
> ; Bertrand Marquis ;
> Volodymyr Babchuk 
> Subject: Re: [PATCH v5 1/8] xen/arm: introduce static shared memory
> 
> Hi Penny,
> 
> I have looked at the code and I have further questions about the binding.
> 
> On 20/06/2022 06:11, Penny Zheng wrote:
> > ---
> >   docs/misc/arm/device-tree/booting.txt | 120
> ++
> >   xen/arch/arm/Kconfig  |   6 ++
> >   xen/arch/arm/bootfdt.c|  68 +++
> >   xen/arch/arm/include/asm/setup.h  |   3 +
> >   4 files changed, 197 insertions(+)
> >
> > diff --git a/docs/misc/arm/device-tree/booting.txt
> > b/docs/misc/arm/device-tree/booting.txt
> > index 98253414b8..6467bc5a28 100644
> > --- a/docs/misc/arm/device-tree/booting.txt
> > +++ b/docs/misc/arm/device-tree/booting.txt
> > @@ -378,3 +378,123 @@ device-tree:
> >
> >   This will reserve a 512MB region starting at the host physical address
> >   0x3000 to be exclusively used by DomU1.
> > +
> > +Static Shared Memory
> > +
> > +
> > +The static shared memory device tree nodes allow users to statically
> > +set up shared memory on dom0less system, enabling domains to do
> > +shm-based communication.
> > +
> > +- compatible
> > +
> > +"xen,domain-shared-memory-v1"
> > +
> > +- xen,shm-id
> > +
> > +An 8-bit integer that represents the unique identifier of the shared
> memory
> > +region. The maximum identifier shall be "xen,shm-id = <0xff>".
> 
> There is nothing in Xen that will ensure that xen,shm-id will match for all 
> the
> nodes using the same region.
> 

True, we actually do not use this field, adding it here to just be aligned with 
Linux.
I could add a check in the very beginning when we parse the device tree.
I'll give more details to explain in which code locates.

> I see you write it to the guest device-tree. However there is a mismatch of 
> the
> type: here you use an integer whereas the guest binding is using a string.
> 
> Cheers,
> 
> --
> Julien Grall


RE: [PATCH v5 1/8] xen/arm: introduce static shared memory

2022-06-29 Thread Penny Zheng
Hi Julien

> -Original Message-
> From: Julien Grall 
> Sent: Saturday, June 25, 2022 1:55 AM
> To: Penny Zheng ; xen-devel@lists.xenproject.org
> Cc: Wei Chen ; Stefano Stabellini
> ; Bertrand Marquis ;
> Volodymyr Babchuk 
> Subject: Re: [PATCH v5 1/8] xen/arm: introduce static shared memory
> 
> Hi Penny,
> 
> On 20/06/2022 06:11, Penny Zheng wrote:
> > From: Penny Zheng 
> >
> > This patch serie introduces a new feature: setting up static
> 
> Typo: s/serie/series/
> 
> > shared memory on a dom0less system, through device tree configuration.
> >
> > This commit parses shared memory node at boot-time, and reserve it in
> > bootinfo.reserved_mem to avoid other use.
> >
> > This commits proposes a new Kconfig CONFIG_STATIC_SHM to wrap
> > static-shm-related codes, and this option depends on static memory(
> > CONFIG_STATIC_MEMORY). That's because that later we want to reuse a
> > few helpers, guarded with CONFIG_STATIC_MEMORY, like
> > acquire_staticmem_pages, etc, on static shared memory.
> >
> > Signed-off-by: Penny Zheng 
> > Reviewed-by: Stefano Stabellini 
> > ---
> > v5 change:
> > - no change
> > ---
> > v4 change:
> > - nit fix on doc
> > ---
> > v3 change:
> > - make nr_shm_domain unsigned int
> > ---
> > v2 change:
> > - document refinement
> > - remove bitmap and use the iteration to check
> > - add a new field nr_shm_domain to keep the number of shared domain
> > ---
> >   docs/misc/arm/device-tree/booting.txt | 120
> ++
> >   xen/arch/arm/Kconfig  |   6 ++
> >   xen/arch/arm/bootfdt.c|  68 +++
> >   xen/arch/arm/include/asm/setup.h  |   3 +
> >   4 files changed, 197 insertions(+)
> >
> > diff --git a/docs/misc/arm/device-tree/booting.txt
> > b/docs/misc/arm/device-tree/booting.txt
> > index 98253414b8..6467bc5a28 100644
> > --- a/docs/misc/arm/device-tree/booting.txt
> > +++ b/docs/misc/arm/device-tree/booting.txt
> > @@ -378,3 +378,123 @@ device-tree:
> >
> >   This will reserve a 512MB region starting at the host physical address
> >   0x3000 to be exclusively used by DomU1.
> > +
> > +Static Shared Memory
> > +
> > +
> > +The static shared memory device tree nodes allow users to statically
> > +set up shared memory on dom0less system, enabling domains to do
> > +shm-based communication.
> > +
> > +- compatible
> > +
> > +"xen,domain-shared-memory-v1"
> > +
> > +- xen,shm-id
> > +
> > +An 8-bit integer that represents the unique identifier of the shared
> memory
> > +region. The maximum identifier shall be "xen,shm-id = <0xff>".
> > +
> > +- xen,shared-mem
> > +
> > +An array takes a physical address, which is the base address of the
> > +shared memory region in host physical address space, a size, and a
> guest
> > +physical address, as the target address of the mapping. The number of
> cells
> > +for the host address (and size) is the same as the guest 
> > pseudo-physical
> > +address and they are inherited from the parent node.
> 
> Sorry for jump in the discussion late. But as this is going to be a stable 
> ABI, I
> would to make sure the interface is going to be easily extendable.
> 
> AFAIU, with your proposal the host physical address is mandatory. I would
> expect that some user may want to share memory but don't care about the
> exact location in memory. So I think it would be good to make it optional in
> the binding.
> 
> I think this wants to be done now because it would be difficult to change the
> binding afterwards (the host physical address is the first set of cells).
> 
> The Xen doesn't need to handle the optional case.
> 
> [...]
> 
> > diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig index
> > be9eff0141..7321f47c0f 100644
> > --- a/xen/arch/arm/Kconfig
> > +++ b/xen/arch/arm/Kconfig
> > @@ -139,6 +139,12 @@ config TEE
> >
> >   source "arch/arm/tee/Kconfig"
> >
> > +config STATIC_SHM
> > +   bool "Statically shared memory on a dom0less system" if
> UNSUPPORTED
> 
> You also want to update SUPPORT.md.
> 
> > +   depends on STATIC_MEMORY
> > +   help
> > + This option enables statically shared memory on a dom0less system.
> > +
> >   endmenu
> >
> >   menu "ARM errata workaround via the alternative framework"
> > diff --git a/xen/arch/arm/bootfdt.c b/xen/arch/arm/bootfdt.c index
> > ec81a45de9..38dcb05d5d 100644
> > --- a/xen/arch/arm/bootfdt.c
> > +++ b/xen/arch/arm/bootfdt.c
> > @@ -361,6 +361,70 @@ static int __init process_domain_node(const void
> *fdt, int node,
> >  size_cells, &bootinfo.reserved_mem, 
> > true);
> >   }
> >
> > +#ifdef CONFIG_STATIC_SHM
> > +static int __init process_shm_node(const void *fdt, int node,
> > +   u32 address_cells, u32 size_cells)
> > +{
> > +const struct fdt_property *prop;
> > +const __be32 *cell;
> > +paddr_t paddr, size;
> > +struct meminfo *mem = &bootinfo.reserved_mem;
> > +unsigned long i;
> 

Re: [PATCH] xen: arm: Don't use stop_cpu() in halt_this_cpu()

2022-06-29 Thread Julien Grall

Hi Stefano,

On 28/06/2022 23:56, Stefano Stabellini wrote:

The advantage of the panic() is it will remind us that some needs to be fixed.
With a warning (or WARN()) people will tend to ignore it.


I know that this specific code path (cpu off) is probably not super
relevant for what I am about to say, but as we move closer to safety
certifiability we need to get away from using "panic" and BUG_ON as a
reminder that more work is needed to have a fully correct implementation
of something.


I don't think we have many places at runtime using BUG_ON()/panic(). 
They are often used because we think Xen would not be able to recover if 
the condition is hit.


I am happy to remove them, but this should not be at the expense to 
introduce other potential weird bugs.




I also see your point and agree that ASSERT is not acceptable for
external input but from my point of view panic is the same (slightly
worse because it doesn't go away in production builds).


I think it depends on your target. Would you be happy if Xen continue to 
run with potentially a fatal flaw?




Julien if you are going to ack the patch feel free to go ahead.


I will do and commit it.

Cheers,

--
Julien Grall



RE: [PATCH v5 5/8] xen/arm: Add additional reference to owner domain when the owner is allocated

2022-06-29 Thread Penny Zheng
Hi Julien

> -Original Message-
> From: Julien Grall 
> Sent: Saturday, June 25, 2022 3:18 AM
> To: Penny Zheng ; xen-devel@lists.xenproject.org
> Cc: Wei Chen ; Stefano Stabellini
> ; Bertrand Marquis ;
> Volodymyr Babchuk 
> Subject: Re: [PATCH v5 5/8] xen/arm: Add additional reference to owner
> domain when the owner is allocated
> 
> Hi Penny,
> 
> On 20/06/2022 06:11, Penny Zheng wrote:
> > Borrower domain will fail to get a page ref using the owner domain
> > during allocation, when the owner is created after borrower.
> >
> > So here, we decide to get and add the right amount of reference, which
> > is the number of borrowers, when the owner is allocated.
> >
> > Signed-off-by: Penny Zheng 
> > Reviewed-by: Stefano Stabellini 
> > ---
> > v5 change:
> > - no change
> > ---
> > v4 changes:
> > - no change
> > ---
> > v3 change:
> > - printk rather than dprintk since it is a serious error
> > ---
> > v2 change:
> > - new commit
> > ---
> >   xen/arch/arm/domain_build.c | 62
> +
> >   1 file changed, 62 insertions(+)
> >
> > diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
> > index d4fd64e2bd..650d18f5ef 100644
> > --- a/xen/arch/arm/domain_build.c
> > +++ b/xen/arch/arm/domain_build.c
> > @@ -799,6 +799,34 @@ static mfn_t __init
> > acquire_shared_memory_bank(struct domain *d,
> >
> >   }
> >
> > +static int __init acquire_nr_borrower_domain(struct domain *d,
> > + paddr_t pbase, paddr_t psize,
> > + unsigned long
> > +*nr_borrowers) {
> > +unsigned long bank;
> > +
> > +/* Iterate reserved memory to find requested shm bank. */
> > +for ( bank = 0 ; bank < bootinfo.reserved_mem.nr_banks; bank++ )
> > +{
> > +paddr_t bank_start = bootinfo.reserved_mem.bank[bank].start;
> > +paddr_t bank_size = bootinfo.reserved_mem.bank[bank].size;
> > +
> > +if ( pbase == bank_start && psize == bank_size )
> > +break;
> > +}
> > +
> > +if ( bank == bootinfo.reserved_mem.nr_banks )
> > +return -ENOENT;
> > +
> > +if ( d == dom_io )
> > +*nr_borrowers =
> bootinfo.reserved_mem.bank[bank].nr_shm_domain;
> > +else
> > +/* Exclude the owner domain itself. */
> NIT: I think this comment wants to be just above the 'if' and expanded to
> explain why the "dom_io" is not included. AFAIU, this is because "dom_io" is
> not described in the Device-Tree, so it would not be taken into account for
> nr_shm_domain.
> 
> > +*nr_borrowers =
> > + bootinfo.reserved_mem.bank[bank].nr_shm_domain - 1;
> 
> TBH, given the use here. I would have consider to not increment
> nr_shm_domain if the role was owner in parsing code. This is v5 now, so I
> would be OK with the comment above.
> 
> But I would suggest to consider it as a follow-up.
> 

LTM, it is not a big change, I'll try to include it in the next serie~

> > +
> > +return 0;
> > +}
> > +
> >   /*
> >* Func allocate_shared_memory is supposed to be only called
> >* from the owner.
> > @@ -810,6 +838,8 @@ static int __init allocate_shared_memory(struct
> domain *d,
> >   {
> >   mfn_t smfn;
> >   int ret = 0;
> > +unsigned long nr_pages, nr_borrowers, i;
> > +struct page_info *page;
> >
> >   dprintk(XENLOG_INFO,
> >   "Allocate static shared memory BANK
> > %#"PRIpaddr"-%#"PRIpaddr".\n", @@ -824,6 +854,7 @@ static int __init
> allocate_shared_memory(struct domain *d,
> >* DOMID_IO is the domain, like DOMID_XEN, that is not auto-
> translated.
> >* It sees RAM 1:1 and we do not need to create P2M mapping for it
> >*/
> > +nr_pages = PFN_DOWN(psize);
> >   if ( d != dom_io )
> >   {
> >   ret = guest_physmap_add_pages(d, gaddr_to_gfn(gbase), smfn,
> > PFN_DOWN(psize)); @@ -835,6 +866,37 @@ static int __init
> allocate_shared_memory(struct domain *d,
> >   }
> >   }
> >
> > +/*
> > + * Get the right amount of references per page, which is the number of
> > + * borrow domains.
> > + */
> > +ret = acquire_nr_borrower_domain(d, pbase, psize, &nr_borrowers);
> > +if ( ret )
> > +return ret;
> > +
> > +/*
> > + * Instead of let borrower domain get a page ref, we add as many
> 
> Typo: s/let/letting/
> 
> > + * additional reference as the number of borrowers when the owner
> > + * is allocated, since there is a chance that owner is created
> > + * after borrower.
> 
> What if the borrower is created first? Wouldn't this result to add pages in 
> the
> P2M without reference?
> 
> If yes, then I think this is worth an explaination.
> 

Yes, it is intended to be the way you said, and I'll add a comment to explain.

> > + */
> > +page = mfn_to_page(smfn);
> 
> Where do you validate the range [smfn, nr_pages]?
> 
> > +for ( i = 0; i < nr_pages; i++ )
> > +{
> > +if ( !get_page_

RE: [PATCH v5 3/8] xen/arm: allocate static shared memory to a specific owner domain

2022-06-29 Thread Penny Zheng
Hi Julien

> -Original Message-
> From: Julien Grall 
> Sent: Saturday, June 25, 2022 3:07 AM
> To: Penny Zheng ; xen-devel@lists.xenproject.org
> Cc: Wei Chen ; Stefano Stabellini
> ; Bertrand Marquis ;
> Volodymyr Babchuk 
> Subject: Re: [PATCH v5 3/8] xen/arm: allocate static shared memory to a
> specific owner domain
> 
> Hi Penny,
> 
> On 20/06/2022 06:11, Penny Zheng wrote:
> > If owner property is defined, then owner domain of a static shared
> > memory region is not the default dom_io anymore, but a specific domain.
> >
> > This commit implements allocating static shared memory to a specific
> > domain when owner property is defined.
> >
> > Coding flow for dealing borrower domain will be introduced later in
> > the following commits.
> >
> > Signed-off-by: Penny Zheng 
> > Reviewed-by: Stefano Stabellini 
> > ---
> > v5 change:
> > - no change
> > ---
> > v4 change:
> > - no changes
> > ---
> > v3 change:
> > - simplify the code since o_gbase is not used if the domain is dom_io
> > ---
> > v2 change:
> > - P2M mapping is restricted to normal domain
> > - in-code comment fix
> > ---
> >   xen/arch/arm/domain_build.c | 44 +++
> --
> >   1 file changed, 33 insertions(+), 11 deletions(-)
> >
> > diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
> > index 91a5ace851..d4fd64e2bd 100644
> > --- a/xen/arch/arm/domain_build.c
> > +++ b/xen/arch/arm/domain_build.c
> > @@ -805,9 +805,11 @@ static mfn_t __init
> acquire_shared_memory_bank(struct domain *d,
> >*/
> >   static int __init allocate_shared_memory(struct domain *d,
> >u32 addr_cells, u32 size_cells,
> > - paddr_t pbase, paddr_t psize)
> > + paddr_t pbase, paddr_t psize,
> > + paddr_t gbase)
> >   {
> >   mfn_t smfn;
> > +int ret = 0;
> >
> >   dprintk(XENLOG_INFO,
> >   "Allocate static shared memory BANK
> > %#"PRIpaddr"-%#"PRIpaddr".\n", @@ -822,8 +824,18 @@ static int __init
> allocate_shared_memory(struct domain *d,
> >* DOMID_IO is the domain, like DOMID_XEN, that is not auto-
> translated.
> >* It sees RAM 1:1 and we do not need to create P2M mapping for it
> >*/
> > -ASSERT(d == dom_io);
> > -return 0;
> > +if ( d != dom_io )
> > +{
> > +ret = guest_physmap_add_pages(d, gaddr_to_gfn(gbase), smfn,
> > + PFN_DOWN(psize));
> 
> Coding style: this line is over 80 characters. And...
> 
> > +if ( ret )
> > +{
> > +printk(XENLOG_ERR
> > +   "Failed to map shared memory to %pd.\n", d);
> 
> ... this line could be merged with the previous one.
> 
> > +return ret;
> > +}
> > +}
> > +
> > +return ret;
> >   }
> >
> >   static int __init process_shm(struct domain *d, @@ -836,6 +848,8 @@
> > static int __init process_shm(struct domain *d,
> >   u32 shm_id;
> >   u32 addr_cells, size_cells;
> >   paddr_t gbase, pbase, psize;
> > +const char *role_str;
> > +bool owner_dom_io = true;
> 
> I think it would be best if role_str and owner_dom_io are defined within the
> loop. Same goes for all the other declarations.
> 
> >
> >   dt_for_each_child_node(node, shm_node)
> >   {
> > @@ -862,19 +876,27 @@ static int __init process_shm(struct domain *d,
> >   ASSERT(IS_ALIGNED(pbase, PAGE_SIZE) && IS_ALIGNED(psize,
> PAGE_SIZE));
> >   gbase = dt_read_number(cells, addr_cells);
> >
> > -/* TODO: Consider owner domain is not the default dom_io. */
> > +/*
> > + * "role" property is optional and if it is defined explicitly,
> > + * then the owner domain is not the default "dom_io" domain.
> > + */
> > +if ( dt_property_read_string(shm_node, "role", &role_str) == 0 )
> > +owner_dom_io = false
> IIUC, the role is per-region. However, owner_dom_io is first initialized to
> false outside the loop. Therefore, the variable may not be correct on the next
> region.
> 
> So I think you want to write:
> 
> owner_dom_io = !dt_property_read_string(...);
> 
> This can also be avoided if you reduce the scope of the variable (it is meant
> to only be used in the loop).
> 

Yes, it is a bug, thx!!! I'll reduce the scope.

> > +
> >   /*
> >* Per static shared memory region could be shared between 
> > multiple
> >* domains.
> > - * In case re-allocating the same shared memory region, we check
> > - * if it is already allocated to the default owner dom_io before
> > - * the actual allocation.
> > + * So when owner domain is the default dom_io, in case 
> > re-allocating
> > + * the same shared memory region, we check if it is already 
> > allocated
> > + * to the default owner dom_io before the actual allocation.
> >*/
> > -  

Re: [PATCH] xen/arm: smmu-v3: Fix MISRA C 2012 Rule 1.3 violations

2022-06-29 Thread Bertrand Marquis
Hi Xenia,

> On 28 Jun 2022, at 16:08, Xenia Ragiadakou  wrote:
> 
> The expression 1 << 31 produces undefined behaviour because the type of 
> integer
> constant 1 is (signed) int and the result of shifting 1 by 31 bits is not
> representable in the (signed) int type.
> Change the type of 1 to unsigned int by adding the U suffix.
> 
> Signed-off-by: Xenia Ragiadakou 
> ---
> Q_OVERFLOW_FLAG has already been fixed in upstream kernel code.
> For GBPA_UPDATE I will submit a patch.
> 
> xen/drivers/passthrough/arm/smmu-v3.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/xen/drivers/passthrough/arm/smmu-v3.c 
> b/xen/drivers/passthrough/arm/smmu-v3.c
> index 1e857f915a..f2562acc38 100644
> --- a/xen/drivers/passthrough/arm/smmu-v3.c
> +++ b/xen/drivers/passthrough/arm/smmu-v3.c
> @@ -338,7 +338,7 @@ static int platform_get_irq_byname_optional(struct device 
> *dev,
> #define CR2_E2H   (1 << 0)
> 
> #define ARM_SMMU_GBPA 0x44
> -#define GBPA_UPDATE  (1 << 31)
> +#define GBPA_UPDATE  (1U << 31)
> #define GBPA_ABORT(1 << 20)
> 
> #define ARM_SMMU_IRQ_CTRL 0x50
> @@ -410,7 +410,7 @@ static int platform_get_irq_byname_optional(struct device 
> *dev,
> 
> #define Q_IDX(llq, p) ((p) & ((1 << (llq)->max_n_shift) - 1))
> #define Q_WRP(llq, p) ((p) & (1 << (llq)->max_n_shift))

Could also make sense to fix those 2 to be coherent.

> -#define Q_OVERFLOW_FLAG  (1 << 31)
> +#define Q_OVERFLOW_FLAG  (1U << 31)
> #define Q_OVF(p)  ((p) & Q_OVERFLOW_FLAG)
> #define Q_ENT(q, p)   ((q)->base +\
>Q_IDX(&((q)->llq), p) *\

Cheers
Bertrand




RE: [PATCH v5 2/8] xen/arm: allocate static shared memory to the default owner dom_io

2022-06-29 Thread Penny Zheng
Hi Julien

> -Original Message-
> From: Julien Grall 
> Sent: Saturday, June 25, 2022 2:22 AM
> To: Penny Zheng ; xen-devel@lists.xenproject.org
> Cc: Wei Chen ; Stefano Stabellini
> ; Bertrand Marquis ;
> Volodymyr Babchuk ; Andrew Cooper
> ; George Dunlap ;
> Jan Beulich ; Wei Liu 
> Subject: Re: [PATCH v5 2/8] xen/arm: allocate static shared memory to the
> default owner dom_io
> 
> Hi Penny,
> 
> On 20/06/2022 06:11, Penny Zheng wrote:
> > From: Penny Zheng 
> >
> > This commit introduces process_shm to cope with static shared memory
> > in domain construction.
> >
> > DOMID_IO will be the default owner of memory pre-shared among
> multiple
> > domains at boot time, when no explicit owner is specified.
> 
> The document in patch #1 suggest the page will be shared with dom_shared.
> But here you say "DOMID_IO".
> 
> Which one is correct?
> 

I’ll fix the documentation, DOM_IO is the last call.

> >
> > This commit only considers allocating static shared memory to dom_io
> > when owner domain is not explicitly defined in device tree, all the
> > left, including the "borrower" code path, the "explicit owner" code
> > path, shall be introduced later in the following patches.
> >
> > Signed-off-by: Penny Zheng 
> > Reviewed-by: Stefano Stabellini 
> > ---
> > v5 change:
> > - refine in-code comment
> > ---
> > v4 change:
> > - no changes
> > ---
> > v3 change:
> > - refine in-code comment
> > ---
> > v2 change:
> > - instead of introducing a new system domain, reuse the existing
> > dom_io
> > - make dom_io a non-auto-translated domain, then no need to create P2M
> > for it
> > - change dom_io definition and make it wider to support static shm
> > here too
> > - introduce is_shm_allocated_to_domio to check whether static shm is
> > allocated yet, instead of using shm_mask bitmap
> > - add in-code comment
> > ---
> >   xen/arch/arm/domain_build.c | 132
> +++-
> >   xen/common/domain.c |   3 +
> >   2 files changed, 134 insertions(+), 1 deletion(-)
> >
> > diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
> > index 7ddd16c26d..91a5ace851 100644
> > --- a/xen/arch/arm/domain_build.c
> > +++ b/xen/arch/arm/domain_build.c
> > @@ -527,6 +527,10 @@ static bool __init
> append_static_memory_to_bank(struct domain *d,
> >   return true;
> >   }
> >
> > +/*
> > + * If cell is NULL, pbase and psize should hold valid values.
> > + * Otherwise, cell will be populated together with pbase and psize.
> > + */
> >   static mfn_t __init acquire_static_memory_bank(struct domain *d,
> >  const __be32 **cell,
> >  u32 addr_cells, u32
> > size_cells, @@ -535,7 +539,8 @@ static mfn_t __init
> acquire_static_memory_bank(struct domain *d,
> >   mfn_t smfn;
> >   int res;
> >
> > -device_tree_get_reg(cell, addr_cells, size_cells, pbase, psize);
> > +if ( cell )
> > +device_tree_get_reg(cell, addr_cells, size_cells, pbase,
> > + psize);
> 
> I think this is a bit of a hack. To me it sounds like this should be moved 
> out to
> a separate helper. This will also make the interface of
> acquire_shared_memory_bank() less questionable (see below).
> 

Ok,  I'll try to not reuse acquire_static_memory_bank in
acquire_shared_memory_bank.

> As this is v5, I would be OK with a follow-up for this split. But this 
> interface of
> acuiqre_shared_memory_bank() needs to change.
> 

I'll try to fix it in next version.

> >   ASSERT(IS_ALIGNED(*pbase, PAGE_SIZE) && IS_ALIGNED(*psize,
> > PAGE_SIZE));
> 
> In the context of your series, who is checking that both psize and pbase are
> suitably aligned?
> 

Actually, the whole parsing process is redundant for the static shared memory.
I've already parsed it and checked it before in process_shm.

> >   if ( PFN_DOWN(*psize) > UINT_MAX )
> >   {
> > @@ -759,6 +764,125 @@ static void __init assign_static_memory_11(struct
> domain *d,
> >   panic("Failed to assign requested static memory for direct-map
> domain %pd.",
> > d);
> >   }
> > +
> > +#ifdef CONFIG_STATIC_SHM
> > +/*
> > + * This function checks whether the static shared memory region is
> > + * already allocated to dom_io.
> > + */
> > +static bool __init is_shm_allocated_to_domio(paddr_t pbase) {
> > +struct page_info *page;
> > +
> > +page = maddr_to_page(pbase);
> > +ASSERT(page);
> 
> maddr_to_page() can never return NULL. If you want to check a page will be
> valid, then you should use mfn_valid().
> 
> However, the ASSERT() implies that the address was suitably checked before.
> But I can't find such check.
> 
> > +
> > +if ( page_get_owner(page) == NULL )
> > +return false;
> > +
> > +ASSERT(page_get_owner(page) == dom_io);
> Could this be hit because of a wrong device-tree? If yes, then this should not
> be an ASSERT() (they are not suitable to check user input).
> 

Yes, it can happen, I'll c