Re: [PATCH 7/9] xen/x86: hook up xen_banner() also for PVH

2021-09-28 Thread Juergen Gross

On 23.09.21 17:31, Jan Beulich wrote:

On 23.09.2021 17:25, Juergen Gross wrote:

On 23.09.21 17:19, Jan Beulich wrote:

On 23.09.2021 17:15, Juergen Gross wrote:

On 23.09.21 17:10, Jan Beulich wrote:

On 23.09.2021 16:59, Juergen Gross wrote:

On 07.09.21 12:11, Jan Beulich wrote:

This was effectively lost while dropping PVHv1 code. Move the function
and arrange for it to be called the same way as done in PV mode. Clearly
this then needs re-introducing the XENFEAT_mmu_pt_update_preserve_ad
check that was recently removed, as that's a PV-only feature.

Signed-off-by: Jan Beulich 

--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -261,6 +261,18 @@ int xen_vcpu_setup(int cpu)
return ((per_cpu(xen_vcpu, cpu) == NULL) ? -ENODEV : 0);
 }
 
+void __init xen_banner(void)

+{
+   unsigned version = HYPERVISOR_xen_version(XENVER_version, NULL);
+   struct xen_extraversion extra;


Please add a blank line here.


Oops.


+   HYPERVISOR_xen_version(XENVER_extraversion, );
+
+   pr_info("Booting paravirtualized kernel on %s\n", pv_info.name);


Is this correct? I don't think the kernel needs to be paravirtualized
with PVH (at least not to the same extend as for PV).


What else do you suggest the message to say? Simply drop
"paravirtualized"? To some extent it is applicable imo, further
qualified by pv_info.name. And that's how it apparently was with
PVHv1.


The string could be selected depending on CONFIG_XEN_PV.


Hmm, now I'm confused: Doesn't this setting control whether the kernel
can run in PV mode? If so, that functionality being present should have
no effect on the functionality of the kernel when running in PVH mode.
So what you suggest would end up in misleading information imo.


Hmm, yes, I mixed "paravirtualized" with "capable to run
paravirtualized".

So the string should depend on xen_pv_domain().


But that's already expressed by pv_info.name then being "Xen PV".


True. Okay, I'm fine with just dropping "paravirtualized".


Juergen


OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature


Re: [PATCH v5 2/2] tools/xenstore: set open file descriptor limit for xenstored

2021-09-28 Thread Juergen Gross

On 28.09.21 17:26, Ian Jackson wrote:

Juergen Gross writes ("Re: [PATCH v5 2/2] tools/xenstore: set open file descriptor 
limit for xenstored"):

Hmm, maybe I should just use:

prlimit --nofile=$XENSTORED_MAX_OPEN_FDS \
 $XENSTORED --pid-file @XEN_RUN_DIR@/xenstored.pid $XENSTORED_ARGS


I guess that would probably work (although it involves another
exec) but I don't understand what's wrong with ulimit, which is a
shell builtin.


My main concern with ulimit is that this would set the open file limit
for _all_ children of the script. I don't think right now this is a real
problem, but it feels wrong to me (systemd-notify ought to be fine, but
you never know ...).


Juergen


OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature


Re: [PATCH v3] tools/xl: fix autoballoon regex

2021-09-28 Thread Dmitry Isaykin
Thanks! That's a good idea. I will do it in v4 patch.It seems that "dom0_mem=" is a correct setting. It means "give all unused by hypervisor memory to dom0".15:36, 28 сентября 2021 г., Anthony PERARD :On Thu, Sep 16, 2021 at 03:15:21PM +0300, Dmitry Isaykin wrote: This regex is used for auto-balloon mode detection based on Xen command line.  The case of specifying a negative size was handled incorrectly. From misc/xen-command-line documentation:  dom0_mem (x86) = List of ( min: | max: |  )  If a size is positive, it represents an absolute value. If a size is negative, it is subtracted from the total available memory.  Also add support for [tT] granularity suffix. Also add support for memory fractions (i.e. '50%' or '1G+25%').  Signed-off-by: Dmitry Isaykin  ---  ret = regcomp(, -  "(^| )dom0_mem=((|min:|max:)[0-9]+[bBkKmMgG]?,?)+($| )", +  "(^| )dom0_mem=((|min:|max:)(-?[0-9]+[bBkKmMgGtT]?)?(\+?[0-9]+%)?,?)+($| )",It seems that by trying to match fractions, the new regex would matchtoo much. For example, if there is " dom0_mem= " on the command line, xlwould detect it as autoballoon=off, while it isn't the case without thispatch. I don't know if it is possible to have "dom0_mem=" on the commandline as I don't know what Xen would do in this case.It might be better to make the regex more complicated and matchfraction like they are described in the doc, something like:(  | (\+)?% )unless xen doesn't boot with bogus value for dom0_mem, but I haven'tchecked. (we could use CPP macros to avoid duplicating the regex.)Also,  is supposed to be < 100, so [0-9]{1,2} would be better toonly match no more than 2 digit.Thought?Thanks,-- Anthony PERARD-- Отправлено из мобильного приложения Яндекс.Почты

[xen-unstable test] 165236: regressions - trouble: blocked/broken/fail/pass

2021-09-28 Thread osstest service owner
flight 165236 xen-unstable real [real]
http://logs.test-lab.xenproject.org/osstest/logs/165236/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-qemuu-rhel6hvm-intel broken
 test-amd64-i386-xl   broken
 test-amd64-i386-xl-pvshimbroken
 test-amd64-i386-xl-qemut-debianhvm-amd64broken
 test-amd64-i386-xl-qemut-debianhvm-i386-xsm broken
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsmbroken
 test-amd64-i386-xl-qemut-win7-amd64 broken
 test-amd64-i386-xl-qemut-ws16-amd64 broken
 test-amd64-i386-xl-qemuu-debianhvm-amd64broken
 test-amd64-i386-xl-qemuu-debianhvm-amd64-shadow broken
 test-amd64-i386-xl-qemuu-ovmf-amd64 broken
 test-amd64-i386-xl-qemuu-win7-amd64 broken
 test-amd64-i386-xl-qemuu-ws16-amd64 broken
 test-amd64-i386-xl-shadowbroken
 test-amd64-i386-xl-vhd   broken
 test-amd64-i386-xl-xsm   broken
 test-arm64-arm64-xl-credit1  broken
 test-arm64-arm64-xl-credit2  broken
 test-arm64-arm64-xl-seattle  broken
 test-arm64-arm64-xl-thunderx broken
 test-arm64-arm64-xl-vhd  broken
 test-arm64-arm64-xl-xsm  broken
 test-amd64-amd64-xl-xsm  broken
 test-amd64-amd64-xl-shadow   broken
 test-amd64-amd64-xl-rtds broken
 test-amd64-amd64-xl-qemuu-ws16-amd64 broken
 test-amd64-amd64-xl-qemuu-win7-amd64 broken
 test-amd64-amd64-xl-qemuu-dmrestrict-amd64-dmrestrict   broken
 test-amd64-amd64-xl-qemuu-debianhvm-i386-xsmbroken
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-shadowbroken
 test-amd64-amd64-xl-qemut-ws16-amd64 broken
 test-amd64-amd64-xl-qemut-win7-amd64 broken
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm   broken
 test-amd64-amd64-xl-qemut-debianhvm-amd64   broken
 test-amd64-amd64-xl-qcow2broken
 test-amd64-amd64-xl-pvshim   broken
 build-armhf  broken
 test-amd64-amd64-xl-pvhv2-intel broken
 test-amd64-amd64-xl-pvhv2-amd broken
 test-amd64-amd64-xl-credit2  broken
 test-amd64-amd64-dom0pvh-xl-amd broken
 test-amd64-amd64-dom0pvh-xl-intel broken
 test-amd64-amd64-xl-credit1  broken
 test-amd64-amd64-libvirt broken
 test-amd64-amd64-xl  broken
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm  broken
 test-amd64-amd64-qemuu-nested-intel broken
 test-amd64-amd64-libvirt-vhd broken
 test-amd64-amd64-libvirt-xsm broken
 test-amd64-amd64-qemuu-nested-amd broken
 test-amd64-amd64-migrupgrade broken
 test-amd64-amd64-pairbroken
 test-amd64-amd64-pygrub  broken
 test-amd64-amd64-qemuu-freebsd11-amd64 broken
 test-amd64-i386-xl-qemuu-dmrestrict-amd64-dmrestrictbroken
 test-amd64-i386-xl-qemuu-debianhvm-i386-xsm broken
 test-amd64-coresched-amd64-xl broken
 test-amd64-coresched-i386-xl broken
 test-amd64-i386-freebsd10-amd64 broken
 test-amd64-i386-freebsd10-i386 broken
 test-amd64-i386-libvirt  broken
 test-amd64-i386-libvirt-pair broken
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm   broken
 test-amd64-i386-libvirt-raw  broken
 test-amd64-i386-libvirt-xsm  broken
 test-amd64-i386-livepatchbroken
 test-amd64-i386-migrupgrade  broken
 test-amd64-i386-pair broken
 test-amd64-i386-qemut-rhel6hvm-amd broken
 test-amd64-i386-qemut-rhel6hvm-intel broken
 test-amd64-i386-qemuu-rhel6hvm-amd broken
 test-xtf-amd64-amd64-2   broken
 test-xtf-amd64-amd64-4   broken
 test-amd64-amd64-xl-qemuu-ws16-amd64 5 host-install(5) broken REGR. vs. 164945
 test-amd64-i386-xl-qemut-ws16-amd64  5 host-install(5) broken REGR. vs. 164945
 test-amd64-amd64-xl-pvhv2-amd  5 host-install(5)   broken REGR. vs. 164945
 test-amd64-i386-qemut-rhel6hvm-amd  5 host-install(5)  broken REGR. vs. 164945
 

Re: [PATCH v1 1/8] x86/xen: update xen_oldmem_pfn_is_ram() documentation

2021-09-28 Thread Boris Ostrovsky


On 9/28/21 2:22 PM, David Hildenbrand wrote:
> The callback is only used for the vmcore nowadays.
>
> Signed-off-by: David Hildenbrand 


Reviewed-by: Boris Ostrovsky 





Re: [PATCH v1 2/8] x86/xen: simplify xen_oldmem_pfn_is_ram()

2021-09-28 Thread Boris Ostrovsky


On 9/28/21 2:22 PM, David Hildenbrand wrote:
> Let's simplify return handling.
>
> Signed-off-by: David Hildenbrand 
> ---
>  arch/x86/xen/mmu_hvm.c | 11 ++-
>  1 file changed, 2 insertions(+), 9 deletions(-)
>
> diff --git a/arch/x86/xen/mmu_hvm.c b/arch/x86/xen/mmu_hvm.c
> index b242d1f4b426..eb61622df75b 100644
> --- a/arch/x86/xen/mmu_hvm.c
> +++ b/arch/x86/xen/mmu_hvm.c
> @@ -21,23 +21,16 @@ static int xen_oldmem_pfn_is_ram(unsigned long pfn)
>   .domid = DOMID_SELF,
>   .pfn = pfn,
>   };
> - int ram;
>  
>   if (HYPERVISOR_hvm_op(HVMOP_get_mem_type, ))
>   return -ENXIO;
>  
>   switch (a.mem_type) {
>   case HVMMEM_mmio_dm:
> - ram = 0;
> - break;
> - case HVMMEM_ram_rw:
> - case HVMMEM_ram_ro:
> + return 0;
>   default:
> - ram = 1;
> - break;
> + return 1;
>   }
> -
> - return ram;
>  }
>  #endif
>  


How about

    return a.mem_type != HVMMEM_mmio_dm;


Result should be promoted to int and this has added benefit of not requiring 
changes in patch 4.


-boris




[qemu-mainline test] 165234: regressions - trouble: blocked/broken/fail/pass

2021-09-28 Thread osstest service owner
flight 165234 qemu-mainline real [real]
http://logs.test-lab.xenproject.org/osstest/logs/165234/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-armhf-armhf-xl  broken
 test-armhf-armhf-xl-credit2  broken
 test-armhf-armhf-xl-cubietruck broken
 test-armhf-armhf-xl-multivcpu broken
 test-armhf-armhf-xl-rtds broken
 test-armhf-armhf-xl-vhd  broken
 test-amd64-i386-qemuu-rhel6hvm-amd broken
 test-amd64-coresched-i386-xl broken
 test-armhf-armhf-xl-credit1  broken
 test-armhf-armhf-xl-arndale  broken
 test-amd64-i386-xl-qemuu-debianhvm-i386-xsm broken
 test-amd64-i386-xl-vhd   broken
 test-armhf-armhf-xl-credit2   5 host-install(5)broken REGR. vs. 164950
 test-armhf-armhf-xl-multivcpu  5 host-install(5)   broken REGR. vs. 164950
 test-armhf-armhf-xl-arndale   5 host-install(5)broken REGR. vs. 164950
 test-armhf-armhf-xl-vhd   5 host-install(5)broken REGR. vs. 164950
 test-amd64-i386-xl-qemuu-debianhvm-i386-xsm 5 host-install(5) broken REGR. vs. 
164950
 test-amd64-coresched-i386-xl  5 host-install(5)broken REGR. vs. 164950
 test-amd64-i386-qemuu-rhel6hvm-amd  5 host-install(5)  broken REGR. vs. 164950
 test-armhf-armhf-xl-credit1   5 host-install(5)broken REGR. vs. 164950
 test-armhf-armhf-xl   5 host-install(5)broken REGR. vs. 164950
 test-armhf-armhf-xl-cubietruck  5 host-install(5)  broken REGR. vs. 164950
 test-amd64-i386-xl-xsm   12 debian-install   fail REGR. vs. 164950
 test-amd64-i386-xl-vhd   12 debian-di-installfail REGR. vs. 164950
 test-arm64-arm64-libvirt-raw 17 guest-start/debian.repeat fail REGR. vs. 164950
 build-armhf-libvirt   6 libvirt-buildfail REGR. vs. 164950

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-xl-rtds  5 host-install(5)broken REGR. vs. 164950

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-libvirt  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-qcow2  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-raw  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-vhd   13 capture-logs(13) broken never pass
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 164950
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 164950
 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 164950
 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 164950
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 164950
 test-amd64-i386-xl-pvshim14 guest-start  fail   never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-raw  14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  14 

Re: [PATCH v3 2/3] arm/efi: Use dom0less configuration when using EFI boot

2021-09-28 Thread Stefano Stabellini
On Tue, 28 Sep 2021, Luca Fancellu wrote:
> This patch introduces the support for dom0less configuration
> when using UEFI boot on ARM, it permits the EFI boot to
> continue if no dom0 kernel is specified but at least one domU
> is found.
> 
> Introduce the new property "uefi,binary" for device tree boot
> module nodes that are subnode of "xen,domain" compatible nodes.
> The property holds a string containing the file name of the
> binary that shall be loaded by the uefi loader from the filesystem.
> 
> Update efi documentation about how to start a dom0less
> setup using UEFI
> 
> Signed-off-by: Luca Fancellu 

Some minor feedback about code style and comments below. With those
addressed:

Reviewed-by: Stefano Stabellini 


> ---
> Changes in v3:
> - fixed documentation
> - fixed name len in strlcpy
> - fixed some style issues
> - closed filesystem handle before calling blexit
> - passed runtime errors up to the stack instead
> of calling blexit
> - renamed names and function to make them more
> general in prevision to load also Dom0 kernel
> and ramdisk from DT
> Changes in v2:
> - remove array of struct file
> - fixed some int types
> - Made the code use filesystem even when configuration
> file is skipped.
> - add documentation of uefi,binary in booting.txt
> - add documentation on how to boot all configuration
> for Xen using UEFI in efi.pandoc
> ---
>  docs/misc/arm/device-tree/booting.txt |  21 ++
>  docs/misc/efi.pandoc  | 203 +
>  xen/arch/arm/efi/efi-boot.h   | 305 +-
>  xen/arch/x86/efi/efi-boot.h   |   6 +
>  xen/common/efi/boot.c |  42 ++--
>  5 files changed, 562 insertions(+), 15 deletions(-)
> 
> diff --git a/docs/misc/arm/device-tree/booting.txt 
> b/docs/misc/arm/device-tree/booting.txt
> index cf878b478e..354bb43fe1 100644
> --- a/docs/misc/arm/device-tree/booting.txt
> +++ b/docs/misc/arm/device-tree/booting.txt
> @@ -190,6 +190,13 @@ The kernel sub-node has the following properties:
>  
>  Command line parameters for the guest kernel.
>  
> +- uefi,binary (UEFI boot only)
> +
> +String property that specifies the file name to be loaded by the UEFI 
> boot
> +for this module. If this is specified, there is no need to specify the 
> reg
> +property because it will be created by the UEFI stub on boot.
> +This option is needed only when UEFI boot is used.
> +
>  The ramdisk sub-node has the following properties:
>  
>  - compatible
> @@ -201,6 +208,13 @@ The ramdisk sub-node has the following properties:
>  Specifies the physical address of the ramdisk in RAM and its
>  length.
>  
> +- uefi,binary (UEFI boot only)
> +
> +String property that specifies the file name to be loaded by the UEFI 
> boot
> +for this module. If this is specified, there is no need to specify the 
> reg
> +property because it will be created by the UEFI stub on boot.
> +This option is needed only when UEFI boot is used.
> +
>  
>  Example
>  ===
> @@ -265,6 +279,13 @@ The dtb sub-node should have the following properties:
>  Specifies the physical address of the device tree binary fragment
>  RAM and its length.
>  
> +- uefi,binary (UEFI boot only)
> +
> +String property that specifies the file name to be loaded by the UEFI 
> boot
> +for this module. If this is specified, there is no need to specify the 
> reg
> +property because it will be created by the UEFI stub on boot.
> +This option is needed only when UEFI boot is used.
> +
>  As an example:
>  
>  module@0xc00 {
> diff --git a/docs/misc/efi.pandoc b/docs/misc/efi.pandoc
> index e289c5e7ba..800e67a233 100644
> --- a/docs/misc/efi.pandoc
> +++ b/docs/misc/efi.pandoc
> @@ -167,3 +167,206 @@ sbsign \
>   --output xen.signed.efi \
>   xen.unified.efi
>  ```
> +
> +## UEFI boot and dom0less on ARM
> +
> +Dom0less feature is supported by ARM and it is possible to use it when Xen is
> +started as an EFI application.
> +The way to specify the domU domains is by Device Tree as specified in the
> +[dom0less](dom0less.html) documentation page under the "Device Tree
> +configuration" section, but instead of declaring the reg property in the boot
> +module, the user must specify the "uefi,binary" property containing the name
> +of the binary file that has to be loaded in memory.
> +The UEFI stub will load the binary in memory and it will add the reg property
> +accordingly.
> +
> +An example here:
> +
> +domU1 {
> + #address-cells = <1>;
> + #size-cells = <1>;
> + compatible = "xen,domain";
> + memory = <0 0x2>;
> + cpus = <1>;
> + vpl011;
> +
> + module@1 {
> + compatible = "multiboot,kernel", "multiboot,module";
> + uefi,binary = "vmlinuz-3.0.31-0.4-xen";
> + bootargs = "console=ttyAMA0";
> + };
> + module@2 {
> + compatible = "multiboot,ramdisk", "multiboot,module";
> + uefi,binary = 

Re: [PATCH v3 3/3] arm/efi: load dom0 modules from DT using UEFI

2021-09-28 Thread Stefano Stabellini
On Tue, 28 Sep 2021, Luca Fancellu wrote:
> Add support to load Dom0 boot modules from
> the device tree using the uefi,binary property.
> 
> Update documentation about that.
> 
> Signed-off-by: Luca Fancellu 

It is great how simple this patch is!

The patch looks all correct. Only one question: do we need a check to
make sure the dom0 ramdisk is not loaded twice? Once via uefi,binary and
another time via the config file? In other words...

> ---
> Changes in v3:
> - new patch
> ---
>  docs/misc/arm/device-tree/booting.txt |  8 
>  docs/misc/efi.pandoc  | 64 +--
>  xen/arch/arm/efi/efi-boot.h   | 36 +--
>  xen/common/efi/boot.c | 12 ++---
>  4 files changed, 108 insertions(+), 12 deletions(-)
> 
> diff --git a/docs/misc/arm/device-tree/booting.txt 
> b/docs/misc/arm/device-tree/booting.txt
> index 354bb43fe1..e73f6476d4 100644
> --- a/docs/misc/arm/device-tree/booting.txt
> +++ b/docs/misc/arm/device-tree/booting.txt
> @@ -70,6 +70,14 @@ Each node contains the following properties:
>   priority of this field vs. other mechanisms of specifying the
>   bootargs for the kernel.
>  
> +- uefi,binary (UEFI boot only)
> +
> + String property that specifies the file name to be loaded by the UEFI
> + boot for this module. If this is specified, there is no need to specify
> + the reg property because it will be created by the UEFI stub on boot.
> + This option is needed only when UEFI boot is used, the node needs to be
> + compatible with multiboot,kernel or multiboot,ramdisk.
> +
>  Examples
>  
>  
> diff --git a/docs/misc/efi.pandoc b/docs/misc/efi.pandoc
> index 800e67a233..4cebc47a18 100644
> --- a/docs/misc/efi.pandoc
> +++ b/docs/misc/efi.pandoc
> @@ -167,6 +167,28 @@ sbsign \
>   --output xen.signed.efi \
>   xen.unified.efi
>  ```
> +## UEFI boot and Dom0 modules on ARM
> +
> +When booting using UEFI on ARM, it is possible to specify the Dom0 modules
> +directly from the device tree without using the Xen configuration file, here 
> an
> +example:
> +
> +chosen {
> + #size-cells = <0x1>;
> + #address-cells = <0x1>;
> + xen,xen-bootargs = "[Xen boot arguments]"
> +
> + module@1 {
> + compatible = "multiboot,kernel", "multiboot,module";
> + uefi,binary = "vmlinuz-3.0.31-0.4-xen";
> + bootargs = "[domain 0 command line options]";
> + };
> +
> + module@2 {
> + compatible = "multiboot,ramdisk", "multiboot,module";
> + uefi,binary = "initrd-3.0.31-0.4-xen";
> + };
> +}
>  
>  ## UEFI boot and dom0less on ARM
>  
> @@ -326,10 +348,10 @@ chosen {
>  ### Boot Xen, Dom0 and DomU(s)
>  
>  This configuration is a mix of the two configuration above, to boot this one
> -the configuration file must be processed so the /chosen node must have the
> -"uefi,cfg-load" property.
> +the configuration file can be processed or the Dom0 modules can be read from
> +the device tree.
>  
> -Here an example:
> +Here the first example:
>  
>  Xen configuration file:
>  
> @@ -369,4 +391,40 @@ chosen {
>  };
>  ```
>  
> +Here the second example:
> +
> +Device tree:
> +
> +```
> +chosen {
> + #size-cells = <0x1>;
> + #address-cells = <0x1>;
> + xen,xen-bootargs = "[Xen boot arguments]"
> +
> + module@1 {
> + compatible = "multiboot,kernel", "multiboot,module";
> + uefi,binary = "vmlinuz-3.0.31-0.4-xen";
> + bootargs = "[domain 0 command line options]";
> + };
> +
> + module@2 {
> + compatible = "multiboot,ramdisk", "multiboot,module";
> + uefi,binary = "initrd-3.0.31-0.4-xen";
> + };
> +
> + domU1 {
> + #size-cells = <0x1>;
> + #address-cells = <0x1>;
> + compatible = "xen,domain";
> + cpus = <0x1>;
> + memory = <0x0 0xc>;
> + vpl011;
>  
> + module@1 {
> + compatible = "multiboot,kernel", "multiboot,module";
> + uefi,binary = "Image-domu1.bin";
> + bootargs = "console=ttyAMA0 root=/dev/ram0 rw";
> + };
> + };
> +};
> +```
> diff --git a/xen/arch/arm/efi/efi-boot.h b/xen/arch/arm/efi/efi-boot.h
> index 4f7c913f86..df63387136 100644
> --- a/xen/arch/arm/efi/efi-boot.h
> +++ b/xen/arch/arm/efi/efi-boot.h
> @@ -31,8 +31,10 @@ static unsigned int __initdata modules_idx;
>  #define ERROR_MISSING_DT_PROPERTY   (-3)
>  #define ERROR_RENAME_MODULE_NAME(-4)
>  #define ERROR_SET_REG_PROPERTY  (-5)
> +#define ERROR_DOM0_ALREADY_FOUND(-6)
>  #define ERROR_DT_MODULE_DOMU(-1)
>  #define ERROR_DT_CHOSEN_NODE(-2)
> +#define ERROR_DT_MODULE_DOM0(-3)
>  
>  void noreturn efi_xen_start(void *fdt_ptr, uint32_t fdt_size);
>  void __flush_dcache_area(const void *vaddr, unsigned long size);
> @@ -45,7 +47,8 @@ static int allocate_module_file(EFI_FILE_HANDLE 

Re: [PATCH v3 1/3] arm/efi: Introduce uefi,cfg-load DT property

2021-09-28 Thread Stefano Stabellini
On Tue, 28 Sep 2021, Luca Fancellu wrote:
> Introduce the uefi,cfg-load DT property of /chosen
> node for ARM whose presence decide whether to force
> the load of the UEFI Xen configuration file.
> 
> The logic is that if any multiboot,module is found in
> the DT, then the uefi,cfg-load property is used to see
> if the UEFI Xen configuration file is needed.
> 
> Modify a comment in efi_arch_use_config_file, removing
> the part that states "dom0 required" because it's not
> true anymore with this commit.
> 
> Signed-off-by: Luca Fancellu 

The patch looks good. Only one minor change: given that this is a Xen
parameter that we are introducing and not a parameter defined by UEFI
Forum, I think uefi,cfg-load should be called xen,uefi-cfg-load instead.
Because "xen," is our prefix, while "uefi," is not.

With that minor change:

Reviewed-by: Stefano Stabellini 


Note that the uefi,binary property is different because that property is
for xen,domain nodes, so we are already in a Xen specific namespace when
we are using it. Instead this property is for /chosen which is not a Xen
specific node.



> ---
> v3 changes:
> - add documentation to misc/arm/device-tree/booting.txt
> - Modified variable name and logic from skip_cfg_file to
> load_cfg_file
> - Add in the commit message that I'm modifying a comment.
> v2 changes:
> - Introduced uefi,cfg-load property
> - Add documentation about the property
> ---
>  docs/misc/arm/device-tree/booting.txt |  8 
>  docs/misc/efi.pandoc  |  2 ++
>  xen/arch/arm/efi/efi-boot.h   | 28 ++-
>  3 files changed, 33 insertions(+), 5 deletions(-)
> 
> diff --git a/docs/misc/arm/device-tree/booting.txt 
> b/docs/misc/arm/device-tree/booting.txt
> index 44cd9e1a9a..cf878b478e 100644
> --- a/docs/misc/arm/device-tree/booting.txt
> +++ b/docs/misc/arm/device-tree/booting.txt
> @@ -121,6 +121,14 @@ A Xen-aware bootloader would set xen,xen-bootargs for 
> Xen, xen,dom0-bootargs
>  for Dom0 and bootargs for native Linux.
>  
>  
> +UEFI boot and DT
> +
> +
> +When Xen is booted using UEFI, it doesn't read the configuration file if any
> +multiboot module is specified. To force Xen to load the configuration file, 
> the
> +boolean property uefi,cfg-load must be declared in the /chosen node.
> +
> +
>  Creating Multiple Domains directly from Xen
>  ===
>  
> diff --git a/docs/misc/efi.pandoc b/docs/misc/efi.pandoc
> index ac3cd58cae..e289c5e7ba 100644
> --- a/docs/misc/efi.pandoc
> +++ b/docs/misc/efi.pandoc
> @@ -14,6 +14,8 @@ loaded the modules and describes them in the device tree 
> provided to Xen.  If a
>  bootloader provides a device tree containing modules then any configuration
>  files are ignored, and the bootloader is responsible for populating all
>  relevant device tree nodes.
> +The property "uefi,cfg-load" can be specified in the /chosen node to force 
> Xen
> +to load the configuration file even if multiboot modules are found.
>  
>  Once built, `make install-xen` will place the resulting binary directly into
>  the EFI boot partition, provided `EFI_VENDOR` is set in the environment (and
> diff --git a/xen/arch/arm/efi/efi-boot.h b/xen/arch/arm/efi/efi-boot.h
> index cf9c37153f..4f1b01757d 100644
> --- a/xen/arch/arm/efi/efi-boot.h
> +++ b/xen/arch/arm/efi/efi-boot.h
> @@ -581,22 +581,40 @@ static void __init 
> efi_arch_load_addr_check(EFI_LOADED_IMAGE *loaded_image)
>  
>  static bool __init efi_arch_use_config_file(EFI_SYSTEM_TABLE *SystemTable)
>  {
> +bool load_cfg_file = true;
>  /*
>   * For arm, we may get a device tree from GRUB (or other bootloader)
>   * that contains modules that have already been loaded into memory.  In
> - * this case, we do not use a configuration file, and rely on the
> - * bootloader to have loaded all required modules and appropriate
> - * options.
> + * this case, we search for the property uefi,cfg-load in the /chosen 
> node
> + * to decide whether to skip the UEFI Xen configuration file or not.
>   */
>  
>  fdt = lookup_fdt_config_table(SystemTable);
>  dtbfile.ptr = fdt;
>  dtbfile.need_to_free = false; /* Config table memory can't be freed. */
> -if ( !fdt || fdt_node_offset_by_compatible(fdt, 0, "multiboot,module") < 
> 0 )
> +
> +if ( fdt_node_offset_by_compatible(fdt, 0, "multiboot,module") > 0 )
> +{
> +/* Locate chosen node */
> +int node = fdt_subnode_offset(fdt, 0, "chosen");
> +const void *cfg_load_prop;
> +int cfg_load_len;
> +
> +if ( node > 0 )
> +{
> +/* Check if uefi,cfg-load property exists */
> +cfg_load_prop = fdt_getprop(fdt, node, "uefi,cfg-load",
> +_load_len);
> +if ( !cfg_load_prop )
> +load_cfg_file = false;
> +}
> +}
> +
> +if ( !fdt || load_cfg_file )
>  {
>  /*
>   

Re: [PATCH v4 6/6] x86: change asm/debugger.h to xen/debugger.h

2021-09-28 Thread Andrew Cooper
On 28/09/2021 21:30, Bobby Eshleman wrote:
> This commit allows non-x86 architecture to omit the file asm/debugger.h
> if they do not require it.  It changes debugger.h to be a general
> xen/debugger.h which, if CONFIG_CRASH_DEBUG, resolves to include
> asm/debugger.h.
>
> It also changes all asm/debugger.h includes to xen/debugger.h.
>
> Because it is no longer required, arm/debugger.h is removed.
>
> Signed-off-by: Bobby Eshleman 

Julien also acked this patch.

> diff --git a/xen/include/xen/debugger.h b/xen/include/xen/debugger.h
> new file mode 100644
> index 00..ddaa4a938b
> --- /dev/null
> +++ b/xen/include/xen/debugger.h
> @@ -0,0 +1,51 @@
> +/**
> + * Generic hooks into arch-dependent Xen.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; If not, see .
> + *
> + * Each debugger should define two functions here:
> + *
> + * 1. debugger_trap_fatal():
> + *  Called when Xen is about to give up and crash. Typically you will use 
> this
> + *  hook to drop into a debug session. It can also be used to hook off
> + *  deliberately caused traps (which you then handle and return non-zero).
> + *
> + * 2. debugger_trap_immediate():
> + *  Called if we want to drop into a debugger now.  This is essentially the
> + *  same as debugger_trap_fatal, except that we use the current register 
> state
> + *  rather than the state which was in effect when we took the trap.
> + *  For example: if we're dying because of an unhandled exception, we call
> + *  debugger_trap_fatal; if we're dying because of a panic() we call
> + *  debugger_trap_immediate().

This comment is now duplicated in x86's asm/debugger.h.  The x86 copy
wants deleting as part of this move.

> + */
> +
> +#ifndef __XEN_DEBUGGER_H__
> +#define __XEN_DEBUGGER_H__
> +
> +#ifdef CONFIG_CRASH_DEBUG
> +
> +#include 
> +
> +#else

#include  because you need bool and false to make this compile.

~Andrew

> +
> +struct cpu_user_regs;
> +
> +static inline bool debugger_trap_fatal(
> +unsigned int vector, const struct cpu_user_regs *regs)
> +{
> +return false;
> +}
> +
> +static inline void debugger_trap_immediate(void)
> +{
> +}
> +
> +#endif /* CONFIG_CRASH_DEBUG */
> +
> +#endif /* __XEN_DEBUGGER_H__ */





Re: [PATCH v4 4/6] x86/gdbsx: expand dbg_rw_mem() inline

2021-09-28 Thread Andrew Cooper
On 28/09/2021 21:30, Bobby Eshleman wrote:
> Because dbg_rw_mem() has only a single call site, this commit
> expands it inline.
>
> Signed-off-by: Bobby Eshleman 

Acked-by: Andrew Cooper 



Re: [PATCH v4 5/6] arch/x86: move domain_pause_for_debugger() to domain.h

2021-09-28 Thread Andrew Cooper
On 28/09/2021 21:30, Bobby Eshleman wrote:
> domain_pause_for_debugger() was previously in debugger.h.  This commit
> moves it to domain.h because its implementation is in domain.c.
>
> Signed-off-by: Bobby Eshleman 

Reviewed-by: Andrew Cooper 



Re: [PATCH v4 3/6] arch/x86: rename debug.c to gdbsx.c

2021-09-28 Thread Andrew Cooper
On 28/09/2021 21:30, Bobby Eshleman wrote:
> diff --git a/xen/include/asm-x86/gdbsx.h b/xen/include/asm-x86/gdbsx.h
> new file mode 100644
> index 00..473229a7fb
> --- /dev/null
> +++ b/xen/include/asm-x86/gdbsx.h
> @@ -0,0 +1,19 @@
> +#ifndef __X86_GDBX_H__
> +#define __X86_GDBX_H__
> +
> +#include 

The errno include wants to move below

However, you need to avoid latent build errors based on the order of
includes.  I'd include public/domctl.h which will get you both domid_t
and struct xen_domctl_gdbsx_memio.

> +
> +#ifdef CONFIG_GDBSX
> +
> +int gdbsx_guest_mem_io(domid_t domid, struct xen_domctl_gdbsx_memio *iop);
> +
> +#else
> +

... specifically here.

~Andrew

> +static inline int gdbsx_guest_mem_io(domid_t domid, struct 
> xen_domctl_gdbsx_memio *iop)
> +{
> +return -EOPNOTSUPP;
> +}
> +
> +#endif
> +
> +#endif





Re: [PATCH v4 1/6] arm/traps: remove debugger_trap_fatal() calls

2021-09-28 Thread Andrew Cooper
On 28/09/2021 21:30, Bobby Eshleman wrote:
> ARM doesn't actually use debugger_trap_* anything, and is stubbed out.
>
> This commit simply removes the unneeded calls.
>
> Signed-off-by: Bobby Eshleman 

Julien already acked this patch on v3.  You should carry the tag on
future revisions.

Acked-by: Julien Grall 




[PATCH] x86/traps: Fix typo in do_entry_CP()

2021-09-28 Thread Andrew Cooper
The call to debugger_trap_entry() should pass the correct vector.  The
break-for-gdbsx logic is in practice unreachable because PV guests can't
generate #CP, but it will interfere with anyone inserting custom debugging
into debugger_trap_entry().

Fixes: 5ad05b9c2490 ("x86/traps: Implement #CP handler and extend #PF for 
shadow stacks")
Signed-off-by: Andrew Cooper 
---
CC: Jan Beulich 
CC: Roger Pau Monné 
CC: Wei Liu 
CC: Bobby Eshleman 
---
 xen/arch/x86/traps.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index c2e2603c394b..63676b0a68ff 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -2047,7 +2047,7 @@ void do_entry_CP(struct cpu_user_regs *regs)
 const char *err = "??";
 unsigned int ec = regs->error_code;
 
-if ( debugger_trap_entry(TRAP_debug, regs) )
+if ( debugger_trap_entry(X86_EXC_CP, regs) )
 return;
 
 /* Decode ec if possible */
-- 
2.11.0




Re: [PATCH v4 2/6] x86/debugger: separate Xen and guest debugging debugger_trap_* functions

2021-09-28 Thread Andrew Cooper
On 28/09/2021 21:30, Bobby Eshleman wrote:
> diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
> index e60af16ddd..772e2a5bfc 100644
> --- a/xen/arch/x86/traps.c
> +++ b/xen/arch/x86/traps.c
> @@ -858,13 +858,20 @@ static void do_trap(struct cpu_user_regs *regs)
>  if ( regs->error_code & X86_XEC_EXT )
>  goto hardware_trap;
>  
> -if ( debugger_trap_entry(trapnr, regs) )
> -return;
> -
>  ASSERT(trapnr < 32);
>  
>  if ( guest_mode(regs) )
>  {
> +struct vcpu *curr = current;
> +if ( (trapnr == TRAP_debug || trapnr == TRAP_int3) &&
> +  guest_kernel_mode(curr, regs) &&
> +  curr->domain->debugger_attached )
> +{
> +if ( trapnr != TRAP_debug )
> +curr->arch.gdbsx_vcpu_event = trapnr;
> +domain_pause_for_debugger();
> +return;
> +}

This is unreachable.  do_trap() isn't used for TRAP_debug or TRAP_int3.

> @@ -2014,9 +2021,6 @@ void do_entry_CP(struct cpu_user_regs *regs)
>  const char *err = "??";
>  unsigned int ec = regs->error_code;
>  
> -if ( debugger_trap_entry(TRAP_debug, regs) )
> -return;
> -
>  /* Decode ec if possible */
>  if ( ec < ARRAY_SIZE(errors) && errors[ec][0] )
>  err = errors[ec];
> @@ -2028,6 +2032,12 @@ void do_entry_CP(struct cpu_user_regs *regs)
>   */
>  if ( guest_mode(regs) )
>  {
> +struct vcpu *curr = current;
> +if ( guest_kernel_mode(curr, regs) && 
> curr->domain->debugger_attached )
> +{
> +domain_pause_for_debugger();
> +return;
> +}

Urgh.  The TRAP_debug above was a copy/paste error.

I'll submit a patch, as it wants backporting for a couple of releases,
after which there should be no additions in do_entry_CP().

Everything else looks good.

~Andrew




[PATCH v4 6/6] x86: change asm/debugger.h to xen/debugger.h

2021-09-28 Thread Bobby Eshleman
This commit allows non-x86 architecture to omit the file asm/debugger.h
if they do not require it.  It changes debugger.h to be a general
xen/debugger.h which, if CONFIG_CRASH_DEBUG, resolves to include
asm/debugger.h.

It also changes all asm/debugger.h includes to xen/debugger.h.

Because it is no longer required, arm/debugger.h is removed.

Signed-off-by: Bobby Eshleman 
---
Changes in v4:
- Replace #include  with `struct cpu_user_regs`

 xen/arch/x86/traps.c   |  1 +
 xen/common/domain.c|  2 +-
 xen/common/gdbstub.c   |  2 +-
 xen/common/keyhandler.c|  2 +-
 xen/common/shutdown.c  |  2 +-
 xen/drivers/char/console.c |  2 +-
 xen/include/asm-arm/debugger.h | 15 --
 xen/include/asm-x86/debugger.h | 15 --
 xen/include/xen/debugger.h | 51 ++
 9 files changed, 57 insertions(+), 35 deletions(-)
 delete mode 100644 xen/include/asm-arm/debugger.h
 create mode 100644 xen/include/xen/debugger.h

diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index 742fa9e2ca..36d7fc6238 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -35,6 +35,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
diff --git a/xen/common/domain.c b/xen/common/domain.c
index 6b71c6d6a9..a87d814b38 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -11,6 +11,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -33,7 +34,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
diff --git a/xen/common/gdbstub.c b/xen/common/gdbstub.c
index 848c1f4327..1d7b98cdac 100644
--- a/xen/common/gdbstub.c
+++ b/xen/common/gdbstub.c
@@ -38,12 +38,12 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
diff --git a/xen/common/keyhandler.c b/xen/common/keyhandler.c
index 8b9f378371..1eafaef9b2 100644
--- a/xen/common/keyhandler.c
+++ b/xen/common/keyhandler.c
@@ -3,6 +3,7 @@
  */
 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -20,7 +21,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 
 static unsigned char keypress_key;
diff --git a/xen/common/shutdown.c b/xen/common/shutdown.c
index abde48aa4c..a933ee001e 100644
--- a/xen/common/shutdown.c
+++ b/xen/common/shutdown.c
@@ -2,13 +2,13 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
 #include 
 #include 
 #include 
-#include 
 #include 
 
 /* opt_noreboot: If true, machine will need manual reset on error. */
diff --git a/xen/drivers/char/console.c b/xen/drivers/char/console.c
index 7d0a603d03..3d1cdde821 100644
--- a/xen/drivers/char/console.c
+++ b/xen/drivers/char/console.c
@@ -15,6 +15,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -26,7 +27,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include  /* for do_console_io */
 #include 
diff --git a/xen/include/asm-arm/debugger.h b/xen/include/asm-arm/debugger.h
deleted file mode 100644
index ac776efa78..00
--- a/xen/include/asm-arm/debugger.h
+++ /dev/null
@@ -1,15 +0,0 @@
-#ifndef __ARM_DEBUGGER_H__
-#define __ARM_DEBUGGER_H__
-
-#define debugger_trap_fatal(v, r) (0)
-#define debugger_trap_immediate() ((void) 0)
-
-#endif /* __ARM_DEBUGGER_H__ */
-/*
- * Local variables:
- * mode: C
- * c-file-style: "BSD"
- * c-basic-offset: 4
- * indent-tabs-mode: nil
- * End:
- */
diff --git a/xen/include/asm-x86/debugger.h b/xen/include/asm-x86/debugger.h
index 8f6222956e..b9eeed395c 100644
--- a/xen/include/asm-x86/debugger.h
+++ b/xen/include/asm-x86/debugger.h
@@ -25,9 +25,6 @@
 #include 
 #include 
 #include 
-
-#ifdef CONFIG_CRASH_DEBUG
-
 #include 
 
 static inline bool debugger_trap_fatal(
@@ -40,16 +37,4 @@ static inline bool debugger_trap_fatal(
 /* Int3 is a trivial way to gather cpu_user_regs context. */
 #define debugger_trap_immediate() __asm__ __volatile__ ( "int3" );
 
-#else
-
-static inline bool debugger_trap_fatal(
-unsigned int vector, struct cpu_user_regs *regs)
-{
-return false;
-}
-
-#define debugger_trap_immediate() ((void)0)
-
-#endif
-
 #endif /* __X86_DEBUGGER_H__ */
diff --git a/xen/include/xen/debugger.h b/xen/include/xen/debugger.h
new file mode 100644
index 00..ddaa4a938b
--- /dev/null
+++ b/xen/include/xen/debugger.h
@@ -0,0 +1,51 @@
+/**
+ * Generic hooks into arch-dependent Xen.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; If not, see .
+ *
+ * Each debugger should define two functions here:
+ *
+ * 1. 

[PATCH v4 5/6] arch/x86: move domain_pause_for_debugger() to domain.h

2021-09-28 Thread Bobby Eshleman
domain_pause_for_debugger() was previously in debugger.h.  This commit
moves it to domain.h because its implementation is in domain.c.

Signed-off-by: Bobby Eshleman 
---
Changes in v3:
- domain_pause_for_debugger() is now moved into debugger.h, not a new
  file debugger.c

Changes in v4:
- Don't unnecessarily include 

 xen/arch/x86/nmi.c | 1 -
 xen/arch/x86/traps.c   | 1 -
 xen/include/asm-x86/debugger.h | 2 --
 xen/include/asm-x86/domain.h   | 2 ++
 4 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/xen/arch/x86/nmi.c b/xen/arch/x86/nmi.c
index ab94a96c4d..11d5f5a917 100644
--- a/xen/arch/x86/nmi.c
+++ b/xen/arch/x86/nmi.c
@@ -30,7 +30,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 
diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index 772e2a5bfc..742fa9e2ca 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -62,7 +62,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
diff --git a/xen/include/asm-x86/debugger.h b/xen/include/asm-x86/debugger.h
index ed4d5c829b..8f6222956e 100644
--- a/xen/include/asm-x86/debugger.h
+++ b/xen/include/asm-x86/debugger.h
@@ -26,8 +26,6 @@
 #include 
 #include 
 
-void domain_pause_for_debugger(void);
-
 #ifdef CONFIG_CRASH_DEBUG
 
 #include 
diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h
index 92d54de0b9..de854b5bfa 100644
--- a/xen/include/asm-x86/domain.h
+++ b/xen/include/asm-x86/domain.h
@@ -672,6 +672,8 @@ void update_guest_memory_policy(struct vcpu *v,
 
 void domain_cpu_policy_changed(struct domain *d);
 
+void domain_pause_for_debugger(void);
+
 bool update_runstate_area(struct vcpu *);
 bool update_secondary_system_time(struct vcpu *,
   struct vcpu_time_info *);
-- 
2.32.0




[PATCH v4 4/6] x86/gdbsx: expand dbg_rw_mem() inline

2021-09-28 Thread Bobby Eshleman
Because dbg_rw_mem() has only a single call site, this commit
expands it inline.

Signed-off-by: Bobby Eshleman 
---
Changes in v4:
- Add DCO

 xen/arch/x86/gdbsx.c | 30 +-
 1 file changed, 9 insertions(+), 21 deletions(-)

diff --git a/xen/arch/x86/gdbsx.c b/xen/arch/x86/gdbsx.c
index adea0f017b..9c8827c6c4 100644
--- a/xen/arch/x86/gdbsx.c
+++ b/xen/arch/x86/gdbsx.c
@@ -151,33 +151,21 @@ static unsigned int dbg_rw_guest_mem(struct domain *dp, 
unsigned long addr,
 return len;
 }
 
-/*
- * addr is guest addr
- * buf is debugger buffer.
- * if toaddr, then addr = buf (write to addr), else buf = addr (rd from guest)
- * pgd3: value of init_mm.pgd[3] in guest. see above.
- * Returns: number of bytes remaining to be copied.
- */
-static unsigned int dbg_rw_mem(unsigned long gva, XEN_GUEST_HANDLE_PARAM(void) 
buf,
-unsigned int len, domid_t domid, bool toaddr,
-uint64_t pgd3)
+int gdbsx_guest_mem_io(domid_t domid, struct xen_domctl_gdbsx_memio *iop)
 {
 struct domain *d = rcu_lock_domain_by_id(domid);
 
-if ( d )
+if ( d && !d->is_dying )
 {
-if ( !d->is_dying )
-len = dbg_rw_guest_mem(d, gva, buf, len, toaddr, pgd3);
-rcu_unlock_domain(d);
+iop->remain = dbg_rw_guest_mem(
+d, iop->gva, guest_handle_from_ptr(iop->uva, void),
+iop->len, domid, iop->pgd3val);
 }
+else
+iop->remain = iop->len;
 
-return len;
-}
-
-int gdbsx_guest_mem_io(domid_t domid, struct xen_domctl_gdbsx_memio *iop)
-{
-iop->remain = dbg_rw_mem(iop->gva, guest_handle_from_ptr(iop->uva, void),
- iop->len, domid, iop->gwr, iop->pgd3val);
+if ( d )
+rcu_unlock_domain(d);
 
 return iop->remain ? -EFAULT : 0;
 }
-- 
2.32.0




[PATCH v4 3/6] arch/x86: rename debug.c to gdbsx.c

2021-09-28 Thread Bobby Eshleman
This commit renames debug.c to gdbsx.c to clarify its purpose.

The function gdbsx_guest_mem_io() is moved from domctl.c to gdbsx.c.

Although gdbsx_guest_mem_io() is conditionally removed from its single
call site in domctl.c upon !CONFIG_GDBSX and so no stub is technically
necessary, this commit adds a stub that would preserve the functioning
of that call site if the #ifdef CONFIG_GDBSX were to ever be removed or
the function were to ever be called outside of such an ifdef block.

Signed-off-by: Bobby Eshleman 
---
Changes in v4:
- Alphebetize Makefile addition
- Fix broken header guard
- Include errno.h in gdbsx.h

 xen/arch/x86/Makefile |  2 +-
 xen/arch/x86/domctl.c | 12 +---
 xen/arch/x86/{debug.c => gdbsx.c} | 12 ++--
 xen/include/asm-x86/debugger.h|  6 --
 xen/include/asm-x86/gdbsx.h   | 19 +++
 5 files changed, 31 insertions(+), 20 deletions(-)
 rename xen/arch/x86/{debug.c => gdbsx.c} (93%)
 create mode 100644 xen/include/asm-x86/gdbsx.h

diff --git a/xen/arch/x86/Makefile b/xen/arch/x86/Makefile
index fe38cfd544..9fa2ea9aa1 100644
--- a/xen/arch/x86/Makefile
+++ b/xen/arch/x86/Makefile
@@ -20,7 +20,6 @@ obj-y += cpuid.o
 obj-$(CONFIG_PV) += compat.o
 obj-$(CONFIG_PV32) += x86_64/compat.o
 obj-$(CONFIG_KEXEC) += crash.o
-obj-$(CONFIG_GDBSX) += debug.o
 obj-y += delay.o
 obj-y += desc.o
 obj-bin-y += dmi_scan.init.o
@@ -32,6 +31,7 @@ obj-y += emul-i8254.o
 obj-y += extable.o
 obj-y += flushtlb.o
 obj-$(CONFIG_CRASH_DEBUG) += gdbstub.o
+obj-$(CONFIG_GDBSX) += gdbsx.o
 obj-y += hypercall.o
 obj-y += i387.o
 obj-y += i8259.o
diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c
index 26a76d2be9..a492fe140e 100644
--- a/xen/arch/x86/domctl.c
+++ b/xen/arch/x86/domctl.c
@@ -20,6 +20,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -33,20 +34,9 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 
-#ifdef CONFIG_GDBSX
-static int gdbsx_guest_mem_io(domid_t domid, struct xen_domctl_gdbsx_memio 
*iop)
-{
-iop->remain = dbg_rw_mem(iop->gva, guest_handle_from_ptr(iop->uva, void),
- iop->len, domid, iop->gwr, iop->pgd3val);
-
-return iop->remain ? -EFAULT : 0;
-}
-#endif
-
 static int update_domain_cpu_policy(struct domain *d,
 xen_domctl_cpu_policy_t *xdpc)
 {
diff --git a/xen/arch/x86/debug.c b/xen/arch/x86/gdbsx.c
similarity index 93%
rename from xen/arch/x86/debug.c
rename to xen/arch/x86/gdbsx.c
index d90dc93056..adea0f017b 100644
--- a/xen/arch/x86/debug.c
+++ b/xen/arch/x86/gdbsx.c
@@ -19,7 +19,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
 
 typedef unsigned long dbgva_t;
@@ -158,7 +158,7 @@ static unsigned int dbg_rw_guest_mem(struct domain *dp, 
unsigned long addr,
  * pgd3: value of init_mm.pgd[3] in guest. see above.
  * Returns: number of bytes remaining to be copied.
  */
-unsigned int dbg_rw_mem(unsigned long gva, XEN_GUEST_HANDLE_PARAM(void) buf,
+static unsigned int dbg_rw_mem(unsigned long gva, XEN_GUEST_HANDLE_PARAM(void) 
buf,
 unsigned int len, domid_t domid, bool toaddr,
 uint64_t pgd3)
 {
@@ -174,6 +174,14 @@ unsigned int dbg_rw_mem(unsigned long gva, 
XEN_GUEST_HANDLE_PARAM(void) buf,
 return len;
 }
 
+int gdbsx_guest_mem_io(domid_t domid, struct xen_domctl_gdbsx_memio *iop)
+{
+iop->remain = dbg_rw_mem(iop->gva, guest_handle_from_ptr(iop->uva, void),
+ iop->len, domid, iop->gwr, iop->pgd3val);
+
+return iop->remain ? -EFAULT : 0;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/include/asm-x86/debugger.h b/xen/include/asm-x86/debugger.h
index cd6b9477f7..ed4d5c829b 100644
--- a/xen/include/asm-x86/debugger.h
+++ b/xen/include/asm-x86/debugger.h
@@ -54,10 +54,4 @@ static inline bool debugger_trap_fatal(
 
 #endif
 
-#ifdef CONFIG_GDBSX
-unsigned int dbg_rw_mem(unsigned long gva, XEN_GUEST_HANDLE_PARAM(void) buf,
-unsigned int len, domid_t domid, bool toaddr,
-uint64_t pgd3);
-#endif
-
 #endif /* __X86_DEBUGGER_H__ */
diff --git a/xen/include/asm-x86/gdbsx.h b/xen/include/asm-x86/gdbsx.h
new file mode 100644
index 00..473229a7fb
--- /dev/null
+++ b/xen/include/asm-x86/gdbsx.h
@@ -0,0 +1,19 @@
+#ifndef __X86_GDBX_H__
+#define __X86_GDBX_H__
+
+#include 
+
+#ifdef CONFIG_GDBSX
+
+int gdbsx_guest_mem_io(domid_t domid, struct xen_domctl_gdbsx_memio *iop);
+
+#else
+
+static inline int gdbsx_guest_mem_io(domid_t domid, struct 
xen_domctl_gdbsx_memio *iop)
+{
+return -EOPNOTSUPP;
+}
+
+#endif
+
+#endif
-- 
2.32.0




[PATCH v4 2/6] x86/debugger: separate Xen and guest debugging debugger_trap_* functions

2021-09-28 Thread Bobby Eshleman
The functions debugger_trap_fatal(), debugger_trap_immediate(), and
debugger_trap_entry() are generic hook functions for debugger support.
In practice, debugger_trap_fatal() and debugger_trap_immediate() are
only used in the debugging of Xen itself and debugger_trap_entry() is
only used in the debugging of guests. That is, debugger_trap_entry() is
part of gdbsx functionality and not the Xen gdstub. This is evidenced by
debugger_trap_fatal()'s usage of domain_pause_for_debugger(). Because of
this, debugger_trap_entry() many not belong alongside the Xen debug
functions.

This commit fixes this by expanding inline debugger_trap_entry() into
its usage sites in x86/traps.c and stubbing out
domain_pause_for_debugger() when !CONFIG_GDBSX. Placing what
was debugger_trap_entry() under the scope of gdbsx instead of gdbstub.

The function calls that caused an effective no-op and early exit out of
debugger_trap_entry() are removed completely (when the trapnr is not
int3/debug).

This commit is one of a series geared towards removing the unnecessary
requirement that all architectures to implement .

Signed-off-by: Bobby Eshleman 
---
Changes in v4:
- Reword commit message for accuracy (make weaker claims)
- Fix "if { return } else if { return }" anti-pattern

 xen/arch/x86/domain.c  |  2 +-
 xen/arch/x86/traps.c   | 50 --
 xen/include/asm-x86/debugger.h | 42 ++--
 3 files changed, 33 insertions(+), 61 deletions(-)

diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index ef1812dc14..70894ff826 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -2541,7 +2541,7 @@ __initcall(init_vcpu_kick_softirq);
 
 void domain_pause_for_debugger(void)
 {
-#ifdef CONFIG_CRASH_DEBUG
+#ifdef CONFIG_GDBSX
 struct vcpu *curr = current;
 struct domain *d = curr->domain;
 
diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index e60af16ddd..772e2a5bfc 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -858,13 +858,20 @@ static void do_trap(struct cpu_user_regs *regs)
 if ( regs->error_code & X86_XEC_EXT )
 goto hardware_trap;
 
-if ( debugger_trap_entry(trapnr, regs) )
-return;
-
 ASSERT(trapnr < 32);
 
 if ( guest_mode(regs) )
 {
+struct vcpu *curr = current;
+if ( (trapnr == TRAP_debug || trapnr == TRAP_int3) &&
+  guest_kernel_mode(curr, regs) &&
+  curr->domain->debugger_attached )
+{
+if ( trapnr != TRAP_debug )
+curr->arch.gdbsx_vcpu_event = trapnr;
+domain_pause_for_debugger();
+return;
+}
 pv_inject_hw_exception(trapnr,
(TRAP_HAVE_EC & (1u << trapnr))
? regs->error_code : X86_EVENT_NO_EC);
@@ -1094,9 +1101,6 @@ void do_invalid_op(struct cpu_user_regs *regs)
 int id = -1, lineno;
 const struct virtual_region *region;
 
-if ( debugger_trap_entry(TRAP_invalid_op, regs) )
-return;
-
 if ( likely(guest_mode(regs)) )
 {
 if ( pv_emulate_invalid_op(regs) )
@@ -1201,8 +1205,7 @@ void do_invalid_op(struct cpu_user_regs *regs)
 
 void do_int3(struct cpu_user_regs *regs)
 {
-if ( debugger_trap_entry(TRAP_int3, regs) )
-return;
+struct vcpu *curr = current;
 
 if ( !guest_mode(regs) )
 {
@@ -1216,6 +1219,13 @@ void do_int3(struct cpu_user_regs *regs)
 return;
 }
 
+if ( guest_kernel_mode(curr, regs) && curr->domain->debugger_attached )
+{
+curr->arch.gdbsx_vcpu_event = TRAP_int3;
+domain_pause_for_debugger();
+return;
+}
+
 pv_inject_hw_exception(TRAP_int3, X86_EVENT_NO_EC);
 }
 
@@ -1492,9 +1502,6 @@ void do_page_fault(struct cpu_user_regs *regs)
 /* fixup_page_fault() might change regs->error_code, so cache it here. */
 error_code = regs->error_code;
 
-if ( debugger_trap_entry(TRAP_page_fault, regs) )
-return;
-
 perfc_incr(page_faults);
 
 /* Any shadow stack access fault is a bug in Xen. */
@@ -1593,9 +1600,6 @@ void do_general_protection(struct cpu_user_regs *regs)
 struct vcpu *v = current;
 #endif
 
-if ( debugger_trap_entry(TRAP_gp_fault, regs) )
-return;
-
 if ( regs->error_code & X86_XEC_EXT )
 goto hardware_gp;
 
@@ -1888,9 +1892,6 @@ void do_debug(struct cpu_user_regs *regs)
 /* Stash dr6 as early as possible. */
 dr6 = read_debugreg(6);
 
-if ( debugger_trap_entry(TRAP_debug, regs) )
-return;
-
 /*
  * At the time of writing (March 2018), on the subject of %dr6:
  *
@@ -1995,6 +1996,12 @@ void do_debug(struct cpu_user_regs *regs)
 return;
 }
 
+if ( guest_kernel_mode(v, regs) && v->domain->debugger_attached )
+{
+domain_pause_for_debugger();
+return;
+}
+
 /* Save debug status register where guest OS can peek at it */
 v->arch.dr6 |= (dr6 & 

[PATCH v4 1/6] arm/traps: remove debugger_trap_fatal() calls

2021-09-28 Thread Bobby Eshleman
ARM doesn't actually use debugger_trap_* anything, and is stubbed out.

This commit simply removes the unneeded calls.

Signed-off-by: Bobby Eshleman 
---
 xen/arch/arm/traps.c | 7 ---
 1 file changed, 7 deletions(-)

diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
index 4ccb6e7d18..889650ba63 100644
--- a/xen/arch/arm/traps.c
+++ b/xen/arch/arm/traps.c
@@ -41,7 +41,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -1266,10 +1265,6 @@ int do_bug_frame(const struct cpu_user_regs *regs, 
vaddr_t pc)
 
 case BUGFRAME_bug:
 printk("Xen BUG at %s%s:%d\n", prefix, filename, lineno);
-
-if ( debugger_trap_fatal(TRAP_invalid_op, regs) )
-return 0;
-
 show_execution_state(regs);
 panic("Xen BUG at %s%s:%d\n", prefix, filename, lineno);
 
@@ -1281,8 +1276,6 @@ int do_bug_frame(const struct cpu_user_regs *regs, 
vaddr_t pc)
 
 printk("Assertion '%s' failed at %s%s:%d\n",
predicate, prefix, filename, lineno);
-if ( debugger_trap_fatal(TRAP_invalid_op, regs) )
-return 0;
 show_execution_state(regs);
 panic("Assertion '%s' failed at %s%s:%d\n",
   predicate, prefix, filename, lineno);
-- 
2.32.0




[PATCH v4 0/6] Remove unconditional arch dependency on asm/debugger.h

2021-09-28 Thread Bobby Eshleman
This series removes the unconditional requirement that all architectures
implement asm/debugger.h. It additionally removes arm's debugger.h and
disentangles some of the x86 gdbsx/gdbstub/generic debugger code.

Additionally, this series does the following:
- Provides generic stubs when !CONFIG_CRASH_DEBUG
- Adds stronger separation between gdbstub, gdbsx, and generic debugger
  code.

v4 simply includes the review feedback from v3 with no other big changes
(as was the case for v3 in comparison to v2).

Bobby Eshleman (6):
  arm/traps: remove debugger_trap_fatal() calls
  x86/debugger: separate Xen and guest debugging debugger_trap_*
functions
  arch/x86: rename debug.c to gdbsx.c
  x86/gdbsx: expand dbg_rw_mem() inline
  arch/x86: move domain_pause_for_debugger() to domain.h
  x86: change asm/debugger.h to xen/debugger.h

 xen/arch/arm/traps.c  |  7 
 xen/arch/x86/Makefile |  2 +-
 xen/arch/x86/domain.c |  2 +-
 xen/arch/x86/domctl.c | 12 +-
 xen/arch/x86/{debug.c => gdbsx.c} | 28 ++---
 xen/arch/x86/nmi.c|  1 -
 xen/arch/x86/traps.c  | 52 +++--
 xen/common/domain.c   |  2 +-
 xen/common/gdbstub.c  |  2 +-
 xen/common/keyhandler.c   |  2 +-
 xen/common/shutdown.c |  2 +-
 xen/drivers/char/console.c|  2 +-
 xen/include/asm-arm/debugger.h| 15 ---
 xen/include/asm-x86/debugger.h| 65 +--
 xen/include/asm-x86/domain.h  |  2 +
 xen/include/asm-x86/gdbsx.h   | 19 +
 xen/include/xen/debugger.h| 51 
 17 files changed, 125 insertions(+), 141 deletions(-)
 rename xen/arch/x86/{debug.c => gdbsx.c} (89%)
 delete mode 100644 xen/include/asm-arm/debugger.h
 create mode 100644 xen/include/asm-x86/gdbsx.h
 create mode 100644 xen/include/xen/debugger.h

-- 
2.32.0




Re: [Stratos-dev] Xen Rust VirtIO demos work breakdown for Project Stratos

2021-09-28 Thread Oleksandr Tyshchenko
On Tue, Sep 28, 2021 at 9:26 AM Stefano Stabellini 
wrote:

Hi Stefano, all

[Sorry for the possible format issues]


On Mon, 27 Sep 2021, Christopher Clark wrote:
> > On Mon, Sep 27, 2021 at 3:06 AM Alex Bennée via Stratos-dev <
> stratos-...@op-lists.linaro.org> wrote:
> >
> >   Marek Marczykowski-Górecki 
> writes:
> >
> >   > [[PGP Signed Part:Undecided]]
> >   > On Fri, Sep 24, 2021 at 05:02:46PM +0100, Alex Bennée wrote:
> >   >> Hi,
> >   >
> >   > Hi,
> >   >
> >   >> 2.1 Stable ABI for foreignmemory mapping to non-dom0 ([STR-57])
> >   >> ───
> >   >>
> >   >>   Currently the foreign memory mapping support only works for
> dom0 due
> >   >>   to reference counting issues. If we are to support backends
> running in
> >   >>   their own domains this will need to get fixed.
> >   >>
> >   >>   Estimate: 8w
> >   >>
> >   >>
> >   >> [STR-57] 
> >   >
> >   > I'm pretty sure it was discussed before, but I can't find
> relevant
> >   > (part of) thread right now: does your model assumes the backend
> (running
> >   > outside of dom0) will gain ability to map (or access in other
> way)
> >   > _arbitrary_ memory page of a frontend domain? Or worse: any
> domain?
> >
> >   The aim is for some DomU's to host backends for other DomU's
> instead of
> >   all backends being in Dom0. Those backend DomU's would have to be
> >   considered trusted because as you say the default memory model of
> VirtIO
> >   is to have full access to the frontend domains memory map.
> >
> >
> > I share Marek's concern. I believe that there are Xen-based systems that
> will want to run guests using VirtIO devices without extending
> > this level of trust to the backend domains.
>
> From a safety perspective, it would be challenging to deploy a system
> with privileged backends. From a safety perspective, it would be a lot
> easier if the backend were unprivileged.
>
> This is one of those times where safety and security requirements are
> actually aligned.


Well, the foreign memory mapping has one advantage in the context of Virtio
use-case
which is that Virtio infrastructure in Guest doesn't require any
modifications to run on top Xen.
The only issue with foreign memory here is that Guest memory actually
mapped without its agreement
which doesn't perfectly fit into the security model. (although there is one
more issue with XSA-300,
but I think it will go away sooner or later, at least there are some
attempts to eliminate it).
While the ability to map any part of Guest memory is not an issue for the
backend running in Dom0
(which we usually trust), this will certainly violate Xen security model if
we want to run it in other
domain, so I completely agree with the existing concern.

It was discussed before [1], but I couldn't find any decisions regarding
that. As I understand,
the one of the possible ideas is to have some entity in Xen (PV
IOMMU/virtio-iommu/whatever)
that works in protection mode, so it denies all foreign mapping requests
from the backend running in DomU
by default and only allows requests with mapping which were *implicitly*
granted by the Guest before.
For example, Xen could be informed which MMIOs hold the queue PFN and
notify registers
(as it traps the accesses to these registers anyway) and could
theoretically parse the frontend request
and retrieve descriptors to make a decision which GFNs are actually
*allowed*.

I can't say for sure (sorry not familiar enough with the topic), but
implementing the virtio-iommu device
in Xen we could probably avoid Guest modifications at all. Of course, for
this to work
the Virtio infrastructure in Guest should use DMA API as mentioned in [1].

Would the “restricted foreign mapping” solution retain the Xen security
model and be accepted
by the Xen community? I wonder, has someone already looked in this
direction, are there any
pitfalls here or is this even feasible?

[1]
https://lore.kernel.org/xen-devel/464e91ec-2b53-2338-43c7-a018087fc...@arm.com/



-- 
Regards,

Oleksandr Tyshchenko


[xen-unstable-smoke test] 165243: regressions - FAIL

2021-09-28 Thread osstest service owner
flight 165243 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/165243/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-armhf   6 xen-buildfail REGR. vs. 165233

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-xl   1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  9c3b9800e2019c93ab22da69e4a0b22d6fb059ec
baseline version:
 xen  890ceb9453171c85e881103e65dbb5cdcf81659e

Last test of basis   165233  2021-09-28 12:00:26 Z0 days
Testing same since   165243  2021-09-28 17:00:26 Z0 days1 attempts


People who touched revisions under test:
  Igor Druzhinin 
  Jan Beulich 

jobs:
 build-arm64-xsm  pass
 build-amd64  pass
 build-armhf  fail
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  blocked 
 test-arm64-arm64-xl-xsm  pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.


commit 9c3b9800e2019c93ab22da69e4a0b22d6fb059ec
Author: Igor Druzhinin 
Date:   Tue Sep 28 16:04:50 2021 +0200

pci: fix handling of PCI bridges with subordinate bus number 0xff

Bus number 0xff is valid according to the PCI spec. Using u8 typed sub_bus
and assigning 0xff to it will result in the following loop getting stuck.

for ( ; sec_bus <= sub_bus; sec_bus++ ) {...}

Just change its type to unsigned int similarly to what is already done in
dmar_scope_add_buses().

Signed-off-by: Igor Druzhinin 
Reviewed-by: Jan Beulich 
Reviewed-by: Bertrand Marquis 

commit 1578322ac6bc4d66800a5a3caf6685f556b64054
Author: Jan Beulich 
Date:   Tue Sep 28 16:03:38 2021 +0200

x86/PVH: actually show Dom0's register state from debug key '0'

vcpu_show_registers() didn't do anything for HVM so far. Note though
that some extra hackery is needed for VMX - see the code comment.

Note further that the show_guest_stack() invocation is left alone here:
While strictly speaking guest_kernel_mode() should be predicated by a
PV / !HVM check, show_guest_stack() itself will bail immediately for
HVM.

While there and despite not being PVH-specific, take the opportunity and
filter offline vCPU-s: There's not really any register state associated
with them, so avoid spamming the log with useless information while
still leaving an indication of the fact.

Signed-off-by: Jan Beulich 
Reviewed-by: Roger Pau Monné 
(qemu changes not included)



[linux-linus test] 165231: regressions - FAIL

2021-09-28 Thread osstest service owner
flight 165231 linux-linus real [real]
flight 165244 linux-linus real-retest [real]
http://logs.test-lab.xenproject.org/osstest/logs/165231/
http://logs.test-lab.xenproject.org/osstest/logs/165244/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-armhf-armhf-xl-credit1 18 guest-start/debian.repeat fail REGR. vs. 152332

Tests which are failing intermittently (not blocking):
 test-amd64-amd64-xl-rtds 20 guest-localmigrate/x10 fail in 165225 pass in 
165231
 test-armhf-armhf-xl-credit1  14 guest-start  fail in 165225 pass in 165231
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-shadow 20 
guest-start/debianhvm.repeat fail pass in 165225

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check fail baseline 
untested
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 152332
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 152332
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 152332
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 152332
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 152332
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 152332
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 152332
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qcow2 14 migrate-support-checkfail never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass
 test-arm64-arm64-xl-vhd  14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-rtds 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-qcow2 14 migrate-support-checkfail never pass
 test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  15 saverestore-support-checkfail   never pass

version targeted for testing:
 linux0513e464f9007b70b96740271a948ca5ab6e7dd7
baseline version:
 linuxdeacdb3e3979979016fcd0ffd518c320a62ad166

Last test of basis   152332  2020-07-31 19:41:23 

[PATCH v1 7/8] virtio-mem: factor out hotplug specifics from virtio_mem_remove() into virtio_mem_deinit_hotplug()

2021-09-28 Thread David Hildenbrand
Let's prepare for a new virtio-mem kdump mode in which we don't actually
hot(un)plug any memory but only observe the state of device blocks.

Signed-off-by: David Hildenbrand 
---
 drivers/virtio/virtio_mem.c | 13 ++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/drivers/virtio/virtio_mem.c b/drivers/virtio/virtio_mem.c
index 1be3ee7f684d..76d8aef3cfd2 100644
--- a/drivers/virtio/virtio_mem.c
+++ b/drivers/virtio/virtio_mem.c
@@ -2667,9 +2667,8 @@ static int virtio_mem_probe(struct virtio_device *vdev)
return rc;
 }
 
-static void virtio_mem_remove(struct virtio_device *vdev)
+static void virtio_mem_deinit_hotplug(struct virtio_mem *vm)
 {
-   struct virtio_mem *vm = vdev->priv;
unsigned long mb_id;
int rc;
 
@@ -2716,7 +2715,8 @@ static void virtio_mem_remove(struct virtio_device *vdev)
 * away. Warn at least.
 */
if (virtio_mem_has_memory_added(vm)) {
-   dev_warn(>dev, "device still has system memory added\n");
+   dev_warn(>vdev->dev,
+"device still has system memory added\n");
} else {
virtio_mem_delete_resource(vm);
kfree_const(vm->resource_name);
@@ -2730,6 +2730,13 @@ static void virtio_mem_remove(struct virtio_device *vdev)
} else {
vfree(vm->bbm.bb_states);
}
+}
+
+static void virtio_mem_remove(struct virtio_device *vdev)
+{
+   struct virtio_mem *vm = vdev->priv;
+
+   virtio_mem_deinit_hotplug(vm);
 
/* reset the device and cleanup the queues */
vdev->config->reset(vdev);
-- 
2.31.1




[PATCH v1 5/8] virtio-mem: factor out hotplug specifics from virtio_mem_init() into virtio_mem_init_hotplug()

2021-09-28 Thread David Hildenbrand
Let's prepare for a new virtio-mem kdump mode in which we don't actually
hot(un)plug any memory but only observe the state of device blocks.

Signed-off-by: David Hildenbrand 
---
 drivers/virtio/virtio_mem.c | 81 -
 1 file changed, 44 insertions(+), 37 deletions(-)

diff --git a/drivers/virtio/virtio_mem.c b/drivers/virtio/virtio_mem.c
index bef8ad6bf466..2ba7e8d6ba8d 100644
--- a/drivers/virtio/virtio_mem.c
+++ b/drivers/virtio/virtio_mem.c
@@ -2392,41 +2392,10 @@ static int virtio_mem_init_vq(struct virtio_mem *vm)
return 0;
 }
 
-static int virtio_mem_init(struct virtio_mem *vm)
+static int virtio_mem_init_hotplug(struct virtio_mem *vm)
 {
const struct range pluggable_range = mhp_get_pluggable_range(true);
uint64_t sb_size, addr;
-   uint16_t node_id;
-
-   if (!vm->vdev->config->get) {
-   dev_err(>vdev->dev, "config access disabled\n");
-   return -EINVAL;
-   }
-
-   /*
-* We don't want to (un)plug or reuse any memory when in kdump. The
-* memory is still accessible (but not mapped).
-*/
-   if (is_kdump_kernel()) {
-   dev_warn(>vdev->dev, "disabled in kdump kernel\n");
-   return -EBUSY;
-   }
-
-   /* Fetch all properties that can't change. */
-   virtio_cread_le(vm->vdev, struct virtio_mem_config, plugged_size,
-   >plugged_size);
-   virtio_cread_le(vm->vdev, struct virtio_mem_config, block_size,
-   >device_block_size);
-   virtio_cread_le(vm->vdev, struct virtio_mem_config, node_id,
-   _id);
-   vm->nid = virtio_mem_translate_node_id(vm, node_id);
-   virtio_cread_le(vm->vdev, struct virtio_mem_config, addr, >addr);
-   virtio_cread_le(vm->vdev, struct virtio_mem_config, region_size,
-   >region_size);
-
-   /* Determine the nid for the device based on the lowest address. */
-   if (vm->nid == NUMA_NO_NODE)
-   vm->nid = memory_add_physaddr_to_nid(vm->addr);
 
/* bad device setup - warn only */
if (!IS_ALIGNED(vm->addr, memory_block_size_bytes()))
@@ -2496,10 +2465,6 @@ static int virtio_mem_init(struct virtio_mem *vm)
  vm->offline_threshold);
}
 
-   dev_info(>vdev->dev, "start address: 0x%llx", vm->addr);
-   dev_info(>vdev->dev, "region size: 0x%llx", vm->region_size);
-   dev_info(>vdev->dev, "device block size: 0x%llx",
-(unsigned long long)vm->device_block_size);
dev_info(>vdev->dev, "memory block size: 0x%lx",
 memory_block_size_bytes());
if (vm->in_sbm)
@@ -2508,10 +2473,52 @@ static int virtio_mem_init(struct virtio_mem *vm)
else
dev_info(>vdev->dev, "big block size: 0x%llx",
 (unsigned long long)vm->bbm.bb_size);
+
+   return 0;
+}
+
+static int virtio_mem_init(struct virtio_mem *vm)
+{
+   uint16_t node_id;
+
+   if (!vm->vdev->config->get) {
+   dev_err(>vdev->dev, "config access disabled\n");
+   return -EINVAL;
+   }
+
+   /*
+* We don't want to (un)plug or reuse any memory when in kdump. The
+* memory is still accessible (but not mapped).
+*/
+   if (is_kdump_kernel()) {
+   dev_warn(>vdev->dev, "disabled in kdump kernel\n");
+   return -EBUSY;
+   }
+
+   /* Fetch all properties that can't change. */
+   virtio_cread_le(vm->vdev, struct virtio_mem_config, plugged_size,
+   >plugged_size);
+   virtio_cread_le(vm->vdev, struct virtio_mem_config, block_size,
+   >device_block_size);
+   virtio_cread_le(vm->vdev, struct virtio_mem_config, node_id,
+   _id);
+   vm->nid = virtio_mem_translate_node_id(vm, node_id);
+   virtio_cread_le(vm->vdev, struct virtio_mem_config, addr, >addr);
+   virtio_cread_le(vm->vdev, struct virtio_mem_config, region_size,
+   >region_size);
+
+   /* Determine the nid for the device based on the lowest address. */
+   if (vm->nid == NUMA_NO_NODE)
+   vm->nid = memory_add_physaddr_to_nid(vm->addr);
+
+   dev_info(>vdev->dev, "start address: 0x%llx", vm->addr);
+   dev_info(>vdev->dev, "region size: 0x%llx", vm->region_size);
+   dev_info(>vdev->dev, "device block size: 0x%llx",
+(unsigned long long)vm->device_block_size);
if (vm->nid != NUMA_NO_NODE && IS_ENABLED(CONFIG_NUMA))
dev_info(>vdev->dev, "nid: %d", vm->nid);
 
-   return 0;
+   return virtio_mem_init_hotplug(vm);
 }
 
 static int virtio_mem_create_resource(struct virtio_mem *vm)
-- 
2.31.1




[PATCH v1 4/8] proc/vmcore: convert oldmem_pfn_is_ram callback to more generic vmcore callbacks

2021-09-28 Thread David Hildenbrand
Let's support multiple registered callbacks, making sure that
registering vmcore callbacks cannot fail. Make the callback return a
bool instead of an int, handling how to deal with errors internally.
Drop unused HAVE_OLDMEM_PFN_IS_RAM.

We soon want to make use of this infrastructure from other drivers:
virtio-mem, registering one callback for each virtio-mem device, to
prevent reading unplugged virtio-mem memory.

Handle it via a generic vmcore_cb structure, prepared for future
extensions: for example, once we support virtio-mem on s390x where the
vmcore is completely constructed in the second kernel, we want to detect
and add plugged virtio-mem memory ranges to the vmcore in order for them
to get dumped properly.

Handle corner cases that are unexpected and shouldn't happen in sane
setups: registering a callback after the vmcore has already been opened
(warn only) and unregistering a callback after the vmcore has already been
opened (warn and essentially read only zeroes from that point on).

Signed-off-by: David Hildenbrand 
---
 arch/x86/kernel/aperture_64.c | 13 -
 arch/x86/xen/mmu_hvm.c| 15 +++---
 fs/proc/vmcore.c  | 99 ---
 include/linux/crash_dump.h| 26 +++--
 4 files changed, 113 insertions(+), 40 deletions(-)

diff --git a/arch/x86/kernel/aperture_64.c b/arch/x86/kernel/aperture_64.c
index 10562885f5fc..af3ba08b684b 100644
--- a/arch/x86/kernel/aperture_64.c
+++ b/arch/x86/kernel/aperture_64.c
@@ -73,12 +73,23 @@ static int gart_mem_pfn_is_ram(unsigned long pfn)
  (pfn >= aperture_pfn_start + aperture_page_count));
 }
 
+#ifdef CONFIG_PROC_VMCORE
+static bool gart_oldmem_pfn_is_ram(struct vmcore_cb *cb, unsigned long pfn)
+{
+   return !!gart_mem_pfn_is_ram(pfn);
+}
+
+static struct vmcore_cb gart_vmcore_cb = {
+   .pfn_is_ram = gart_oldmem_pfn_is_ram,
+};
+#endif
+
 static void __init exclude_from_core(u64 aper_base, u32 aper_order)
 {
aperture_pfn_start = aper_base >> PAGE_SHIFT;
aperture_page_count = (32 * 1024 * 1024) << aper_order >> PAGE_SHIFT;
 #ifdef CONFIG_PROC_VMCORE
-   WARN_ON(register_oldmem_pfn_is_ram(_mem_pfn_is_ram));
+   register_vmcore_cb(_vmcore_cb);
 #endif
 #ifdef CONFIG_PROC_KCORE
WARN_ON(register_mem_pfn_is_ram(_mem_pfn_is_ram));
diff --git a/arch/x86/xen/mmu_hvm.c b/arch/x86/xen/mmu_hvm.c
index eb61622df75b..49bd4a6a5858 100644
--- a/arch/x86/xen/mmu_hvm.c
+++ b/arch/x86/xen/mmu_hvm.c
@@ -12,10 +12,10 @@
  * The kdump kernel has to check whether a pfn of the crashed kernel
  * was a ballooned page. vmcore is using this function to decide
  * whether to access a pfn of the crashed kernel.
- * Returns 0 if the pfn is not backed by a RAM page, the caller may
+ * Returns "false" if the pfn is not backed by a RAM page, the caller may
  * handle the pfn special in this case.
  */
-static int xen_oldmem_pfn_is_ram(unsigned long pfn)
+static bool xen_vmcore_pfn_is_ram(struct vmcore_cb *cb, unsigned long pfn)
 {
struct xen_hvm_get_mem_type a = {
.domid = DOMID_SELF,
@@ -23,15 +23,18 @@ static int xen_oldmem_pfn_is_ram(unsigned long pfn)
};
 
if (HYPERVISOR_hvm_op(HVMOP_get_mem_type, ))
-   return -ENXIO;
+   return true;
 
switch (a.mem_type) {
case HVMMEM_mmio_dm:
-   return 0;
+   return false;
default:
-   return 1;
+   return true;
}
 }
+static struct vmcore_cb xen_vmcore_cb = {
+   .pfn_is_ram = xen_vmcore_pfn_is_ram,
+};
 #endif
 
 static void xen_hvm_exit_mmap(struct mm_struct *mm)
@@ -65,6 +68,6 @@ void __init xen_hvm_init_mmu_ops(void)
if (is_pagetable_dying_supported())
pv_ops.mmu.exit_mmap = xen_hvm_exit_mmap;
 #ifdef CONFIG_PROC_VMCORE
-   WARN_ON(register_oldmem_pfn_is_ram(_oldmem_pfn_is_ram));
+   register_vmcore_cb(_vmcore_cb);
 #endif
 }
diff --git a/fs/proc/vmcore.c b/fs/proc/vmcore.c
index a9bd80ab670e..7a04b2eca287 100644
--- a/fs/proc/vmcore.c
+++ b/fs/proc/vmcore.c
@@ -62,46 +62,75 @@ core_param(novmcoredd, vmcoredd_disabled, bool, 0);
 /* Device Dump Size */
 static size_t vmcoredd_orig_sz;
 
-/*
- * Returns > 0 for RAM pages, 0 for non-RAM pages, < 0 on error
- * The called function has to take care of module refcounting.
- */
-static int (*oldmem_pfn_is_ram)(unsigned long pfn);
-
-int register_oldmem_pfn_is_ram(int (*fn)(unsigned long pfn))
+static DECLARE_RWSEM(vmcore_cb_rwsem);
+/* List of registered vmcore callbacks. */
+static LIST_HEAD(vmcore_cb_list);
+/* Whether we had a surprise unregistration of a callback. */
+static bool vmcore_cb_unstable;
+/* Whether the vmcore has been opened once. */
+static bool vmcore_opened;
+
+void register_vmcore_cb(struct vmcore_cb *cb)
 {
-   if (oldmem_pfn_is_ram)
-   return -EBUSY;
-   oldmem_pfn_is_ram = fn;
-   return 0;
+   down_write(_cb_rwsem);
+   INIT_LIST_HEAD(>next);
+   

[PATCH v1 8/8] virtio-mem: kdump mode to sanitize /proc/vmcore access

2021-09-28 Thread David Hildenbrand
Although virtio-mem currently supports reading unplugged memory in the
hypervisor, this will change in the future, indicated to the device via
a new feature flag. We similarly sanitized /proc/kcore access recently. [1]

Let's register a vmcore callback, to allow vmcore code to check if a PFN
belonging to a virtio-mem device is either currently plugged and should
be dumped or is currently unplugged and should not be accessed, instead
mapping the shared zeropage or returning zeroes when reading.

This is important when not capturing /proc/vmcore via tools like
"makedumpfile" that can identify logically unplugged virtio-mem memory via
PG_offline in the memmap, but simply by e.g., copying the file.

Distributions that support virtio-mem+kdump have to make sure that the
virtio_mem module will be part of the kdump kernel or the kdump initrd;
dracut was recently [2] extended to include virtio-mem in the generated
initrd. As long as no special kdump kernels are used, this will
automatically make sure that virtio-mem will be around in the kdump initrd
and sanitize /proc/vmcore access -- with dracut.

With this series, we'll send one virtio-mem state request for every
~2 MiB chunk of virtio-mem memory indicated in the vmcore that we intend
to read/map.

In the future, we might want to allow building virtio-mem for kdump
mode only, even without CONFIG_MEMORY_HOTPLUG and friends: this way,
we could support special stripped-down kdump kernels that have many
other config options disabled; we'll tackle that once required. Further,
we might want to try sensing bigger blocks (e.g., memory sections)
first before falling back to device blocks on demand.

Tested with Fedora rawhide, which contains a recent kexec-tools version
(considering "System RAM (virtio_mem)" when creating the vmcore header) and
a recent dracut version (including the virtio_mem module in the kdump
initrd).

[1] https://lkml.kernel.org/r/20210526093041.8800-1-da...@redhat.com
[2] https://github.com/dracutdevs/dracut/pull/1157

Signed-off-by: David Hildenbrand 
---
 drivers/virtio/virtio_mem.c | 136 
 1 file changed, 124 insertions(+), 12 deletions(-)

diff --git a/drivers/virtio/virtio_mem.c b/drivers/virtio/virtio_mem.c
index 76d8aef3cfd2..ec0b2ab37acb 100644
--- a/drivers/virtio/virtio_mem.c
+++ b/drivers/virtio/virtio_mem.c
@@ -223,6 +223,9 @@ struct virtio_mem {
 * When this lock is held the pointers can't change, ONLINE and
 * OFFLINE blocks can't change the state and no subblocks will get
 * plugged/unplugged.
+*
+* In kdump mode, used to serialize requests, last_block_addr and
+* last_block_plugged.
 */
struct mutex hotplug_mutex;
bool hotplug_active;
@@ -230,6 +233,9 @@ struct virtio_mem {
/* An error occurred we cannot handle - stop processing requests. */
bool broken;
 
+   /* Cached valued of is_kdump_kernel() when the device was probed. */
+   bool in_kdump;
+
/* The driver is being removed. */
spinlock_t removal_lock;
bool removing;
@@ -243,6 +249,13 @@ struct virtio_mem {
/* Memory notifier (online/offline events). */
struct notifier_block memory_notifier;
 
+#ifdef CONFIG_PROC_VMCORE
+   /* vmcore callback for /proc/vmcore handling in kdump mode */
+   struct vmcore_cb vmcore_cb;
+   uint64_t last_block_addr;
+   bool last_block_plugged;
+#endif /* CONFIG_PROC_VMCORE */
+
/* Next device in the list of virtio-mem devices. */
struct list_head next;
 };
@@ -2293,6 +2306,12 @@ static void virtio_mem_run_wq(struct work_struct *work)
uint64_t diff;
int rc;
 
+   if (unlikely(vm->in_kdump)) {
+   dev_warn_once(>vdev->dev,
+"unexpected workqueue run in kdump kernel\n");
+   return;
+   }
+
hrtimer_cancel(>retry_timer);
 
if (vm->broken)
@@ -2521,6 +2540,86 @@ static int virtio_mem_init_hotplug(struct virtio_mem *vm)
return rc;
 }
 
+#ifdef CONFIG_PROC_VMCORE
+static int virtio_mem_send_state_request(struct virtio_mem *vm, uint64_t addr,
+uint64_t size)
+{
+   const uint64_t nb_vm_blocks = size / vm->device_block_size;
+   const struct virtio_mem_req req = {
+   .type = cpu_to_virtio16(vm->vdev, VIRTIO_MEM_REQ_STATE),
+   .u.state.addr = cpu_to_virtio64(vm->vdev, addr),
+   .u.state.nb_blocks = cpu_to_virtio16(vm->vdev, nb_vm_blocks),
+   };
+   int rc = -ENOMEM;
+
+   dev_dbg(>vdev->dev, "requesting state: 0x%llx - 0x%llx\n", addr,
+   addr + size - 1);
+
+   switch (virtio_mem_send_request(vm, )) {
+   case VIRTIO_MEM_RESP_ACK:
+   return virtio16_to_cpu(vm->vdev, vm->resp.u.state.state);
+   case VIRTIO_MEM_RESP_ERROR:
+   rc = -EINVAL;
+   break;
+   default:
+   break;
+   

[PATCH v1 6/8] virtio-mem: factor out hotplug specifics from virtio_mem_probe() into virtio_mem_init_hotplug()

2021-09-28 Thread David Hildenbrand
Let's prepare for a new virtio-mem kdump mode in which we don't actually
hot(un)plug any memory but only observe the state of device blocks.

Signed-off-by: David Hildenbrand 
---
 drivers/virtio/virtio_mem.c | 87 +++--
 1 file changed, 45 insertions(+), 42 deletions(-)

diff --git a/drivers/virtio/virtio_mem.c b/drivers/virtio/virtio_mem.c
index 2ba7e8d6ba8d..1be3ee7f684d 100644
--- a/drivers/virtio/virtio_mem.c
+++ b/drivers/virtio/virtio_mem.c
@@ -260,6 +260,8 @@ static void virtio_mem_fake_offline_going_offline(unsigned 
long pfn,
 static void virtio_mem_fake_offline_cancel_offline(unsigned long pfn,
   unsigned long nr_pages);
 static void virtio_mem_retry(struct virtio_mem *vm);
+static int virtio_mem_create_resource(struct virtio_mem *vm);
+static void virtio_mem_delete_resource(struct virtio_mem *vm);
 
 /*
  * Register a virtio-mem device so it will be considered for the online_page
@@ -2395,7 +2397,8 @@ static int virtio_mem_init_vq(struct virtio_mem *vm)
 static int virtio_mem_init_hotplug(struct virtio_mem *vm)
 {
const struct range pluggable_range = mhp_get_pluggable_range(true);
-   uint64_t sb_size, addr;
+   uint64_t unit_pages, sb_size, addr;
+   int rc;
 
/* bad device setup - warn only */
if (!IS_ALIGNED(vm->addr, memory_block_size_bytes()))
@@ -2474,7 +2477,48 @@ static int virtio_mem_init_hotplug(struct virtio_mem *vm)
dev_info(>vdev->dev, "big block size: 0x%llx",
 (unsigned long long)vm->bbm.bb_size);
 
+   /* create the parent resource for all memory */
+   rc = virtio_mem_create_resource(vm);
+   if (rc)
+   return rc;
+
+   /* use a single dynamic memory group to cover the whole memory device */
+   if (vm->in_sbm)
+   unit_pages = PHYS_PFN(memory_block_size_bytes());
+   else
+   unit_pages = PHYS_PFN(vm->bbm.bb_size);
+   rc = memory_group_register_dynamic(vm->nid, unit_pages);
+   if (rc < 0)
+   goto out_del_resource;
+   vm->mgid = rc;
+
+   /*
+* If we still have memory plugged, we have to unplug all memory first.
+* Registering our parent resource makes sure that this memory isn't
+* actually in use (e.g., trying to reload the driver).
+*/
+   if (vm->plugged_size) {
+   vm->unplug_all_required = true;
+   dev_info(>vdev->dev, "unplugging all memory is required\n");
+   }
+
+   /* register callbacks */
+   vm->memory_notifier.notifier_call = virtio_mem_memory_notifier_cb;
+   rc = register_memory_notifier(>memory_notifier);
+   if (rc)
+   goto out_unreg_group;
+   rc = register_virtio_mem_device(vm);
+   if (rc)
+   goto out_unreg_mem;
+
return 0;
+out_unreg_mem:
+   unregister_memory_notifier(>memory_notifier);
+out_unreg_group:
+   memory_group_unregister(vm->mgid);
+out_del_resource:
+   virtio_mem_delete_resource(vm);
+   return rc;
 }
 
 static int virtio_mem_init(struct virtio_mem *vm)
@@ -2578,7 +2622,6 @@ static bool virtio_mem_has_memory_added(struct virtio_mem 
*vm)
 static int virtio_mem_probe(struct virtio_device *vdev)
 {
struct virtio_mem *vm;
-   uint64_t unit_pages;
int rc;
 
BUILD_BUG_ON(sizeof(struct virtio_mem_req) != 24);
@@ -2608,40 +2651,6 @@ static int virtio_mem_probe(struct virtio_device *vdev)
if (rc)
goto out_del_vq;
 
-   /* create the parent resource for all memory */
-   rc = virtio_mem_create_resource(vm);
-   if (rc)
-   goto out_del_vq;
-
-   /* use a single dynamic memory group to cover the whole memory device */
-   if (vm->in_sbm)
-   unit_pages = PHYS_PFN(memory_block_size_bytes());
-   else
-   unit_pages = PHYS_PFN(vm->bbm.bb_size);
-   rc = memory_group_register_dynamic(vm->nid, unit_pages);
-   if (rc < 0)
-   goto out_del_resource;
-   vm->mgid = rc;
-
-   /*
-* If we still have memory plugged, we have to unplug all memory first.
-* Registering our parent resource makes sure that this memory isn't
-* actually in use (e.g., trying to reload the driver).
-*/
-   if (vm->plugged_size) {
-   vm->unplug_all_required = true;
-   dev_info(>vdev->dev, "unplugging all memory is required\n");
-   }
-
-   /* register callbacks */
-   vm->memory_notifier.notifier_call = virtio_mem_memory_notifier_cb;
-   rc = register_memory_notifier(>memory_notifier);
-   if (rc)
-   goto out_unreg_group;
-   rc = register_virtio_mem_device(vm);
-   if (rc)
-   goto out_unreg_mem;
-
virtio_device_ready(vdev);
 
/* trigger a config update to start processing the requested_size */
@@ -2649,12 +2658,6 @@ static int 

[PATCH v3 17/17] xen/arm: Add linux,pci-domain property for hwdom if not available.

2021-09-28 Thread Rahul Singh
If the property is not present in the device tree node for host bridge,
XEN while creating the dtb for hwdom will create this property and
assigns the already allocated segment to the host bridge
so that XEN and linux will have the same segment for the host bridges.

Signed-off-by: Rahul Singh 
---
Change in v3:
- Use is_pci_passthrough_enabled()
Change in v2:
- Add linux,pci-domain only when pci-passthrough command line option is enabeld
---
 xen/arch/arm/domain_build.c| 16 
 xen/arch/arm/pci/pci-host-common.c | 21 +
 xen/include/asm-arm/pci.h  |  9 +
 3 files changed, 46 insertions(+)

diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index 1731ae2028..026c9e5c6c 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -743,6 +743,22 @@ static int __init write_properties(struct domain *d, 
struct kernel_info *kinfo,
 return res;
 }
 
+if ( is_pci_passthrough_enabled() && dt_device_type_is_equal(node, "pci") )
+{
+if ( !dt_find_property(node, "linux,pci-domain", NULL) )
+{
+uint16_t segment;
+
+res = pci_get_host_bridge_segment(node, );
+if ( res < 0 )
+return res;
+
+res = fdt_property_cell(kinfo->fdt, "linux,pci-domain", segment);
+if ( res )
+return res;
+}
+}
+
 /*
  * Override the property "status" to disable the device when it's
  * marked for passthrough.
diff --git a/xen/arch/arm/pci/pci-host-common.c 
b/xen/arch/arm/pci/pci-host-common.c
index c5941b10e9..593beeb48c 100644
--- a/xen/arch/arm/pci/pci-host-common.c
+++ b/xen/arch/arm/pci/pci-host-common.c
@@ -255,6 +255,27 @@ struct pci_host_bridge *pci_find_host_bridge(uint16_t 
segment, uint8_t bus)
 
 return NULL;
 }
+
+/*
+ * This function will lookup an hostbridge based on config space address.
+ */
+int pci_get_host_bridge_segment(const struct dt_device_node *node,
+uint16_t *segment)
+{
+struct pci_host_bridge *bridge;
+
+list_for_each_entry( bridge, _host_bridges, node )
+{
+if ( bridge->dt_node != node )
+continue;
+
+*segment = bridge->segment;
+return 0;
+}
+
+return -EINVAL;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/include/asm-arm/pci.h b/xen/include/asm-arm/pci.h
index 5532ce3977..7cb2e2f1ed 100644
--- a/xen/include/asm-arm/pci.h
+++ b/xen/include/asm-arm/pci.h
@@ -90,6 +90,8 @@ int pci_generic_config_write(struct pci_host_bridge *bridge, 
pci_sbdf_t sbdf,
 void __iomem *pci_ecam_map_bus(struct pci_host_bridge *bridge,
pci_sbdf_t sbdf, uint32_t where);
 struct pci_host_bridge *pci_find_host_bridge(uint16_t segment, uint8_t bus);
+int pci_get_host_bridge_segment(const struct dt_device_node *node,
+uint16_t *segment);
 
 static always_inline bool is_pci_passthrough_enabled(void)
 {
@@ -104,5 +106,12 @@ static always_inline bool is_pci_passthrough_enabled(void)
 return false;
 }
 
+static inline int pci_get_host_bridge_segment(const struct dt_device_node 
*node,
+  uint16_t *segment)
+{
+ASSERT_UNREACHABLE();
+return -EINVAL;
+}
+
 #endif  /*!CONFIG_HAS_PCI*/
 #endif /* __ARM_PCI_H__ */
-- 
2.17.1




[PATCH v3 16/17] arm/libxl: Emulated PCI device tree node in libxl

2021-09-28 Thread Rahul Singh
libxl will create an emulated PCI device tree node in the device tree to
enable the guest OS to discover the virtual PCI during guest boot.
Emulated PCI device tree node will only be created when there is any
device assigned to guest.

A new area has been reserved in the arm guest physical map at
which the VPCI bus is declared in the device tree (reg and ranges
parameters of the node).

Signed-off-by: Rahul Singh 
---
Change in v3:
- Make GUEST_VPCI_MEM_ADDR address 2MB aligned
Change in v2:
- enable doamin_vpci_init() when XEN_DOMCTL_CDF_vpci is set for domain.
---
 tools/include/libxl.h|   6 ++
 tools/libs/light/libxl_arm.c | 105 +++
 tools/libs/light/libxl_create.c  |   3 +
 tools/libs/light/libxl_types.idl |   1 +
 tools/xl/xl_parse.c  |   2 +
 xen/include/public/arch-arm.h|  10 +++
 6 files changed, 127 insertions(+)

diff --git a/tools/include/libxl.h b/tools/include/libxl.h
index b9ba16d698..3362073b21 100644
--- a/tools/include/libxl.h
+++ b/tools/include/libxl.h
@@ -358,6 +358,12 @@
  */
 #define LIBXL_HAVE_BUILDINFO_ARM_VUART 1
 
+/*
+ * LIBXL_HAVE_BUILDINFO_ARM_VPCI indicates that the toolstack supports virtual
+ * PCI for ARM.
+ */
+#define LIBXL_HAVE_BUILDINFO_ARM_VPCI 1
+
 /*
  * LIBXL_HAVE_BUILDINFO_GRANT_LIMITS indicates that libxl_domain_build_info
  * has the max_grant_frames and max_maptrack_frames fields.
diff --git a/tools/libs/light/libxl_arm.c b/tools/libs/light/libxl_arm.c
index e3140a6e00..52f1ddce48 100644
--- a/tools/libs/light/libxl_arm.c
+++ b/tools/libs/light/libxl_arm.c
@@ -269,6 +269,58 @@ static int fdt_property_regs(libxl__gc *gc, void *fdt,
 return fdt_property(fdt, "reg", regs, sizeof(regs));
 }
 
+static int fdt_property_values(libxl__gc *gc, void *fdt,
+const char *name, unsigned num_cells, ...)
+{
+uint32_t prop[num_cells];
+be32 *cells = [0];
+int i;
+va_list ap;
+uint32_t arg;
+
+va_start(ap, num_cells);
+for (i = 0 ; i < num_cells; i++) {
+arg = va_arg(ap, uint32_t);
+set_cell(, 1, arg);
+}
+va_end(ap);
+
+return fdt_property(fdt, name, prop, sizeof(prop));
+}
+
+static int fdt_property_vpci_ranges(libxl__gc *gc, void *fdt,
+unsigned addr_cells,
+unsigned size_cells,
+unsigned num_regs, ...)
+{
+uint32_t regs[num_regs*((addr_cells*2)+size_cells+1)];
+be32 *cells = [0];
+int i;
+va_list ap;
+uint64_t arg;
+
+va_start(ap, num_regs);
+for (i = 0 ; i < num_regs; i++) {
+/* Set the memory bit field */
+arg = va_arg(ap, uint32_t);
+set_cell(, 1, arg);
+
+/* Set the vpci bus address */
+arg = addr_cells ? va_arg(ap, uint64_t) : 0;
+set_cell(, addr_cells , arg);
+
+/* Set the cpu bus address where vpci address is mapped */
+set_cell(, addr_cells, arg);
+
+/* Set the vpci size requested */
+arg = size_cells ? va_arg(ap, uint64_t) : 0;
+set_cell(, size_cells, arg);
+}
+va_end(ap);
+
+return fdt_property(fdt, "ranges", regs, sizeof(regs));
+}
+
 static int make_root_properties(libxl__gc *gc,
 const libxl_version_info *vers,
 void *fdt)
@@ -668,6 +720,53 @@ static int make_vpl011_uart_node(libxl__gc *gc, void *fdt,
 return 0;
 }
 
+static int make_vpci_node(libxl__gc *gc, void *fdt,
+const struct arch_info *ainfo,
+struct xc_dom_image *dom)
+{
+int res;
+const uint64_t vpci_ecam_base = GUEST_VPCI_ECAM_BASE;
+const uint64_t vpci_ecam_size = GUEST_VPCI_ECAM_SIZE;
+const char *name = GCSPRINTF("pcie@%"PRIx64, vpci_ecam_base);
+
+res = fdt_begin_node(fdt, name);
+if (res) return res;
+
+res = fdt_property_compat(gc, fdt, 1, "pci-host-ecam-generic");
+if (res) return res;
+
+res = fdt_property_string(fdt, "device_type", "pci");
+if (res) return res;
+
+res = fdt_property_regs(gc, fdt, GUEST_ROOT_ADDRESS_CELLS,
+GUEST_ROOT_SIZE_CELLS, 1, vpci_ecam_base, vpci_ecam_size);
+if (res) return res;
+
+res = fdt_property_values(gc, fdt, "bus-range", 2, 0, 255);
+if (res) return res;
+
+res = fdt_property_cell(fdt, "#address-cells", 3);
+if (res) return res;
+
+res = fdt_property_cell(fdt, "#size-cells", 2);
+if (res) return res;
+
+res = fdt_property_string(fdt, "status", "okay");
+if (res) return res;
+
+res = fdt_property_vpci_ranges(gc, fdt, GUEST_ROOT_ADDRESS_CELLS,
+GUEST_ROOT_SIZE_CELLS, 2,
+GUEST_VPCI_ADDR_TYPE_MEM, GUEST_VPCI_MEM_ADDR, GUEST_VPCI_MEM_SIZE,
+GUEST_VPCI_ADDR_TYPE_PREFETCH_MEM, GUEST_VPCI_PREFETCH_MEM_ADDR,
+GUEST_VPCI_PREFETCH_MEM_SIZE);
+if (res) return res;
+
+res = fdt_end_node(fdt);
+if (res) return res;
+
+return 0;
+}
+
 static const struct arch_info *get_arch_info(libxl__gc *gc,
 

[PATCH v3 15/17] xen/arm: Transitional change to build HAS_VPCI on ARM.

2021-09-28 Thread Rahul Singh
This patch will be reverted once we add support for VPCI MSI/MSIX
support on ARM.

Signed-off-by: Rahul Singh 
---
Change in v3: none
Change in v2: Patch introduced in v2
---
 xen/drivers/vpci/Makefile | 3 ++-
 xen/drivers/vpci/header.c | 2 ++
 xen/include/asm-arm/pci.h | 8 
 3 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/xen/drivers/vpci/Makefile b/xen/drivers/vpci/Makefile
index 55d1bdfda0..1a1413b93e 100644
--- a/xen/drivers/vpci/Makefile
+++ b/xen/drivers/vpci/Makefile
@@ -1 +1,2 @@
-obj-y += vpci.o header.o msi.o msix.o
+obj-y += vpci.o header.o
+obj-$(CONFIG_HAS_PCI_MSI) += msi.o msix.o
diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
index ba9a036202..f8cd55e7c0 100644
--- a/xen/drivers/vpci/header.c
+++ b/xen/drivers/vpci/header.c
@@ -96,8 +96,10 @@ static void modify_decoding(const struct pci_dev *pdev, 
uint16_t cmd,
  * FIXME: punching holes after the p2m has been set up might be racy for
  * DomU usage, needs to be revisited.
  */
+#ifdef CONFIG_HAS_PCI_MSI
 if ( map && !rom_only && vpci_make_msix_hole(pdev) )
 return;
+#endif
 
 for ( i = 0; i < ARRAY_SIZE(header->bars); i++ )
 {
diff --git a/xen/include/asm-arm/pci.h b/xen/include/asm-arm/pci.h
index 49c9622902..5532ce3977 100644
--- a/xen/include/asm-arm/pci.h
+++ b/xen/include/asm-arm/pci.h
@@ -26,6 +26,14 @@ struct arch_pci_dev {
 struct device dev;
 };
 
+/* Arch-specific MSI data for vPCI. */
+struct vpci_arch_msi {
+};
+
+/* Arch-specific MSI-X entry data for vPCI. */
+struct vpci_arch_msix_entry {
+};
+
 /*
  * struct to hold the mappings of a config space window. This
  * is expected to be used as sysdata for PCI controllers that
-- 
2.17.1




[PATCH v1 3/8] proc/vmcore: let pfn_is_ram() return a bool

2021-09-28 Thread David Hildenbrand
The callback should deal with errors internally, it doesn't make sense to
expose these via pfn_is_ram(). We'll rework the callbacks next. Right now
we consider errors as if "it's RAM"; no functional change.

Signed-off-by: David Hildenbrand 
---
 fs/proc/vmcore.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/fs/proc/vmcore.c b/fs/proc/vmcore.c
index 9a15334da208..a9bd80ab670e 100644
--- a/fs/proc/vmcore.c
+++ b/fs/proc/vmcore.c
@@ -84,11 +84,11 @@ void unregister_oldmem_pfn_is_ram(void)
 }
 EXPORT_SYMBOL_GPL(unregister_oldmem_pfn_is_ram);
 
-static int pfn_is_ram(unsigned long pfn)
+static bool pfn_is_ram(unsigned long pfn)
 {
int (*fn)(unsigned long pfn);
/* pfn is ram unless fn() checks pagetype */
-   int ret = 1;
+   bool ret = true;
 
/*
 * Ask hypervisor if the pfn is really ram.
@@ -97,7 +97,7 @@ static int pfn_is_ram(unsigned long pfn)
 */
fn = oldmem_pfn_is_ram;
if (fn)
-   ret = fn(pfn);
+   ret = !!fn(pfn);
 
return ret;
 }
@@ -124,7 +124,7 @@ ssize_t read_from_oldmem(char *buf, size_t count,
nr_bytes = count;
 
/* If pfn is not ram, return zeros for sparse dump files */
-   if (pfn_is_ram(pfn) == 0)
+   if (!pfn_is_ram(pfn))
memset(buf, 0, nr_bytes);
else {
if (encrypted)
-- 
2.31.1




[PATCH v1 2/8] x86/xen: simplify xen_oldmem_pfn_is_ram()

2021-09-28 Thread David Hildenbrand
Let's simplify return handling.

Signed-off-by: David Hildenbrand 
---
 arch/x86/xen/mmu_hvm.c | 11 ++-
 1 file changed, 2 insertions(+), 9 deletions(-)

diff --git a/arch/x86/xen/mmu_hvm.c b/arch/x86/xen/mmu_hvm.c
index b242d1f4b426..eb61622df75b 100644
--- a/arch/x86/xen/mmu_hvm.c
+++ b/arch/x86/xen/mmu_hvm.c
@@ -21,23 +21,16 @@ static int xen_oldmem_pfn_is_ram(unsigned long pfn)
.domid = DOMID_SELF,
.pfn = pfn,
};
-   int ram;
 
if (HYPERVISOR_hvm_op(HVMOP_get_mem_type, ))
return -ENXIO;
 
switch (a.mem_type) {
case HVMMEM_mmio_dm:
-   ram = 0;
-   break;
-   case HVMMEM_ram_rw:
-   case HVMMEM_ram_ro:
+   return 0;
default:
-   ram = 1;
-   break;
+   return 1;
}
-
-   return ram;
 }
 #endif
 
-- 
2.31.1




[PATCH v3 14/17] xen/arm: Enable the existing x86 virtual PCI support for ARM.

2021-09-28 Thread Rahul Singh
The existing VPCI support available for X86 is adapted for Arm.
When the device is added to XEN via the hyper call
“PHYSDEVOP_pci_device_add”, VPCI handler for the config space
access is added to the Xen to emulate the PCI devices config space.

A MMIO trap handler for the PCI ECAM space is registered in XEN
so that when guest is trying to access the PCI config space,XEN
will trap the access and emulate read/write using the VPCI and
not the real PCI hardware.

For Dom0less systems scan_pci_devices() would be used to discover the
PCI device in XEN and VPCI handler will be added during XEN boots.

Signed-off-by: Rahul Singh 
---
Change in v3:
- Use is_pci_passthrough_enabled() in place of pci_passthrough_enabled variable
- Reject XEN_DOMCTL_CDF_vpci for x86 in arch_sanitise_domain_config()
- Remove IS_ENABLED(CONFIG_HAS_VPCI) from has_vpci()
Change in v2:
- Add new XEN_DOMCTL_CDF_vpci flag
- modify has_vpci() to include XEN_DOMCTL_CDF_vpci
- enable vpci support when pci-passthough option is enabled.
---
 xen/arch/arm/Makefile |   1 +
 xen/arch/arm/domain.c |   8 ++-
 xen/arch/arm/domain_build.c   |   3 +
 xen/arch/arm/vpci.c   | 102 ++
 xen/arch/arm/vpci.h   |  36 
 xen/arch/x86/domain.c |   6 ++
 xen/common/domain.c   |   2 +-
 xen/drivers/passthrough/pci.c |  12 
 xen/include/asm-arm/domain.h  |   8 ++-
 xen/include/asm-x86/pci.h |   2 -
 xen/include/public/arch-arm.h |   7 +++
 xen/include/public/domctl.h   |   4 +-
 xen/include/xen/pci.h |   2 +
 13 files changed, 186 insertions(+), 7 deletions(-)
 create mode 100644 xen/arch/arm/vpci.c
 create mode 100644 xen/arch/arm/vpci.h

diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
index 44d7cc81fa..fb9c976ea2 100644
--- a/xen/arch/arm/Makefile
+++ b/xen/arch/arm/Makefile
@@ -7,6 +7,7 @@ ifneq ($(CONFIG_NO_PLAT),y)
 obj-y += platforms/
 endif
 obj-$(CONFIG_TEE) += tee/
+obj-$(CONFIG_HAS_VPCI) += vpci.o
 
 obj-$(CONFIG_HAS_ALTERNATIVE) += alternative.o
 obj-y += bootfdt.init.o
diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
index 19c756ac3d..fbb52f78f1 100644
--- a/xen/arch/arm/domain.c
+++ b/xen/arch/arm/domain.c
@@ -39,6 +39,7 @@
 #include 
 #include 
 
+#include "vpci.h"
 #include "vuart.h"
 
 DEFINE_PER_CPU(struct vcpu *, curr_vcpu);
@@ -622,8 +623,8 @@ int arch_sanitise_domain_config(struct 
xen_domctl_createdomain *config)
 {
 unsigned int max_vcpus;
 
-/* HVM and HAP must be set. IOMMU may or may not be */
-if ( (config->flags & ~XEN_DOMCTL_CDF_iommu) !=
+/* HVM and HAP must be set. IOMMU and VPCI may or may not be */
+if ( (config->flags & ~XEN_DOMCTL_CDF_iommu & ~XEN_DOMCTL_CDF_vpci) !=
  (XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap) )
 {
 dprintk(XENLOG_INFO, "Unsupported configuration %#x\n",
@@ -767,6 +768,9 @@ int arch_domain_create(struct domain *d,
 if ( is_hardware_domain(d) && (rc = domain_vuart_init(d)) )
 goto fail;
 
+if ( (rc = domain_vpci_init(d)) != 0 )
+goto fail;
+
 return 0;
 
 fail:
diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index d233d634c1..1731ae2028 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -2773,6 +2773,9 @@ void __init create_dom0(void)
 if ( iommu_enabled )
 dom0_cfg.flags |= XEN_DOMCTL_CDF_iommu;
 
+if ( is_pci_passthrough_enabled() )
+dom0_cfg.flags |= XEN_DOMCTL_CDF_vpci;
+
 dom0 = domain_create(0, _cfg, true);
 if ( IS_ERR(dom0) || (alloc_dom0_vcpu0(dom0) == NULL) )
 panic("Error creating domain 0\n");
diff --git a/xen/arch/arm/vpci.c b/xen/arch/arm/vpci.c
new file mode 100644
index 00..76c12b9281
--- /dev/null
+++ b/xen/arch/arm/vpci.c
@@ -0,0 +1,102 @@
+/*
+ * xen/arch/arm/vpci.c
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+#include 
+
+#include 
+
+#define REGISTER_OFFSET(addr)  ( (addr) & 0x0fff)
+
+/* Do some sanity checks. */
+static bool vpci_mmio_access_allowed(unsigned int reg, unsigned int len)
+{
+/* Check access size. */
+if ( len > 8 )
+return false;
+
+/* Check that access is size aligned. */
+if ( (reg & (len - 1)) )
+return false;
+
+return true;
+}
+
+static int vpci_mmio_read(struct vcpu *v, mmio_info_t *info,
+  register_t *r, void *p)
+{
+unsigned int reg;
+pci_sbdf_t sbdf;
+unsigned long data = ~0UL;
+unsigned int size = 1U << info->dabt.size;
+
+sbdf.sbdf 

[PATCH v1 1/8] x86/xen: update xen_oldmem_pfn_is_ram() documentation

2021-09-28 Thread David Hildenbrand
The callback is only used for the vmcore nowadays.

Signed-off-by: David Hildenbrand 
---
 arch/x86/xen/mmu_hvm.c | 9 +++--
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/arch/x86/xen/mmu_hvm.c b/arch/x86/xen/mmu_hvm.c
index 57409373750f..b242d1f4b426 100644
--- a/arch/x86/xen/mmu_hvm.c
+++ b/arch/x86/xen/mmu_hvm.c
@@ -9,12 +9,9 @@
 
 #ifdef CONFIG_PROC_VMCORE
 /*
- * This function is used in two contexts:
- * - the kdump kernel has to check whether a pfn of the crashed kernel
- *   was a ballooned page. vmcore is using this function to decide
- *   whether to access a pfn of the crashed kernel.
- * - the kexec kernel has to check whether a pfn was ballooned by the
- *   previous kernel. If the pfn is ballooned, handle it properly.
+ * The kdump kernel has to check whether a pfn of the crashed kernel
+ * was a ballooned page. vmcore is using this function to decide
+ * whether to access a pfn of the crashed kernel.
  * Returns 0 if the pfn is not backed by a RAM page, the caller may
  * handle the pfn special in this case.
  */
-- 
2.31.1




[PATCH v1 0/8] proc/vmcore: sanitize access to virtio-mem memory

2021-09-28 Thread David Hildenbrand
As so often with virtio-mem changes that mess with common MM
infrastructure, this might be a good candiate to go via Andrew's tree.

--

After removing /dev/kmem, sanitizing /proc/kcore and handling /dev/mem,
this series tackles the last sane way how a VM could accidentially access
logically unplugged memory managed by a virtio-mem device: /proc/vmcore

When dumping memory via "makedumpfile", PG_offline pages, used by
virtio-mem to flag logically unplugged memory, are already properly
excluded; however, especially when accessing/copying /proc/vmcore "the
usual way", we can still end up reading logically unplugged memory part of
a virtio-mem device.

Patch #1-#3 are cleanups. Patch #4 extends the existing oldmem_pfn_is_ram
mechanism. Patch #5-#7 are virtio-mem refactorings for patch #8, which
implements the virtio-mem logic to query the state of device blocks.

Patch #8:

"
Although virtio-mem currently supports reading unplugged memory in the
hypervisor, this will change in the future, indicated to the device via
a new feature flag. We similarly sanitized /proc/kcore access recently.
[...]
Distributions that support virtio-mem+kdump have to make sure that the
virtio_mem module will be part of the kdump kernel or the kdump initrd;
dracut was recently [2] extended to include virtio-mem in the generated
initrd. As long as no special kdump kernels are used, this will
automatically make sure that virtio-mem will be around in the kdump initrd
and sanitize /proc/vmcore access -- with dracut.
"

This is the last remaining bit to support
VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE [3] in the Linux implementation of
virtio-mem.

Note: this is best-effort. We'll never be able to control what runs inside
the second kernel, really, but we also don't have to care: we only care
about sane setups where we don't want our VM getting zapped once we
touch the wrong memory location while dumping. While we usually expect sane
setups to use "makedumfile", nothing really speaks against just copying
/proc/vmcore, especially in environments where HWpoisioning isn't typically
expected. Also, we really don't want to put all our trust completely on the
memmap, so sanitizing also makes sense when just using "makedumpfile".

[1] https://lkml.kernel.org/r/20210526093041.8800-1-da...@redhat.com
[2] https://github.com/dracutdevs/dracut/pull/1157
[3] https://lists.oasis-open.org/archives/virtio-comment/202109/msg00021.html

Cc: Andrew Morton 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
Cc: "H. Peter Anvin" 
Cc: Boris Ostrovsky 
Cc: Juergen Gross 
Cc: Stefano Stabellini 
Cc: "Michael S. Tsirkin" 
Cc: Jason Wang 
Cc: Dave Young 
Cc: Baoquan He 
Cc: Vivek Goyal 
Cc: Michal Hocko 
Cc: Oscar Salvador 
Cc: Mike Rapoport 
Cc: "Rafael J. Wysocki" 
Cc: x...@kernel.org
Cc: xen-devel@lists.xenproject.org
Cc: virtualizat...@lists.linux-foundation.org
Cc: ke...@lists.infradead.org
Cc: linux-fsde...@vger.kernel.org
Cc: linux...@kvack.org

David Hildenbrand (8):
  x86/xen: update xen_oldmem_pfn_is_ram() documentation
  x86/xen: simplify xen_oldmem_pfn_is_ram()
  proc/vmcore: let pfn_is_ram() return a bool
  proc/vmcore: convert oldmem_pfn_is_ram callback to more generic vmcore
callbacks
  virtio-mem: factor out hotplug specifics from virtio_mem_init() into
virtio_mem_init_hotplug()
  virtio-mem: factor out hotplug specifics from virtio_mem_probe() into
virtio_mem_init_hotplug()
  virtio-mem: factor out hotplug specifics from virtio_mem_remove() into
virtio_mem_deinit_hotplug()
  virtio-mem: kdump mode to sanitize /proc/vmcore access

 arch/x86/kernel/aperture_64.c |  13 +-
 arch/x86/xen/mmu_hvm.c|  31 ++--
 drivers/virtio/virtio_mem.c   | 297 --
 fs/proc/vmcore.c  | 105 
 include/linux/crash_dump.h|  26 ++-
 5 files changed, 332 insertions(+), 140 deletions(-)


base-commit: 5816b3e6577eaa676ceb00a848f0fd65fe2adc29
-- 
2.31.1




[PATCH v3 13/17] xen/arm: Implement pci access functions

2021-09-28 Thread Rahul Singh
Implement generic pci access functions to read/write the configuration
space.

Signed-off-by: Rahul Singh 
---
Change in v3:
- Remove PRI_pci as not used.
- Replace uint32_t sbdf to pci_sbdf_t sbdf to avoid typecast
Change in v2: Fixed comments
---
 xen/arch/arm/pci/pci-access.c  | 57 ++
 xen/arch/arm/pci/pci-host-common.c | 19 ++
 xen/include/asm-arm/pci.h  |  1 +
 3 files changed, 77 insertions(+)

diff --git a/xen/arch/arm/pci/pci-access.c b/xen/arch/arm/pci/pci-access.c
index 3cd14a4b87..9f9aac43d7 100644
--- a/xen/arch/arm/pci/pci-access.c
+++ b/xen/arch/arm/pci/pci-access.c
@@ -16,6 +16,7 @@
 #include 
 
 #define INVALID_VALUE (~0U)
+#define PCI_ERR_VALUE(len) GENMASK(0, len * 8)
 
 int pci_generic_config_read(struct pci_host_bridge *bridge, pci_sbdf_t sbdf,
 uint32_t reg, uint32_t len, uint32_t *value)
@@ -72,6 +73,62 @@ int pci_generic_config_write(struct pci_host_bridge *bridge, 
pci_sbdf_t sbdf,
 return 0;
 }
 
+static uint32_t pci_config_read(pci_sbdf_t sbdf, unsigned int reg,
+unsigned int len)
+{
+uint32_t val = PCI_ERR_VALUE(len);
+struct pci_host_bridge *bridge = pci_find_host_bridge(sbdf.seg, sbdf.bus);
+
+if ( unlikely(!bridge) )
+return val;
+
+if ( unlikely(!bridge->ops->read) )
+return val;
+
+bridge->ops->read(bridge, sbdf, reg, len, );
+
+return val;
+}
+
+static void pci_config_write(pci_sbdf_t sbdf, unsigned int reg,
+ unsigned int len, uint32_t val)
+{
+struct pci_host_bridge *bridge = pci_find_host_bridge(sbdf.seg, sbdf.bus);
+
+if ( unlikely(!bridge) )
+return;
+
+if ( unlikely(!bridge->ops->write) )
+return;
+
+bridge->ops->write(bridge, sbdf, reg, len, val);
+}
+
+/*
+ * Wrappers for all PCI configuration access functions.
+ */
+
+#define PCI_OP_WRITE(size, type)\
+void pci_conf_write##size(pci_sbdf_t sbdf,  \
+  unsigned int reg, type val)   \
+{   \
+pci_config_write(sbdf, reg, size / 8, val); \
+}
+
+#define PCI_OP_READ(size, type) \
+type pci_conf_read##size(pci_sbdf_t sbdf,   \
+  unsigned int reg) \
+{   \
+return pci_config_read(sbdf, reg, size / 8);\
+}
+
+PCI_OP_READ(8, uint8_t)
+PCI_OP_READ(16, uint16_t)
+PCI_OP_READ(32, uint32_t)
+PCI_OP_WRITE(8, uint8_t)
+PCI_OP_WRITE(16, uint16_t)
+PCI_OP_WRITE(32, uint32_t)
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/arch/arm/pci/pci-host-common.c 
b/xen/arch/arm/pci/pci-host-common.c
index a08e06cea1..c5941b10e9 100644
--- a/xen/arch/arm/pci/pci-host-common.c
+++ b/xen/arch/arm/pci/pci-host-common.c
@@ -236,6 +236,25 @@ err_exit:
 return err;
 }
 
+/*
+ * This function will lookup an hostbridge based on the segment and bus
+ * number.
+ */
+struct pci_host_bridge *pci_find_host_bridge(uint16_t segment, uint8_t bus)
+{
+struct pci_host_bridge *bridge;
+
+list_for_each_entry( bridge, _host_bridges, node )
+{
+if ( bridge->segment != segment )
+continue;
+if ( (bus < bridge->cfg->busn_start) || (bus > bridge->cfg->busn_end) )
+continue;
+return bridge;
+}
+
+return NULL;
+}
 /*
  * Local variables:
  * mode: C
diff --git a/xen/include/asm-arm/pci.h b/xen/include/asm-arm/pci.h
index bb7eda6705..49c9622902 100644
--- a/xen/include/asm-arm/pci.h
+++ b/xen/include/asm-arm/pci.h
@@ -81,6 +81,7 @@ int pci_generic_config_write(struct pci_host_bridge *bridge, 
pci_sbdf_t sbdf,
  uint32_t reg, uint32_t len, uint32_t value);
 void __iomem *pci_ecam_map_bus(struct pci_host_bridge *bridge,
pci_sbdf_t sbdf, uint32_t where);
+struct pci_host_bridge *pci_find_host_bridge(uint16_t segment, uint8_t bus);
 
 static always_inline bool is_pci_passthrough_enabled(void)
 {
-- 
2.17.1




[PATCH v3 12/17] xen/arm: Add support for Xilinx ZynqMP PCI host controller

2021-09-28 Thread Rahul Singh
From: Oleksandr Andrushchenko 

Add support for Xilinx ZynqMP PCI host controller to map the PCI config
space to the XEN memory.

Patch helps to understand how the generic infrastructure for PCI
host-bridge discovery will be used for future references.

Signed-off-by: Oleksandr Andrushchenko 
---
Change in v3:
- nwl_cfg_reg_index(..) as static function
- Add support for pci_host_generic_probe() 
Change in v2:
- Add more info in commit msg
---
 xen/arch/arm/pci/Makefile  |  1 +
 xen/arch/arm/pci/pci-host-zynqmp.c | 63 ++
 2 files changed, 64 insertions(+)
 create mode 100644 xen/arch/arm/pci/pci-host-zynqmp.c

diff --git a/xen/arch/arm/pci/Makefile b/xen/arch/arm/pci/Makefile
index 6f32fbbe67..1d045ade01 100644
--- a/xen/arch/arm/pci/Makefile
+++ b/xen/arch/arm/pci/Makefile
@@ -3,3 +3,4 @@ obj-y += pci-access.o
 obj-y += pci-host-generic.o
 obj-y += pci-host-common.o
 obj-y += ecam.o
+obj-y += pci-host-zynqmp.o
diff --git a/xen/arch/arm/pci/pci-host-zynqmp.c 
b/xen/arch/arm/pci/pci-host-zynqmp.c
new file mode 100644
index 00..6ccbfd15c9
--- /dev/null
+++ b/xen/arch/arm/pci/pci-host-zynqmp.c
@@ -0,0 +1,63 @@
+/*
+ * Based on Linux drivers/pci/controller/pci-host-common.c
+ * Based on Linux drivers/pci/controller/pci-host-generic.c
+ * Based on xen/arch/arm/pci/pci-host-generic.c
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see .
+ */
+
+#include 
+#include 
+#include 
+
+static int nwl_cfg_reg_index(struct dt_device_node *np)
+{
+return dt_property_match_string(np, "reg-names", "cfg");
+}
+
+/* ECAM ops */
+const struct pci_ecam_ops nwl_pcie_ops = {
+.bus_shift  = 20,
+.cfg_reg_index = nwl_cfg_reg_index,
+.pci_ops= {
+.map_bus= pci_ecam_map_bus,
+.read   = pci_generic_config_read,
+.write  = pci_generic_config_write,
+}
+};
+
+static const struct dt_device_match nwl_pcie_dt_match[] = {
+{ .compatible = "xlnx,nwl-pcie-2.11" },
+{ },
+};
+
+static int pci_host_generic_probe(struct dt_device_node *dev,
+  const void *data)
+{
+return pci_host_common_probe(dev, _pcie_ops);
+}
+
+DT_DEVICE_START(pci_gen, "PCI HOST ZYNQMP", DEVICE_PCI)
+.dt_match = nwl_pcie_dt_match,
+.init = pci_host_generic_probe,
+DT_DEVICE_END
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
-- 
2.17.1




[PATCH v3 11/17] xen/arm: PCI host bridge discovery within XEN on ARM

2021-09-28 Thread Rahul Singh
XEN during boot will read the PCI device tree node “reg” property
and will map the PCI config space to the XEN memory.

As of now only "pci-host-ecam-generic" compatible board is supported.

"linux,pci-domain" device tree property assigns a fixed PCI domain
number to a host bridge, otherwise an unstable (across boots) unique
number will be assigned by Linux. XEN access the PCI devices based on
Segment:Bus:Device:Function. A Segment number in the XEN is same as a
domain number in Linux. Segment number and domain number has to be in
sync to access the correct PCI devices.

XEN will read the “linux,pci-domain” property from the device tree node
and configure the host bridge segment number accordingly. If this
property is not available XEN will allocate the unique segment number
to the host bridge.

Signed-off-by: Rahul Singh 
---
Change in v3:
- Modify commit msg based on received comments.
- Remove added struct match_table{} struct in struct device{}
- Replace uint32_t sbdf to pci_sbdf_t sbdf to avoid typecast
- Remove bus_start,bus_end and void *sysdata from struct pci_host_bridge{}
- Move "#include " in "xen/pci.h" after pci_sbdf_t sbdf declaration
- Add pci_host_generic_probe() function 
Change in v2:
- Add more info in commit msg
- Add callback to parse register index.
- Merge patch pci_ecam_operation into this patch to avoid confusion
- Add new struct in struct device for match table
---
 xen/arch/arm/pci/Makefile   |   4 +
 xen/arch/arm/pci/ecam.c |  61 +++
 xen/arch/arm/pci/pci-access.c   |  83 ++
 xen/arch/arm/pci/pci-host-common.c  | 247 
 xen/arch/arm/pci/pci-host-generic.c |  46 ++
 xen/include/asm-arm/pci.h   |  56 +++
 xen/include/xen/pci.h   |   3 +-
 7 files changed, 499 insertions(+), 1 deletion(-)
 create mode 100644 xen/arch/arm/pci/ecam.c
 create mode 100644 xen/arch/arm/pci/pci-access.c
 create mode 100644 xen/arch/arm/pci/pci-host-common.c
 create mode 100644 xen/arch/arm/pci/pci-host-generic.c

diff --git a/xen/arch/arm/pci/Makefile b/xen/arch/arm/pci/Makefile
index a98035df4c..6f32fbbe67 100644
--- a/xen/arch/arm/pci/Makefile
+++ b/xen/arch/arm/pci/Makefile
@@ -1 +1,5 @@
 obj-y += pci.o
+obj-y += pci-access.o
+obj-y += pci-host-generic.o
+obj-y += pci-host-common.o
+obj-y += ecam.o
diff --git a/xen/arch/arm/pci/ecam.c b/xen/arch/arm/pci/ecam.c
new file mode 100644
index 00..602d00799c
--- /dev/null
+++ b/xen/arch/arm/pci/ecam.c
@@ -0,0 +1,61 @@
+/*
+ * Based on Linux drivers/pci/ecam.c
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see .
+ */
+
+#include 
+#include 
+
+/*
+ * Function to implement the pci_ops->map_bus method.
+ */
+void __iomem *pci_ecam_map_bus(struct pci_host_bridge *bridge,
+   pci_sbdf_t sbdf, uint32_t where)
+{
+const struct pci_config_window *cfg = bridge->cfg;
+struct pci_ecam_ops *ops =
+container_of(bridge->ops, struct pci_ecam_ops, pci_ops);
+unsigned int devfn_shift = ops->bus_shift - 8;
+void __iomem *base;
+
+unsigned int busn = PCI_BUS(sbdf.bdf);
+
+if ( busn < cfg->busn_start || busn > cfg->busn_end )
+return NULL;
+
+busn -= cfg->busn_start;
+base = cfg->win + (busn << ops->bus_shift);
+
+return base + (PCI_DEVFN2(sbdf.bdf) << devfn_shift) + where;
+}
+
+/* ECAM ops */
+const struct pci_ecam_ops pci_generic_ecam_ops = {
+.bus_shift  = 20,
+.pci_ops= {
+.map_bus= pci_ecam_map_bus,
+.read   = pci_generic_config_read,
+.write  = pci_generic_config_write,
+}
+};
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/arch/arm/pci/pci-access.c b/xen/arch/arm/pci/pci-access.c
new file mode 100644
index 00..3cd14a4b87
--- /dev/null
+++ b/xen/arch/arm/pci/pci-access.c
@@ -0,0 +1,83 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received 

[PATCH v3 10/17] xen/arm: Add cmdline boot option "pci-passthrough = "

2021-09-28 Thread Rahul Singh
Add cmdline boot option "pci-passthrough = = " to enable or
disable the PCI passthrough support on ARM.

Signed-off-by: Rahul Singh 
---
Change in v3:
- Remove "define pci_passthrough_enabled (false)"
- Fixed coding style and minor comment
Change in v2:
- Add option in xen-command-line.pandoc
- Change pci option to pci-passthrough
- modify option from custom_param to boolean param
---
 docs/misc/xen-command-line.pandoc |  7 +++
 xen/arch/arm/pci/pci.c| 14 ++
 xen/common/physdev.c  |  6 ++
 xen/include/asm-arm/pci.h | 11 +++
 xen/include/asm-x86/pci.h |  8 
 5 files changed, 46 insertions(+)

diff --git a/docs/misc/xen-command-line.pandoc 
b/docs/misc/xen-command-line.pandoc
index 177e656f12..c8bf96ccf8 100644
--- a/docs/misc/xen-command-line.pandoc
+++ b/docs/misc/xen-command-line.pandoc
@@ -1808,6 +1808,13 @@ All numbers specified must be hexadecimal ones.
 
 This option can be specified more than once (up to 8 times at present).
 
+### pci-passthrough (arm)
+> `= `
+
+> Default: `false`
+
+Flag to enable or disable support for PCI passthrough
+
 ### pcid (x86)
 > `=  | xpti=`
 
diff --git a/xen/arch/arm/pci/pci.c b/xen/arch/arm/pci/pci.c
index e359bab9ea..84d8f0d634 100644
--- a/xen/arch/arm/pci/pci.c
+++ b/xen/arch/arm/pci/pci.c
@@ -16,6 +16,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 /*
@@ -62,8 +63,21 @@ static int __init acpi_pci_init(void)
 }
 #endif
 
+/*
+ * By default pci passthrough is disabled
+ */
+bool __read_mostly pci_passthrough_enabled = false;
+boolean_param("pci-passthrough", pci_passthrough_enabled);
+
 static int __init pci_init(void)
 {
+/*
+ * Enable PCI passthrough when has been enabled explicitly
+ * (pci-passthrough=on)
+ */
+if ( !pci_passthrough_enabled )
+return 0;
+
 pci_segments_init();
 
 if ( acpi_disabled )
diff --git a/xen/common/physdev.c b/xen/common/physdev.c
index 20a5530269..2d5fc886fc 100644
--- a/xen/common/physdev.c
+++ b/xen/common/physdev.c
@@ -19,6 +19,9 @@ ret_t do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
 struct pci_dev_info pdev_info;
 nodeid_t node;
 
+if ( !is_pci_passthrough_enabled() )
+return -ENOSYS;
+
 ret = -EFAULT;
 if ( copy_from_guest(, arg, 1) != 0 )
 break;
@@ -54,6 +57,9 @@ ret_t do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
 case PHYSDEVOP_pci_device_remove: {
 struct physdev_pci_device dev;
 
+if ( !is_pci_passthrough_enabled() )
+return -ENOSYS;
+
 ret = -EFAULT;
 if ( copy_from_guest(, arg, 1) != 0 )
 break;
diff --git a/xen/include/asm-arm/pci.h b/xen/include/asm-arm/pci.h
index 7dd9eb3dba..0cf849e26f 100644
--- a/xen/include/asm-arm/pci.h
+++ b/xen/include/asm-arm/pci.h
@@ -19,14 +19,25 @@
 
 #define pci_to_dev(pcidev) (&(pcidev)->arch.dev)
 
+extern bool_t pci_passthrough_enabled;
+
 /* Arch pci dev struct */
 struct arch_pci_dev {
 struct device dev;
 };
 
+static always_inline bool is_pci_passthrough_enabled(void)
+{
+return pci_passthrough_enabled;
+}
 #else   /*!CONFIG_HAS_PCI*/
 
 struct arch_pci_dev { };
 
+static always_inline bool is_pci_passthrough_enabled(void)
+{
+return false;
+}
+
 #endif  /*!CONFIG_HAS_PCI*/
 #endif /* __ARM_PCI_H__ */
diff --git a/xen/include/asm-x86/pci.h b/xen/include/asm-x86/pci.h
index cc05045e9c..3f806ce7a8 100644
--- a/xen/include/asm-x86/pci.h
+++ b/xen/include/asm-x86/pci.h
@@ -32,4 +32,12 @@ bool_t pci_ro_mmcfg_decode(unsigned long mfn, unsigned int 
*seg,
 extern int pci_mmcfg_config_num;
 extern struct acpi_mcfg_allocation *pci_mmcfg_config;
 
+/*
+ * Unlike ARM, PCI passthrough is always enabled for x86.
+ */
+static always_inline bool is_pci_passthrough_enabled(void)
+{
+return true;
+}
+
 #endif /* __X86_PCI_H__ */
-- 
2.17.1




[PATCH v3 09/17] xen/arm: Add support for PCI init to initialize the PCI driver.

2021-09-28 Thread Rahul Singh
pci_init(..) will be called during xen startup to initialize and probe
the PCI host-bridge driver.

Signed-off-by: Rahul Singh 
---
Change in v3:
- Some nit for device_init(..) return logic
- Remove inline from acpi_pci_init(..)
- Modify return value for apci_pci_init(..) to return -EOPNOTSUPP
Change in v2:
- ACPI init function to return int
- pci_segments_init() called before dt/acpi init
---
 xen/arch/arm/pci/pci.c   | 51 
 xen/include/asm-arm/device.h |  1 +
 2 files changed, 52 insertions(+)

diff --git a/xen/arch/arm/pci/pci.c b/xen/arch/arm/pci/pci.c
index a7a7bc3213..e359bab9ea 100644
--- a/xen/arch/arm/pci/pci.c
+++ b/xen/arch/arm/pci/pci.c
@@ -12,6 +12,10 @@
  * along with this program.  If not, see .
  */
 
+#include 
+#include 
+#include 
+#include 
 #include 
 
 /*
@@ -22,6 +26,53 @@ int arch_pci_clean_pirqs(struct domain *d)
 return 0;
 }
 
+static int __init dt_pci_init(void)
+{
+struct dt_device_node *np;
+int rc;
+
+dt_for_each_device_node(dt_host, np)
+{
+rc = device_init(np, DEVICE_PCI, NULL);
+/*
+ * Ignore the following error codes:
+ *   - EBADF: Indicate the current device is not a pci device.
+ *   - ENODEV: The pci device is not present or cannot be used by
+ * Xen.
+ */
+if( !rc || rc == -EBADF || rc == -ENODEV )
+continue;
+
+return rc;
+}
+
+return 0;
+}
+
+#ifdef CONFIG_ACPI
+static int __init acpi_pci_init(void)
+{
+printk(XENLOG_ERR "ACPI pci init not supported \n");
+return -EOPNOTSUPP;
+}
+#else
+static int __init acpi_pci_init(void)
+{
+return -EINVAL;
+}
+#endif
+
+static int __init pci_init(void)
+{
+pci_segments_init();
+
+if ( acpi_disabled )
+return dt_pci_init();
+else
+return acpi_pci_init();
+}
+__initcall(pci_init);
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/include/asm-arm/device.h b/xen/include/asm-arm/device.h
index ee7cff2d44..5ecd5e7bd1 100644
--- a/xen/include/asm-arm/device.h
+++ b/xen/include/asm-arm/device.h
@@ -34,6 +34,7 @@ enum device_class
 DEVICE_SERIAL,
 DEVICE_IOMMU,
 DEVICE_GIC,
+DEVICE_PCI,
 /* Use for error */
 DEVICE_UNKNOWN,
 };
-- 
2.17.1




[PATCH v3 08/17] xen/device-tree: Add dt_get_pci_domain_nr helper

2021-09-28 Thread Rahul Singh
Based Linux commit 41e5c0f81d3e676d671d96a0a1fafb27abfbd9d7

Import the Linux helper of_get_pci_domain_nr. This function will try to
obtain the host bridge domain number by finding a property called
"linux,pci-domain" of the given device node.

Signed-off-by: Rahul Singh 
---
Change in v3:
- Modify commit message to include upstream Linux commit-id not stable
  Linux commit-id
- Remove return value as those are not valid for XEN
Change in v2: Patch introduced in v2
---
 xen/common/device_tree.c  | 12 
 xen/include/xen/device_tree.h | 17 +
 2 files changed, 29 insertions(+)

diff --git a/xen/common/device_tree.c b/xen/common/device_tree.c
index 53160d61f8..ea93da1725 100644
--- a/xen/common/device_tree.c
+++ b/xen/common/device_tree.c
@@ -2183,6 +2183,18 @@ void __init dt_unflatten_host_device_tree(void)
 dt_alias_scan();
 }
 
+int dt_get_pci_domain_nr(struct dt_device_node *node)
+{
+u32 domain;
+int error;
+
+error = dt_property_read_u32(node, "linux,pci-domain", );
+if ( !error )
+return -EINVAL;
+
+return (u16)domain;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/include/xen/device_tree.h b/xen/include/xen/device_tree.h
index 3ffe3eb3d2..2297c59ce6 100644
--- a/xen/include/xen/device_tree.h
+++ b/xen/include/xen/device_tree.h
@@ -832,6 +832,23 @@ int dt_count_phandle_with_args(const struct dt_device_node 
*np,
const char *list_name,
const char *cells_name);
 
+/**
+ * dt_get_pci_domain_nr - Find the host bridge domain number
+ *of the given device node.
+ * @node: Device tree node with the domain information.
+ *
+ * This function will try to obtain the host bridge domain number by finding
+ * a property called "linux,pci-domain" of the given device node.
+ *
+ * Return:
+ * * > 0- On success, an associated domain number.
+ * * -EINVAL- The property "linux,pci-domain" does not exist.
+ *
+ * Returns the associated domain number from DT in the range [0-0x], or
+ * a negative value if the required property is not found.
+ */
+int dt_get_pci_domain_nr(struct dt_device_node *node);
+
 #ifdef CONFIG_DEVICE_TREE_DEBUG
 #define dt_dprintk(fmt, args...)  \
 printk(XENLOG_DEBUG fmt, ## args)
-- 
2.17.1




[PATCH v3 07/17] xen/device-tree: Add dt_property_read_u32_array helper

2021-09-28 Thread Rahul Singh
Based Linux commit a67e9472da423ec47a3586920b526ebaedf25fc3

Import the Linux helper of_property_read_u32_array. This function find
and read an array of 32 bit integers from a property.

Signed-off-by: Rahul Singh 
---
Change in v3:
- Modify commit message to include upstream Linux commit-id not stable
  Linux commit-id
Change in v2: Patch introduced in v2
---
 xen/include/xen/device_tree.h | 30 ++
 1 file changed, 30 insertions(+)

diff --git a/xen/include/xen/device_tree.h b/xen/include/xen/device_tree.h
index 1693fb8e8c..3ffe3eb3d2 100644
--- a/xen/include/xen/device_tree.h
+++ b/xen/include/xen/device_tree.h
@@ -392,6 +392,36 @@ int dt_property_read_variable_u32_array(const struct 
dt_device_node *np,
 const char *propname, u32 *out_values,
 size_t sz_min, size_t sz_max);
 
+/**
+ * dt_property_read_u32_array - Find and read an array of 32 bit integers
+ * from a property.
+ *
+ * @np: device node from which the property value is to be read.
+ * @propname:   name of the property to be searched.
+ * @out_values: pointer to return value, modified only if return value is 0.
+ * @sz: number of array elements to read
+ *
+ * Search for a property in a device node and read 32-bit value(s) from
+ * it.
+ *
+ * Return: 0 on success, -EINVAL if the property does not exist,
+ * -ENODATA if property does not have a value, and -EOVERFLOW if the
+ * property data isn't large enough.
+ *
+ * The out_values is modified only if a valid u32 value can be decoded.
+ */
+static inline int dt_property_read_u32_array(const struct dt_device_node *np,
+ const char *propname,
+ u32 *out_values, size_t sz)
+{
+int ret = dt_property_read_variable_u32_array(np, propname, out_values,
+  sz, 0);
+if ( ret >= 0 )
+return 0;
+else
+return ret;
+}
+
 /**
  * dt_property_read_bool - Check if a property exists
  * @np: node to get the value
-- 
2.17.1




[PATCH v3 06/17] xen/device-tree: Add dt_property_read_variable_u32_array helper

2021-09-28 Thread Rahul Singh
Based Linux commit a67e9472da423ec47a3586920b526ebaedf25fc3

Import the Linux helper of_property_read_variable_u32_array. This
function find and read an array of 32 bit integers from a property,
with bounds on the minimum and maximum array size.

Signed-off-by: Rahul Singh 
---
Change in v3:
- Modify commit message to include upstream Linux commit-id not stable 
  Linux commit-id
Change in v2: Patch introduced in v2
---
 xen/common/device_tree.c  | 61 +++
 xen/include/xen/device_tree.h | 26 +++
 2 files changed, 87 insertions(+)

diff --git a/xen/common/device_tree.c b/xen/common/device_tree.c
index 03d25a81ce..53160d61f8 100644
--- a/xen/common/device_tree.c
+++ b/xen/common/device_tree.c
@@ -208,6 +208,67 @@ int dt_property_read_string(const struct dt_device_node 
*np,
 return 0;
 }
 
+/**
+ * dt_find_property_value_of_size
+ *
+ * @np: device node from which the property value is to be read.
+ * @propname:   name of the property to be searched.
+ * @min:minimum allowed length of property value
+ * @max:maximum allowed length of property value (0 means unlimited)
+ * @len:if !=NULL, actual length is written to here
+ *
+ * Search for a property in a device node and valid the requested size.
+ *
+ * Return: The property value on success, -EINVAL if the property does not
+ * exist, -ENODATA if property does not have a value, and -EOVERFLOW if the
+ * property data is too small or too large.
+ */
+static void *dt_find_property_value_of_size(const struct dt_device_node *np,
+const char *propname, u32 min,
+u32 max, size_t *len)
+{
+const struct dt_property *prop = dt_find_property(np, propname, NULL);
+
+if ( !prop )
+return ERR_PTR(-EINVAL);
+if ( !prop->value )
+return ERR_PTR(-ENODATA);
+if ( prop->length < min )
+return ERR_PTR(-EOVERFLOW);
+if ( max && prop->length > max )
+return ERR_PTR(-EOVERFLOW);
+
+if ( len )
+*len = prop->length;
+
+return prop->value;
+}
+
+int dt_property_read_variable_u32_array(const struct dt_device_node *np,
+const char *propname, u32 *out_values,
+size_t sz_min, size_t sz_max)
+{
+size_t sz, count;
+const __be32 *val = dt_find_property_value_of_size(np, propname,
+(sz_min * sizeof(*out_values)),
+(sz_max * sizeof(*out_values)),
+);
+
+if ( IS_ERR(val) )
+return PTR_ERR(val);
+
+if ( !sz_max )
+sz = sz_min;
+else
+sz /= sizeof(*out_values);
+
+count = sz;
+while ( count-- )
+*out_values++ = be32_to_cpup(val++);
+
+return sz;
+}
+
 int dt_property_match_string(const struct dt_device_node *np,
  const char *propname, const char *string)
 {
diff --git a/xen/include/xen/device_tree.h b/xen/include/xen/device_tree.h
index b02696be94..1693fb8e8c 100644
--- a/xen/include/xen/device_tree.h
+++ b/xen/include/xen/device_tree.h
@@ -366,6 +366,32 @@ bool_t dt_property_read_u32(const struct dt_device_node 
*np,
 bool_t dt_property_read_u64(const struct dt_device_node *np,
 const char *name, u64 *out_value);
 
+
+/**
+ * dt_property_read_variable_u32_array - Find and read an array of 32 bit
+ * integers from a property, with bounds on the minimum and maximum array size.
+ *
+ * @np: device node from which the property value is to be read.
+ * @propname:   name of the property to be searched.
+ * @out_values: pointer to return found values.
+ * @sz_min: minimum number of array elements to read
+ * @sz_max: maximum number of array elements to read, if zero there is no
+ *  upper limit on the number of elements in the dts entry but only
+ *  sz_min will be read.
+ *
+ * Search for a property in a device node and read 32-bit value(s) from
+ * it.
+ *
+ * Return: The number of elements read on success, -EINVAL if the property
+ * does not exist, -ENODATA if property does not have a value, and -EOVERFLOW
+ * if the property data is smaller than sz_min or longer than sz_max.
+ *
+ * The out_values is modified only if a valid u32 value can be decoded.
+ */
+int dt_property_read_variable_u32_array(const struct dt_device_node *np,
+const char *propname, u32 *out_values,
+size_t sz_min, size_t sz_max);
+
 /**
  * dt_property_read_bool - Check if a property exists
  * @np: node to get the value
-- 
2.17.1




[PATCH v3 05/17] xen/arm: Add PHYSDEVOP_pci_device_* support for ARM

2021-09-28 Thread Rahul Singh
Hardware domain is in charge of doing the PCI enumeration and will
discover the PCI devices and then will communicate to XEN via hyper
call PHYSDEVOP_pci_device_add(..) to add the PCI devices in XEN.

Also implement PHYSDEVOP_pci_device_remove(..) to remove the PCI device.

As most of the code for PHYSDEVOP_pci_device_* is the same between x86
and ARM, move the code to a common file to avoid duplication.

Signed-off-by: Rahul Singh 
---
Change in v3: Fixed minor comment.
Change in v2:
- Add support for PHYSDEVOP_pci_device_remove()
- Move code to common code
---
 xen/arch/arm/physdev.c  |  5 +-
 xen/arch/x86/physdev.c  | 50 +---
 xen/arch/x86/x86_64/physdev.c   |  4 +-
 xen/common/Makefile |  1 +
 xen/common/physdev.c| 81 +
 xen/include/asm-arm/hypercall.h |  2 -
 xen/include/asm-arm/numa.h  |  5 ++
 xen/include/asm-x86/hypercall.h |  9 ++--
 xen/include/xen/hypercall.h |  8 
 9 files changed, 106 insertions(+), 59 deletions(-)
 create mode 100644 xen/common/physdev.c

diff --git a/xen/arch/arm/physdev.c b/xen/arch/arm/physdev.c
index e91355fe22..4e00b03aab 100644
--- a/xen/arch/arm/physdev.c
+++ b/xen/arch/arm/physdev.c
@@ -8,10 +8,9 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 
-
-int do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
+long arch_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
 {
 gdprintk(XENLOG_DEBUG, "PHYSDEVOP cmd=%d: not implemented\n", cmd);
 return -ENOSYS;
diff --git a/xen/arch/x86/physdev.c b/xen/arch/x86/physdev.c
index 23465bcd00..c00cc99404 100644
--- a/xen/arch/x86/physdev.c
+++ b/xen/arch/x86/physdev.c
@@ -174,7 +174,7 @@ int physdev_unmap_pirq(domid_t domid, int pirq)
 }
 #endif /* COMPAT */
 
-ret_t do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
+ret_t arch_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
 {
 int irq;
 ret_t ret;
@@ -480,54 +480,6 @@ ret_t do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) 
arg)
 break;
 }
 
-case PHYSDEVOP_pci_device_add: {
-struct physdev_pci_device_add add;
-struct pci_dev_info pdev_info;
-nodeid_t node;
-
-ret = -EFAULT;
-if ( copy_from_guest(, arg, 1) != 0 )
-break;
-
-pdev_info.is_extfn = !!(add.flags & XEN_PCI_DEV_EXTFN);
-if ( add.flags & XEN_PCI_DEV_VIRTFN )
-{
-pdev_info.is_virtfn = 1;
-pdev_info.physfn.bus = add.physfn.bus;
-pdev_info.physfn.devfn = add.physfn.devfn;
-}
-else
-pdev_info.is_virtfn = 0;
-
-if ( add.flags & XEN_PCI_DEV_PXM )
-{
-uint32_t pxm;
-size_t optarr_off = offsetof(struct physdev_pci_device_add, 
optarr) /
-sizeof(add.optarr[0]);
-
-if ( copy_from_guest_offset(, arg, optarr_off, 1) )
-break;
-
-node = pxm_to_node(pxm);
-}
-else
-node = NUMA_NO_NODE;
-
-ret = pci_add_device(add.seg, add.bus, add.devfn, _info, node);
-break;
-}
-
-case PHYSDEVOP_pci_device_remove: {
-struct physdev_pci_device dev;
-
-ret = -EFAULT;
-if ( copy_from_guest(, arg, 1) != 0 )
-break;
-
-ret = pci_remove_device(dev.seg, dev.bus, dev.devfn);
-break;
-}
-
 case PHYSDEVOP_prepare_msix:
 case PHYSDEVOP_release_msix: {
 struct physdev_pci_device dev;
diff --git a/xen/arch/x86/x86_64/physdev.c b/xen/arch/x86/x86_64/physdev.c
index 0a50cbd4d8..5f72652ff7 100644
--- a/xen/arch/x86/x86_64/physdev.c
+++ b/xen/arch/x86/x86_64/physdev.c
@@ -9,9 +9,10 @@ EMIT_FILE;
 #include 
 #include 
 #include 
-#include 
+#include 
 
 #define do_physdev_op compat_physdev_op
+#define arch_physdev_op arch_compat_physdev_op
 
 #define physdev_apic   compat_physdev_apic
 #define physdev_apic_t physdev_apic_compat_t
@@ -82,6 +83,7 @@ CHECK_physdev_pci_device
 typedef int ret_t;
 
 #include "../physdev.c"
+#include "../../../common/physdev.c"
 
 /*
  * Local variables:
diff --git a/xen/common/Makefile b/xen/common/Makefile
index 54de70d422..bcb1c8fb03 100644
--- a/xen/common/Makefile
+++ b/xen/common/Makefile
@@ -29,6 +29,7 @@ obj-y += notifier.o
 obj-y += page_alloc.o
 obj-$(CONFIG_HAS_PDX) += pdx.o
 obj-$(CONFIG_PERF_COUNTERS) += perfc.o
+obj-y += physdev.o
 obj-y += preempt.o
 obj-y += random.o
 obj-y += rangeset.o
diff --git a/xen/common/physdev.c b/xen/common/physdev.c
new file mode 100644
index 00..20a5530269
--- /dev/null
+++ b/xen/common/physdev.c
@@ -0,0 +1,81 @@
+
+#include 
+#include 
+#include 
+
+#ifndef COMPAT
+typedef long ret_t;
+#endif
+
+ret_t do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
+{
+ret_t ret;
+
+switch ( cmd )
+{
+#ifdef CONFIG_HAS_PCI
+case PHYSDEVOP_pci_device_add: {
+struct physdev_pci_device_add add;
+struct 

[PATCH v3 04/17] xen/arm: xc_domain_ioport_permission(..) not supported on ARM.

2021-09-28 Thread Rahul Singh
ARM architecture does not implement I/O ports. Ignore this call on ARM
to avoid the overhead of making a hypercall just for Xen to return
-ENOSYS.

Signed-off-by: Rahul Singh 
Reviewed-by: Stefano Stabellini 
---
Change in v3: Added Reviewed-by: Stefano Stabellini 
Change in v2:
- Instead of returning success in XEN, ignored the call in xl.
---
 tools/libs/ctrl/xc_domain.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/tools/libs/ctrl/xc_domain.c b/tools/libs/ctrl/xc_domain.c
index 23322b70b5..25c95f6596 100644
--- a/tools/libs/ctrl/xc_domain.c
+++ b/tools/libs/ctrl/xc_domain.c
@@ -1348,6 +1348,14 @@ int xc_domain_ioport_permission(xc_interface *xch,
 uint32_t nr_ports,
 uint32_t allow_access)
 {
+#if defined(__arm__) || defined(__aarch64__)
+/*
+ * The ARM architecture does not implement I/O ports.
+ * Avoid the overhead of making a hypercall just for Xen to return -ENOSYS.
+ * It is safe to ignore this call on ARM so we just return 0.
+ */
+return 0;
+#else
 DECLARE_DOMCTL;
 
 domctl.cmd = XEN_DOMCTL_ioport_permission;
@@ -1357,6 +1365,7 @@ int xc_domain_ioport_permission(xc_interface *xch,
 domctl.u.ioport_permission.allow_access = allow_access;
 
 return do_domctl(xch, );
+#endif
 }
 
 int xc_availheap(xc_interface *xch,
-- 
2.17.1




[PATCH v3 03/17] xen/arm: solve compilation error on ARM with ACPI && HAS_PCI

2021-09-28 Thread Rahul Singh
prelink.o: In function `pcie_aer_get_firmware_first’:
drivers/passthrough/pci.c:1251: undefined reference to `apei_hest_parse'

Compilation error is observed when ACPI and HAS_PCI is enabled for ARM
architecture. APEI not supported on ARM yet move the code under
CONFIG_X86 flag to gate the code for ARM.

Signed-off-by: Rahul Singh 
Acked-by: Stefano Stabellini 
---
Change in v3: Added Acked-by: Stefano Stabellini 
Change in v2: Add in code comment "APEI not supported on ARM yet"
---
 xen/drivers/passthrough/pci.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c
index 8996403161..d774a6154e 100644
--- a/xen/drivers/passthrough/pci.c
+++ b/xen/drivers/passthrough/pci.c
@@ -1150,7 +1150,8 @@ void __hwdom_init setup_hwdom_pci_devices(
 pcidevs_unlock();
 }
 
-#ifdef CONFIG_ACPI
+/* APEI not supported on ARM yet. */
+#if defined(CONFIG_ACPI) && defined(CONFIG_X86)
 #include 
 #include 
 
-- 
2.17.1




[PATCH v3 02/17] xen/arm: pci: Add stubs to allow selecting HAS_PCI

2021-09-28 Thread Rahul Singh
In a follow-up we will enable PCI support in Xen on Arm (i.e select
HAS_PCI).

The generic code expects the arch to implement a few functions:
arch_iommu_use_permitted()
arch_pci_clean_pirqs()

Note that this is not yet sufficient to enable HAS_PCI and will be
addressed in follow-ups.

Signed-off-by: Rahul Singh 
Reviewed-by: Stefano Stabellini 
---
Change in v3:
- Modify commit message.
- Added Reviewed-by: Stefano Stabellini 
Change in v2:
- Remove pci_conf_read*(..) dummy implementation
- Add in code comment for arch_pci_clean_pirqs() and arch_iommu_use_permitted()
- Fixed minor comments.
---
 xen/arch/arm/Makefile   |  1 +
 xen/arch/arm/pci/Makefile   |  1 +
 xen/arch/arm/pci/pci.c  | 33 +
 xen/drivers/passthrough/arm/iommu.c |  9 
 xen/include/asm-arm/pci.h   | 31 ---
 5 files changed, 72 insertions(+), 3 deletions(-)
 create mode 100644 xen/arch/arm/pci/Makefile
 create mode 100644 xen/arch/arm/pci/pci.c

diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
index 3d3b97b5b4..44d7cc81fa 100644
--- a/xen/arch/arm/Makefile
+++ b/xen/arch/arm/Makefile
@@ -2,6 +2,7 @@ obj-$(CONFIG_ARM_32) += arm32/
 obj-$(CONFIG_ARM_64) += arm64/
 obj-$(CONFIG_ARM_64) += efi/
 obj-$(CONFIG_ACPI) += acpi/
+obj-$(CONFIG_HAS_PCI) += pci/
 ifneq ($(CONFIG_NO_PLAT),y)
 obj-y += platforms/
 endif
diff --git a/xen/arch/arm/pci/Makefile b/xen/arch/arm/pci/Makefile
new file mode 100644
index 00..a98035df4c
--- /dev/null
+++ b/xen/arch/arm/pci/Makefile
@@ -0,0 +1 @@
+obj-y += pci.o
diff --git a/xen/arch/arm/pci/pci.c b/xen/arch/arm/pci/pci.c
new file mode 100644
index 00..a7a7bc3213
--- /dev/null
+++ b/xen/arch/arm/pci/pci.c
@@ -0,0 +1,33 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see .
+ */
+
+#include 
+
+/*
+ * PIRQ event channels are not supported on Arm, so nothing to do.
+ */
+int arch_pci_clean_pirqs(struct domain *d)
+{
+return 0;
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/drivers/passthrough/arm/iommu.c 
b/xen/drivers/passthrough/arm/iommu.c
index db3b07a571..ee653a9c48 100644
--- a/xen/drivers/passthrough/arm/iommu.c
+++ b/xen/drivers/passthrough/arm/iommu.c
@@ -135,3 +135,12 @@ void arch_iommu_domain_destroy(struct domain *d)
 void __hwdom_init arch_iommu_hwdom_init(struct domain *d)
 {
 }
+
+/*
+ * Unlike x86, Arm doesn't support mem-sharing, mem-paging and log-dirty (yet).
+ * So there is no restriction to use the IOMMU.
+ */
+bool arch_iommu_use_permitted(const struct domain *d)
+{
+return true;
+}
diff --git a/xen/include/asm-arm/pci.h b/xen/include/asm-arm/pci.h
index de13359f65..7dd9eb3dba 100644
--- a/xen/include/asm-arm/pci.h
+++ b/xen/include/asm-arm/pci.h
@@ -1,7 +1,32 @@
-#ifndef __X86_PCI_H__
-#define __X86_PCI_H__
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see .
+ */
 
+#ifndef __ARM_PCI_H__
+#define __ARM_PCI_H__
+
+#ifdef CONFIG_HAS_PCI
+
+#define pci_to_dev(pcidev) (&(pcidev)->arch.dev)
+
+/* Arch pci dev struct */
 struct arch_pci_dev {
+struct device dev;
 };
 
-#endif /* __X86_PCI_H__ */
+#else   /*!CONFIG_HAS_PCI*/
+
+struct arch_pci_dev { };
+
+#endif  /*!CONFIG_HAS_PCI*/
+#endif /* __ARM_PCI_H__ */
-- 
2.17.1




[PATCH v3 01/17] xen/pci: Refactor MSI code that implements MSI functionality within XEN

2021-09-28 Thread Rahul Singh
On Arm, the initial plan is to only support GICv3 ITS which doesn't
require us to manage the MSIs because the HW will protect against
spoofing. Move the code under CONFIG_HAS_PCI_MSI flag to gate the code
for ARM.

No functional change intended.

Signed-off-by: Rahul Singh 
Reviewed-by: Daniel P. Smith 
---
Change in v3: none 
Change in v2: Fixed minor comments
---
 xen/arch/x86/Kconfig |  1 +
 xen/drivers/passthrough/Makefile |  1 +
 xen/drivers/passthrough/msi.c| 83 
 xen/drivers/passthrough/pci.c| 54 +
 xen/drivers/pci/Kconfig  |  4 ++
 xen/include/xen/msi.h| 43 +
 xen/xsm/flask/hooks.c|  8 +--
 7 files changed, 149 insertions(+), 45 deletions(-)
 create mode 100644 xen/drivers/passthrough/msi.c
 create mode 100644 xen/include/xen/msi.h

diff --git a/xen/arch/x86/Kconfig b/xen/arch/x86/Kconfig
index 1f83518ee0..b4abfca46f 100644
--- a/xen/arch/x86/Kconfig
+++ b/xen/arch/x86/Kconfig
@@ -20,6 +20,7 @@ config X86
select HAS_NS16550
select HAS_PASSTHROUGH
select HAS_PCI
+   select HAS_PCI_MSI
select HAS_PDX
select HAS_SCHED_GRANULARITY
select HAS_UBSAN
diff --git a/xen/drivers/passthrough/Makefile b/xen/drivers/passthrough/Makefile
index 445690e3e5..a5efa22714 100644
--- a/xen/drivers/passthrough/Makefile
+++ b/xen/drivers/passthrough/Makefile
@@ -7,3 +7,4 @@ obj-y += iommu.o
 obj-$(CONFIG_HAS_PCI) += pci.o
 obj-$(CONFIG_HAS_DEVICE_TREE) += device_tree.o
 obj-$(CONFIG_HAS_PCI) += ats.o
+obj-$(CONFIG_HAS_PCI_MSI) += msi.o
diff --git a/xen/drivers/passthrough/msi.c b/xen/drivers/passthrough/msi.c
new file mode 100644
index 00..ce1a450f6f
--- /dev/null
+++ b/xen/drivers/passthrough/msi.c
@@ -0,0 +1,83 @@
+#include 
+#include 
+#include 
+#include 
+
+int pdev_msix_assign(struct domain *d, struct pci_dev *pdev)
+{
+int rc;
+
+if ( pdev->msix )
+{
+rc = pci_reset_msix_state(pdev);
+if ( rc )
+return rc;
+msixtbl_init(d);
+}
+
+return 0;
+}
+
+int pdev_msi_init(struct pci_dev *pdev)
+{
+unsigned int pos;
+
+INIT_LIST_HEAD(>msi_list);
+
+pos = pci_find_cap_offset(pdev->seg, pdev->bus, PCI_SLOT(pdev->devfn),
+  PCI_FUNC(pdev->devfn), PCI_CAP_ID_MSI);
+if ( pos )
+{
+uint16_t ctrl = pci_conf_read16(pdev->sbdf, msi_control_reg(pos));
+
+pdev->msi_maxvec = multi_msi_capable(ctrl);
+}
+
+pos = pci_find_cap_offset(pdev->seg, pdev->bus, PCI_SLOT(pdev->devfn),
+  PCI_FUNC(pdev->devfn), PCI_CAP_ID_MSIX);
+if ( pos )
+{
+struct arch_msix *msix = xzalloc(struct arch_msix);
+uint16_t ctrl;
+
+if ( !msix )
+return -ENOMEM;
+
+spin_lock_init(>table_lock);
+
+ctrl = pci_conf_read16(pdev->sbdf, msix_control_reg(pos));
+msix->nr_entries = msix_table_size(ctrl);
+
+pdev->msix = msix;
+}
+
+return 0;
+}
+
+void pdev_msi_deinit(struct pci_dev *pdev)
+{
+XFREE(pdev->msix);
+}
+
+void pdev_dump_msi(const struct pci_dev *pdev)
+{
+const struct msi_desc *msi;
+
+if ( list_empty(>msi_list) )
+return;
+
+printk(" - MSIs < ");
+list_for_each_entry ( msi, >msi_list, list )
+printk("%d ", msi->irq);
+printk(">");
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c
index fc4fa2e5c3..8996403161 100644
--- a/xen/drivers/passthrough/pci.c
+++ b/xen/drivers/passthrough/pci.c
@@ -32,8 +32,8 @@
 #include 
 #include 
 #include 
+#include 
 #include 
-#include 
 #include "ats.h"
 
 struct pci_seg {
@@ -314,6 +314,7 @@ static struct pci_dev *alloc_pdev(struct pci_seg *pseg, u8 
bus, u8 devfn)
 {
 struct pci_dev *pdev;
 unsigned int pos;
+int rc;
 
 list_for_each_entry ( pdev, >alldevs_list, alldevs_list )
 if ( pdev->bus == bus && pdev->devfn == devfn )
@@ -327,35 +328,12 @@ static struct pci_dev *alloc_pdev(struct pci_seg *pseg, 
u8 bus, u8 devfn)
 *((u8*) >bus) = bus;
 *((u8*) >devfn) = devfn;
 pdev->domain = NULL;
-INIT_LIST_HEAD(>msi_list);
-
-pos = pci_find_cap_offset(pseg->nr, bus, PCI_SLOT(devfn), PCI_FUNC(devfn),
-  PCI_CAP_ID_MSI);
-if ( pos )
-{
-uint16_t ctrl = pci_conf_read16(pdev->sbdf, msi_control_reg(pos));
-
-pdev->msi_maxvec = multi_msi_capable(ctrl);
-}
 
-pos = pci_find_cap_offset(pseg->nr, bus, PCI_SLOT(devfn), PCI_FUNC(devfn),
-  PCI_CAP_ID_MSIX);
-if ( pos )
+rc = pdev_msi_init(pdev);
+if ( rc )
 {
-struct arch_msix *msix = xzalloc(struct arch_msix);
-uint16_t ctrl;
-
-if ( !msix )
-{
-xfree(pdev);
-return NULL;
- 

[PATCH v3 00/17] PCI devices passthrough on Arm

2021-09-28 Thread Rahul Singh
Hello All,

The purpose of this patch series is to add PCI passthrough support to Xen on
Arm. PCI passthrough support on ARM is the collaboration work between EPAM and
ARM. ARM submitted the partial RFC [1][2] last year to get early feedback. We
tried to fix all the comments and added more features to this patch series.

Working POC with all the features can be found at [3]. Working POC is tested
on x86 so that there will be no regression on x86. Design presentation can be
found at [4]

PCI passthrough support is divided into different patches. This patch series
includes following features: 

Preparatory work to implement the PCI passthrough support for the ARM:
- Refactor MSI code.
- Fixed compilation error when HAS_PCI enabled for ARM.

Discovering PCI Host Bridge in XEN:
- PCI init to initialize the PCI driver.
- PCI host bridge discovery in XEN and map the PCI ECAM configuration space to
  the XEN memory.
- PCI access functions.

Discovering PCI devices:
- To support the PCI passthrough, XEN should be aware of the PCI
  devices.
- Hardware domain is in charge of doing the PCI enumeration and will discover
  the PCI devices and then communicate to the XEN via a hypercall to add the
  PCI devices in XEN.

Enable the existing x86 virtual PCI support for ARM:
- Add VPCI trap handler for each of the PCI device added for config space
  access.
- Register the trap handler in XEN for each of the host bridge PCI ECAM config
  space access.

Emulated PCI device tree node in libxl:
- Create a virtual PCI device tree node in libxl to enable the guest OS to
  discover the virtual PCI during guest boot.

This patch series does not inlcude the following features. Following features
will be send for review in the next version of the patch series once initial
patch series merged.

- VPCI support for DOMU guests (Non-identity mappings guest view of the BARs)
- Virtual bus topology implementation
- IOMMU related changes (generic, SMMUv2, SMMUv3)
- MSI support for DOMU guests.
- Virual ITS support for DOMU guests

[1] https://lists.xenproject.org/archives/html/xen-devel/2020-07/msg01184.html
[2] 
https://lists.xenproject.org/archives/html/xen-devel/2020-07/threads.html#01184
[3] 
https://gitlab.com/rahsingh/xen-integration/-/commits/pci-passthrough-upstream-all
[4] 
https://static.sched.com/hosted_files/xen2021/e4/PCI_Device_Passthrough_On_Arm.pdf

Oleksandr Andrushchenko (1):
  xen/arm: Add support for Xilinx ZynqMP PCI host controller

Rahul Singh (16):
  xen/pci: Refactor MSI code that implements MSI functionality within
XEN
  xen/arm: pci: Add stubs to allow selecting HAS_PCI
  xen/arm: solve compilation error on ARM with ACPI && HAS_PCI
  xen/arm: xc_domain_ioport_permission(..) not supported on ARM.
  xen/arm: Add PHYSDEVOP_pci_device_* support for ARM
  xen/device-tree: Add dt_property_read_variable_u32_array helper
  xen/device-tree: Add dt_property_read_u32_array helper
  xen/device-tree: Add dt_get_pci_domain_nr helper
  xen/arm: Add support for PCI init to initialize the PCI driver.
  xen/arm: Add cmdline boot option "pci-passthrough = "
  xen/arm: PCI host bridge discovery within XEN on ARM
  xen/arm: Implement pci access functions
  xen/arm: Enable the existing x86 virtual PCI support for ARM.
  xen/arm: Transitional change to build HAS_VPCI on ARM.
  arm/libxl: Emulated PCI device tree node in libxl
  xen/arm: Add linux,pci-domain property for hwdom if not available.

 docs/misc/xen-command-line.pandoc   |   7 +
 tools/include/libxl.h   |   6 +
 tools/libs/ctrl/xc_domain.c |   9 +
 tools/libs/light/libxl_arm.c| 105 ++
 tools/libs/light/libxl_create.c |   3 +
 tools/libs/light/libxl_types.idl|   1 +
 tools/xl/xl_parse.c |   2 +
 xen/arch/arm/Makefile   |   2 +
 xen/arch/arm/domain.c   |   8 +-
 xen/arch/arm/domain_build.c |  19 ++
 xen/arch/arm/pci/Makefile   |   6 +
 xen/arch/arm/pci/ecam.c |  61 ++
 xen/arch/arm/pci/pci-access.c   | 140 ++
 xen/arch/arm/pci/pci-host-common.c  | 287 
 xen/arch/arm/pci/pci-host-generic.c |  46 +
 xen/arch/arm/pci/pci-host-zynqmp.c  |  63 ++
 xen/arch/arm/pci/pci.c  |  98 ++
 xen/arch/arm/physdev.c  |   5 +-
 xen/arch/arm/vpci.c | 102 ++
 xen/arch/arm/vpci.h |  36 
 xen/arch/x86/Kconfig|   1 +
 xen/arch/x86/domain.c   |   6 +
 xen/arch/x86/physdev.c  |  50 +
 xen/arch/x86/x86_64/physdev.c   |   4 +-
 xen/common/Makefile |   1 +
 xen/common/device_tree.c|  73 +++
 xen/common/domain.c |   2 +-
 xen/common/physdev.c|  87 +
 xen/drivers/passthrough/Makefile|   1 +
 xen/drivers/passthrough/arm/iommu.c |   9 +
 xen/drivers/passthrough/msi.c   |  83 
 xen/drivers/passthrough/pci.c   |  69 +++
 

[adhoc test] 165241: all pass

2021-09-28 Thread iwj
[adhoc adhoc] 
harness 3a3089c9: mfi-common: Drop Linux dom0 i386 tests for newer Lin...
165241: all pass

flight 165241 xen-unstable adhoc [adhoc]
http://logs.test-lab.xenproject.org/osstest/logs/165241/

Perfect :-)
All tests in this flight passed as required

jobs:
 build-arm64-pvopspass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary




[xen-unstable-smoke test] 165233: tolerable all pass - PUSHED

2021-09-28 Thread osstest service owner
flight 165233 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/165233/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  890ceb9453171c85e881103e65dbb5cdcf81659e
baseline version:
 xen  1c3ed9c908732d19660fbe83580674d585464d4c

Last test of basis   165230  2021-09-28 05:00:28 Z0 days
Testing same since   165233  2021-09-28 12:00:26 Z0 days1 attempts


People who touched revisions under test:
  Anthony PERARD 
  Ian Jackson 

jobs:
 build-arm64-xsm  pass
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-arm64-arm64-xl-xsm  pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/xen.git
   1c3ed9c908..890ceb9453  890ceb9453171c85e881103e65dbb5cdcf81659e -> smoke



Re: [PATCH v2 14/17] xen/arm: Enable the existing x86 virtual PCI support for ARM.

2021-09-28 Thread Rahul Singh
Hi Jan

> On 24 Sep 2021, at 8:44 am, Jan Beulich  wrote:
> 
> On 22.09.2021 13:35, Rahul Singh wrote:
>> @@ -623,7 +624,7 @@ int arch_sanitise_domain_config(struct 
>> xen_domctl_createdomain *config)
>> unsigned int max_vcpus;
>> 
>> /* HVM and HAP must be set. IOMMU may or may not be */
>> -if ( (config->flags & ~XEN_DOMCTL_CDF_iommu) !=
>> +if ( (config->flags & ~XEN_DOMCTL_CDF_iommu & ~XEN_DOMCTL_CDF_vpci) !=
>>  (XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap) )
>> {
>> dprintk(XENLOG_INFO, "Unsupported configuration %#x\n",
> 
> While you accept the new flag here and ...
> 
>> --- a/xen/common/domain.c
>> +++ b/xen/common/domain.c
>> @@ -483,7 +483,7 @@ static int sanitise_domain_config(struct 
>> xen_domctl_createdomain *config)
>>  ~(XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap |
>>XEN_DOMCTL_CDF_s3_integrity | XEN_DOMCTL_CDF_oos_off |
>>XEN_DOMCTL_CDF_xs_domain | XEN_DOMCTL_CDF_iommu |
>> -   XEN_DOMCTL_CDF_nested_virt) )
>> +   XEN_DOMCTL_CDF_nested_virt | XEN_DOMCTL_CDF_vpci) )
>> {
>> dprintk(XENLOG_INFO, "Unknown CDF flags %#x\n", config->flags);
>> return -EINVAL;
> 
> ... here, you need to somehow reject it on x86, until DomU support
> there gets added (unless I have misunderstood things and you're
> aiming at enabing that support for x86 here at the same time). I
> cannot spot existing code which would take care of such a newly
> added flag.

Ok. I will reject the flag in x86 arch_sanitise_domain_config().
> 
> 
>> --- a/xen/include/asm-x86/pci.h
>> +++ b/xen/include/asm-x86/pci.h
>> @@ -6,8 +6,6 @@
>> #define CF8_ADDR_HI(cf8) (  ((cf8) & 0x0f00) >> 16)
>> #define CF8_ENABLED(cf8) (!!((cf8) & 0x8000))
>> 
>> -#define MMCFG_BDF(addr)  ( ((addr) & 0x0000) >> 12)
> 
> While there was a reason for the padding blank after the first
> opening parentheses here, ...
> 
>> --- a/xen/include/xen/pci.h
>> +++ b/xen/include/xen/pci.h
>> @@ -41,6 +41,8 @@
>> #define PCI_SBDF3(s,b,df) \
>> ((pci_sbdf_t){ .sbdf = (((s) & 0x) << 16) | PCI_BDF2(b, df) })
>> 
>> +#define MMCFG_BDF(addr)  ( ((addr) & 0x0000) >> 12)
> 
> ... that blank ends up bogus here.
Ack . I will remove the extra blank in next version.

Regards,
Rahul
> 
> Jan
> 




[PATCH v3 3/3] arm/efi: load dom0 modules from DT using UEFI

2021-09-28 Thread Luca Fancellu
Add support to load Dom0 boot modules from
the device tree using the uefi,binary property.

Update documentation about that.

Signed-off-by: Luca Fancellu 
---
Changes in v3:
- new patch
---
 docs/misc/arm/device-tree/booting.txt |  8 
 docs/misc/efi.pandoc  | 64 +--
 xen/arch/arm/efi/efi-boot.h   | 36 +--
 xen/common/efi/boot.c | 12 ++---
 4 files changed, 108 insertions(+), 12 deletions(-)

diff --git a/docs/misc/arm/device-tree/booting.txt 
b/docs/misc/arm/device-tree/booting.txt
index 354bb43fe1..e73f6476d4 100644
--- a/docs/misc/arm/device-tree/booting.txt
+++ b/docs/misc/arm/device-tree/booting.txt
@@ -70,6 +70,14 @@ Each node contains the following properties:
priority of this field vs. other mechanisms of specifying the
bootargs for the kernel.
 
+- uefi,binary (UEFI boot only)
+
+   String property that specifies the file name to be loaded by the UEFI
+   boot for this module. If this is specified, there is no need to specify
+   the reg property because it will be created by the UEFI stub on boot.
+   This option is needed only when UEFI boot is used, the node needs to be
+   compatible with multiboot,kernel or multiboot,ramdisk.
+
 Examples
 
 
diff --git a/docs/misc/efi.pandoc b/docs/misc/efi.pandoc
index 800e67a233..4cebc47a18 100644
--- a/docs/misc/efi.pandoc
+++ b/docs/misc/efi.pandoc
@@ -167,6 +167,28 @@ sbsign \
--output xen.signed.efi \
xen.unified.efi
 ```
+## UEFI boot and Dom0 modules on ARM
+
+When booting using UEFI on ARM, it is possible to specify the Dom0 modules
+directly from the device tree without using the Xen configuration file, here an
+example:
+
+chosen {
+   #size-cells = <0x1>;
+   #address-cells = <0x1>;
+   xen,xen-bootargs = "[Xen boot arguments]"
+
+   module@1 {
+   compatible = "multiboot,kernel", "multiboot,module";
+   uefi,binary = "vmlinuz-3.0.31-0.4-xen";
+   bootargs = "[domain 0 command line options]";
+   };
+
+   module@2 {
+   compatible = "multiboot,ramdisk", "multiboot,module";
+   uefi,binary = "initrd-3.0.31-0.4-xen";
+   };
+}
 
 ## UEFI boot and dom0less on ARM
 
@@ -326,10 +348,10 @@ chosen {
 ### Boot Xen, Dom0 and DomU(s)
 
 This configuration is a mix of the two configuration above, to boot this one
-the configuration file must be processed so the /chosen node must have the
-"uefi,cfg-load" property.
+the configuration file can be processed or the Dom0 modules can be read from
+the device tree.
 
-Here an example:
+Here the first example:
 
 Xen configuration file:
 
@@ -369,4 +391,40 @@ chosen {
 };
 ```
 
+Here the second example:
+
+Device tree:
+
+```
+chosen {
+   #size-cells = <0x1>;
+   #address-cells = <0x1>;
+   xen,xen-bootargs = "[Xen boot arguments]"
+
+   module@1 {
+   compatible = "multiboot,kernel", "multiboot,module";
+   uefi,binary = "vmlinuz-3.0.31-0.4-xen";
+   bootargs = "[domain 0 command line options]";
+   };
+
+   module@2 {
+   compatible = "multiboot,ramdisk", "multiboot,module";
+   uefi,binary = "initrd-3.0.31-0.4-xen";
+   };
+
+   domU1 {
+   #size-cells = <0x1>;
+   #address-cells = <0x1>;
+   compatible = "xen,domain";
+   cpus = <0x1>;
+   memory = <0x0 0xc>;
+   vpl011;
 
+   module@1 {
+   compatible = "multiboot,kernel", "multiboot,module";
+   uefi,binary = "Image-domu1.bin";
+   bootargs = "console=ttyAMA0 root=/dev/ram0 rw";
+   };
+   };
+};
+```
diff --git a/xen/arch/arm/efi/efi-boot.h b/xen/arch/arm/efi/efi-boot.h
index 4f7c913f86..df63387136 100644
--- a/xen/arch/arm/efi/efi-boot.h
+++ b/xen/arch/arm/efi/efi-boot.h
@@ -31,8 +31,10 @@ static unsigned int __initdata modules_idx;
 #define ERROR_MISSING_DT_PROPERTY   (-3)
 #define ERROR_RENAME_MODULE_NAME(-4)
 #define ERROR_SET_REG_PROPERTY  (-5)
+#define ERROR_DOM0_ALREADY_FOUND(-6)
 #define ERROR_DT_MODULE_DOMU(-1)
 #define ERROR_DT_CHOSEN_NODE(-2)
+#define ERROR_DT_MODULE_DOM0(-3)
 
 void noreturn efi_xen_start(void *fdt_ptr, uint32_t fdt_size);
 void __flush_dcache_area(const void *vaddr, unsigned long size);
@@ -45,7 +47,8 @@ static int allocate_module_file(EFI_FILE_HANDLE dir_handle,
 static int handle_module_node(EFI_FILE_HANDLE dir_handle,
   int module_node_offset,
   int reg_addr_cells,
-  int reg_size_cells);
+  int reg_size_cells,
+  bool is_domu_module);
 static bool is_boot_module(int dt_module_offset);
 static int handle_dom0less_domain_node(EFI_FILE_HANDLE dir_handle,
   

[PATCH v3 2/3] arm/efi: Use dom0less configuration when using EFI boot

2021-09-28 Thread Luca Fancellu
This patch introduces the support for dom0less configuration
when using UEFI boot on ARM, it permits the EFI boot to
continue if no dom0 kernel is specified but at least one domU
is found.

Introduce the new property "uefi,binary" for device tree boot
module nodes that are subnode of "xen,domain" compatible nodes.
The property holds a string containing the file name of the
binary that shall be loaded by the uefi loader from the filesystem.

Update efi documentation about how to start a dom0less
setup using UEFI

Signed-off-by: Luca Fancellu 
---
Changes in v3:
- fixed documentation
- fixed name len in strlcpy
- fixed some style issues
- closed filesystem handle before calling blexit
- passed runtime errors up to the stack instead
of calling blexit
- renamed names and function to make them more
general in prevision to load also Dom0 kernel
and ramdisk from DT
Changes in v2:
- remove array of struct file
- fixed some int types
- Made the code use filesystem even when configuration
file is skipped.
- add documentation of uefi,binary in booting.txt
- add documentation on how to boot all configuration
for Xen using UEFI in efi.pandoc
---
 docs/misc/arm/device-tree/booting.txt |  21 ++
 docs/misc/efi.pandoc  | 203 +
 xen/arch/arm/efi/efi-boot.h   | 305 +-
 xen/arch/x86/efi/efi-boot.h   |   6 +
 xen/common/efi/boot.c |  42 ++--
 5 files changed, 562 insertions(+), 15 deletions(-)

diff --git a/docs/misc/arm/device-tree/booting.txt 
b/docs/misc/arm/device-tree/booting.txt
index cf878b478e..354bb43fe1 100644
--- a/docs/misc/arm/device-tree/booting.txt
+++ b/docs/misc/arm/device-tree/booting.txt
@@ -190,6 +190,13 @@ The kernel sub-node has the following properties:
 
 Command line parameters for the guest kernel.
 
+- uefi,binary (UEFI boot only)
+
+String property that specifies the file name to be loaded by the UEFI boot
+for this module. If this is specified, there is no need to specify the reg
+property because it will be created by the UEFI stub on boot.
+This option is needed only when UEFI boot is used.
+
 The ramdisk sub-node has the following properties:
 
 - compatible
@@ -201,6 +208,13 @@ The ramdisk sub-node has the following properties:
 Specifies the physical address of the ramdisk in RAM and its
 length.
 
+- uefi,binary (UEFI boot only)
+
+String property that specifies the file name to be loaded by the UEFI boot
+for this module. If this is specified, there is no need to specify the reg
+property because it will be created by the UEFI stub on boot.
+This option is needed only when UEFI boot is used.
+
 
 Example
 ===
@@ -265,6 +279,13 @@ The dtb sub-node should have the following properties:
 Specifies the physical address of the device tree binary fragment
 RAM and its length.
 
+- uefi,binary (UEFI boot only)
+
+String property that specifies the file name to be loaded by the UEFI boot
+for this module. If this is specified, there is no need to specify the reg
+property because it will be created by the UEFI stub on boot.
+This option is needed only when UEFI boot is used.
+
 As an example:
 
 module@0xc00 {
diff --git a/docs/misc/efi.pandoc b/docs/misc/efi.pandoc
index e289c5e7ba..800e67a233 100644
--- a/docs/misc/efi.pandoc
+++ b/docs/misc/efi.pandoc
@@ -167,3 +167,206 @@ sbsign \
--output xen.signed.efi \
xen.unified.efi
 ```
+
+## UEFI boot and dom0less on ARM
+
+Dom0less feature is supported by ARM and it is possible to use it when Xen is
+started as an EFI application.
+The way to specify the domU domains is by Device Tree as specified in the
+[dom0less](dom0less.html) documentation page under the "Device Tree
+configuration" section, but instead of declaring the reg property in the boot
+module, the user must specify the "uefi,binary" property containing the name
+of the binary file that has to be loaded in memory.
+The UEFI stub will load the binary in memory and it will add the reg property
+accordingly.
+
+An example here:
+
+domU1 {
+   #address-cells = <1>;
+   #size-cells = <1>;
+   compatible = "xen,domain";
+   memory = <0 0x2>;
+   cpus = <1>;
+   vpl011;
+
+   module@1 {
+   compatible = "multiboot,kernel", "multiboot,module";
+   uefi,binary = "vmlinuz-3.0.31-0.4-xen";
+   bootargs = "console=ttyAMA0";
+   };
+   module@2 {
+   compatible = "multiboot,ramdisk", "multiboot,module";
+   uefi,binary = "initrd-3.0.31-0.4-xen";
+   };
+   module@3 {
+   compatible = "multiboot,ramdisk", "multiboot,module";
+   uefi,binary = "passthrough.dtb";
+   };
+};
+
+## How to boot different Xen setup using UEFI
+
+These are the different ways to boot a Xen system from UEFI:
+
+ - Boot Xen and Dom0 (minimum required)
+ - Boot Xen and DomU(s) (true dom0less, only on ARM)
+ - 

[PATCH v3 0/3] arm/efi: Add dom0less support to UEFI boot

2021-09-28 Thread Luca Fancellu
This serie introduces a way to start a dom0less setup when Xen is booting as EFI
application.
Using the device tree it's now possible to fetch from the disk and load in
memory all the modules needed to start any domU defined in the DT.
Dom0less for now is supported only by the arm architecture.

Luca Fancellu (3):
  arm/efi: Introduce uefi,cfg-load DT property
  arm/efi: Use dom0less configuration when using EFI boot
  arm/efi: load dom0 modules from DT using UEFI

 docs/misc/arm/device-tree/booting.txt |  37 +++
 docs/misc/efi.pandoc  | 263 +++
 xen/arch/arm/efi/efi-boot.h   | 361 +-
 xen/arch/x86/efi/efi-boot.h   |   6 +
 xen/common/efi/boot.c |  54 ++--
 5 files changed, 696 insertions(+), 25 deletions(-)

-- 
2.17.1




[PATCH v3 1/3] arm/efi: Introduce uefi,cfg-load DT property

2021-09-28 Thread Luca Fancellu
Introduce the uefi,cfg-load DT property of /chosen
node for ARM whose presence decide whether to force
the load of the UEFI Xen configuration file.

The logic is that if any multiboot,module is found in
the DT, then the uefi,cfg-load property is used to see
if the UEFI Xen configuration file is needed.

Modify a comment in efi_arch_use_config_file, removing
the part that states "dom0 required" because it's not
true anymore with this commit.

Signed-off-by: Luca Fancellu 
---
v3 changes:
- add documentation to misc/arm/device-tree/booting.txt
- Modified variable name and logic from skip_cfg_file to
load_cfg_file
- Add in the commit message that I'm modifying a comment.
v2 changes:
- Introduced uefi,cfg-load property
- Add documentation about the property
---
 docs/misc/arm/device-tree/booting.txt |  8 
 docs/misc/efi.pandoc  |  2 ++
 xen/arch/arm/efi/efi-boot.h   | 28 ++-
 3 files changed, 33 insertions(+), 5 deletions(-)

diff --git a/docs/misc/arm/device-tree/booting.txt 
b/docs/misc/arm/device-tree/booting.txt
index 44cd9e1a9a..cf878b478e 100644
--- a/docs/misc/arm/device-tree/booting.txt
+++ b/docs/misc/arm/device-tree/booting.txt
@@ -121,6 +121,14 @@ A Xen-aware bootloader would set xen,xen-bootargs for Xen, 
xen,dom0-bootargs
 for Dom0 and bootargs for native Linux.
 
 
+UEFI boot and DT
+
+
+When Xen is booted using UEFI, it doesn't read the configuration file if any
+multiboot module is specified. To force Xen to load the configuration file, the
+boolean property uefi,cfg-load must be declared in the /chosen node.
+
+
 Creating Multiple Domains directly from Xen
 ===
 
diff --git a/docs/misc/efi.pandoc b/docs/misc/efi.pandoc
index ac3cd58cae..e289c5e7ba 100644
--- a/docs/misc/efi.pandoc
+++ b/docs/misc/efi.pandoc
@@ -14,6 +14,8 @@ loaded the modules and describes them in the device tree 
provided to Xen.  If a
 bootloader provides a device tree containing modules then any configuration
 files are ignored, and the bootloader is responsible for populating all
 relevant device tree nodes.
+The property "uefi,cfg-load" can be specified in the /chosen node to force Xen
+to load the configuration file even if multiboot modules are found.
 
 Once built, `make install-xen` will place the resulting binary directly into
 the EFI boot partition, provided `EFI_VENDOR` is set in the environment (and
diff --git a/xen/arch/arm/efi/efi-boot.h b/xen/arch/arm/efi/efi-boot.h
index cf9c37153f..4f1b01757d 100644
--- a/xen/arch/arm/efi/efi-boot.h
+++ b/xen/arch/arm/efi/efi-boot.h
@@ -581,22 +581,40 @@ static void __init 
efi_arch_load_addr_check(EFI_LOADED_IMAGE *loaded_image)
 
 static bool __init efi_arch_use_config_file(EFI_SYSTEM_TABLE *SystemTable)
 {
+bool load_cfg_file = true;
 /*
  * For arm, we may get a device tree from GRUB (or other bootloader)
  * that contains modules that have already been loaded into memory.  In
- * this case, we do not use a configuration file, and rely on the
- * bootloader to have loaded all required modules and appropriate
- * options.
+ * this case, we search for the property uefi,cfg-load in the /chosen node
+ * to decide whether to skip the UEFI Xen configuration file or not.
  */
 
 fdt = lookup_fdt_config_table(SystemTable);
 dtbfile.ptr = fdt;
 dtbfile.need_to_free = false; /* Config table memory can't be freed. */
-if ( !fdt || fdt_node_offset_by_compatible(fdt, 0, "multiboot,module") < 0 
)
+
+if ( fdt_node_offset_by_compatible(fdt, 0, "multiboot,module") > 0 )
+{
+/* Locate chosen node */
+int node = fdt_subnode_offset(fdt, 0, "chosen");
+const void *cfg_load_prop;
+int cfg_load_len;
+
+if ( node > 0 )
+{
+/* Check if uefi,cfg-load property exists */
+cfg_load_prop = fdt_getprop(fdt, node, "uefi,cfg-load",
+_load_len);
+if ( !cfg_load_prop )
+load_cfg_file = false;
+}
+}
+
+if ( !fdt || load_cfg_file )
 {
 /*
  * We either have no FDT, or one without modules, so we must have a
- * Xen EFI configuration file to specify modules.  (dom0 required)
+ * Xen EFI configuration file to specify modules.
  */
 return true;
 }
-- 
2.17.1




Re: [xen-unstable test] 164996: regressions - FAIL

2021-09-28 Thread Ian Jackson
Jan Beulich writes ("Re: [xen-unstable test] 164996: regressions - FAIL"):
> Ian, for your setting up of a one-off flight (as just talked about),
> you can find the patch at
> https://lists.xen.org/archives/html/xen-devel/2021-09/msg01691.html
> (and perhaps in your mailbox). In case that's easier I've also attached
> it here.
...
> [DELETED ATTACHMENT linux-5.15-rc2-xen-privcmd-mmap-kvcalloc.patch, plain 
> text]

Thanks.  The attachment didn't git-am but I managed to make a tree
with it in (but a bogus commit message).

I have a repro of 165218 test-arm64-arm64-libvirt-raw (that's the last
xen-unstable flight) running.  If all goes well it will rebuild Linux
from my branch (new flight 165241) and then run the test using that
kernel (new flight 165242).  I have told it to report to the people on
this thread (and the list).

It will probably report in an hour or two (since it needs to rebuild a
kernel and then negotiate to get a host to run the repro on).
I didn't ask it to keep the host for me, but it ought to publish the
logs and as I say, send an email report here.

Ian.

For my reference:

./mg-transient-task ./mg-repro-setup -P -E...,i...@xenproject.org,... 165218 
test-arm64-arm64-libvirt-raw X --rebuild 
+linux=https://xenbits.xen.org/git-http/people/iwj/linux.git#164996-fix 
alloc:equiv-rochester




Re: [PATCH v2 10/11] xen/arm: Do not map PCI ECAM and MMIO space to Domain-0's p2m

2021-09-28 Thread Stefano Stabellini
On Tue, 28 Sep 2021, Oleksandr Andrushchenko wrote:
> [snip]
> >> Sorry I didn't follow your explanation.
> >>
> >> My suggestion is to remove the #ifdef CONFIG_HAS_PCI completely from
> >> map_range_to_domain. At the beginning of map_range_to_domain, there is
> >> already this line:
> >>
> >> bool need_mapping = !dt_device_for_passthrough(dev);
> >>
> >> We can change it into:
> >>
> >> bool need_mapping = !dt_device_for_passthrough(dev) &&
> >>       !mr_data->skip_mapping;
> >>
> >>
> >> Then, in map_device_children we can set mr_data->skip_mapping to true
> >> for PCI devices.
> > This is the key. I am fine with this, but it just means we move the
> >
> > check to the outside of this function which looks good. Will do
> >
> >>There is already a pci check there:
> >>
> >>if ( dt_device_type_is_equal(dev, "pci") )
> >>
> >> so it should be easy to do. What am I missing?
> >>
> >>
> >
> I did some experiments. If we move the check to map_device_children
> 
> it is not enough because part of the ranges is still mapped at handle_device 
> level:
> 
> handle_device:
> (XEN) --- /axi/pcie@fd0e need_mapping 1 addr fd0e
> (XEN) --- /axi/pcie@fd0e need_mapping 1 addr fd48
> (XEN) --- /axi/pcie@fd0e need_mapping 1 addr 80
> 
> map_device_children:
> (XEN) Mapping children of /axi/pcie@fd0e to guest skip 1
> (XEN) --- /axi/pcie@fd0e need_mapping 0 addr e000
> (XEN) --- /axi/pcie@fd0e need_mapping 0 addr 6
> 
> pci_host_bridge_mappings:
> (XEN) --- /axi/pcie@fd0e need_mapping 1 addr fd0e
> (XEN) --- /axi/pcie@fd0e need_mapping 1 addr fd48
> 
> So, I did more intrusive change:
> 
> @@ -1540,6 +1534,12 @@ static int __init handle_device(struct domain *d, 
> struct dt_device_node *dev,
>   int res;
>   u64 addr, size;
>   bool need_mapping = !dt_device_for_passthrough(dev);
> +    struct map_range_data mr_data = {
> +    .d = d,
> +    .p2mt = p2mt,
> +    .skip_mapping = is_pci_passthrough_enabled() &&
> +    (device_get_class(dev) == DEVICE_PCI)
> +    };
> 
> With this I see that now mappings are done correctly:
> 
> handle_device:
> (XEN) --- /axi/pcie@fd0e need_mapping 0 addr fd0e
> (XEN) --- /axi/pcie@fd0e need_mapping 0 addr fd48
> (XEN) --- /axi/pcie@fd0e need_mapping 0 addr 80
> 
> map_device_children:
> (XEN) Mapping children of /axi/pcie@fd0e to guest skip 1
> (XEN) --- /axi/pcie@fd0e need_mapping 0 addr e000
> (XEN) --- /axi/pcie@fd0e need_mapping 0 addr 6
> 
> pci_host_bridge_mappings:
> (XEN) --- /axi/pcie@fd0e need_mapping 1 addr fd0e
> (XEN) --- /axi/pcie@fd0e need_mapping 1 addr fd48
> 
> So, handle_device seems to be the right place. While at it I have also
> 
> optimized the way we setup struct map_range_data mr_data in both
> 
> handle_device and map_device_children: I removed structure initialization
> 
> from within the relevant loop and also pass mr_data to map_device_children,
> 
> so it doesn't need to create its own copy of the same and perform yet
> 
> another computation for .skip_mapping: it does need to not only know
> 
> that dev is a PCI device (this is done by the dt_device_type_is_equal(dev, 
> "pci")
> 
> check, but also account on is_pci_passthrough_enabled().
> 
> Thus, the change will be more intrusive, but I hope will simplify things.
> 
> I am attaching the fixup patch for just in case you want more details.

Yes, thanks, this is what I had in mind. Hopefully the resulting
combined patch will be simpler.

Cheers,

Stefano

Re: [PATCH V3 1/3] xen: Introduce "gpaddr_bits" field to XEN_SYSCTL_physinfo

2021-09-28 Thread Oleksandr



On 28.09.21 09:28, Michal Orzel wrote:

Hi Oleksandr,


Hi Michal





On 24.09.2021 00:48, Oleksandr Tyshchenko wrote:

From: Oleksandr Tyshchenko 

We need to pass info about maximum supported guest address
space size to the toolstack on Arm in order to properly
calculate the base and size of the extended region (safe range)
for the guest. The extended region is unused address space which
could be safely used by domain for foreign/grant mappings on Arm.
The extended region itself will be handled by the subsequents
patch.

Use p2m_ipa_bits variable on Arm, the x86 equivalent is
hap_paddr_bits.

As we change the size of structure bump the interface version.

Suggested-by: Julien Grall 
Signed-off-by: Oleksandr Tyshchenko 
---
Please note, that review comments for the RFC version [1] haven't been 
addressed yet.
It is not forgotten, some clarification is needed. It will be addressed for the 
next version.

[1] 
https://lore.kernel.org/xen-devel/973f5344-aa10-3ad6-ff02-ad5f358ad...@citrix.com/

Changes RFC -> V2:
- update patch subject/description
- replace arch-specific sub-struct with common gpaddr_bits
  field and update code to reflect that

Changes V2 -> V3:
- make the field uint8_t and add uint8_t pad[7] after
- remove leading blanks in libxl.h
---
  tools/include/libxl.h| 7 +++
  tools/libs/light/libxl.c | 2 ++
  tools/libs/light/libxl_types.idl | 2 ++
  xen/arch/arm/sysctl.c| 2 ++
  xen/arch/x86/sysctl.c| 2 ++
  xen/include/public/sysctl.h  | 4 +++-
  6 files changed, 18 insertions(+), 1 deletion(-)


Don't you want to print gpaddr_bits field of xen_sysctl_physinfo from 
output_physinfo (xl_info.c)?


Good point, will do, thank you.




Apart from that:
Reviewed-by: Michal Orzel 


Thanks




Cheers


--
Regards,

Oleksandr Tyshchenko




[xen-unstable test] 165227: regressions - FAIL

2021-09-28 Thread osstest service owner
flight 165227 xen-unstable real [real]
flight 165235 xen-unstable real-retest [real]
http://logs.test-lab.xenproject.org/osstest/logs/165227/
http://logs.test-lab.xenproject.org/osstest/logs/165235/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-arm64-arm64-libvirt-raw 17 guest-start/debian.repeat fail REGR. vs. 164945

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 164945
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 164945
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 164945
 test-amd64-i386-xl-qemut-ws16-amd64 19 guest-stop fail like 164945
 test-amd64-i386-xl-qemut-win7-amd64 19 guest-stop fail like 164945
 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check   fail like 164945
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 164945
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 164945
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 164945
 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 164945
 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 164945
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 164945
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-amd64-i386-xl-pvshim14 guest-start  fail   never pass
 test-amd64-i386-libvirt  15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-raw  14 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-credit1  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass
 test-armhf-armhf-libvirt-qcow2 14 migrate-support-checkfail never pass
 test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  14 migrate-support-check  

Re: [PATCH v5 2/2] tools/xenstore: set open file descriptor limit for xenstored

2021-09-28 Thread Ian Jackson
Juergen Gross writes ("Re: [PATCH v5 2/2] tools/xenstore: set open file 
descriptor limit for xenstored"):
> Hmm, maybe I should just use:
> 
> prlimit --nofile=$XENSTORED_MAX_OPEN_FDS \
> $XENSTORED --pid-file @XEN_RUN_DIR@/xenstored.pid $XENSTORED_ARGS

I guess that would probably work (although it involves another
exec) but I don't understand what's wrong with ulimit, which is a
shell builtin.

I think this script has to run only on Linux and all reasonable Linux
/bin/sh have `ulimit`.  (I have checked dash and bash.)

So I think just

  ulimit -n $XENSTORED_MAX_OPEN_FDS

  $XENSTORED --pid-file @XEN_RUN_DIR@/xenstored.pid $XENSTORED_ARGS

will DTRT.  You could also do this

  ulimit -n $XENSTORED_MAX_OPEN_FDS || true

which will arrange that if, somehow, this fails, the system is likely
to continue to mostly-work despite the error.  Whether that would be
desirable is a matter of taste I think.

(I have RTFM again, and setting -H and -S separately is not needed;
omitting -H or -S means to set both.)

Ian.



Re: [xen-unstable test] 164996: regressions - FAIL

2021-09-28 Thread Jan Beulich
On 23.09.2021 04:56, Julien Grall wrote:
> We could push the patch in the branch we have. However the Linux we use is
> not fairly old (I think I did a push last year) and not even the latest
> stable.

I don't think that's a problem here - this looks to be 5.4.17-ish, which
the patch should be good for (and it does apply cleanly to plain 5.4.0).

Ian, for your setting up of a one-off flight (as just talked about),
you can find the patch at
https://lists.xen.org/archives/html/xen-devel/2021-09/msg01691.html
(and perhaps in your mailbox). In case that's easier I've also attached
it here.

Jan
xen/privcmd: replace kcalloc() by kvcalloc() when allocating empty pages

Osstest has been suffering test failures for a little while from order-4
allocation failures, resulting from alloc_empty_pages() calling
kcalloc(). As there's no need for physically contiguous space here,
switch to kvcalloc().

Signed-off-by: Jan Beulich 
Cc: sta...@vger.kernel.org
Reviewed-by: Juergen Gross 
---
RFC: I cannot really test this, as alloc_empty_pages() only gets used in
 the auto-translated case (i.e. on Arm or PVH Dom0, the latter of
 which I'm not trusting enough yet to actually start playing with
 guests).

There are quite a few more kcalloc() where it's not immediately clear
how large the element counts could possibly grow nor whether it would be
fine to replace them (i.e. physically contiguous space not required).

I wasn't sure whether to Cc stable@ here; the issue certainly has been
present for quite some time. But it didn't look to cause issues until
recently.

--- a/drivers/xen/privcmd.c
+++ b/drivers/xen/privcmd.c
@@ -420,7 +420,7 @@ static int alloc_empty_pages(struct vm_a
int rc;
struct page **pages;
 
-   pages = kcalloc(numpgs, sizeof(pages[0]), GFP_KERNEL);
+   pages = kvcalloc(numpgs, sizeof(pages[0]), GFP_KERNEL);
if (pages == NULL)
return -ENOMEM;
 
@@ -428,7 +428,7 @@ static int alloc_empty_pages(struct vm_a
if (rc != 0) {
pr_warn("%s Could not alloc %d pfns rc:%d\n", __func__,
numpgs, rc);
-   kfree(pages);
+   kvfree(pages);
return -ENOMEM;
}
BUG_ON(vma->vm_private_data != NULL);
@@ -912,7 +912,7 @@ static void privcmd_close(struct vm_area
else
pr_crit("unable to unmap MFN range: leaking %d pages. rc=%d\n",
numpgs, rc);
-   kfree(pages);
+   kvfree(pages);
 }
 
 static vm_fault_t privcmd_fault(struct vm_fault *vmf)


Re: [XEN PATCH v5] xen: rework `checkpolicy` detection when using "randconfig"

2021-09-28 Thread Luca Fancellu



> On 28 Sep 2021, at 16:04, Anthony PERARD  wrote:
> 
> This will help prevent the CI loop from having build failures when
> `checkpolicy` isn't available when doing "randconfig" jobs.
> 
> To prevent "randconfig" from selecting XSM_FLASK_POLICY when
> `checkpolicy` isn't available, we will actually override the config
> output with the use of KCONFIG_ALLCONFIG.
> 
> Doing this way still allow a user/developer to set XSM_FLASK_POLICY
> even when "checkpolicy" isn't available. It also prevent the build
> system from reset the config when "checkpolicy" isn't available
> anymore. And XSM_FLASK_POLICY is still selected automatically when
> `checkpolicy` is available.
> But this also work well for "randconfig", as it will not select
> XSM_FLASK_POLICY when "checkpolicy" is missing.
> 
> This patch allows to easily add more override which depends on the
> environment.
> 
> Also, move the check out of Config.mk and into xen/ build system.
> Nothing in tools/ is using that information as it's done by
> ./configure.
> 
> We named the new file ".allconfig.tmp" as ".*.tmp" are already ignored
> via .gitignore.
> 
> Signed-off-by: Anthony PERARD 

Reviewed-by: Luca Fancellu 

> ---
> v5:
> - remove changes to common/Kconfig in order to avoid change in
>  behavior for "default y if m" in future Kconfig update as the current
>  behavior doesn't seems to be explicitly documented.
> 
> v4:
> - keep XEN_ prefix for HAS_CHECKPOLICY
> - rework .allconfig.tmp file generation, so it is easier to read.
> - remove .allconfig.tmp on clean, .*.tmp files aren't all cleaned yet,
>  maybe for another time.
> - add information about file name choice and Kconfig change in patch
>  description.
> 
> v3:
> - use KCONFIG_ALLCONFIG
> - don't override XSM_FLASK_POLICY value unless we do randconfig.
> - no more changes to the current behavior of kconfig, only to
>  randconfig.
> 
> v2 was "[XEN PATCH v2] xen: allow XSM_FLASK_POLICY only if checkpolicy binary 
> is available"
> ---
> Config.mk|  6 --
> xen/Makefile | 20 +---
> 2 files changed, 17 insertions(+), 9 deletions(-)
> 
> diff --git a/Config.mk b/Config.mk
> index e85bf186547f..d5490e35d03d 100644
> --- a/Config.mk
> +++ b/Config.mk
> @@ -137,12 +137,6 @@ export XEN_HAS_BUILD_ID=y
> build_id_linker := --build-id=sha1
> endif
> 
> -ifndef XEN_HAS_CHECKPOLICY
> -CHECKPOLICY ?= checkpolicy
> -XEN_HAS_CHECKPOLICY := $(shell $(CHECKPOLICY) -h 2>&1 | grep -q xen && 
> echo y || echo n)
> -export XEN_HAS_CHECKPOLICY
> -endif
> -
> define buildmakevars2shellvars
> export PREFIX="$(prefix)";\
> export XEN_SCRIPT_DIR="$(XEN_SCRIPT_DIR)";\
> diff --git a/xen/Makefile b/xen/Makefile
> index f47423dacd9a..7c2ffce0fc77 100644
> --- a/xen/Makefile
> +++ b/xen/Makefile
> @@ -17,6 +17,8 @@ export XEN_BUILD_HOST   ?= $(shell hostname)
> PYTHON_INTERPRETER:= $(word 1,$(shell which python3 python python2 
> 2>/dev/null) python)
> export PYTHON ?= $(PYTHON_INTERPRETER)
> 
> +export CHECKPOLICY   ?= checkpolicy
> +
> export BASEDIR := $(CURDIR)
> export XEN_ROOT := $(BASEDIR)/..
> 
> @@ -178,6 +180,8 @@ CFLAGS += $(CLANG_FLAGS)
> export CLANG_FLAGS
> endif
> 
> +export XEN_HAS_CHECKPOLICY := $(call success,$(CHECKPOLICY) -h 2>&1 | grep 
> -q xen)
> +
> export root-make-done := y
> endif # root-make-done
> 
> @@ -189,14 +193,24 @@ ifeq ($(config-build),y)
> # *config targets only - make sure prerequisites are updated, and descend
> # in tools/kconfig to make the *config target
> 
> +# Create a file for KCONFIG_ALLCONFIG which depends on the environment.
> +# This will be use by kconfig targets 
> allyesconfig/allmodconfig/allnoconfig/randconfig
> +filechk_kconfig_allconfig = \
> +$(if $(findstring n,$(XEN_HAS_CHECKPOLICY)), echo 
> 'CONFIG_XSM_FLASK_POLICY=n';) \
> +$(if $(KCONFIG_ALLCONFIG), cat $(KCONFIG_ALLCONFIG);) \
> +:
> +
> +.allconfig.tmp: FORCE
> + set -e; { $(call filechk_kconfig_allconfig); } > $@
> +
> config: FORCE
>   $(MAKE) $(kconfig) $@
> 
> # Config.mk tries to include .config file, don't try to remake it
> %/.config: ;
> 
> -%config: FORCE
> - $(MAKE) $(kconfig) $@
> +%config: .allconfig.tmp FORCE
> + $(MAKE) $(kconfig) KCONFIG_ALLCONFIG=$< $@
> 
> else # !config-build
> 
> @@ -368,7 +382,7 @@ _clean: delete-unfresh-files
>   -o -name "*.gcno" -o -name ".*.cmd" -o -name "lib.a" \) -exec 
> rm -f {} \;
>   rm -f include/asm $(TARGET) $(TARGET).gz $(TARGET).efi 
> $(TARGET).efi.map $(TARGET)-syms $(TARGET)-syms.map *~ core
>   rm -f asm-offsets.s include/asm-*/asm-offsets.h
> - rm -f .banner
> + rm -f .banner .allconfig.tmp
> 
> .PHONY: _distclean
> _distclean: clean
> -- 
> Anthony PERARD
> 
> 




Re: [XEN PATCH v5] xen: rework `checkpolicy` detection when using "randconfig"

2021-09-28 Thread Jan Beulich
On 28.09.2021 17:04, Anthony PERARD wrote:
> This will help prevent the CI loop from having build failures when
> `checkpolicy` isn't available when doing "randconfig" jobs.
> 
> To prevent "randconfig" from selecting XSM_FLASK_POLICY when
> `checkpolicy` isn't available, we will actually override the config
> output with the use of KCONFIG_ALLCONFIG.
> 
> Doing this way still allow a user/developer to set XSM_FLASK_POLICY
> even when "checkpolicy" isn't available. It also prevent the build
> system from reset the config when "checkpolicy" isn't available
> anymore. And XSM_FLASK_POLICY is still selected automatically when
> `checkpolicy` is available.
> But this also work well for "randconfig", as it will not select
> XSM_FLASK_POLICY when "checkpolicy" is missing.
> 
> This patch allows to easily add more override which depends on the
> environment.
> 
> Also, move the check out of Config.mk and into xen/ build system.
> Nothing in tools/ is using that information as it's done by
> ./configure.
> 
> We named the new file ".allconfig.tmp" as ".*.tmp" are already ignored
> via .gitignore.
> 
> Signed-off-by: Anthony PERARD 

Reviewed-by: Jan Beulich 




Re: [XEN PATCH v4] xen: rework `checkpolicy` detection when using "randconfig"

2021-09-28 Thread Anthony PERARD
On Tue, Sep 28, 2021 at 03:34:00PM +0100, Anthony PERARD wrote:
> On Tue, Sep 28, 2021 at 03:46:01PM +0200, Jan Beulich wrote:
> > On 28.09.2021 10:39, Anthony PERARD wrote:
> > > This will help prevent the CI loop from having build failures when
> > > `checkpolicy` isn't available when doing "randconfig" jobs.
> > > 
> > > To prevent "randconfig" from selecting XSM_FLASK_POLICY when
> > > `checkpolicy` isn't available, we will actually override the config
> > > output with the use of KCONFIG_ALLCONFIG.
> > > 
> > > Doing this way still allow a user/developer to set XSM_FLASK_POLICY
> > > even when "checkpolicy" isn't available. It also prevent the build
> > > system from reset the config when "checkpolicy" isn't available
> > > anymore. And XSM_FLASK_POLICY is still selected automatically when
> > > `checkpolicy` is available.
> > > But this also work well for "randconfig", as it will not select
> > > XSM_FLASK_POLICY when "checkpolicy" is missing.
> > > 
> > > This patch allows to easily add more override which depends on the
> > > environment.
> > > 
> > > Also, move the check out of Config.mk and into xen/ build system.
> > > Nothing in tools/ is using that information as it's done by
> > > ./configure.
> > > 
> > > We named the new file ".allconfig.tmp" as ".*.tmp" are already ignored
> > > via .gitignore.
> > > 
> > > Remove '= y' in Kconfig as it isn't needed, only a value "y" is true,
> > > anything else is considered false.
> > 
> > Seeing you say this explicitly makes me wonder - is this actually true?
> 
> I've check that this was true by empirical testing before sending the
> patch. But the documentation isn't clear to me about the meaning of
> 'default y if "m"'. So would you rather keep '= y' just to stay on the
> safe side?

I've sent v5 with this change to the Kconfig file removed.

-- 
Anthony PERARD



[XEN PATCH v5] xen: rework `checkpolicy` detection when using "randconfig"

2021-09-28 Thread Anthony PERARD
This will help prevent the CI loop from having build failures when
`checkpolicy` isn't available when doing "randconfig" jobs.

To prevent "randconfig" from selecting XSM_FLASK_POLICY when
`checkpolicy` isn't available, we will actually override the config
output with the use of KCONFIG_ALLCONFIG.

Doing this way still allow a user/developer to set XSM_FLASK_POLICY
even when "checkpolicy" isn't available. It also prevent the build
system from reset the config when "checkpolicy" isn't available
anymore. And XSM_FLASK_POLICY is still selected automatically when
`checkpolicy` is available.
But this also work well for "randconfig", as it will not select
XSM_FLASK_POLICY when "checkpolicy" is missing.

This patch allows to easily add more override which depends on the
environment.

Also, move the check out of Config.mk and into xen/ build system.
Nothing in tools/ is using that information as it's done by
./configure.

We named the new file ".allconfig.tmp" as ".*.tmp" are already ignored
via .gitignore.

Signed-off-by: Anthony PERARD 
---
v5:
- remove changes to common/Kconfig in order to avoid change in
  behavior for "default y if m" in future Kconfig update as the current
  behavior doesn't seems to be explicitly documented.

v4:
- keep XEN_ prefix for HAS_CHECKPOLICY
- rework .allconfig.tmp file generation, so it is easier to read.
- remove .allconfig.tmp on clean, .*.tmp files aren't all cleaned yet,
  maybe for another time.
- add information about file name choice and Kconfig change in patch
  description.

v3:
- use KCONFIG_ALLCONFIG
- don't override XSM_FLASK_POLICY value unless we do randconfig.
- no more changes to the current behavior of kconfig, only to
  randconfig.

v2 was "[XEN PATCH v2] xen: allow XSM_FLASK_POLICY only if checkpolicy binary 
is available"
---
 Config.mk|  6 --
 xen/Makefile | 20 +---
 2 files changed, 17 insertions(+), 9 deletions(-)

diff --git a/Config.mk b/Config.mk
index e85bf186547f..d5490e35d03d 100644
--- a/Config.mk
+++ b/Config.mk
@@ -137,12 +137,6 @@ export XEN_HAS_BUILD_ID=y
 build_id_linker := --build-id=sha1
 endif
 
-ifndef XEN_HAS_CHECKPOLICY
-CHECKPOLICY ?= checkpolicy
-XEN_HAS_CHECKPOLICY := $(shell $(CHECKPOLICY) -h 2>&1 | grep -q xen && 
echo y || echo n)
-export XEN_HAS_CHECKPOLICY
-endif
-
 define buildmakevars2shellvars
 export PREFIX="$(prefix)";\
 export XEN_SCRIPT_DIR="$(XEN_SCRIPT_DIR)";\
diff --git a/xen/Makefile b/xen/Makefile
index f47423dacd9a..7c2ffce0fc77 100644
--- a/xen/Makefile
+++ b/xen/Makefile
@@ -17,6 +17,8 @@ export XEN_BUILD_HOST ?= $(shell hostname)
 PYTHON_INTERPRETER := $(word 1,$(shell which python3 python python2 
2>/dev/null) python)
 export PYTHON  ?= $(PYTHON_INTERPRETER)
 
+export CHECKPOLICY ?= checkpolicy
+
 export BASEDIR := $(CURDIR)
 export XEN_ROOT := $(BASEDIR)/..
 
@@ -178,6 +180,8 @@ CFLAGS += $(CLANG_FLAGS)
 export CLANG_FLAGS
 endif
 
+export XEN_HAS_CHECKPOLICY := $(call success,$(CHECKPOLICY) -h 2>&1 | grep -q 
xen)
+
 export root-make-done := y
 endif # root-make-done
 
@@ -189,14 +193,24 @@ ifeq ($(config-build),y)
 # *config targets only - make sure prerequisites are updated, and descend
 # in tools/kconfig to make the *config target
 
+# Create a file for KCONFIG_ALLCONFIG which depends on the environment.
+# This will be use by kconfig targets 
allyesconfig/allmodconfig/allnoconfig/randconfig
+filechk_kconfig_allconfig = \
+$(if $(findstring n,$(XEN_HAS_CHECKPOLICY)), echo 
'CONFIG_XSM_FLASK_POLICY=n';) \
+$(if $(KCONFIG_ALLCONFIG), cat $(KCONFIG_ALLCONFIG);) \
+:
+
+.allconfig.tmp: FORCE
+   set -e; { $(call filechk_kconfig_allconfig); } > $@
+
 config: FORCE
$(MAKE) $(kconfig) $@
 
 # Config.mk tries to include .config file, don't try to remake it
 %/.config: ;
 
-%config: FORCE
-   $(MAKE) $(kconfig) $@
+%config: .allconfig.tmp FORCE
+   $(MAKE) $(kconfig) KCONFIG_ALLCONFIG=$< $@
 
 else # !config-build
 
@@ -368,7 +382,7 @@ _clean: delete-unfresh-files
-o -name "*.gcno" -o -name ".*.cmd" -o -name "lib.a" \) -exec 
rm -f {} \;
rm -f include/asm $(TARGET) $(TARGET).gz $(TARGET).efi 
$(TARGET).efi.map $(TARGET)-syms $(TARGET)-syms.map *~ core
rm -f asm-offsets.s include/asm-*/asm-offsets.h
-   rm -f .banner
+   rm -f .banner .allconfig.tmp
 
 .PHONY: _distclean
 _distclean: clean
-- 
Anthony PERARD




Re: [XEN PATCH v4] xen: rework `checkpolicy` detection when using "randconfig"

2021-09-28 Thread Luca Fancellu



> On 28 Sep 2021, at 09:39, Anthony PERARD  wrote:
> 
> This will help prevent the CI loop from having build failures when
> `checkpolicy` isn't available when doing "randconfig" jobs.
> 
> To prevent "randconfig" from selecting XSM_FLASK_POLICY when
> `checkpolicy` isn't available, we will actually override the config
> output with the use of KCONFIG_ALLCONFIG.
> 
> Doing this way still allow a user/developer to set XSM_FLASK_POLICY
> even when "checkpolicy" isn't available. It also prevent the build
> system from reset the config when "checkpolicy" isn't available
> anymore. And XSM_FLASK_POLICY is still selected automatically when
> `checkpolicy` is available.
> But this also work well for "randconfig", as it will not select
> XSM_FLASK_POLICY when "checkpolicy" is missing.
> 
> This patch allows to easily add more override which depends on the
> environment.
> 
> Also, move the check out of Config.mk and into xen/ build system.
> Nothing in tools/ is using that information as it's done by
> ./configure.
> 
> We named the new file ".allconfig.tmp" as ".*.tmp" are already ignored
> via .gitignore.
> 
> Remove '= y' in Kconfig as it isn't needed, only a value "y" is true,
> anything else is considered false.

I don’t know if it is true, I’m having a look here: 
https://www.kernel.org/doc/Documentation/kbuild/kconfig-language.txt

And the section “Menu dependencies” states that:

An expression can have a value of 'n', 'm' or 'y' (or 0, 1, 2
respectively for calculations).

So it seems to me that m and y are evaluated as true, am I wrong?

Cheers,
Luca

> 
> Signed-off-by: Anthony PERARD 
> ---
> v4:
> - keep XEN_ prefix for HAS_CHECKPOLICY
> - rework .allconfig.tmp file generation, so it is easier to read.
> - remove .allconfig.tmp on clean, .*.tmp files aren't all cleaned yet,
>  maybe for another time.
> - add information about file name choice and Kconfig change in patch
>  description.
> 
> v3:
> - use KCONFIG_ALLCONFIG
> - don't override XSM_FLASK_POLICY value unless we do randconfig.
> - no more changes to the current behavior of kconfig, only to
>  randconfig.
> 
> v2 was "[XEN PATCH v2] xen: allow XSM_FLASK_POLICY only if checkpolicy binary 
> is available"
> ---
> Config.mk  |  6 --
> xen/Makefile   | 20 +---
> xen/common/Kconfig |  2 +-
> 3 files changed, 18 insertions(+), 10 deletions(-)
> 
> diff --git a/Config.mk b/Config.mk
> index e85bf186547f..d5490e35d03d 100644
> --- a/Config.mk
> +++ b/Config.mk
> @@ -137,12 +137,6 @@ export XEN_HAS_BUILD_ID=y
> build_id_linker := --build-id=sha1
> endif
> 
> -ifndef XEN_HAS_CHECKPOLICY
> -CHECKPOLICY ?= checkpolicy
> -XEN_HAS_CHECKPOLICY := $(shell $(CHECKPOLICY) -h 2>&1 | grep -q xen && 
> echo y || echo n)
> -export XEN_HAS_CHECKPOLICY
> -endif
> -
> define buildmakevars2shellvars
> export PREFIX="$(prefix)";\
> export XEN_SCRIPT_DIR="$(XEN_SCRIPT_DIR)";\
> diff --git a/xen/Makefile b/xen/Makefile
> index f47423dacd9a..7c2ffce0fc77 100644
> --- a/xen/Makefile
> +++ b/xen/Makefile
> @@ -17,6 +17,8 @@ export XEN_BUILD_HOST   ?= $(shell hostname)
> PYTHON_INTERPRETER:= $(word 1,$(shell which python3 python python2 
> 2>/dev/null) python)
> export PYTHON ?= $(PYTHON_INTERPRETER)
> 
> +export CHECKPOLICY   ?= checkpolicy
> +
> export BASEDIR := $(CURDIR)
> export XEN_ROOT := $(BASEDIR)/..
> 
> @@ -178,6 +180,8 @@ CFLAGS += $(CLANG_FLAGS)
> export CLANG_FLAGS
> endif
> 
> +export XEN_HAS_CHECKPOLICY := $(call success,$(CHECKPOLICY) -h 2>&1 | grep 
> -q xen)
> +
> export root-make-done := y
> endif # root-make-done
> 
> @@ -189,14 +193,24 @@ ifeq ($(config-build),y)
> # *config targets only - make sure prerequisites are updated, and descend
> # in tools/kconfig to make the *config target
> 
> +# Create a file for KCONFIG_ALLCONFIG which depends on the environment.
> +# This will be use by kconfig targets 
> allyesconfig/allmodconfig/allnoconfig/randconfig
> +filechk_kconfig_allconfig = \
> +$(if $(findstring n,$(XEN_HAS_CHECKPOLICY)), echo 
> 'CONFIG_XSM_FLASK_POLICY=n';) \
> +$(if $(KCONFIG_ALLCONFIG), cat $(KCONFIG_ALLCONFIG);) \
> +:
> +
> +.allconfig.tmp: FORCE
> + set -e; { $(call filechk_kconfig_allconfig); } > $@
> +
> config: FORCE
>   $(MAKE) $(kconfig) $@
> 
> # Config.mk tries to include .config file, don't try to remake it
> %/.config: ;
> 
> -%config: FORCE
> - $(MAKE) $(kconfig) $@
> +%config: .allconfig.tmp FORCE
> + $(MAKE) $(kconfig) KCONFIG_ALLCONFIG=$< $@
> 
> else # !config-build
> 
> @@ -368,7 +382,7 @@ _clean: delete-unfresh-files
>   -o -name "*.gcno" -o -name ".*.cmd" -o -name "lib.a" \) -exec 
> rm -f {} \;
>   rm -f include/asm $(TARGET) $(TARGET).gz $(TARGET).efi 
> $(TARGET).efi.map $(TARGET)-syms $(TARGET)-syms.map *~ core
>   rm -f asm-offsets.s include/asm-*/asm-offsets.h
> - rm -f .banner
> + rm -f .banner .allconfig.tmp
> 
> 

Re: [PATCH v2 10/11] xen/arm: Do not map PCI ECAM and MMIO space to Domain-0's p2m

2021-09-28 Thread Oleksandr Andrushchenko
[snip]
>> Sorry I didn't follow your explanation.
>>
>> My suggestion is to remove the #ifdef CONFIG_HAS_PCI completely from
>> map_range_to_domain. At the beginning of map_range_to_domain, there is
>> already this line:
>>
>> bool need_mapping = !dt_device_for_passthrough(dev);
>>
>> We can change it into:
>>
>> bool need_mapping = !dt_device_for_passthrough(dev) &&
>>       !mr_data->skip_mapping;
>>
>>
>> Then, in map_device_children we can set mr_data->skip_mapping to true
>> for PCI devices.
> This is the key. I am fine with this, but it just means we move the
>
> check to the outside of this function which looks good. Will do
>
>>There is already a pci check there:
>>
>>if ( dt_device_type_is_equal(dev, "pci") )
>>
>> so it should be easy to do. What am I missing?
>>
>>
>
I did some experiments. If we move the check to map_device_children

it is not enough because part of the ranges is still mapped at handle_device 
level:

handle_device:
(XEN) --- /axi/pcie@fd0e need_mapping 1 addr fd0e
(XEN) --- /axi/pcie@fd0e need_mapping 1 addr fd48
(XEN) --- /axi/pcie@fd0e need_mapping 1 addr 80

map_device_children:
(XEN) Mapping children of /axi/pcie@fd0e to guest skip 1
(XEN) --- /axi/pcie@fd0e need_mapping 0 addr e000
(XEN) --- /axi/pcie@fd0e need_mapping 0 addr 6

pci_host_bridge_mappings:
(XEN) --- /axi/pcie@fd0e need_mapping 1 addr fd0e
(XEN) --- /axi/pcie@fd0e need_mapping 1 addr fd48

So, I did more intrusive change:

@@ -1540,6 +1534,12 @@ static int __init handle_device(struct domain *d, struct 
dt_device_node *dev,
  int res;
  u64 addr, size;
  bool need_mapping = !dt_device_for_passthrough(dev);
+    struct map_range_data mr_data = {
+    .d = d,
+    .p2mt = p2mt,
+    .skip_mapping = is_pci_passthrough_enabled() &&
+    (device_get_class(dev) == DEVICE_PCI)
+    };

With this I see that now mappings are done correctly:

handle_device:
(XEN) --- /axi/pcie@fd0e need_mapping 0 addr fd0e
(XEN) --- /axi/pcie@fd0e need_mapping 0 addr fd48
(XEN) --- /axi/pcie@fd0e need_mapping 0 addr 80

map_device_children:
(XEN) Mapping children of /axi/pcie@fd0e to guest skip 1
(XEN) --- /axi/pcie@fd0e need_mapping 0 addr e000
(XEN) --- /axi/pcie@fd0e need_mapping 0 addr 6

pci_host_bridge_mappings:
(XEN) --- /axi/pcie@fd0e need_mapping 1 addr fd0e
(XEN) --- /axi/pcie@fd0e need_mapping 1 addr fd48

So, handle_device seems to be the right place. While at it I have also

optimized the way we setup struct map_range_data mr_data in both

handle_device and map_device_children: I removed structure initialization

from within the relevant loop and also pass mr_data to map_device_children,

so it doesn't need to create its own copy of the same and perform yet

another computation for .skip_mapping: it does need to not only know

that dev is a PCI device (this is done by the dt_device_type_is_equal(dev, 
"pci")

check, but also account on is_pci_passthrough_enabled().

Thus, the change will be more intrusive, but I hope will simplify things.

I am attaching the fixup patch for just in case you want more details.

Thank you,

Oleksandr


From 07d6523be2535293d3e34ffd1c8508a0812a4cd8 Mon Sep 17 00:00:00 2001
From: Oleksandr Andrushchenko 
Date: Tue, 28 Sep 2021 13:24:42 +0300
Subject: [PATCH] Fixes: 4480fb1a5c83 ("xen/arm: Do not map PCI ECAM and MMIO
 space to Domain-0's p2m")

Since v2:
 - removed check in map_range_to_domain for PCI_DEV
   and moved it to handle_device, so the code is
   simpler
 - s/map_pci_bridge/skip_mapping
 - extended comment in pci_host_bridge_mappings
 - minor code restructure in construct_dom0
 - s/.need_p2m_mapping/.need_p2m_hwdom_mapping and related
   callbacks
 - unsigned int i; in pci_host_bridge_mappings

Signed-off-by: Oleksandr Andrushchenko 
---
 xen/arch/arm/domain_build.c| 43 +++---
 xen/arch/arm/pci/ecam.c|  8 +++---
 xen/arch/arm/pci/pci-host-common.c | 15 ++-
 xen/arch/arm/pci/pci-host-zynqmp.c |  2 +-
 xen/include/asm-arm/pci.h  | 12 -
 xen/include/asm-arm/setup.h|  2 +-
 6 files changed, 36 insertions(+), 46 deletions(-)

diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index e72c1b881cae..17f3db6a1f48 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -1386,7 +1386,8 @@ int __init map_range_to_domain(const struct dt_device_node *dev,
 {
 struct map_range_data *mr_data = data;
 struct domain *d = mr_data->d;
-bool need_mapping = !dt_device_for_passthrough(dev);
+bool need_mapping = !dt_device_for_passthrough(dev) &&
+!mr_data->skip_mapping;
 int res;
 
 /*
@@ -1409,13 +1410,6 @@ int __init map_range_to_domain(const struct dt_device_node *dev,
 }
 }
 
-#ifdef CONFIG_HAS_PCI
-if ( 

Re: [XEN PATCH v4] xen: rework `checkpolicy` detection when using "randconfig"

2021-09-28 Thread Anthony PERARD
On Tue, Sep 28, 2021 at 03:46:01PM +0200, Jan Beulich wrote:
> On 28.09.2021 10:39, Anthony PERARD wrote:
> > This will help prevent the CI loop from having build failures when
> > `checkpolicy` isn't available when doing "randconfig" jobs.
> > 
> > To prevent "randconfig" from selecting XSM_FLASK_POLICY when
> > `checkpolicy` isn't available, we will actually override the config
> > output with the use of KCONFIG_ALLCONFIG.
> > 
> > Doing this way still allow a user/developer to set XSM_FLASK_POLICY
> > even when "checkpolicy" isn't available. It also prevent the build
> > system from reset the config when "checkpolicy" isn't available
> > anymore. And XSM_FLASK_POLICY is still selected automatically when
> > `checkpolicy` is available.
> > But this also work well for "randconfig", as it will not select
> > XSM_FLASK_POLICY when "checkpolicy" is missing.
> > 
> > This patch allows to easily add more override which depends on the
> > environment.
> > 
> > Also, move the check out of Config.mk and into xen/ build system.
> > Nothing in tools/ is using that information as it's done by
> > ./configure.
> > 
> > We named the new file ".allconfig.tmp" as ".*.tmp" are already ignored
> > via .gitignore.
> > 
> > Remove '= y' in Kconfig as it isn't needed, only a value "y" is true,
> > anything else is considered false.
> 
> Seeing you say this explicitly makes me wonder - is this actually true?

I've check that this was true by empirical testing before sending the
patch. But the documentation isn't clear to me about the meaning of
'default y if "m"'. So would you rather keep '= y' just to stay on the
safe side?

> At least when modules are enabled (which our kconfig is capable of even
> if we don't use that part of it), "m" is also "kind of" true, and the
> related logic really isn't quite boolean iirc.
> 
> Everything else looks goot to me now, thanks.

Thanks,

-- 
Anthony PERARD



Re: [XEN PATCH v4] xen: rework `checkpolicy` detection when using "randconfig"

2021-09-28 Thread Jan Beulich
On 28.09.2021 10:39, Anthony PERARD wrote:
> This will help prevent the CI loop from having build failures when
> `checkpolicy` isn't available when doing "randconfig" jobs.
> 
> To prevent "randconfig" from selecting XSM_FLASK_POLICY when
> `checkpolicy` isn't available, we will actually override the config
> output with the use of KCONFIG_ALLCONFIG.
> 
> Doing this way still allow a user/developer to set XSM_FLASK_POLICY
> even when "checkpolicy" isn't available. It also prevent the build
> system from reset the config when "checkpolicy" isn't available
> anymore. And XSM_FLASK_POLICY is still selected automatically when
> `checkpolicy` is available.
> But this also work well for "randconfig", as it will not select
> XSM_FLASK_POLICY when "checkpolicy" is missing.
> 
> This patch allows to easily add more override which depends on the
> environment.
> 
> Also, move the check out of Config.mk and into xen/ build system.
> Nothing in tools/ is using that information as it's done by
> ./configure.
> 
> We named the new file ".allconfig.tmp" as ".*.tmp" are already ignored
> via .gitignore.
> 
> Remove '= y' in Kconfig as it isn't needed, only a value "y" is true,
> anything else is considered false.

Seeing you say this explicitly makes me wonder - is this actually true?
At least when modules are enabled (which our kconfig is capable of even
if we don't use that part of it), "m" is also "kind of" true, and the
related logic really isn't quite boolean iirc.

Everything else looks goot to me now, thanks.

Jan




Ping: [PATCH 1/2] gnttab: remove guest_physmap_remove_page() call from gnttab_map_frame()

2021-09-28 Thread Jan Beulich
On 22.09.2021 12:00, Roger Pau Monné wrote:
> On Wed, Sep 22, 2021 at 11:42:30AM +0200, Jan Beulich wrote:
>> On 22.09.2021 11:26, Roger Pau Monné wrote:
>>> On Tue, Sep 21, 2021 at 12:12:05PM +0200, Jan Beulich wrote:
 On 21.09.2021 10:32, Roger Pau Monné wrote:
> On Mon, Sep 20, 2021 at 05:27:17PM +0200, Jan Beulich wrote:
>> On 20.09.2021 12:20, Roger Pau Monné wrote:
>>> On Mon, Sep 13, 2021 at 08:41:47AM +0200, Jan Beulich wrote:
 --- a/xen/include/asm-arm/grant_table.h
 +++ b/xen/include/asm-arm/grant_table.h
 +if ( gfn_eq(ogfn, INVALID_GFN) || gfn_eq(ogfn, gfn) ||
\
>>>
>>> I'm slightly confused by this checks, don't you need to check for
>>> gfn_eq(gfn, INVALID_GFN) (not ogfn) in order to call
>>> guest_physmap_remove_page?
>>
>> Why? It's ogfn which gets passed to the function. And it indeed is the
>> prior GFN's mapping that we want to remove here.
>>
>>> Or assuming that ogfn is not invalid can be used to imply a removal?
>>
>> That implication can be (and on x86 is) used for the incoming argument,
>> i.e. "gfn". I don't think "ogfn" can serve this purpose.
>
> I guess I'm confused due to the ogfn checks done on the Arm side that
> are not performed on x86. So on Arm you always need to explicitly
> unhook the previous GFN before attempting to setup a new mapping,
> while on x86 you only need to do this when it's a removal in order to
> clear the entry?

 The difference isn't with guest_physmap_add_entry() (both x86 and
 Arm only insert a new mapping there), but with
 xenmem_add_to_physmap_one(): Arm's variant doesn't care about prior
 mappings. And gnttab_map_frame() gets called only from there. This
 is effectively what the first paragraph of the description is about.
>>>
>>> OK, sorry, it wasn't clear to me from the description. Could you
>>> explicitly mention in the description that the removal is moved into
>>> gnttab_set_frame_gfn on Arm in order to cope with the fact that
>>> xenmem_add_to_physmap_one doesn't perform it.
>>
>> Well, it's not really "in order to cope" - that's true for the placement
>> prior to this change as well, so not a justification for the change.
>> Nevertheless I've tried to make this more clear by changing the 1st
>> paragraph to:
>>
>> "Without holding appropriate locks, attempting to remove a prior mapping
>>  of the underlying page is pointless, as the same (or another) mapping
>>  could be re-established by a parallel request on another vCPU. Move the
>>  code to Arm's gnttab_set_frame_gfn(); it cannot be dropped there since
>>  xenmem_add_to_physmap_one() doesn't call it either (unlike on x86). Of
>>  course this new placement doesn't improve things in any way as far as
>>  the security of grant status frame mappings goes (see XSA-379). Proper
>>  locking would be needed here to allow status frames to be mapped
>>  securely."
> 
> Thanks, this is indeed much clearer IMO:
> 
> Acked-by: Roger Pau Monné 

Any chance of an Arm ack (or otherwise) here?

Thanks, Jan

> Albeit I still think we need to fix Arm side to do the removal in
> xenmem_add_to_physmap_one (or the x86 side to not do it).
> 
> Thanks, Roger.
> 




Re: [PATCH v2 10/11] vpci: Add initial support for virtual PCI bus topology

2021-09-28 Thread Oleksandr Andrushchenko

On 28.09.21 11:17, Michal Orzel wrote:
>
> On 28.09.2021 09:59, Jan Beulich wrote:
>> On 28.09.2021 09:48, Michal Orzel wrote:
>>> On 23.09.2021 14:55, Oleksandr Andrushchenko wrote:
 --- a/xen/drivers/passthrough/pci.c
 +++ b/xen/drivers/passthrough/pci.c
 @@ -833,6 +833,63 @@ int pci_remove_device(u16 seg, u8 bus, u8 devfn)
   return ret;
   }
   
 +static struct vpci_dev *pci_find_virtual_device(const struct domain *d,
 +const struct pci_dev 
 *pdev)
 +{
 +struct vpci_dev *vdev;
 +
 +list_for_each_entry ( vdev, >vdev_list, list )
 +if ( vdev->pdev == pdev )
 +return vdev;
 +return NULL;
 +}
 +
 +int pci_add_virtual_device(struct domain *d, const struct pci_dev *pdev)
 +{
 +struct vpci_dev *vdev;
 +
 +ASSERT(!pci_find_virtual_device(d, pdev));
 +
 +/* Each PCI bus supports 32 devices/slots at max. */
 +if ( d->vpci_dev_next > 31 )
 +return -ENOSPC;
 +
 +vdev = xzalloc(struct vpci_dev);
 +if ( !vdev )
 +return -ENOMEM;
 +
 +/* We emulate a single host bridge for the guest, so segment is 
 always 0. */
 +*(u16*) >seg = 0;
>>> Empty line hear would improve readability due to the asterisks being so 
>>> close to each other.
Will add
>>> Apart from that:
>>> Reviewed-by: Michal Orzel 
 +/*
 + * The bus number is set to 0, so virtual devices are seen
 + * as embedded endpoints behind the root complex.
 + */
 +*((u8*) >bus) = 0;
 +*((u8*) >devfn) = PCI_DEVFN(d->vpci_dev_next++, 0);
>> All of these casts are (a) malformed and (b) unnecessary in the first
>> place, afaics at least.
>>
> Agree.
> *((u8*) >bus) = 0;
> is the same as:
> vdev->bus = 0;

Overengineering at its best ;)

Will fix that

>> Jan
>>
Thank you,

Oleksandr


[qemu-mainline test] 165226: regressions - FAIL

2021-09-28 Thread osstest service owner
flight 165226 qemu-mainline real [real]
flight 165232 qemu-mainline real-retest [real]
http://logs.test-lab.xenproject.org/osstest/logs/165226/
http://logs.test-lab.xenproject.org/osstest/logs/165232/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-arm64-arm64-libvirt-raw 17 guest-start/debian.repeat fail REGR. vs. 164950

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 164950
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 164950
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 164950
 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 164950
 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check   fail like 164950
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 164950
 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 164950
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 164950
 test-amd64-i386-xl-pvshim14 guest-start  fail   never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-raw  14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-rtds 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-qcow2 14 migrate-support-checkfail never pass
 test-armhf-armhf-xl-vhd  14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail   never pass

version targeted for testing:
 qemuude8ed1055c2ce18c95f597eb10df360dcb534f99
baseline version:
 qemuu99c44988d5ba1866a411450c877ed818b1b70081

Last test of basis   164950  

Re: [PATCH v3] tools/xl: fix autoballoon regex

2021-09-28 Thread Anthony PERARD
On Thu, Sep 16, 2021 at 03:15:21PM +0300, Dmitry Isaykin wrote:
> This regex is used for auto-balloon mode detection based on Xen command line.
> 
> The case of specifying a negative size was handled incorrectly.
> From misc/xen-command-line documentation:
> 
> dom0_mem (x86)
> = List of ( min: | max: |  )
> 
> If a size is positive, it represents an absolute value.
> If a size is negative, it is subtracted from the total available memory.
> 
> Also add support for [tT] granularity suffix.
> Also add support for memory fractions (i.e. '50%' or '1G+25%').
> 
> Signed-off-by: Dmitry Isaykin 
> ---
>  ret = regcomp(,
> -  "(^| )dom0_mem=((|min:|max:)[0-9]+[bBkKmMgG]?,?)+($| )",
> +  "(^| 
> )dom0_mem=((|min:|max:)(-?[0-9]+[bBkKmMgGtT]?)?(\+?[0-9]+%)?,?)+($| )",

It seems that by trying to match fractions, the new regex would match
too much. For example, if there is " dom0_mem= " on the command line, xl
would detect it as autoballoon=off, while it isn't the case without this
patch. I don't know if it is possible to have "dom0_mem=" on the command
line as I don't know what Xen would do in this case.

It might be better to make the regex more complicated and match
fraction like they are described in the doc, something like:
(  | (\+)?% )

unless xen doesn't boot with bogus value for dom0_mem, but I haven't
checked. (we could use CPP macros to avoid duplicating the 
regex.)

Also,  is supposed to be < 100, so [0-9]{1,2} would be better to
only match no more than 2 digit.

Thought?

Thanks,

-- 
Anthony PERARD



Re: [PATCH v2] pci: fix handling of PCI bridges with subordinate bus number 0xff

2021-09-28 Thread Bertrand Marquis
Hi,

> On 24 Sep 2021, at 10:10, Igor Druzhinin  wrote:
> 
> Bus number 0xff is valid according to the PCI spec. Using u8 typed sub_bus
> and assigning 0xff to it will result in the following loop getting stuck.
> 
>for ( ; sec_bus <= sub_bus; sec_bus++ ) {...}
> 
> Just change its type to unsigned int similarly to what is already done in
> dmar_scope_add_buses().
> 
> Signed-off-by: Igor Druzhinin 
Reviewed-by: Bertrand Marquis 

Cheers
Bertrand


> ---
> v2:
> - fix free_pdev() as well
> - style fixes
> ---
> xen/drivers/passthrough/pci.c | 5 ++---
> 1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c
> index fc4fa2e..d65cda8 100644
> --- a/xen/drivers/passthrough/pci.c
> +++ b/xen/drivers/passthrough/pci.c
> @@ -363,8 +363,7 @@ static struct pci_dev *alloc_pdev(struct pci_seg *pseg, 
> u8 bus, u8 devfn)
> /* update bus2bridge */
> switch ( pdev->type = pdev_type(pseg->nr, bus, devfn) )
> {
> -u16 cap;
> -u8 sec_bus, sub_bus;
> +unsigned int cap, sec_bus, sub_bus;
> 
> case DEV_TYPE_PCIe2PCI_BRIDGE:
> case DEV_TYPE_LEGACY_PCI_BRIDGE:
> @@ -431,7 +430,7 @@ static void free_pdev(struct pci_seg *pseg, struct 
> pci_dev *pdev)
> /* update bus2bridge */
> switch ( pdev->type )
> {
> -uint8_t sec_bus, sub_bus;
> +unsigned int sec_bus, sub_bus;
> 
> case DEV_TYPE_PCIe2PCI_BRIDGE:
> case DEV_TYPE_LEGACY_PCI_BRIDGE:
> -- 
> 2.7.4
> 
> 




Re: [PATCH v5 2/2] tools/xenstore: set open file descriptor limit for xenstored

2021-09-28 Thread Juergen Gross

On 28.09.21 14:02, Ian Jackson wrote:

Juergen Gross writes ("[PATCH v5 2/2] tools/xenstore: set open file descriptor limit 
for xenstored"):

Add a configuration item for the maximum number of open file
descriptors xenstored should be allowed to have.

The default should be "unlimited" in order not to restrict xenstored
in the number of domains it can support, but unfortunately the kernel
is normally limiting the maximum value via /proc/sys/fs/nr_open [1],
[2]. So check that file to exist and if it does, limit the maximum
value to the one specified by /proc/sys/fs/nr_open.

As an aid for the admin configuring the value add a comment specifying
the common needs of xenstored for the different domain types.

...

echo -n Starting $XENSTORED...
@@ -70,6 +89,7 @@ test -f @CONFIG_DIR@/@CONFIG_LEAF_DIR@/xencommons && . 
@CONFIG_DIR@/@CONFIG_LEAF
systemd-notify --booted 2>/dev/null || timeout_xenstore $XENSTORED || 
exit 1
XS_PID=`cat @XEN_RUN_DIR@/xenstored.pid`
echo $XS_OOM_SCORE >/proc/$XS_PID/oom_score_adj
+   prlimit --pid $XS_PID --nofile=$XENSTORED_MAX_OPEN_FDS


Thanks for this.  I have one comment/question, which I regret making
rather late:

I am uncomfortable with the use of prlimit here, because identifying
processes by pid is typically inherently not 100% reliable.

AIUI you are using it here because perhaps otherwise you would have to
mess about with both systemd and non-systemd approaches.  But in fact
this script "launch-xenstore" is simply a parent of xenstore.  It is
run either by systemd or from the init script, and it runs $XENSTORED
directly (so not via systemd or another process supervisor).

fd limits are inherited, so I think you can use ulimit rather than
prlimit ?

If you use ulimit I think you must set the hard and soft limits,
which requires two calls.

If you can't use ulimit then we should try to make some argument that
the prlimit can't target the wrong process eg due to a
misconfiguration or stale pid file or soemthing.  I think I see a way
that such an argument could be construted but it would be better just
to use ulimit.


Hmm, maybe I should just use:

prlimit --nofile=$XENSTORED_MAX_OPEN_FDS \
   $XENSTORED --pid-file @XEN_RUN_DIR@/xenstored.pid $XENSTORED_ARGS


Juergen


OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature


Re: [PATCH 01/11] xen: reserve flags for internal usage in xen_domctl_createdomain

2021-09-28 Thread Jan Beulich
On 23.09.2021 11:54, Julien Grall wrote:
> On 23/09/2021 08:11, Penny Zheng wrote:
>> From: Stefano Stabellini 
>>
>> We are passing an extra special boolean flag at domain creation to
>> specify whether we want to the domain to be privileged (i.e. dom0) or
>> not. Another flag will be introduced later in this series.
>>
>> Reserve bits 16-31 from the existing flags bitfield in struct
>> xen_domctl_createdomain for internal Xen usage.
> 
> I am a bit split with this approach. This feels a bit of a hack to 
> reserve bits for internal purpose in external headers. But at the same 
> time I can see how this is easier to deal with it over repurposing the 
> last argument of domain_create().

I actually have trouble seeing why that's easier. It is a common thing
to widen a bool to "unsigned int flags" when more than one control is
needed. Plus this makes things needlessly harder once (in the future)
the low 16 bits are exhausted in the public interface.

Jan




[PATCH v5 2/2] tools/xenstore: set open file descriptor limit for xenstored

2021-09-28 Thread Ian Jackson
Juergen Gross writes ("[PATCH v5 2/2] tools/xenstore: set open file descriptor 
limit for xenstored"):
> Add a configuration item for the maximum number of open file
> descriptors xenstored should be allowed to have.
> 
> The default should be "unlimited" in order not to restrict xenstored
> in the number of domains it can support, but unfortunately the kernel
> is normally limiting the maximum value via /proc/sys/fs/nr_open [1],
> [2]. So check that file to exist and if it does, limit the maximum
> value to the one specified by /proc/sys/fs/nr_open.
> 
> As an aid for the admin configuring the value add a comment specifying
> the common needs of xenstored for the different domain types.
...
>   echo -n Starting $XENSTORED...
> @@ -70,6 +89,7 @@ test -f @CONFIG_DIR@/@CONFIG_LEAF_DIR@/xencommons && . 
> @CONFIG_DIR@/@CONFIG_LEAF
>   systemd-notify --booted 2>/dev/null || timeout_xenstore $XENSTORED || 
> exit 1
>   XS_PID=`cat @XEN_RUN_DIR@/xenstored.pid`
>   echo $XS_OOM_SCORE >/proc/$XS_PID/oom_score_adj
> + prlimit --pid $XS_PID --nofile=$XENSTORED_MAX_OPEN_FDS

Thanks for this.  I have one comment/question, which I regret making
rather late:

I am uncomfortable with the use of prlimit here, because identifying
processes by pid is typically inherently not 100% reliable.

AIUI you are using it here because perhaps otherwise you would have to
mess about with both systemd and non-systemd approaches.  But in fact
this script "launch-xenstore" is simply a parent of xenstore.  It is
run either by systemd or from the init script, and it runs $XENSTORED
directly (so not via systemd or another process supervisor).

fd limits are inherited, so I think you can use ulimit rather than
prlimit ?

If you use ulimit I think you must set the hard and soft limits,
which requires two calls.

If you can't use ulimit then we should try to make some argument that
the prlimit can't target the wrong process eg due to a
misconfiguration or stale pid file or soemthing.  I think I see a way
that such an argument could be construted but it would be better just
to use ulimit.

Ian.



Re: [XEN PATCH] Config.mk: update OVMF to edk2-stable202108

2021-09-28 Thread Ian Jackson
Anthony PERARD writes ("Re: [XEN PATCH] Config.mk: update OVMF to 
edk2-stable202108"):
> On Tue, Aug 31, 2021 at 02:58:36PM +0100, Ian Jackson wrote:
> > Anthony PERARD writes ("[XEN PATCH] Config.mk: update OVMF to 
> > edk2-stable202108"):
> > > Update to the latest stable tag.
> > 
> > Thanks.  I am OK with this but I think we should hold off committing
> > it until the XSA fallout has been sorted.
> 
> Hopefully, this is sorted now. Time to commit the patch?

Well, things are still not good, but, yes, I have committed this.

git seems to have auto-merged this successfully with the Mini-OS
version update.

Ian.



Re: [RFC PATCH 00/10] security: Introduce qemu_security_policy_taint() API

2021-09-28 Thread P J P
On Tuesday, 14 September, 2021, 07:00:27 pm IST, P J P  
wrote:
>* Thanks so much for restarting this thread. I've been at it intermittently 
>last few
> months, thinking about how could we annotate the source/module objects.
>
> -> [*] https://lists.gnu.org/archive/html/qemu-devel/2020-07/msg04642.html
>
>* Last time we discussed about having both compile and run time options for 
>users
> to be able to select the qualified objects/backends/devices as desired.
>
>* To confirm: How/Where is the security policy defined? Is it device/module 
>specific OR QEMU project wide?
>
>>> it feels like we need
>> 'secure': 'bool'
>
>* Though we started the (above [*]) discussion with '--security' option in 
>mind,
>  I wonder if 'secure' annotation is much specific. And if we could widen its 
> scope.
>
>
>Source annotations: I've been thinking over following approaches
>===
>
>1) Segregate the QEMU sources under
>
>  ../staging/ <= devel/experimental, not for production usage
>  ../src/ <= good for production usage, hence security relevant
>  ../deprecated/ <= Bad for production usage, not security relevant
>
>  - This is similar to Linux staging drivers' tree.
>  - Staging drivers are not considered for production usage and hence CVE 
> allocation.
>  - At build time by default we only build sources under ../src/ tree.
>  - But we can still have options to build /staging/ and /deprecated/ trees.
>  - It's readily understandable to end users.
>
>2) pkgconfig(1) way:
>  - If we could define per device/backend a configuration (.pc) file which is 
> then used
>  at build/run time to decide which sources are suitable for the build or 
> usage.
>
>  - I'm trying to experiment with this.
>
>3) We annotate QEMU devices/backends/modules with macros which define its 
>status.
>  It can then be used at build/run times to decide if it's suitable for usage.
>  For ex:
>
>  $ cat annotsrc.h
>
>  #include 
>
>  enum SRCSTATUS {
>  DEVEL = 0,
>  STAGING,
>  PRODUCTION,
>  DEPRECATED
>  };
>
...
>
>
>* Approach 3) above is similar to the _security_policy_taint() API,
>  but works at the source/object file level, rather than specific 'struct 
> type' field.
> 
>* Does adding a field to struct type (ex. DeviceClass) scale to all 
>objects/modules/backends etc?
>  Does it have any limitations to include/cover other sources/objects?
>
>* I'd really appreciate your feedback/inputs/suggestions.


Ping...!?
---
  -P J P
http://feedmug.com



Re: Xen Rust VirtIO demos work breakdown for Project Stratos

2021-09-28 Thread Andrew Cooper
On 24/09/2021 17:02, Alex Bennée wrote:
> 1.1 Upstream an "official" rust crate for Xen ([STR-52])
> 
>
>   To start with we will want an upstream location for future work to be
>   based upon. The intention is the crate is independent of the version
>   of Xen it runs on (above the baseline version chosen). This will
>   entail:
>
>   • ☐ agreeing with upstream the name/location for the source

Probably github/xen-project/rust-bindings unless anyone has a better
suggestion.

We almost certainly want a companion repository configured as a
hello-world example using the bindings and (cross-)compiled for each
backend target.

>   • ☐ documenting the rules for the "stable" hypercall ABI

Easy.  There shall be no use of unstable interfaces at all.

This is the *only* way to avoid making the bindings dependent on the
version of the hypervisor, and will be a major improvement in the Xen
ecosystem.

Any unstable hypercall wanting to be used shall be stabilised in Xen
first, which has been vehemently agreed to at multiple dev summits in
the past, and will be a useful way of guiding the stabilisation effort.

>   • ☐ establish an internal interface to elide between ioctl mediated
> and direct hypercalls
>   • ☐ ensure the crate is multi-arch and has feature parity for arm64
>
>   As such we expect the implementation to be standalone, i.e. not
>   wrapping the existing Xen libraries for mediation. There should be a
>   close (1-to-1) mapping between the interfaces in the crate and the
>   eventual hypercall made to the hypervisor.
>
>   Estimate: 4w (elapsed likely longer due to discussion)
>
>
> [STR-52] 
> 
>
>
> 1.2 Basic Hypervisor Interactions hypercalls ([STR-53])
> ───
>
>   These are the bare minimum hypercalls implemented as both ioctl and
>   direct calls. These allow for a very basic binary to:
>
>   • ☐ console_io - output IO via the Xen console
>   • ☐ domctl stub - basic stub for domain control (different API?)
>   • ☐ sysctl stub - basic stub for system control (different API?)
>
>   The idea would be this provides enough hypercall interface to query
>   the list of domains and output their status via the xen console. There
>   is an open question about if the domctl and sysctl hypercalls are way
>   to go.

console_io probably wants implementing as a backend to println!() or the
log module, because users of the crate won't want change how they
printf()/etc depending on the target.

That said, console_io hypercalls only do anything for unprivleged VMs in
debug builds of the hypervisor.  This is fine for development, and less
fine in production, so logging ought to use the PV console instead (with
room for future expansion to an Argo transport).

domctl/sysctl are unstable interfaces.  I don't think they'll be
necessary for a basic virtio backend, and they will be the most
complicated hypercalls to stabilise.

>
>   Estimate: 6w
>
>
> [STR-53] 
> 
>
>
> 1.3 [#10] Access to XenStore service ([STR-54])
> ───
>
>   This is a shared configuration storage space accessed via either Unix
>   sockets (on dom0) or via the Xenbus. This is used to access
>   configuration information for the domain.
>
>   Is this needed for a backend though? Can everything just be passed
>   direct on the command line?

Currently, if you want a stubdom and you want to instruct it to shut
down cleanly, it needs xenstore.  Any stubdom which wants disk or
network needs xenstore too.

xenbus (the transport) does need to split between ioctl()'s and raw
hypercalls.  xenstore (the protocol) could be in the xen crate, or a
separate one as it is a piece of higher level functionality.

However, we should pay attention to non-xenstore usecases and not paint
ourselves into a corner.  Some security usecases would prefer not to use
shared memory, and e.g. might consider using an Argo transport instead
of the traditional grant-shared page.

>
>   Estimate: 4w
>
>
> [STR-54] 
> 

[libvirt test] 165228: regressions - FAIL

2021-09-28 Thread osstest service owner
flight 165228 libvirt real [real]
http://logs.test-lab.xenproject.org/osstest/logs/165228/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-amd64-libvirt   6 libvirt-buildfail REGR. vs. 151777
 build-i386-libvirt6 libvirt-buildfail REGR. vs. 151777
 build-arm64-libvirt   6 libvirt-buildfail REGR. vs. 151777
 build-armhf-libvirt   6 libvirt-buildfail REGR. vs. 151777

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-pair  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-amd64-amd64-libvirt-vhd  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-pair  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-amd64-i386-libvirt-raw   1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-xsm   1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-qcow2  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-raw  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-raw  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-qcow2  1 build-check(1)   blocked  n/a

version targeted for testing:
 libvirt  65499b4f090bf8b5c0699f27e755921f774fb1ee
baseline version:
 libvirt  2c846fa6bcc11929c9fb857a22430fb9945654ad

Last test of basis   151777  2020-07-10 04:19:19 Z  445 days
Failing since151818  2020-07-11 04:18:52 Z  444 days  435 attempts
Testing same since   165228  2021-09-28 04:18:53 Z0 days1 attempts


People who touched revisions under test:
Adolfo Jayme Barrientos 
  Aleksandr Alekseev 
  Aleksei Zakharov 
  Andika Triwidada 
  Andrea Bolognani 
  Balázs Meskó 
  Barrett Schonefeld 
  Bastian Germann 
  Bastien Orivel 
  BiaoXiang Ye 
  Bihong Yu 
  Binfeng Wu 
  Bjoern Walk 
  Boris Fiuczynski 
  Brian Turek 
  Bruno Haible 
  Chris Mayo 
  Christian Borntraeger 
  Christian Ehrhardt 
  Christian Kirbach 
  Christian Schoenebeck 
  Cole Robinson 
  Collin Walling 
  Cornelia Huck 
  Cédric Bosdonnat 
  Côme Borsoi 
  Daniel Henrique Barboza 
  Daniel Letai 
  Daniel P. Berrange 
  Daniel P. Berrangé 
  Didik Supriadi 
  dinglimin 
  Dmytro Linkin 
  Eiichi Tsukata 
  Eric Farman 
  Erik Skultety 
  Fabian Affolter 
  Fabian Freyer 
  Fabiano Fidêncio 
  Fangge Jin 
  Farhan Ali 
  Fedora Weblate Translation 
  gongwei 
  Guoyi Tu
  Göran Uddeborg 
  Halil Pasic 
  Han Han 
  Hao Wang 
  Hela Basa 
  Helmut Grohne 
  Hiroki Narukawa 
  Ian Wienand 
  Jakob Meng 
  Jamie Strandboge 
  Jamie Strandboge 
  Jan Kuparinen 
  jason lee 
  Jean-Baptiste Holcroft 
  Jia Zhou 
  Jianan Gao 
  Jim Fehlig 
  Jin Yan 
  Jinsheng Zhang 
  Jiri Denemark 
  John Ferlan 
  Jonathan Watt 
  Jonathon Jongsma 
  Julio Faracco 
  Justin Gatzen 
  Ján Tomko 
  Kashyap Chamarthy 
  Kevin Locke 
  Kristina Hanicova 
  Laine Stump 
  Laszlo Ersek 
  Lee Yarwood 
  Lei Yang 
  Liao Pingfang 
  Lin Ma 
  Lin Ma 
  Lin Ma 
  Liu Yiding 
  Luke Yue 
  Luyao Zhong 
  Marc Hartmayer 
  Marc-André Lureau 
  Marek Marczykowski-Górecki 
  Markus Schade 
  Martin Kletzander 
  Masayoshi Mizuma 
  Matej Cepl 
  Matt Coleman 
  Matt Coleman 
  Mauro Matteo Cascella 
  Meina Li 
  Michal Privoznik 
  Michał Smyk 
  Milo Casagrande 
  Moshe Levi 
  Muha Aliss 
  Nathan 
  Neal Gompa 
  Nick Chevsky 
  Nick Shyrokovskiy 
  Nickys Music Group 
  Nico Pache 
  Nikolay Shirokovskiy 
  Olaf Hering 
  Olesya Gerasimenko 
  Orion Poplawski 
  Pany 
  Patrick Magauran 
  Paulo de Rezende Pinatti 
  Pavel Hrdina 
  Peng Liang 
  Peter Krempa 
  Pino Toscano 
  Pino Toscano 
  Piotr Drąg 
  Prathamesh Chavan 
  Richard W.M. Jones 
  Ricky Tigg 
  Robin Lee 
  Roman Bogorodskiy 
  Roman Bolshakov 
  Ryan Gahagan 
  Ryan Schmidt 
  Sam Hartman 
  Scott Shambarger 
  Sebastian Mitterle 
  SeongHyun Jo 
  Shalini Chellathurai Saroja 
  Shaojun Yang 
  Shi Lei 
  simmon 
  Simon Chopin 
  Simon Gaiser 
  Simon Rowe 
  Stefan Bader 
  Stefan Berger 
  Stefan Berger 
  Stefan Hajnoczi 
  Stefan Hajnoczi 
  Szymon Scholz 
  Thomas Huth 
  Tim Wiederhake 
  Tomáš Golembiovský 
  Tomáš Janoušek 
  Tuguoyi 
  Victor Toso 
  Ville Skyttä 
  Vinayak Kale 
  Wang Xin 
  WangJian 
  Weblate 
  Wei 

Re: [PATCH v2 01/11] vpci: Make vpci registers removal a dedicated function

2021-09-28 Thread Michal Orzel



On 23.09.2021 14:54, Oleksandr Andrushchenko wrote:
> From: Oleksandr Andrushchenko 
> 
> This is in preparation for dynamic assignment of the vpci register
> handlers depending on the domain: hwdom or guest.
> 
> Signed-off-by: Oleksandr Andrushchenko 
> 
> ---
> Since v1:
>  - constify struct pci_dev where possible
> ---
>  xen/drivers/vpci/vpci.c | 7 ++-
>  xen/include/xen/vpci.h  | 2 ++
>  2 files changed, 8 insertions(+), 1 deletion(-)
> 
Reviewed-by: Michal Orzel 




Re: [PATCH V5 12/12] net: netvsc: Add Isolation VM support for netvsc driver

2021-09-28 Thread Tianyu Lan

On 9/28/2021 1:39 PM, Christoph Hellwig wrote:

On Mon, Sep 27, 2021 at 10:26:43PM +0800, Tianyu Lan wrote:

Hi Christoph:
 Gentile ping. The swiotlb and shared memory mapping changes in this
patchset needs your reivew. Could you have a look? >

I'm a little too busy for a review of such a huge patchset right now.
That being said here are my comments from a very quick review:

Hi Christoph:
  Thanks for your comments. Most patches in the series are Hyper-V
change. I will split patchset and make it easy to review.




  - the bare memremap usage in swiotlb looks strange and I'd
definitively expect a well documented wrapper.


OK. Should the wrapper in the DMA code? How about dma_map_decrypted() 
introduced in the V4?

https://lkml.org/lkml/2021/8/27/605


  - given that we can now hand out swiotlb memory for coherent mappings
we need to carefully audit what happens when this memremaped
memory gets mmaped or used through dma_get_sgtable


OK. I check that.


  - the netscv changes I'm not happy with at all.  A large part of it
is that the driver already has a bad structure, but this series
is making it significantly worse.  We'll need to find a way
to use the proper dma mapping abstractions here.  One option
if you want to stick to the double vmapped buffer would be something
like using dma_alloc_noncontigous plus a variant of
dma_vmap_noncontiguous that takes the shared_gpa_boundary into
account.



OK. I will do that.





[PATCH v5 2/2] tools/xenstore: set open file descriptor limit for xenstored

2021-09-28 Thread Juergen Gross
Add a configuration item for the maximum number of open file
descriptors xenstored should be allowed to have.

The default should be "unlimited" in order not to restrict xenstored
in the number of domains it can support, but unfortunately the kernel
is normally limiting the maximum value via /proc/sys/fs/nr_open [1],
[2]. So check that file to exist and if it does, limit the maximum
value to the one specified by /proc/sys/fs/nr_open.

As an aid for the admin configuring the value add a comment specifying
the common needs of xenstored for the different domain types.

[1]: 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=60fd760fb9ff7034360bab7137c917c0330628c2
[2]: 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0c2d64fb6cae9aae480f6a46cfe79f8d7d48b59f

Signed-off-by: Juergen Gross 
---
V2:
- set ulimit form launch script (Julien Grall)
- split off from original patch (Julien Grall)
V4:
- switch to directly configuring the limit of file descriptors instead
  of domains (Ian Jackson)
V5:
- use /proc/sys/fs/nr_open (Ian Jackson)
---
 .../Linux/init.d/sysconfig.xencommons.in  | 13 
 tools/hotplug/Linux/launch-xenstore.in| 20 +++
 2 files changed, 33 insertions(+)

diff --git a/tools/hotplug/Linux/init.d/sysconfig.xencommons.in 
b/tools/hotplug/Linux/init.d/sysconfig.xencommons.in
index b83101ab7e..433e4849af 100644
--- a/tools/hotplug/Linux/init.d/sysconfig.xencommons.in
+++ b/tools/hotplug/Linux/init.d/sysconfig.xencommons.in
@@ -32,6 +32,19 @@
 # Changing this requires a reboot to take effect.
 #XENSTORED=@XENSTORED@
 
+## Type: string
+## Default: unlimited
+#
+# Select maximum number of file descriptors xenstored is allowed to have
+# opened at one time.
+# For each HVM domain xenstored might need up to 5 open file descriptors,
+# PVH and PV domains will require up to 3 open file descriptors. Additionally
+# 20-30 file descriptors will be opened for internal uses.
+# The specified value (including "unlimited") will be capped by the contents
+# of /proc/sys/fs/nr_open if existing.
+# Only evaluated if XENSTORETYPE is "daemon".
+#XENSTORED_MAX_OPEN_FDS=unlimited
+
 ## Type: string
 ## Default: ""
 #
diff --git a/tools/hotplug/Linux/launch-xenstore.in 
b/tools/hotplug/Linux/launch-xenstore.in
index 1747c96065..7a0334d880 100644
--- a/tools/hotplug/Linux/launch-xenstore.in
+++ b/tools/hotplug/Linux/launch-xenstore.in
@@ -54,6 +54,7 @@ test -f @CONFIG_DIR@/@CONFIG_LEAF_DIR@/xencommons && . 
@CONFIG_DIR@/@CONFIG_LEAF
 
 [ "$XENSTORETYPE" = "daemon" ] && {
[ -z "$XENSTORED_TRACE" ] || XENSTORED_ARGS="$XENSTORED_ARGS -T 
@XEN_LOG_DIR@/xenstored-trace.log"
+   [ -z "$XENSTORED_MAX_OPEN_FDS" ] && XENSTORED_MAX_OPEN_FDS=unlimited
[ -z "$XENSTORED" ] && XENSTORED=@XENSTORED@
[ -x "$XENSTORED" ] || {
echo "No xenstored found"
@@ -62,6 +63,24 @@ test -f @CONFIG_DIR@/@CONFIG_LEAF_DIR@/xencommons && . 
@CONFIG_DIR@/@CONFIG_LEAF
[ -z "$XENSTORED_OOM_MEM_THRESHOLD" ] || XENSTORED_OOM_MEM_THRESHOLD=50
XS_OOM_SCORE=-$(($XENSTORED_OOM_MEM_THRESHOLD * 10))
 
+   [ "$XENSTORED_MAX_OPEN_FDS" = "unlimited" ] || {
+   [ -z "${XENSTORED_MAX_OPEN_FDS//[0-9]}" ] &&
+   [ -n "$XENSTORED_MAX_OPEN_FDS" ] || {
+   echo "XENSTORED_MAX_OPEN_FDS=$XENSTORED_MAX_OPEN_FDS 
invalid"
+   echo "Setting to default \"unlimited\"."
+   XENSTORED_MAX_OPEN_FDS=unlimited
+   }
+   }
+   [ -r /proc/sys/fs/nr_open ] && {
+   MAX_FDS=`cat /proc/sys/fs/nr_open`
+   [ "$XENSTORED_MAX_OPEN_FDS" = "unlimited" ] && 
XENSTORED_MAX_OPEN_FDS=$MAX_FDS
+   [ $XENSTORED_MAX_OPEN_FDS -gt $MAX_FDS ] && {
+   echo "XENSTORED_MAX_OPEN_FDS exceeds system limit."
+   echo "Setting to \"$MAX_FDS\"."
+   XENSTORED_MAX_OPEN_FDS=$MAX_FDS
+   }
+   }
+
rm -f @XEN_RUN_DIR@/xenstored.pid
 
echo -n Starting $XENSTORED...
@@ -70,6 +89,7 @@ test -f @CONFIG_DIR@/@CONFIG_LEAF_DIR@/xencommons && . 
@CONFIG_DIR@/@CONFIG_LEAF
systemd-notify --booted 2>/dev/null || timeout_xenstore $XENSTORED || 
exit 1
XS_PID=`cat @XEN_RUN_DIR@/xenstored.pid`
echo $XS_OOM_SCORE >/proc/$XS_PID/oom_score_adj
+   prlimit --pid $XS_PID --nofile=$XENSTORED_MAX_OPEN_FDS
 
exit 0
 }
-- 
2.26.2




[PATCH v5 1/2] tools/xenstore: set oom score for xenstore daemon on Linux

2021-09-28 Thread Juergen Gross
Xenstored is absolutely mandatory for a Xen host and it can't be
restarted, so being killed by OOM-killer in case of memory shortage is
to be avoided.

Set /proc/$pid/oom_score_adj (if available) per default to -500 (this
translates to 50% of dom0 memory size) in order to allow xenstored to
use large amounts of memory without being killed.

The percentage of dom0 memory above which the oom killer is allowed to
kill xenstored can be set via XENSTORED_OOM_MEM_THRESHOLD in
xencommons.

Make sure the pid file isn't a left-over from a previous run delete it
before starting xenstored.

Signed-off-by: Juergen Gross 
Reviewed-by: Ian Jackson 
---
V2:
- set oom score from launch script (Julien Grall)
- split off open file descriptor limit setting (Julien Grall)
V3:
- make oom killer threshold configurable (Julien Grall)
V4:
- extend comment (Ian Jackson)
---
 tools/hotplug/Linux/init.d/sysconfig.xencommons.in | 9 +
 tools/hotplug/Linux/launch-xenstore.in | 6 ++
 2 files changed, 15 insertions(+)

diff --git a/tools/hotplug/Linux/init.d/sysconfig.xencommons.in 
b/tools/hotplug/Linux/init.d/sysconfig.xencommons.in
index 00cf7f91d4..b83101ab7e 100644
--- a/tools/hotplug/Linux/init.d/sysconfig.xencommons.in
+++ b/tools/hotplug/Linux/init.d/sysconfig.xencommons.in
@@ -48,6 +48,15 @@ XENSTORED_ARGS=
 # Only evaluated if XENSTORETYPE is "daemon".
 #XENSTORED_TRACE=[yes|on|1]
 
+## Type: integer
+## Default: 50
+#
+# Percentage of dom0 memory size the xenstore daemon can use before the
+# OOM killer is allowed to kill it.
+# The specified value is multiplied by -10 and echoed to
+# /proc/PID/oom_score_adj.
+#XENSTORED_OOM_MEM_THRESHOLD=50
+
 ## Type: string
 ## Default: @LIBEXEC@/boot/xenstore-stubdom.gz
 #
diff --git a/tools/hotplug/Linux/launch-xenstore.in 
b/tools/hotplug/Linux/launch-xenstore.in
index 019f9d6f4d..1747c96065 100644
--- a/tools/hotplug/Linux/launch-xenstore.in
+++ b/tools/hotplug/Linux/launch-xenstore.in
@@ -59,11 +59,17 @@ test -f @CONFIG_DIR@/@CONFIG_LEAF_DIR@/xencommons && . 
@CONFIG_DIR@/@CONFIG_LEAF
echo "No xenstored found"
exit 1
}
+   [ -z "$XENSTORED_OOM_MEM_THRESHOLD" ] || XENSTORED_OOM_MEM_THRESHOLD=50
+   XS_OOM_SCORE=-$(($XENSTORED_OOM_MEM_THRESHOLD * 10))
+
+   rm -f @XEN_RUN_DIR@/xenstored.pid
 
echo -n Starting $XENSTORED...
$XENSTORED --pid-file @XEN_RUN_DIR@/xenstored.pid $XENSTORED_ARGS
 
systemd-notify --booted 2>/dev/null || timeout_xenstore $XENSTORED || 
exit 1
+   XS_PID=`cat @XEN_RUN_DIR@/xenstored.pid`
+   echo $XS_OOM_SCORE >/proc/$XS_PID/oom_score_adj
 
exit 0
 }
-- 
2.26.2




[PATCH v5 0/2] tools/xenstore: set resource limits of xenstored

2021-09-28 Thread Juergen Gross
Set some limits for xenstored in order to avoid it being killed by
OOM killer, or to run out of file descriptors.

Changes in V5:
- respect /proc/sys/fs/nr_open (Ian Jackson)

Changes in V4:
- add comments
- switch to configure open file descriptors directly

Changes in V3:
- make oom score configurable

Changes in V2:
- split into 2 patches
- set limits from start script

Juergen Gross (2):
  tools/xenstore: set oom score for xenstore daemon on Linux
  tools/xenstore: set open file descriptor limit for xenstored

 .../Linux/init.d/sysconfig.xencommons.in  | 22 
 tools/hotplug/Linux/launch-xenstore.in| 26 +++
 2 files changed, 48 insertions(+)

-- 
2.26.2




[xen-unstable-smoke test] 165230: tolerable all pass - PUSHED

2021-09-28 Thread osstest service owner
flight 165230 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/165230/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  1c3ed9c908732d19660fbe83580674d585464d4c
baseline version:
 xen  2e46d73b4c7562f7b104e9e10fe302316af13959

Last test of basis   165223  2021-09-27 15:01:42 Z0 days
Testing same since   165230  2021-09-28 05:00:28 Z0 days1 attempts


People who touched revisions under test:
  Oleksandr Tyshchenko 
  Stefano Stabellini 

jobs:
 build-arm64-xsm  pass
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-arm64-arm64-xl-xsm  pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/xen.git
   2e46d73b4c..1c3ed9c908  1c3ed9c908732d19660fbe83580674d585464d4c -> smoke



Re: [PATCH v2 05/11] xen/arm: Mark device as PCI while creating one

2021-09-28 Thread Oleksandr Andrushchenko

On 28.09.21 11:39, Jan Beulich wrote:
> On 28.09.2021 10:29, Oleksandr Andrushchenko wrote:
>> On 28.09.21 11:26, Jan Beulich wrote:
>>> On 28.09.2021 10:09, Oleksandr Andrushchenko wrote:
 On 27.09.21 13:26, Jan Beulich wrote:
> On 27.09.2021 12:04, Oleksandr Andrushchenko wrote:
>> On 27.09.21 13:00, Jan Beulich wrote:
>>> On 27.09.2021 11:35, Oleksandr Andrushchenko wrote:
 On 27.09.21 12:19, Jan Beulich wrote:
> On 27.09.2021 10:45, Oleksandr Andrushchenko wrote:
>> On 27.09.21 10:45, Jan Beulich wrote:
>>> On 23.09.2021 14:54, Oleksandr Andrushchenko wrote:
 --- a/xen/drivers/passthrough/pci.c
 +++ b/xen/drivers/passthrough/pci.c
 @@ -328,6 +328,9 @@ static struct pci_dev *alloc_pdev(struct 
 pci_seg *pseg, u8 bus, u8 devfn)
*((u8*) >bus) = bus;
*((u8*) >devfn) = devfn;
pdev->domain = NULL;
 +#ifdef CONFIG_ARM
 +pci_to_dev(pdev)->type = DEV_PCI;
 +#endif
>>> I have to admit that I'm not happy about new CONFIG_ 
>>> conditionals
>>> here. I'd prefer to see this done by a new arch helper, unless 
>>> there are
>>> obstacles I'm overlooking.
>> Do you mean something like arch_pci_alloc_pdev(dev)?
> I'd recommend against "alloc" in its name; "new" instead maybe?
 I am fine with arch_pci_new_pdev, but arch prefix points to the fact 
 that
 this is just an architecture specific part of the pdev allocation 
 rather than
 actual pdev allocation itself, so with this respect 
 arch_pci_alloc_pdev seems
 more natural to me.
>>> The bulk of the function is about populating the just allocated struct.
>>> There's no arch-specific part of the allocation (so far, leaving aside
>>> MSI-X), you only want and arch-specific part of the initialization. I
>>> would agree with "alloc" in the name if further allocation was to
>>> happen there.
>> Hm, then arch_pci_init_pdev sounds more reasonable
> Fine with me.
 Do we want this to be void or returning an error code? If error code is 
 needed,
 then we would also need a roll-back function, e.g. arch_pci_free_pdev or
 arch_pci_release_pdev or arch_pci_fini_pdev or something, so it can be 
 used in
 case of error or in free_pdev function.
>>> I'd start with void and make it return an error (and deal with necessary
>>> cleanup) only once a need arises.
>> Sounds reasonable. For x86 I think we can deal with:
>>
>> xen/include/xen/pci.h:
>>
>> #ifdef CONFIG_ARM
>> void arch_pci_init_pdev(struct pci_dev *pdev);
>> #else
>> static inline void arch_pci_init_pdev(struct pci_dev *pdev)
>> {
>>       return 0;
>> }
>> #endif
> But that's still #ifdef-ary. We have asm/pci.h.
Sure, will define it there
>
> Jan
>
>
Thank you,

Oleksandr


[XEN PATCH v4] xen: rework `checkpolicy` detection when using "randconfig"

2021-09-28 Thread Anthony PERARD
This will help prevent the CI loop from having build failures when
`checkpolicy` isn't available when doing "randconfig" jobs.

To prevent "randconfig" from selecting XSM_FLASK_POLICY when
`checkpolicy` isn't available, we will actually override the config
output with the use of KCONFIG_ALLCONFIG.

Doing this way still allow a user/developer to set XSM_FLASK_POLICY
even when "checkpolicy" isn't available. It also prevent the build
system from reset the config when "checkpolicy" isn't available
anymore. And XSM_FLASK_POLICY is still selected automatically when
`checkpolicy` is available.
But this also work well for "randconfig", as it will not select
XSM_FLASK_POLICY when "checkpolicy" is missing.

This patch allows to easily add more override which depends on the
environment.

Also, move the check out of Config.mk and into xen/ build system.
Nothing in tools/ is using that information as it's done by
./configure.

We named the new file ".allconfig.tmp" as ".*.tmp" are already ignored
via .gitignore.

Remove '= y' in Kconfig as it isn't needed, only a value "y" is true,
anything else is considered false.

Signed-off-by: Anthony PERARD 
---
v4:
- keep XEN_ prefix for HAS_CHECKPOLICY
- rework .allconfig.tmp file generation, so it is easier to read.
- remove .allconfig.tmp on clean, .*.tmp files aren't all cleaned yet,
  maybe for another time.
- add information about file name choice and Kconfig change in patch
  description.

v3:
- use KCONFIG_ALLCONFIG
- don't override XSM_FLASK_POLICY value unless we do randconfig.
- no more changes to the current behavior of kconfig, only to
  randconfig.

v2 was "[XEN PATCH v2] xen: allow XSM_FLASK_POLICY only if checkpolicy binary 
is available"
---
 Config.mk  |  6 --
 xen/Makefile   | 20 +---
 xen/common/Kconfig |  2 +-
 3 files changed, 18 insertions(+), 10 deletions(-)

diff --git a/Config.mk b/Config.mk
index e85bf186547f..d5490e35d03d 100644
--- a/Config.mk
+++ b/Config.mk
@@ -137,12 +137,6 @@ export XEN_HAS_BUILD_ID=y
 build_id_linker := --build-id=sha1
 endif
 
-ifndef XEN_HAS_CHECKPOLICY
-CHECKPOLICY ?= checkpolicy
-XEN_HAS_CHECKPOLICY := $(shell $(CHECKPOLICY) -h 2>&1 | grep -q xen && 
echo y || echo n)
-export XEN_HAS_CHECKPOLICY
-endif
-
 define buildmakevars2shellvars
 export PREFIX="$(prefix)";\
 export XEN_SCRIPT_DIR="$(XEN_SCRIPT_DIR)";\
diff --git a/xen/Makefile b/xen/Makefile
index f47423dacd9a..7c2ffce0fc77 100644
--- a/xen/Makefile
+++ b/xen/Makefile
@@ -17,6 +17,8 @@ export XEN_BUILD_HOST ?= $(shell hostname)
 PYTHON_INTERPRETER := $(word 1,$(shell which python3 python python2 
2>/dev/null) python)
 export PYTHON  ?= $(PYTHON_INTERPRETER)
 
+export CHECKPOLICY ?= checkpolicy
+
 export BASEDIR := $(CURDIR)
 export XEN_ROOT := $(BASEDIR)/..
 
@@ -178,6 +180,8 @@ CFLAGS += $(CLANG_FLAGS)
 export CLANG_FLAGS
 endif
 
+export XEN_HAS_CHECKPOLICY := $(call success,$(CHECKPOLICY) -h 2>&1 | grep -q 
xen)
+
 export root-make-done := y
 endif # root-make-done
 
@@ -189,14 +193,24 @@ ifeq ($(config-build),y)
 # *config targets only - make sure prerequisites are updated, and descend
 # in tools/kconfig to make the *config target
 
+# Create a file for KCONFIG_ALLCONFIG which depends on the environment.
+# This will be use by kconfig targets 
allyesconfig/allmodconfig/allnoconfig/randconfig
+filechk_kconfig_allconfig = \
+$(if $(findstring n,$(XEN_HAS_CHECKPOLICY)), echo 
'CONFIG_XSM_FLASK_POLICY=n';) \
+$(if $(KCONFIG_ALLCONFIG), cat $(KCONFIG_ALLCONFIG);) \
+:
+
+.allconfig.tmp: FORCE
+   set -e; { $(call filechk_kconfig_allconfig); } > $@
+
 config: FORCE
$(MAKE) $(kconfig) $@
 
 # Config.mk tries to include .config file, don't try to remake it
 %/.config: ;
 
-%config: FORCE
-   $(MAKE) $(kconfig) $@
+%config: .allconfig.tmp FORCE
+   $(MAKE) $(kconfig) KCONFIG_ALLCONFIG=$< $@
 
 else # !config-build
 
@@ -368,7 +382,7 @@ _clean: delete-unfresh-files
-o -name "*.gcno" -o -name ".*.cmd" -o -name "lib.a" \) -exec 
rm -f {} \;
rm -f include/asm $(TARGET) $(TARGET).gz $(TARGET).efi 
$(TARGET).efi.map $(TARGET)-syms $(TARGET)-syms.map *~ core
rm -f asm-offsets.s include/asm-*/asm-offsets.h
-   rm -f .banner
+   rm -f .banner .allconfig.tmp
 
 .PHONY: _distclean
 _distclean: clean
diff --git a/xen/common/Kconfig b/xen/common/Kconfig
index db687b1785e7..eb6c2edb7bfe 100644
--- a/xen/common/Kconfig
+++ b/xen/common/Kconfig
@@ -251,7 +251,7 @@ config XSM_FLASK_AVC_STATS
 
 config XSM_FLASK_POLICY
bool "Compile Xen with a built-in FLASK security policy"
-   default y if "$(XEN_HAS_CHECKPOLICY)" = "y"
+   default y if "$(XEN_HAS_CHECKPOLICY)"
depends on XSM_FLASK
---help---
  This includes a default XSM policy in the hypervisor so that the
-- 
Anthony PERARD




Re: [PATCH v2 05/11] xen/arm: Mark device as PCI while creating one

2021-09-28 Thread Jan Beulich
On 28.09.2021 10:29, Oleksandr Andrushchenko wrote:
> 
> On 28.09.21 11:26, Jan Beulich wrote:
>> On 28.09.2021 10:09, Oleksandr Andrushchenko wrote:
>>> On 27.09.21 13:26, Jan Beulich wrote:
 On 27.09.2021 12:04, Oleksandr Andrushchenko wrote:
> On 27.09.21 13:00, Jan Beulich wrote:
>> On 27.09.2021 11:35, Oleksandr Andrushchenko wrote:
>>> On 27.09.21 12:19, Jan Beulich wrote:
 On 27.09.2021 10:45, Oleksandr Andrushchenko wrote:
> On 27.09.21 10:45, Jan Beulich wrote:
>> On 23.09.2021 14:54, Oleksandr Andrushchenko wrote:
>>> --- a/xen/drivers/passthrough/pci.c
>>> +++ b/xen/drivers/passthrough/pci.c
>>> @@ -328,6 +328,9 @@ static struct pci_dev *alloc_pdev(struct 
>>> pci_seg *pseg, u8 bus, u8 devfn)
>>>   *((u8*) >bus) = bus;
>>>   *((u8*) >devfn) = devfn;
>>>   pdev->domain = NULL;
>>> +#ifdef CONFIG_ARM
>>> +pci_to_dev(pdev)->type = DEV_PCI;
>>> +#endif
>> I have to admit that I'm not happy about new CONFIG_ 
>> conditionals
>> here. I'd prefer to see this done by a new arch helper, unless there 
>> are
>> obstacles I'm overlooking.
> Do you mean something like arch_pci_alloc_pdev(dev)?
 I'd recommend against "alloc" in its name; "new" instead maybe?
>>> I am fine with arch_pci_new_pdev, but arch prefix points to the fact 
>>> that
>>> this is just an architecture specific part of the pdev allocation 
>>> rather than
>>> actual pdev allocation itself, so with this respect arch_pci_alloc_pdev 
>>> seems
>>> more natural to me.
>> The bulk of the function is about populating the just allocated struct.
>> There's no arch-specific part of the allocation (so far, leaving aside
>> MSI-X), you only want and arch-specific part of the initialization. I
>> would agree with "alloc" in the name if further allocation was to
>> happen there.
> Hm, then arch_pci_init_pdev sounds more reasonable
 Fine with me.
>>> Do we want this to be void or returning an error code? If error code is 
>>> needed,
>>> then we would also need a roll-back function, e.g. arch_pci_free_pdev or
>>> arch_pci_release_pdev or arch_pci_fini_pdev or something, so it can be used 
>>> in
>>> case of error or in free_pdev function.
>> I'd start with void and make it return an error (and deal with necessary
>> cleanup) only once a need arises.
> 
> Sounds reasonable. For x86 I think we can deal with:
> 
> xen/include/xen/pci.h:
> 
> #ifdef CONFIG_ARM
> void arch_pci_init_pdev(struct pci_dev *pdev);
> #else
> static inline void arch_pci_init_pdev(struct pci_dev *pdev)
> {
>      return 0;
> }
> #endif

But that's still #ifdef-ary. We have asm/pci.h.

Jan




Re: sh_unshadow_for_p2m_change() vs p2m_set_entry()

2021-09-28 Thread Jan Beulich
On 27.09.2021 22:25, Tim Deegan wrote:
> At 13:31 +0200 on 24 Sep (1632490304), Jan Beulich wrote:
>> I'm afraid you're still my best guess to hopefully get an insight
>> on issues like this one.
> 
> I'm now very rusty on all this but I'll do my best!  I suspect I'll
> just be following you through the code.

Thanks much!

>> While doing IOMMU superpage work I was, just in the background,
>> considering in how far the superpage re-coalescing to be used there
>> couldn't be re-used for P2M / EPT / NPT. Which got me to think about
>> shadow mode's using of p2m-pt.c: That's purely software use of the
>> tables in that case, isn't it? In which case hardware support for
>> superpages shouldn't matter at all.
> 
> ISTR at the time we used the same table for p2m and NPT.
> If that's gone away, then yes, we could have superpages
> in the p2m without caring about hardware support.

No, that code is still used two ways, but it can't be used for the
same domain in both of these ways. IOW I'm wondering whether the
check for 2M pages to be usable shouldn't be "!hap || hap_2mb", as
opposed to the 1G check continuing to be "hap && hap_1gb". Of
course once I make that change, I may end up learning what
"potential errors" that other commit was talking about ...

As to the further parts of your reply, I guess I'll try to
transform this (largely supporting my observations) and the above
into one or more patches then.

Jan




Re: [PATCH v2 05/11] xen/arm: Mark device as PCI while creating one

2021-09-28 Thread Oleksandr Andrushchenko

On 28.09.21 11:26, Jan Beulich wrote:
> On 28.09.2021 10:09, Oleksandr Andrushchenko wrote:
>> On 27.09.21 13:26, Jan Beulich wrote:
>>> On 27.09.2021 12:04, Oleksandr Andrushchenko wrote:
 On 27.09.21 13:00, Jan Beulich wrote:
> On 27.09.2021 11:35, Oleksandr Andrushchenko wrote:
>> On 27.09.21 12:19, Jan Beulich wrote:
>>> On 27.09.2021 10:45, Oleksandr Andrushchenko wrote:
 On 27.09.21 10:45, Jan Beulich wrote:
> On 23.09.2021 14:54, Oleksandr Andrushchenko wrote:
>> --- a/xen/drivers/passthrough/pci.c
>> +++ b/xen/drivers/passthrough/pci.c
>> @@ -328,6 +328,9 @@ static struct pci_dev *alloc_pdev(struct pci_seg 
>> *pseg, u8 bus, u8 devfn)
>>   *((u8*) >bus) = bus;
>>   *((u8*) >devfn) = devfn;
>>   pdev->domain = NULL;
>> +#ifdef CONFIG_ARM
>> +pci_to_dev(pdev)->type = DEV_PCI;
>> +#endif
> I have to admit that I'm not happy about new CONFIG_ 
> conditionals
> here. I'd prefer to see this done by a new arch helper, unless there 
> are
> obstacles I'm overlooking.
 Do you mean something like arch_pci_alloc_pdev(dev)?
>>> I'd recommend against "alloc" in its name; "new" instead maybe?
>> I am fine with arch_pci_new_pdev, but arch prefix points to the fact that
>> this is just an architecture specific part of the pdev allocation rather 
>> than
>> actual pdev allocation itself, so with this respect arch_pci_alloc_pdev 
>> seems
>> more natural to me.
> The bulk of the function is about populating the just allocated struct.
> There's no arch-specific part of the allocation (so far, leaving aside
> MSI-X), you only want and arch-specific part of the initialization. I
> would agree with "alloc" in the name if further allocation was to
> happen there.
 Hm, then arch_pci_init_pdev sounds more reasonable
>>> Fine with me.
>> Do we want this to be void or returning an error code? If error code is 
>> needed,
>> then we would also need a roll-back function, e.g. arch_pci_free_pdev or
>> arch_pci_release_pdev or arch_pci_fini_pdev or something, so it can be used 
>> in
>> case of error or in free_pdev function.
> I'd start with void and make it return an error (and deal with necessary
> cleanup) only once a need arises.

Sounds reasonable. For x86 I think we can deal with:

xen/include/xen/pci.h:

#ifdef CONFIG_ARM
void arch_pci_init_pdev(struct pci_dev *pdev);
#else
static inline void arch_pci_init_pdev(struct pci_dev *pdev)
{
     return 0;
}
#endif

>
> Jan
>

Re: [PATCH v2 05/11] xen/arm: Mark device as PCI while creating one

2021-09-28 Thread Jan Beulich
On 28.09.2021 10:09, Oleksandr Andrushchenko wrote:
> 
> On 27.09.21 13:26, Jan Beulich wrote:
>> On 27.09.2021 12:04, Oleksandr Andrushchenko wrote:
>>> On 27.09.21 13:00, Jan Beulich wrote:
 On 27.09.2021 11:35, Oleksandr Andrushchenko wrote:
> On 27.09.21 12:19, Jan Beulich wrote:
>> On 27.09.2021 10:45, Oleksandr Andrushchenko wrote:
>>> On 27.09.21 10:45, Jan Beulich wrote:
 On 23.09.2021 14:54, Oleksandr Andrushchenko wrote:
> --- a/xen/drivers/passthrough/pci.c
> +++ b/xen/drivers/passthrough/pci.c
> @@ -328,6 +328,9 @@ static struct pci_dev *alloc_pdev(struct pci_seg 
> *pseg, u8 bus, u8 devfn)
>  *((u8*) >bus) = bus;
>  *((u8*) >devfn) = devfn;
>  pdev->domain = NULL;
> +#ifdef CONFIG_ARM
> +pci_to_dev(pdev)->type = DEV_PCI;
> +#endif
 I have to admit that I'm not happy about new CONFIG_ conditionals
 here. I'd prefer to see this done by a new arch helper, unless there 
 are
 obstacles I'm overlooking.
>>> Do you mean something like arch_pci_alloc_pdev(dev)?
>> I'd recommend against "alloc" in its name; "new" instead maybe?
> I am fine with arch_pci_new_pdev, but arch prefix points to the fact that
> this is just an architecture specific part of the pdev allocation rather 
> than
> actual pdev allocation itself, so with this respect arch_pci_alloc_pdev 
> seems
> more natural to me.
 The bulk of the function is about populating the just allocated struct.
 There's no arch-specific part of the allocation (so far, leaving aside
 MSI-X), you only want and arch-specific part of the initialization. I
 would agree with "alloc" in the name if further allocation was to
 happen there.
>>> Hm, then arch_pci_init_pdev sounds more reasonable
>> Fine with me.
> 
> Do we want this to be void or returning an error code? If error code is 
> needed,
> then we would also need a roll-back function, e.g. arch_pci_free_pdev or
> arch_pci_release_pdev or arch_pci_fini_pdev or something, so it can be used in
> case of error or in free_pdev function.

I'd start with void and make it return an error (and deal with necessary
cleanup) only once a need arises.

Jan




Re: Semantics of XEN_DOMCTL_SHADOW_OP_SET_ALLOCATION

2021-09-28 Thread Jan Beulich
On 27.09.2021 23:01, Andrew Cooper wrote:
> A recent ABI change in Xen caused a total breakage under the Xapi
> toolstack, and the investigation had lead to this.

I'm curious which change this was; while it's likely one of mine, I
can't seem to be able to easily guess.

> First of all, the memory pool really needs renaming, because (not naming
> names) multiple developers were fooled into thinking that the bug was
> caused by a VM being unexpectedly in shadow mode.
> 
> Second, any MB value >= 0x100 will truncate to 0 between
> {hap,shadow}_domctl() and {hap,shadow}_set_allocation().

This wants fixing of course. I assume a patch is already in the works.
If not, let me know and I'll see about making one.

> But for the main issue, passing 0 in at the hypercall level is broken.
> 
> hap_enable() forces a minimum of 256 pages.  A subsequent hypercall
> trying to set 0 frees {tot 245, free 244, p2m 11} all the way down to
> {tot 1, free 0, p2m 11} before failing with -ENOMEM because there are no
> more free pages to free.  Getting -ENOMEM from this kind of operation
> isn't really correct.

It's questionable, but I wouldn't call it outright "not correct": The
function was requested to obtain memory (from the pool), so one may
view this as allocation. The set-allocation functions really are both
allocations and frees at the same time (moving pages from one pool to
another).

> Passing 0 cannot possibly function.  There are non-zero p2m frames by
> the time createdomain completes, as we allocate the top of the p2m, and
> they cannot be freed without the teardown logic which releases memory in
> the correct order.
> 
> In fact, passing anything lower than the current free size is guaranteed
> to fail.  Continuations also mean that you can't pick a value which is
> guaranteed not to fail, because even a well (poorly?) placed foreign map
> in dom0 could change the properties of the pool.

Well, I suppose outside of domain cleanup shrinking of the pool was
always meant as kind of a best effort operation.

> The shadow side rejects hypercall attempts using 0

I haven't been able to spot this rejection logic. Instead I'm getting
the impression that the BUG() at the bottom of _shadow_prealloc()
would be hit if shrinking the pool beyond what can really be freed
(i.e. in particular if any pages are in use for the p2m).

> (but can be bypassed
> with the above truncation bug), and will try to drop shadows to get down
> to the limit.  This represents a difference vs HAP, and in practice 1M
> granularity is probably enough to ensure that you can't fail to set the
> shadow allocation that low.  There is also a reachable BUG() somewhere
> in this path as reported several times on xen-devel, but I still haven't
> figured out how to tickle it.

Any pointer to one such report? I don't recall any, and hence it's
not clear to me whether that's the one in _shadow_prealloc().

> There is no code or working usecase for reducing the size of the shadow
> pool, except on domain destruction.  I think we should prohibit the
> ability to shrink the shadow pool, and defer most of this mess to anyone
> who turns up with a plausible usecase.

No present use case for reducing is a fair argument for dropping support
for doing so. That'll still mean an error return, which - according to
what you have written near the top - may still upset the Xapi tool stack.

Jan




  1   2   >